#868 closed defect (fixed)
Some Unicode characters aren't stripped properly in non-UTF-8 BibTeX output
| Reported by: | dstillman | Owned by: | simon |
|---|---|---|---|
| Priority: | minor | Milestone: | |
| Component: | export | Version: | |
| Keywords: | Cc: | codec |
Description
r2025, which explicitly sets the BibTeX translator to export using ASCII when not in UTF-8 mode, theoretically shouldn't be necessary if all non-ASCII characters are being replaced, but without it the Chinese character in the title "台oobar" isn't properly replaced with a question mark.
This isn't too big of an issue, but it might be worth figuring out why this happens, since other code (for example, code that warned the user if characters were being stripped) might operate under the assumption that the translator (rather than Firefox) was detecting/removing all out-of-range characters.
I haven't looked at how the translate code works, but I'm guessing it might have something to do with the string (from the data layer) still needing to be read in as UTF-8 in order for the replace(/[\u0080-\uFFFF]/g, "?") to work properly.
Change History (1)
comment:1 Changed 9 years ago by simon
- Resolution set to fixed
- Status changed from new to closed
(In [2027]) - closes #868, Some Unicode characters aren't stripped properly in non-UTF-8 BibTeX output