Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#868 closed defect (fixed)

Some Unicode characters aren't stripped properly in non-UTF-8 BibTeX output

Reported by: dstillman Owned by: simon
Priority: minor Milestone:
Component: export Version:
Keywords: Cc: codec

Description

r2025, which explicitly sets the BibTeX translator to export using ASCII when not in UTF-8 mode, theoretically shouldn't be necessary if all non-ASCII characters are being replaced, but without it the Chinese character in the title "台oobar" isn't properly replaced with a question mark.

This isn't too big of an issue, but it might be worth figuring out why this happens, since other code (for example, code that warned the user if characters were being stripped) might operate under the assumption that the translator (rather than Firefox) was detecting/removing all out-of-range characters.

I haven't looked at how the translate code works, but I'm guessing it might have something to do with the string (from the data layer) still needing to be read in as UTF-8 in order for the replace(/[\u0080-\uFFFF]/g, "?") to work properly.

Change History (1)

comment:1 Changed 9 years ago by simon

  • Resolution set to fixed
  • Status changed from new to closed

(In [2027]) - closes #868, Some Unicode characters aren't stripped properly in non-UTF-8 BibTeX output

  • speeds up BibTeX text encoding
  • escapes BibTeX special characters
Note: See TracTickets for help on using tickets.