Opened 9 years ago

Last modified 7 years ago

#669 reopened defect

Handling accented characters with COinS

Reported by: stakats Owned by: stakats
Priority: major Milestone:
Component: ingester Version: 1.0
Keywords: Cc:

Description

We're currently using escape/unescape, and it looks like we should be using encodeURIComponent/decodeURIComponent in ingester.js.

Attachments (1)

ingester.js_decodeURIComponent-escape.diff (1.2 KB) - added by karnesky 7 years ago.

Download all attachments as: .zip

Change History (3)

comment:1 Changed 9 years ago by stakats

  • Resolution set to fixed
  • Status changed from new to closed

(In [1518]) Closes #669 by supporting encode/decode of non-ASCII (e.g. accented) characters in COinS tags

Changed 7 years ago by karnesky

comment:2 Changed 7 years ago by karnesky

  • Resolution fixed deleted
  • Status changed from closed to reopened

COinS does not mandate a character encoding (and OpenURL actually allows a parameter to set the encoding). There are many applications that emit data that is not UTF-8 (older versions of refbase, the OCLC had done this, and I'm sure there are others). They typically use the (deprecated) escape function. Zotero should be more liberal in what it accepts. This patch will use decodeURIComponent, but will fall back to unescape. It works with the examples given in http://forums.zotero.org/discussion/2400 . The patch does not consider what character set the COinS claims to use (there seems to be no reason to), nor does it consider any other escaping scheme (do we need to worry about any?). Minor whitespace flaws are also fixed.

Note: See TracTickets for help on using tickets.