This is an old revision of the document!


Zotero Translators - The Missing Manual

Tools of the Trade

Zotero (discuss tricky business of choosing Zotero 1.0.x or Zotero 2.0), Scaffold, Xpather (http://xpath.alephzarro.com/) [requires DOM Inspector, https://developer.mozilla.org/En/DOM_Inspector], Firebug (http://getfirebug.com/)

Translator Metadata

Each translator is described by several metadata fields. For the stand-alone javascript translator files in Zotero 2.0, this metadata is included at the beginning of the file in a JSON block, e.g.:

{
	"translatorID":"fcf41bed-0cbc-3704-85c7-8062a0068a7a",
	"translatorType":12,
	"label":"NCBI PubMed",
	"creator":"Simon Kornblith and Michael Berkowitz",
	"target":"http://[^/]*www\\.ncbi\\.nlm\\.nih\\.gov[^/]*/(pubmed|sites/entrez|entrez/query\\.fcgi\\?.*db=PubMed)",
	"minVersion":"1.0.0b3.r1",
	"maxVersion":"",
	"priority":100,
	"inRepository":true,
	"lastUpdated":"2008-12-15 00:25:00"
}

A description of the metadata fields:

  • translatorID
    The internal ID by which Zotero identifies the translator. It is recommended to use a GUID (GUIDs can be automatically generated in Scaffold). As the translatorID is used for automatic updating of translators, and for calling translators within other translators, using stable GUIDs is strongly recommended.
  • translatorType
    Four types of translator exist, web translators being the most common. The four types are: import (1), export (2), web (4) and search (8). The value of translatorType should be the number listed after the relevant type. Some translators belong to multiple types. In those cases, the value of translatorType is the sum of the types (e.g. an web/search translator will have a translatorType value of 12). In Scaffold the translatorType is set with checkboxes.
  • label
    The name of the translator
  • creator
    The author(s) of the translator
  • target
    For web translators, the target should specify a Javascript regular expression. Whenever a page is loaded, Zotero tests the target regular expressions of all web translators on the webpage URL. Of the matching translators, the translator with the lowest priority number will be used for that page. The translator's DetectCode function is run, and a Zotero item icon will appear in the address bar if an item is found.
  • minVersion
    The minimum version of Zotero for which the translators works properly
  • maxVersion
    The maximum version of Zotero for which the translators works properly
  • priority
    The priority number is used to determine which translator should be used, if multiple translators are found to be able to translate a certain web page. A lower number indicates a higher priority.
  • inRepository
    FIXME No clue what this is for
  • lastUpdated
    The date and time when the translator was last modified (format YYYY-MM-DD HH:MM:SS). Scaffold automatically updates this fields when the translator is saved (or run).

Zotero.Utilities

To do: include details on all the useful (but hidden) functions in Zotero.Utilities

When writing translator code, you can make use of a number of functions in https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.jsZotero.Utilities. Below each function is described, and an example of its use is given.

String manipulation

cleanAuthor

Function description
Zotero.Utilities.prototype.cleanAuthor = function(author, type, useComma)
@param {String} author Creator string
@param {String} type Creator type string (e.g., “author” or “editor”)
@param {Boolean} useComma Whether the creator string is in inverted (Last, First) format
@return {Object} firstName, lastName, and creatorType

Cleans white-space and punctuation (.,/[]:) from start and end of supplied string. Replaces internal multiple spaces by single spaces. Switches around first and last name if these are inverted and separated by a comma, and the value of useComma is set to true. Cleans extraneous punctuation off a creator name and parse into first and last name

Example code

var name = "Doe, John";
Zotero.debug(Zotero.Utilities.cleanAuthor(name, "author",true));

Example debug output
'firstName' ⇒ “John”
'lastName' ⇒ “Doe”
'creatorType' ⇒ “author”

https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L40

Zotero.Utilities.trim

https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L85

Zotero.Utilities.cleanString https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L114

Zotero.Utilities.superCleanString https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L123

Zotero.Utilities.cleanTags https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L136

Zotero.Utilities.htmlSpecialChars https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L153

Zotero.Utilities.unescapeHTML https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L189

Zotero.Utilities.parseMarkup https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L206

Zotero.Utilities.isInt https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L247

Zotero.Utilities.getPageRange https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L260

Zotero.Utilities.lpad https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L283

Zotero.Utilities.itemTypeExists https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L297

Zotero.Utilities.getCreatorsForType https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L311

Zotero.Utilities.getLocalizedCreatorType https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L327

Zotero.Utilities.capitalizeTitle https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L342

Zotero.Utilities.processAsync https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/content/zotero/xpcom/utilities.js#L361

Item properties

Especially for screen scraper translators, knowing which item types (book, journalArticle, etc) and item fields (title, url, etc) exist in Zotero can be very helpful. Fortunately, the possible item properties can be found in the following source code file (types are listed as “itemTypes” entries, fields as “itemFields”):

https://www.zotero.org/trac/browser/extension/branches/1.0/chrome/locale/en-US/zotero/zotero.properties#L154

Note that the different item types make use of different combinations of item fields (e.g. the book item type has the field ISBN, while the journalArticle item type lacks this field).

Translator delegation

To do: describe how translators can call other translators (annotate existing RIS-translator with a bunch of comments?)

Useful translator examples

To do: pick some examples of the different types of translators:

  • XML based: NCBI Pubmed, Google Books
  • RIS translators
  • Pure screen scrapers
dev/translators_reference_guide.1246298040.txt.gz · Last modified: 2009/06/29 13:54 by rmzelle