Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
dev:translators_reference_guide [2009/06/29 13:54] – rmzelle | dev:translators_reference_guide [2017/11/12 19:53] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | < | ||
+ | in the process of updating the documentation for | ||
+ | <a href=" | ||
+ | may be outdated in the meantime. Thanks for your understanding.</ | ||
+ | |||
+ | |||
====== Zotero Translators - The Missing Manual ====== | ====== Zotero Translators - The Missing Manual ====== | ||
===== Tools of the Trade ===== | ===== Tools of the Trade ===== | ||
- | Zotero | + | Writing |
- | + | ||
- | ===== Translator Metadata ===== | + | |
- | Each translator is described by several metadata fields. For the stand-alone javascript translator | + | * [[/ |
+ | * [[/ | ||
+ | * XPath tools - most translators rely on XPath to extract information from HTML web pages or from XML data files. | ||
+ | * [[http:// | ||
+ | * [[http:// | ||
- | < | ||
- | " | ||
- | " | ||
- | " | ||
- | " | ||
- | " | ||
- | " | ||
- | " | ||
- | " | ||
- | " | ||
- | " | ||
- | }</ | ||
- | |||
- | A description of the metadata fields: | ||
- | |||
- | * **translatorID** \\ The internal ID by which Zotero identifies the translator. It is recommended to use a [[http:// | ||
- | * **translatorType** \\ Four types of translator exist, web translators being the most common. The four types are: import (1), export (2), web (4) and search (8). The value of translatorType should be the number listed after the relevant type. Some translators belong to multiple types. In those cases, the value of translatorType is the sum of the types (e.g. an web/search translator will have a translatorType value of 12). In [[scaffold|Scaffold]] the translatorType is set with checkboxes. | ||
- | * **label** \\ The name of the translator | ||
- | * **creator** \\ The author(s) of the translator | ||
- | * **target** \\ For web translators, | ||
- | * **minVersion** \\ The minimum version of Zotero for which the translators works properly | ||
- | * **maxVersion** \\ The maximum version of Zotero for which the translators works properly | ||
- | * **priority** \\ The priority number is used to determine which translator should be used, if multiple translators are found to be able to translate a certain web page. A lower number indicates a higher priority. | ||
- | * **inRepository** \\ FIXME No clue what this is for | ||
- | * **lastUpdated** \\ The date and time when the translator was last modified (format YYYY-MM-DD HH:MM:SS). Scaffold automatically updates this fields when the translator is saved (or run). | ||
===== Zotero.Utilities ===== | ===== Zotero.Utilities ===== | ||
Line 39: | Line 22: | ||
To do: include details on all the useful (but hidden) functions in Zotero.Utilities | To do: include details on all the useful (but hidden) functions in Zotero.Utilities | ||
- | When writing translator code, you can make use of a number of functions in [[|https:// | + | When writing translator code, you can make use of a number of functions in [[https:// |
==== String manipulation ==== | ==== String manipulation ==== | ||
Line 46: | Line 29: | ||
**Function description** \\ | **Function description** \\ | ||
+ | https:// | ||
Zotero.Utilities.prototype.cleanAuthor = function(author, | Zotero.Utilities.prototype.cleanAuthor = function(author, | ||
@param {String} author Creator string \\ | @param {String} author Creator string \\ | ||
Line 52: | Line 36: | ||
@return {Object} firstName, lastName, and creatorType | @return {Object} firstName, lastName, and creatorType | ||
- | Cleans | + | Sometimes it is difficult to extract clean author names from webpages. '' |
- | Cleans extraneous punctuation off a creator name and parse into first and last name | + | |
**Example code** | **Example code** | ||
<code javascript> | <code javascript> | ||
- | var name = "Doe, John"; | + | var name = " |
Zotero.debug(Zotero.Utilities.cleanAuthor(name, | Zotero.debug(Zotero.Utilities.cleanAuthor(name, | ||
</ | </ | ||
- | **Example debug output** \\ | + | **Example |
- | ' | + | < |
- | ' | + | ' |
+ | ' | ||
' | ' | ||
+ | </ | ||
- | https:// | + | === trim === |
- | + | **Function description** \\ | |
- | === Zotero.Utilities.trim === | + | |
https:// | https:// | ||
+ | Zotero.Utilities.prototype.trim = function(s) \\ | ||
+ | @type String | ||
- | Zotero.Utilities.cleanString | + | Removes leading and trailing whitespace from a string |
- | https:// | + | |
- | Zotero.Utilities.superCleanString | + | === trimInternal === |
+ | **Function description** \\ | ||
+ | https:// | ||
+ | Zotero.Utilities.prototype.trimInternal = function(s) | ||
+ | @type String | ||
+ | |||
+ | Cleans whitespace off a string and replaces multiple spaces with one | ||
+ | |||
+ | === cleanString === | ||
+ | Deprecated function, use trimInternal instead. | ||
+ | |||
+ | === superCleanString | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.superCleanString = function(x) | ||
+ | @type String | ||
- | Zotero.Utilities.cleanTags | + | Cleans any non-word non-parenthesis characters off the ends of a string |
+ | |||
+ | === cleanTags | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.cleanTags = function(x) | ||
+ | @type String | ||
- | Zotero.Utilities.htmlSpecialChars | + | Eliminates HTML tags, replacing each instance of <br> with a newline |
+ | |||
+ | === htmlSpecialChars | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.htmlSpecialChars = function(str) | ||
+ | @type String | ||
- | Zotero.Utilities.unescapeHTML | + | Escapes several predefined characters: |
+ | * & (ampersand) becomes & | ||
+ | * " (double quote) becomes & | ||
+ | * ' (single quote) becomes &# | ||
+ | * < (less than) becomes < | ||
+ | * > (greater than) becomes > | ||
+ | and | ||
+ | * < | ||
+ | * < | ||
+ | |||
+ | === unescapeHTML | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.unescapeHTML = function(str) | ||
+ | @type String | ||
+ | |||
+ | Converts all HTML entities in a string into Unicode characters. | ||
- | Zotero.Utilities.parseMarkup | + | === parseMarkup |
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.parseMarkup = function(str) | ||
+ | @return {Array} An array of objects with the following form: | ||
+ | { | ||
+ | type: ' | ||
+ | text: "text content", | ||
+ | [ attributes: { key1: val [ , key2: val, ...] } | ||
+ | }</ | ||
- | Zotero.Utilities.isInt | + | Parses a text string for HTML/XUL markup and returns an array of parts. Currently only finds HTML links (<a> tags) |
+ | |||
+ | === isInt === | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.isInt = function(x) | ||
+ | @deprecated Use isNaN(parseInt(x)) | ||
+ | @type Boolean | ||
- | Zotero.Utilities.getPageRange | + | Tests if a string is an integer |
+ | |||
+ | === getPageRange | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.getPageRange = function(pages) | ||
+ | @param {String} Page range to parse | ||
+ | @return {Integer[]} Start and end pages | ||
- | Zotero.Utilities.lpad | + | Parses a page range |
+ | |||
+ | === lpad === | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.lpad = function(string, | ||
+ | @param {String} string String to pad | ||
+ | @param {String} pad String to use as padding | ||
+ | @length {Integer} length Length of new padded string | ||
+ | @type String | ||
+ | |||
+ | Pads a number or other string with a given string on the left | ||
+ | |||
+ | === getLocalizedCreatorType === | ||
+ | **Function description** \\ | ||
+ | https:// | ||
+ | Zotero.Utilities.prototype.capitalizeTitle = function(string, | ||
+ | @param {String} string | ||
+ | @param {Boolean} force Forces title case conversion, even if the capitalizeTitles pref is off | ||
+ | @type String | ||
+ | |||
+ | Cleans a title, converting it to title case and replacing " :" with ":" | ||
+ | |||
+ | ==== Other functions ==== | ||
- | Zotero.Utilities.itemTypeExists | + | === itemTypeExists |
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.itemTypeExists = function(type) | ||
+ | @param {String} type Item type | ||
+ | @type Boolean | ||
- | Zotero.Utilities.getCreatorsForType | + | Tests if an item type exists (FIXME: what is the use case for this?) |
+ | |||
+ | === getCreatorsForType | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.getCreatorsForType = function(type) | ||
+ | @param {String} type Item type | ||
+ | @return {String[]} Creator types | ||
- | Zotero.Utilities.getLocalizedCreatorType | + | Find valid creator types for a given item type (FIXME: what is the use case for this?) |
+ | |||
+ | === getLocalizedCreatorType | ||
+ | **Function description** \\ | ||
https:// | https:// | ||
+ | Zotero.Utilities.prototype.getLocalizedCreatorType = function(type) | ||
+ | @param {String} type Creator type | ||
+ | @param {String} Localized creator type | ||
+ | @type Boolean | ||
- | Zotero.Utilities.capitalizeTitle | + | Gets a creator type name, localized to the current locale (FIXME: what is the use case for this?) |
- | https:// | + | |
Zotero.Utilities.processAsync | Zotero.Utilities.processAsync | ||
https:// | https:// | ||
+ | |||
+ | To Do: | ||
+ | |||
+ | processDocuments, | ||
===== Item properties ===== | ===== Item properties ===== | ||
Line 116: | Line 202: | ||
Especially for screen scraper translators, | Especially for screen scraper translators, | ||
- | https://www.zotero.org/trac/browser/ | + | http://aurimasv.github.io/z2csl/typeMap.xml |
Note that the different item types make use of different combinations of item fields (e.g. the '' | Note that the different item types make use of different combinations of item fields (e.g. the '' |