Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
dev:translators [2014/12/18 01:24] – [Translator Structure] dataMode = block or line is deprecated zuphilipdev:translators [2023/08/18 09:15] (current) – [Recommendations for Translator Authors] Add one advice about not modifying the live document. zoe
Line 7: Line 7:
 **Note:** Before writing a translator for a site, look at the [[dev:exposing_metadata|documentation on exposing metadata]]; website authors should try embedding the necessary metadata before attempting to write a translator. **Note:** Before writing a translator for a site, look at the [[dev:exposing_metadata|documentation on exposing metadata]]; website authors should try embedding the necessary metadata before attempting to write a translator.
  
-If you're looking for a broken translator to fix, see the [[https://repo.zotero.org/errors|recent translator errors]] and check on one of the top reported errors. You can also check the status of many translators by reviewing the [[dev:translators:testing#running_tests|translator test overview]].+If you're looking for a broken translator to fix, see the [[https://zotero-translator-tests.s3.amazonaws.com/index.html|recent translator errors]] and check on one of the top reported errors. You can also check the status of many translators by reviewing the [[dev:translators:testing#running_tests|translator test overview]].
 ===== Translator Types - Web, Import, Export and Search ===== ===== Translator Types - Web, Import, Export and Search =====
  
Line 45: Line 45:
   * **creator** \\ The author(s) of the translator.   * **creator** \\ The author(s) of the translator.
   * **target** \\   * **target** \\
-    * For [[dev:translators:coding#web_translators|web translators]], the ''target'' should specify a [[:dev:technologies#regular_expressions|JavaScript regular expression]] (note that escaping requires two backslashes: one for the regular expression itself, and one for the JSON, e.g. "<nowiki>^https?://(www\\.)?example.com/</nowiki>". If using Scaffold, the add-on takes care of the JSON escaping, so backslashes do not need to be escaped).\\ When only matching a domain, the translator should terminate in a forward slash, so it only matches a non-proxied domain. Zotero will take care of de-proxifying the URL and pass the de-proxified URL to the translator.\\ Whenever a webpage is loaded, Zotero tests the target regular expressions of all web translators on the webpage URL. If there is a translator with a matching target, this translator’s ''detectWeb'' function is run. If this function finds item metadata, the Zotero translator icon appears or becomes active in the browser. When multiple translators have a matching target, the translator with the lowest priority number is selected. Web translators with the target set to ''null'' (e.g. the DOI translator) match every webpage, but normally have a high priority number and are only used when no other translator matches.+    * For [[dev:translators:coding#web_translators|web translators]], the ''target'' should specify a [[:dev:technologies#regular_expressions|JavaScript regular expression]] (note that escaping requires two backslashes: one for the regular expression itself, and one for the JSON, e.g. "<nowiki>^https?://(www\\.)?example.com/</nowiki>". If using Scaffold, it takes care of the JSON escaping, so backslashes do not need to be escaped).\\ When only matching a domain, the translator should terminate in a forward slash, so it only matches a non-proxied domain. Zotero will take care of de-proxifying the URL and pass the de-proxified URL to the translator.\\ Whenever a webpage is loaded, Zotero tests the target regular expressions of all web translators on the webpage URL. If there is a translator with a matching target, this translator’s ''detectWeb'' function is run. If this function finds item metadata, the Zotero translator icon appears or becomes active in the browser. When multiple translators have a matching target, the translator with the lowest priority number is selected. Web translators with an empty ''target'' string (e.g. the DOI translator) match every webpage, but normally have a high priority number and are only used when no other translator matches.
     * For import translators, the ''target'' is set to the expected extension (e.g. the BibTeX import/export translator has its target set to "bib"; selecting BibTex in Zotero’s import window filters for files with a ".bib" extension).     * For import translators, the ''target'' is set to the expected extension (e.g. the BibTeX import/export translator has its target set to "bib"; selecting BibTex in Zotero’s import window filters for files with a ".bib" extension).
     * For export translators, the ''target'' is set to the extension that should be given to generated files (e.g. the BibTeX translator produces "filename.bib" files).     * For export translators, the ''target'' is set to the extension that should be given to generated files (e.g. the BibTeX translator produces "filename.bib" files).
   * **minVersion** & **maxVersion** \\ Respectively the minimum and maximum version of Zotero (as specified in Zotero’s [[https://developer.mozilla.org/en/install_manifests|Install Manifest]]) with which the translator is compatible.   * **minVersion** & **maxVersion** \\ Respectively the minimum and maximum version of Zotero (as specified in Zotero’s [[https://developer.mozilla.org/en/install_manifests|Install Manifest]]) with which the translator is compatible.
-  * **browserSupport** \\ A string containing one or more of the letters ''g'', ''c'', ''s'', ''i'', representing the connectors that the translator can be run in -- Gecko (Firefox), Chrome, Safari, Internet Explorer, respectively. ''b'' indicates support for the Bookmarklet ([[https://groups.google.com/forum/#!topic/zotero-dev/ZWCe86B3OCw/discussion|zotero-dev thread]]) and ''v'' indicates support for the [[https://github.com/zotero/translation-server|translation-server]].  For more information, see [[dev:translators:connectors|Connectors]]. **Warning: Compatible with Zotero 2.1.9 and later only.**+  * **browserSupport** \\ A string containing one or more of the letters ''g'', ''c'', ''s'', ''i'', representing the connectors that the translator can be run in -- Gecko (Firefox), Chrome, Safari, Internet Explorer, respectively. ''b'' indicates support for the Bookmarklet ([[https://groups.google.com/forum/#!topic/zotero-dev/ZWCe86B3OCw/discussion|zotero-dev thread]]) and ''v'' indicates support for the [[https://github.com/zotero/translation-server|translation-server]].  For more information, see [[dev:translators:connectors|Connectors]].
   * **priority** \\ An integer indicating translator priority. When multiple translators can translate a source, the translator with the lowest priority number is selected. Site-specific web translators normally have a priority of 100. For guidelines on picking an appropriate priority for web translators see [[:dev:translators:priority|this page]]   * **priority** \\ An integer indicating translator priority. When multiple translators can translate a source, the translator with the lowest priority number is selected. Site-specific web translators normally have a priority of 100. For guidelines on picking an appropriate priority for web translators see [[:dev:translators:priority|this page]]
   * **inRepository** \\ Set to ''true'' for translators that are added to the Zotero repo and distributed to all Zotero users, and ''false'' for those that are not.   * **inRepository** \\ Set to ''true'' for translators that are added to the Zotero repo and distributed to all Zotero users, and ''false'' for those that are not.
Line 58: Line 58:
  
   * **configOptions**   * **configOptions**
-    * **dataMode** \\ For [[dev:translators:coding#import_translators|import translators]], this sets the form in which the input data is presented to the translator. If set to "rdf/xml", Zotero will parse the input as XML and expose the data through the ''Zotero.RDF'' object.  If "xml/e4x", Zotero will expose the data through the function ''Zotero.getXML()''. Zotero does not natively support importing N3 representations of RDF. The values "block" and "line" are deprecated and no longer necessary in Zotero 2.1 and later.+    * **dataMode** \\ For [[dev:translators:coding#import_translators|import translators]], this sets the form in which the input data is presented to the translator. If set to "rdf/xml", Zotero will parse the input as XML and expose the data through the ''Zotero.RDF'' object.  If "xml/dom", Zotero will expose the data through the function ''Zotero.getXML()''. Zotero does not natively support importing N3 representations of RDF. The values "block" and "line" are deprecated and no longer necessary in [[dev:client_coding:changes_in_zotero_2.1|Zotero 2.1]] and later.
     * **getCollections** \\ For [[dev:translators:coding#export_translators|export translators]], set to ''true'' or ''false''. If ''true'', an export translator will have access to the collection names and can recreate them in the exported file.     * **getCollections** \\ For [[dev:translators:coding#export_translators|export translators]], set to ''true'' or ''false''. If ''true'', an export translator will have access to the collection names and can recreate them in the exported file.
   * **displayOptions**   * **displayOptions**
Line 75: Line 75:
  
   * **[[dev:translators:coding#web_translators|Web translators]]**   * **[[dev:translators:coding#web_translators|Web translators]]**
-    * //detectWeb// \\ After a web translator has been selected based by its matching target and its priority ranking, ''detectWeb'' is run to determine whether item metadata can indeed be retrieved from the webpage. Should return the detected item type (e.g. "journalArticle", see the [[http://gsl-nagoya-u.net/http/pub/csl-fields/index.html|overview of Zotero item types]]), or, if multiple items are found, "multiple". If ''detectWeb'' does not return a value, the translator with the next-highest priority is selected, until all translators with a matching target have been exhausted.+    * //detectWeb// \\ After a web translator has been selected based by its matching target and its priority ranking, ''detectWeb'' is run to determine whether item metadata can indeed be retrieved from the webpage. Should return the detected item type (e.g. "journalArticle", see the [[https://aurimasv.github.io/z2csl/typeMap.xml|overview of Zotero item types]]), or, if multiple items are found, "multiple". If ''detectWeb'' does not return a value, the translator with the next-highest priority is selected, until all translators with a matching target have been exhausted.
     * //doWeb// \\ Performs the actual item metadata retrieval.     * //doWeb// \\ Performs the actual item metadata retrieval.
   * **[[dev:translators:coding#import_translators|Import translators]]**   * **[[dev:translators:coding#import_translators|Import translators]]**
Line 84: Line 84:
   * **[[dev:translators:coding#search_translators|Search translators]]**   * **[[dev:translators:coding#search_translators|Search translators]]**
     * //detectSearch// \\ Determines whether the translator can look up item metadata. Should return ''true'' if it can, and ''false'' if it cannot.     * //detectSearch// \\ Determines whether the translator can look up item metadata. Should return ''true'' if it can, and ''false'' if it cannot.
-    * //doSearch// \\ Performs the actual look up.+    * //doSearch// \\ Performs the actual lookup.
  
 See [[dev:translators:coding|Translator Coding]] for a detailed description on how these functions can be coded. See [[dev:translators:coding|Translator Coding]] for a detailed description on how these functions can be coded.
Line 92: Line 92:
 The following tools can make coding Zotero translators easier: The following tools can make coding Zotero translators easier:
  
-  * [[:dev:translators:scaffold|Scaffold]] - Scaffold is a Firefox add-on developed by CHNM to create and modify web translators. Translators can be quickly [[:dev:translators:testing|tested]] and debugged, and item saving is simulated, so no changes are made to your Zotero library. +  * [[:dev:translators:scaffold|Scaffold]] - Scaffold is an IDE for translators built into Zotero (Tools -> Developer -> Translator Editor). Translators can be quickly [[:dev:translators:testing|tested]] and debugged, and item saving is simulated, so no changes are made to your Zotero library. 
-  * XPath Tools \\ Many web translators rely on [[dev:technologies#xpath|XPath]] to extract information from HTML or XML. XPath construction is made easier by using the [[http://dl.dropbox.com/u/848981/it/xp/xp.html|XCpath bookmarklet]] or one of the following Firefox add-ons: +  * Browser inspector - Web translators generally use [[https://developer.mozilla.org/en-US/docs/Web/API/Document/querySelector|querySelector]] and [[https://developer.mozilla.org/en-US/docs/Web/API/Document/querySelectorAll|querySelectorAll]] to extract content from web pages. Your browser likely provides an inspector tool to help you understand pages' structure. You can access it by right-clicking and selecting Inspect (Firefox) or Inspect Element (Chrome). 
-    * [[http://getfirebug.com/|Firebug]] - Useful to inspect the HTML structure of webpages, and [[dev:technologies#using_firebug_to_design_xpath_expressions|find XPath expressions for elements of interest]]. +  * XPath Tools - Many older web translators rely on XPath to extract information from HTML or XML. Various tools can assist with generating and checking XPath expressions, including the DevTools built into browsersFor example, in Firefox, you can get the XPath for any element by finding it in the browser's Inspector tool, right-clicking on the element, and choosing Copy -> XPath. 
-    * [[https://addons.mozilla.org/fr/firefox/addon/firepath/|FirePath]] - Extension to Firebug to edit, inspect and generate XPath expressions.+
 ===== Contributing Translators ===== ===== Contributing Translators =====
  
-If you created or modified a translator and wish to have it added to Zotero, or are looking for support on writing translators, please drop note on the [[http://groups.google.com/group/zotero-dev|Zotero development mailing list]]. It is often best to post the source on a code sharing site like [[https://gist.github.com|Github]], so that it can be accessed easily. +If you created or modified a translator and wish to have it added to Zotero, or are looking for support on writing translators, please submit pull request to the [[https://github.com/zotero/translators/|Zotero Translators GitHub repo]]. You can also ask questions about translator development on [[http://groups.google.com/group/zotero-dev|Zotero development mailing list]].
- +
-There are two ways to submit translators for inclusion in Zotero: +
- +
-==== Submitting for Casual Coders ==== +
- +
-The easiest way to submit translators is post a link to your code on the [[http://groups.google.com/group/zotero-dev|Zotero development mailing list]] (do not paste the code directly into your message), using a code sharing website like http://gist.github.com. +
- +
-=== Using gist.github.com === +
- +
-Go to https://gist.github.com/ and copy and paste your translator code into the large text box. Enter the file name of the translator in the "name this file..." text box, and click the "Create Public Gist" button at the bottom of the page. Note that if you are using [[dev:translators:scaffold|Scaffold]] to develop translators, it is not sufficient to just copy the contents of the "Code" pane -- you will need to open the translator file, which you can find in the translators directory of your [[:zotero_data|Zotero data directory]]. +
- +
-Send a message to the [[http://groups.google.com/group/zotero-dev|Zotero development mailing list]]. In your message, ask for your translator to be uploaded to the repository, and provide the link to the Gist (copy the URL from the address bar, which should be in the form of ''<nowiki>https://gist.github.com/766801</nowiki>''. One of the Zotero developers or community members will review the code and add it to the repository. If you don't hear back within a week, feel free to post a reminder to the list.+
  
-==== Submitting for Frequent Contributors ====+To submit a pull request, fork the [[https://github.com/zotero/translators|Zotero Translator GitHub repository]], commit your changes (i.e., adding or modifying translator files), and create a [[http://help.github.com/pull-requests/|pull request]]. You can use your Git client of choice, but for new users we recommend [[http://www.syntevo.com/smartgit/index.html|SmartGit]], which is free for non-commercial purposes.
  
-While the process is slightly more complicated, you can also submit translators directly to the [[https://github.com/zotero/translators|Zotero translator repository]]. Fork the GitHub repositorycommit your changes, and create a [[http://help.github.com/pull-requests/|pull request]]. You can use your Git client of choice, but for new users we recommend [[http://www.syntevo.com/smartgit/index.html|SmartGit]], which is free for non-commercial purposes.+When you submit a pull request on GitHub, your translator code will be reviewed, and you will receive comments from the Zotero developers or experienced volunteersOnce you've made any necessary changes, your translator will be added to the Zotero translator repository
  
 ==== Licensing ==== ==== Licensing ====
  
-Please note that contributed translators need to be licensed in a way that allows the Zotero project to distribute them and modify them. We encourage you to license new translators under the [[http://www.gnu.org/licenses/agpl.html|GNU Affero General Public License version 3]] (or later), which is the license used for Zotero. To do so, just add a license statement to the top of the file. Take a look a recently committed translator, like "Die Zeit.js", for an example of such a statement. +Please note that contributed translators need to be licensed in a way that allows the Zotero project to distribute them and modify them. We encourage you to license new translators under the [[http://www.gnu.org/licenses/agpl.html|GNU Affero General Public License version 3]] (or later), which is the license used for Zotero. To do so, just add a license statement to the top of the file. Take a look a recently committed translator, like "Die Zeit.js", for an example of such a statement.
 ===== Recommendations for Translator Authors ===== ===== Recommendations for Translator Authors =====
 While there are no strict coding guidelines for translators, there are some general recommendations: While there are no strict coding guidelines for translators, there are some general recommendations:
Line 124: Line 112:
   - ''detectWeb'', ''detectImport'' and ''detectSearch'' should be coded to minimize the likelihood of the corresponding ''doWeb'', etc. function failing. Do your minimum required input checking the detect functions -- a failing ''do'' function will cause user-visible errors.   - ''detectWeb'', ''detectImport'' and ''detectSearch'' should be coded to minimize the likelihood of the corresponding ''doWeb'', etc. function failing. Do your minimum required input checking the detect functions -- a failing ''do'' function will cause user-visible errors.
   - Make detect functions lightweight-- they may be run on pages that a user is not even considering saving. Detect functions should not need to make additional HTTP requests. This obviously runs counter to the preceding point-- find a happy medium.   - Make detect functions lightweight-- they may be run on pages that a user is not even considering saving. Detect functions should not need to make additional HTTP requests. This obviously runs counter to the preceding point-- find a happy medium.
-  - Minimize HTTP requests. More HTTP requests slow down the user, cause undue load on servers, and in general are bad.+  - When translating the web page in the browser, do not modify any part of its DOM. 
 +  - Minimize HTTP requests. More HTTP requests slow down the user, cause undue load on servers, risk getting the user rate-limited or blocked, and in general are bad.
   - Don't leak user data. HTTP requests should in general not be directed to 3rd-party hosts.   - Don't leak user data. HTTP requests should in general not be directed to 3rd-party hosts.
   - Document your code. If there are input data deficiencies and the translator is working around them, document the deficiencies. If there are specific types of pages that a web translator is for, provide example URLs and expected output.   - Document your code. If there are input data deficiencies and the translator is working around them, document the deficiencies. If there are specific types of pages that a web translator is for, provide example URLs and expected output.
   - Produce [[dev:translators:testing|translator tests]] when possible, covering the basic page types that the translator is designed to support.   - Produce [[dev:translators:testing|translator tests]] when possible, covering the basic page types that the translator is designed to support.
 +  - Run ESLint on your code before submitting it. Zotero provides an ESLint plugin for translator development. You can run it on your translator within a clone of the ''zotero/translators'' repository: <code>npm ci && npm run lint -- "Your Translator.js"</code>
dev/translators.1418883870.txt.gz · Last modified: 2014/12/18 01:24 by zuphilip