Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
dev:how_to_write_a_zotero_translator_plusplus [2010/07/27 17:29] – forklift tomrochewikidev:how_to_write_a_zotero_translator_plusplus [2017/11/19 19:24] (current) – Remove HWZT++ adamsmith
Line 1: Line 1:
-===== Chapter 0Introduction =====+**Note:** This page used to hold an updated version of Adam Crymble's [[http://niche-canada.org/member-projects/zotero-guide/chapter1.html|How to Write a Zotero Translator]]. The guidance included in both Adam's original guide and this update is too outdated to be useful and translators based on it are no longer accepted. We instead recommend the following resources:
  
-[[http://niche-canada.org/member-projects/zotero-guide/chapter1.html|How to Write a Zotero Translator]] by Adam Crymble (aka //HWZT//) is still the best guide to writing a simple screenscraping translator for Zotero. Unfortunately+  * The template code included in [[https://www.zotero.org/support/dev/translators/scaffold|Scaffold]], 
 +  * The code of [[https://github.com/zotero/translators/pulls|recently updated translators]],  
 +  * The documentation for writing Zotero translators by [[https://www.mediawiki.org/wiki/Citoid/Creating_Zotero_translators|Wikimedia]], the most comprehensive and up-to-date resource for writing "scrapers" as of November 2017, 
 +  * The [[https://www.zotero.org/support/dev/translators/coding|translator coding documentation]],  
 +  * Search for answers in the [[https://forums.zotero.org/discussions|Zotero Forums]] and [[https://github.com/zotero/translators/issues|on GitHub]].
  
-  * much has changed since HWZT was written, limiting its usability. 
-  * HWZT is not wikified, limiting its maintainability. 
  
-This page (aka //HWZT++//) updates and wikifies HWZT. (Note that HWZT++ is [[http://forums.zotero.org/discussion/12980?page=1#Item_11|adapted from 'How to Write a Zotero Translator' (2009) by Adam Crymble]].) For the moment, all but this chapter and Chapter 1 are merely a list of deltas to HWZT, organized by HWZT chapter: for each HWZT chapter, you must read it, then read the delta(s) here (if any), then execute appropriately. (Hopefully we will soon fully incorporate updated material from HWZT into HWZT++, thus eliminating the need to refer to both.) 
  
-===== Chapter 1: Introduction to Zotero Translators ===== 
- 
-//Note:// the following is adapted from [[http://niche-canada.org/member-projects/zotero-guide/chapter1.html|HWZT chapter 1 (Intro)]] 
- 
-==== Zotero ==== 
- 
-The citation management program Zotero is a wonderful tool for researchers everywhere. Citations from the web may be "grabbed" simply by clicking on an icon in your web browser address bar. The citation information displayed on the screen is then saved to your Zotero collection with little or no additional effort. However, for this to work, each and every website must either follow [[how_to_write_a_zotero_translator_plusplus#metadata_converters|standardized metadata guidelines]], or must have its own personal "translator" that tells Zotero which words on the screen correspond with which bibliographic fields. Computers are stupid; translators make them smart. 
- 
-Most users who know about the citation capture feature are enthralled by it and want more. The [[http://forums.zotero.org/categories/|Zotero forums]] receive multiple requests daily from users hoping their favourite site will be given this capability. Unfortunately, there just aren't enough Zotero programmers around to keep up with the demand for translators, and more intensive coding-projects take priority. 
- 
-Luckily, Zotero translators are fairly easy to create (as far as computer programming goes). This guide seeks to help take some of that load away from the Zotero staff by teaching the community of Zotero users how to create their own translators and to share them with others. 
- 
-==== Who is this guide for? ==== 
- 
-Anyone! No previous experience required! 
- 
-In fact, the guide will assume that you have no programming experience whatsoever. You just need to have spent some time using Zotero and grabbing citations. (Check out the [[http://www.zotero.org/static/videos/tour/zotero_tour.htm|demo video]] if you are not familiar with this feature). 
- 
-This guide uses plain language throughout and is written for people who are not programmers. You just need to be comfortable with computers, able to think logically, and not afraid to do some Google searching when you find a word or come across an error message you don't recognize. 
- 
-Everything will be explained with a new user in mind. Therefore, you might find that you do not need to read all sections. If you are unsure whether or not you should read a section or skip it, it is probably a good idea to jump ahead to the end of the chapter and read the "What you should understand before moving on." If in doubt, it is probably best to read the chapter and refresh your knowledge. Skipping too much background will just leave you frustrated when you start coding. 
- 
-If you fit into one of the following categories, chances are you are a great candidate for writing a translator: 
- 
-  * Website administrator of a searchable database 
-  * Librarian or archivist 
-  * Researcher or journalist 
-  * Graduate student 
-  * Someone who wants to learn JavaScript 
- 
-==== What you will learn ==== 
- 
-When you are finished with this guide, you should not only know enough to create your own working Zotero translator, but you should understand the following concepts and computer languages: 
- 
-  * Basic HTML 
-  * the Document Object Model (DOM) 
-  * XPaths 
-  * JavaScript Regular Expressions (RegExp) 
-  * Basic JavaScript 
- 
-Translators are written in a computer language called "JavaScript" so your work will involve learning to do some basic JavaScript programming. 
- 
-You will also learn how to use a number of programs (all free and reputable). Among these are Firefox, Zotero, Scaffold, Komodo Edit, DOM Inspector, and XPather. 
- 
-You will not learn how to embed JavaScript into HTML documents or to do DOM scripting; however, after learning how to write a Zotero translator you will be well on your way to understanding these concepts. 
- 
-==== Benefits of Writing a Translator ==== 
- 
-If you are a web administrator or work for a company that maintains a searchable database, having a Zotero translator will increase your site's usability. Zotero has over a million users, many of whom judge a website's usability in part by whether or not they can automatically download citations. Adding this capability sends a message to your users that you believe their experience while using your site is important. 
- 
-[[dev/translator_overview#step_5contribute_your_translator|Contributing your own translator]] is a much more proactive way to get your site included in Zotero than submitting a request on the Zotero forums. 
- 
-If you are an end user rather than an administrator, writing your own translator allows you to customize it to your exact needs. If you only want Zotero to save the title and a copy of a pdf from a website, you can set your translator to do this. If you want all possible information from a site, you can arrange this as well. 
- 
-If the website in question is a rather obscure database that may be password protected, you will have to submit your own translator. This is because, even if a Zotero programmer has time to work on your request, without access to the database he or she is powerless to write the translator. 
- 
-Finally, Zotero is an open source, freeware project, adding to the software is in the spirit of its creation. Once you've finished a new translator you can submit it to the Zotero team so that everyone can benefit from the fruits of your labour. 
- 
-==== The Three Major Types of Translators ==== 
- 
-  * Scrapers 
-  * Metadata Converters 
-  * Exporters 
- 
-==== Scrapers ==== 
- 
-This guide will teach you how to create a "Scraper." 
- 
-The advantage of a Scraper is that it is the only kind of translator that can be used on any website. What a scraper does is take (scrape) words off the webpage and tells Zotero which words correlate to which part of the citation. It's sort of like cutting and pasting in a text document, but by using code rather than keystrokes. 
- 
-Another advantage of a scraper is that you can tailor it to your exact needs by choosing to gather all, some, or very little of the information available. Scrapers are easy to learn and to make. 
- 
-The disadvantage of a scraper is that it relies heavily on format and the consistency of the webpage's creator. If a Webmaster decides to change the structure of a web page even slightly, you will have to alter your scraper to reflect this. However, these changes happen infrequently and once released to all Zotero users, you may find that others take an interest in your translator and help to keep it up to date when needed. 
- 
-The other two types of translators are powerful and accurate, but are only possible to use under certain conditions that are almost always out of your control unless you are the website's administrator. We will look at how these work, but our focus will be on scrapers. 
- 
-==== Metadata Converters ==== 
- 
-These translators take information that a Webmaster has voluntarily embedded in a webpage, known as metadata, and organizes it into the correct Zotero fields. You can think of metadata as invisible ink that only appears if you know how to find it. Obviously the catch here is that the Webmaster must have included this information in the first place. 
- 
-The practice of including metadata is becoming more common, especially in databases. In the past couple of years, how people display metadata has become more standardized. Because of this standardization, Zotero already supports most sites that have it. Some of the most commonly used metadata convensions are: 
- 
-  * [[http://dublincore.org/|Dublin Core Metadata Initiative]] (aka "DC") 
-  * [[http://unapi.info/|unAPI]] 
-  * [[http://ocoins.info/|COinS]] 
-  * [[http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml|Embedded RDF]] 
- 
-If you are a website administrator and want your site to automatically be Zotero compliant, it is best to use one of these systems rather than writing a translator; they are standardized and reliable. Let me repeat that: //if you are a website administrator and can include standardized metadata in your site, stop reading this guide and add the metadata!// If you are not a website administrator, you might make this recommendation to the site in question. 
- 
-==== Exporters ==== 
- 
-Exporters also rely on a website providing certain information. In this case, we need a link that allows us to download a citation. For the most part, there are very few export formats. You may have come across them before, labeled as "MARC display" buttons, or a "RefWorks" button. These options are most common in library catalogues and on academic journal databases. This type of translator actually uses two translators, one embedded in the other. This can get quite complicated so we will not cover it in detail in this guide. However, if you do need to write one of these and need a few hints to get started, here are a few (rather technical) pointers. If you do not need to write an exporter please feel free to ignore this section. 
- 
-  - Use an XPath to grab the link URL of the citation download. 
-  - Use HTTP get to download the page found at that URL. 
-  - Call the translator for that type of citation (ie, MARC, Bibtex, etc) to interpret the citation. 
-  - Save results into Zotero. 
-  - If you are lost, check one of the many Library Catalogue translators by launching Scaffold and loading the translator code. 
- 
-You do not need to understand the latter two types of translators or the contents of the paragraph above to be successful at writing scrapers. 
- 
-Note: from this point forward, we will be using the word "Translator" exclusively when referring to "scrapers." Most of the terminology used to explain how to write a scraper is the same as would be used to explain how to write any other translator. The other types of translators will be referenced explicitly if used. 
- 
-==== Before we begin ==== 
- 
-As with anything new and computer-related, it is important that you backup your entire computer before you start. Coding is generally a safe practice but you never know when something could go wrong and your hard work gets wiped out! Best to backup now than be sorry later. 
- 
-There are many short cuts available when writing JavaScript and experienced programmers may tell you to use these. You are free to learn the short cuts if you wish; however, this guide will not use them, for simplicity's sake. 
- 
-This is NOT a guide detailing how to use Zotero. It is a guide detailing how to write code to extend the usefulness of Zotero. 
- 
-===== Chapter 2: General Troubleshooting Guidelines ===== 
- 
-//Note:// the following is adapted from [[http://niche-canada.org/member-projects/zotero-guide/chapter2.html|HWZT chapter 2 (Intro)]] 
- 
-Before we start, you should be aware: you will get frustrated — at least once. Here are a few tips to help you solve your problems. 
- 
-==== Search Engines ==== 
- 
-If you run into difficulties when writing computer code the great news is: the answer to almost any problem can be found online. All computer programmers have needed help at one time or another, and given their love for computers, most sought that help online. Lucky for you, that means that most of the questions they asked — and the subsequent answers — are still floating around the internet. 
- 
-This means the internet is often your best resource for finding help. If you run into a problem, the first thing you should do is type your problem into a search engine. More often than not someone has already asked your exact question, and someone else has provided an answer. You might even find entire websites dedicated to solving your particular problem. As far as coding goes, Zotero translators are quite basic; you will not come across a problem when writing a Zotero translator that no one has encountered before. 
- 
-Likewise, if you encounter an error message you don't understand, cut and paste that error message into a search engine and surround it with quotation marks. You will likely find dozens of explanations why this error appeared and how to fix it. 
- 
-The more specific you can be about your problem, the better the results you will find. Don't be discouraged if you don't find the answer on your first search. Rephrase the search terms and try again. 
- 
-==== Online Tutorials ==== 
- 
-Your second best option is [[http://www.w3schools.com/|W3Schools]] tutorials. W3Schools has step-by-step tutorials for nearly every internet-related programming language. Particularly helpful for this project are their: 
- 
-  * [[http://www.w3schools.com/html/default.asp|HTML tutorial]] 
-  * [[http://www.w3schools.com/htmldom/default.asp|HTML DOM tutorial]] 
-  * [[http://www.w3schools.com/js/default.asp|JavaScript tutorial]] 
- 
-At W3Schools, you can find great reference charts that will show you at a glance all the different capabilities of JavaScript and HTML. These will come in handy when you want to do something and can't remember how. 
- 
-Apart from W3Schools, you can find many other tutorials online. Try typing in what you want to learn into a search engine and you will likely find a tutorial. Keep in mind that many tutorials teach you how to accomplish a specific task and may not teach you exactly what you're looking for. 
- 
-==== Forums ==== 
- 
-If you've Googled it, Yahoo'd it, looked it up on the W3Schools reference charts and tried various combinations of teas, coffees, and energy drinks to no avail, you're going to need to ask for help. There are numerous internet forums to which you can turn for this; just find a forum you like. Here are a couple to get you started: 
- 
-=== Webdeveloper.com === 
- 
-The [[http://www.webdeveloper.com/forum/index.php?|JavaScript forum at WebDeveloper.com]] is excellent, especially for code-related questions. 
- 
-If you can't figure out why you are getting a particular error message, or why you can't get information from point A to point B, this is the forum for you. At any given time there are over one hundred people logged into the forum just waiting to answer your question. If you post your problem here in a courteous manner, with a little bit of luck you will have a solution within a couple of hours. 
- 
-It may not be the instant gratification we've come to expect, but don't forget, these people are volunteering to help you, and most probably if you're desperate enough to ask for help, you could use a few hours away from the keyboard anyway. 
- 
-=== Zotero Forums === 
- 
-If your question is something specific to Zotero, such as "why can't I put anything in the field Loc. in archive?" the helpful men and women at WebDeveloper.com will have no idea how to answer your question. Instead, post it to the [[http://forums.zotero.org/categories/|Zotero forums]]. Don't expect an answer as quickly as you would get on a more popular forum — there are only so many Zotero programmers to go around. You should get an answer in a couple of days as long as you're clear in your description of your problem. 
- 
-==== Asking Good Questions ==== 
- 
-Clarity and specificity are your friends when it comes to asking for help on a forum. The people who read forums and offer their expertise are busy; make it easy for them by carefully thinking out your problem before you ask. Likewise, make sure you are asking a specific question to a narrowly defined problem. 
- 
-For example, don't post something like: "Why won't my translator work?" 
- 
-Instead, try: "Why am I getting a syntax error when I try to [[http://niche-canada.org/member-projects/zotero-guide/chapter6.html#pushExplanation|Push a value into an Object]]?" 
- 
-Always post the relevant section of your code (and only the relevant section of your code) along with your question. This will make it easier for the experts to help you solve your problem. If the answer you get does not do the trick and you are still stuck, be polite and try rephrasing the question. Remember, don't bite the hand that feeds you; these are volunteers and they're trying to help you! 
- 
-==== Debugging ==== 
- 
-To help you ask good questions you'll learn how to use the ''Zotero.debug()'' method in [[how_to_write_a_zotero_translator_plusplus#chapter_6js_variables|Chapter 6]]. This will let you figure out exactly which part of your code is not working. Until then, keep the following in mind: 
- 
-When fixing problem code, only change one thing at a time. If you are working on a section of code that has several issues, fix one problem and retry the code before moving on. Sometimes if you make three or four changes before retrying the program, you will inadvertently cause another unexpected problem. This can make you think your fix was incorrect, when in fact only the last change you made was wrong. Change one thing and make sure it works before moving on and you will prevent a lot of confusion. 
- 
-===== Chapter 3: Required Software ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter3.html|HWZT chapter 3 (Required Software)]]: deltas are 
- 
-  - Scaffold: don't get Scaffold 1.0 from the link in HWZT, get Scaffold 2.0 [[http://bitbucket.org/rmzelle/scaffold/downloads|here]](temporarily). 
-  - Solvent: also downlevel, so instead get the following uplevel Firefox add-ons (from either the [[https://addons.mozilla.org/en-US/firefox/|official Firefox add-ons repository]] or the project links below): 
-    * XPather: download [[http://xpath.alephzarro.com/download|here]], documentation [[http://xpath.alephzarro.com/documentation|here]]. XPather can standalone, but works better in combination with ... 
-    * DOM Inspector: download [[https://developer.mozilla.org/en/dom_inspector#Getting_DOM_Inspector|here]], documentation [[https://developer.mozilla.org/en/dom_inspector#Documentation|here]] 
-Install all 3, then restart Firefox. 
- 
-===== Chapter 4: DOM & HTML ===== 
- 
-[[http://niche: -canada.org/member-projects/zotero-guide/chapter4.html|HWZT chapter 4 (DOM & HTML)]]: information only, appears uptodate, no deltas. 
- 
-===== Chapter 5: XPath directions ===== 
- 
-[[http:/XPath directions/niche-canada.org/member-projects/zotero-guide/chapter5.html|HWZT chapter 5 (XPath directions)]]: 
- 
-The {DOM Inspector + XPather} workflow differs from that of Solvent. After opening the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]], 
-  - Open DOM Inspector (aka //DI//) with C-S-i or from the Firefox main menu with Tools>DOM Inspector. XPather functionality is available from UI within the DI window. 
-  - Hit button=Inspect at the upper right of the DI window. This will open pane=Browser in the DI window displaying the contents of the first sample page. 
-  - To test the XPath string denoting the heading (text="Method and Meaning in Canadian Environmental History") of the first sample page, 
-    * in textbox=XPath in the DI window, type <code>//h1</code> 
-    * hit button=Eval next to the text box. This will popup dialog="XPath Browser" showing the matches for your string. (In this case, there should be only 1 match.) 
-    * Move dialog="XPath Browser" so you can see both it and the pane=Browser of the DI window. 
-    * In table="Matching Nodes" in dialog="XPath Browser", click on the row representing the match. In the pane=Browser of the DI window, you should briefly see a flashing red border around the heading. 
-  
-===== Chapter 6: JS Variables ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter6.html|HWZT chapter 6 (JS Variables)]]: 
- 
-A few things have changed since HWZT. To use its first Scaffold example as an example: 
- 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - If you are not already in tab=Metadata, select that. Enter some text in the Label and Creator fields. 
-  - The URI of the sample page has changed since HWZT, so you will need to enter Target=<code>http://niche-canada.org/member-projects/zotero-guide/</code> 
-  - Hit button="Test Regex". You should get a result, in the "Test Frame" on the right of the tab, similar to that described in HWZT. 
-  - Instead of <code>Click on the "Detect Code" tab</code>, click on tab=Code. As directed, in that tab enter the expressions <code>var myVariable=4; 
-Zotero.debug(myVariable); 
-</code>  
-  - Click on icon="Run doWeb" (a stylized thunderbolt) to obtain an evaluation like <code>12:00:00 ===>4<===(number)</code> 
-  - To also obtain an evaluation like <code>12:00:00 detectWeb returned type "undefined"</code>, you will need to also click on icon="Run detectWeb" (the eye next to the thunderbolt). 
- 
-The subsequent examples behave similarly. 
- 
-===== Chapter 7: JS Methods & Math ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter7.html|HWZT chapter 7 (JS Methods & Math)]]: the examples all work in Scaffold 2.0, no deltas (other than those described in the delta for chapter 6 above). 
- 
-===== Chapter 8: JS If Statements ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter8.html|HWZT chapter 8 (JS If Statements)]]: the examples all work in Scaffold 2.0, no deltas (other than those described in the delta for chapter 6 above). 
- 
-===== Chapter 9: JS Loops ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter9.html|HWZT chapter 9 (JS Loops)]]: the examples all work in Scaffold 2.0, no deltas (other than those described in the delta for chapter 6 above). 
- 
-===== Chapter 10: JS Functions ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter10.html|HWZT chapter 10 (JS Functions)]]: the examples all work in Scaffold 2.0, no deltas (other than those described in the delta for chapter 6 above). 
- 
-===== Chapter 11: XPath containers ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter11.html|HWZT chapter 11 (XPath containers)]]:  
- 
-Again, a few changes since HWZT, and again, using its first Scaffold example as an example: 
- 
-  - Close any running Scaffold 2.0 instances. 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - If you are not already in tab=Metadata, select that. Enter some text in the Label and Creator fields. 
-  - The URI of the sample page has changed since HWZT, so you will need to enter Target=<code>http://niche-canada.org/member-projects/zotero-guide/</code> 
-  - Hit button="Test Regex". You should get a result, in the "Test Frame" on the right of the tab, similar to that described in HWZT. 
-  - Instead of <code>Click on the "Detect Code" tab</code>, click on tab=Code.  
-  - In that tab enter <code>function detectWeb(doc, url) { 
-  var namespace = doc.documentElement.namespaceURI; 
-  var nsResolver = namespace ? function(prefix) { 
-    if (prefix == "x" ) return namespace; else return null; 
-    } : null; 
-  var myXPath = '//td[1]'; 
-  var myXPathObject =  
-    doc.evaluate(myXPath, doc, nsResolver, XPathResult.ANY_TYPE, null).iterateNext().textContent; 
-  Zotero.debug(myXPathObject); 
-}</code>  
-  - Click on icon="Run detectWeb" (the eye): you should get results like <code>12:00:00 Title:</code> 
- 
-The code for the second complete Scaffold example (from "Example 11.10") is similarly 
-<code>function detectWeb(doc, url) { 
-  var namespace = doc.documentElement.namespaceURI; 
-  var nsResolver = namespace ? function(prefix) { 
-    if (prefix == "x" ) return namespace; else return null; 
-    } : null; 
-  var myXPath = '//div[@id="Content"]/div/table[@class="Bibrec"]/tbody/tr/td[1][@class="Label"]'; 
-  var myXPathObject = doc.evaluate(myXPath, doc, nsResolver, XPathResult.ANY_TYPE, null); 
-  var items = new Object(); 
-  var headers; 
-  while (headers = myXPathObject.iterateNext()) { 
-    items[headers.textContent]=''; 
-  } 
-  Zotero.debug(items); 
-} 
-</code> Click on icon="Run detectWeb" (the eye): you should get results like <code>12:00:00 'Title:' => "" 
-  'PrincipalAuthor:' => "" 
-  'Imprint:' => "" 
-  'Subjects:' => "" 
-  '' => "" 
-  'ISBN-10:' => "" 
-  'Collection:' => "" 
-  'Pages:' => "" 
-</code> 
- 
-===== Chapter 12: Regexps ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter12.html|HWZT chapter 12 (Regexps)]]: the example works in Scaffold 2.0 with no deltas (other than those described in chapter 11 above). To test, open Scaffold 2.0 on any page (e.g. the directions for opening for chapter 11 will work) and enter the code <code>function detectWeb(doc, url) { 
-  var x = "                     346                       "  
-  x = x.replace(/^\s*|\s*$/g, ''); 
-  Zotero.debug(x);  
-} 
-</code>. Click on icon="Run detectWeb" (the eye): in the Test Frame you should get results like <code>12:00:00 346</code> 
- 
-===== Chapter 13: Metadata Tab ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter13.html|HWZT chapter 13 (Metadata Tab)]]: mostly informational, all works in Scaffold 2.0 with no deltas (other than those previously described). However the next chapter (14, from which comes the following quote) "builds upon the translator started in [this chapter]" and assumes that, having "already created [this] translator you can find it again by opening Scaffold and clicking on [icon=Load From Database]." There are two problems with this statement, in descending seriousness: 
- 
-  - this chapter (13) does not describe saving the tutorial translator 
-  - the icon's title is merely "Load" in Scaffold 2.0 
- 
-So, to create and save the tutorial translator, do 
- 
-  - Close any running Scaffold 2.0 instances. 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - If you are not already in tab=Metadata, select that. 
-  - In field=Label, enter <code>How to Write a Zotero Translator</code> 
-  - In field=Creator, enter your name 
-  - In field=Target, enter <code>/member-projects/zotero-guide</code> 
-  - Scaffold 2.0 will not save a translator without code (and it will silently refuse to do so), so switch to tab=Code, and enter <code>function detectWeb(doc, url) { 
-  var x = "                     346                       "  
-  x = x.replace(/^\s*|\s*$/g, ''); 
-  Zotero.debug(x);  
-} 
-</code> (though any working code should do). 
-  - Click icon=Save (second from left, looks like some tabs): your translator should save silently. 
- 
-To test your translator saved properly, 
- 
-  - Close any running Scaffold 2.0 instances. 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold" with little data in tab=Metadata. 
-  - Hit icon=Load (the non-OS UI item closest to the upper left of the dialog) to popup dialog="Load Translator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator</code> The labels are in lexicographic order (after the first few), so expect to see it between label="History Cooperative" and label="Hurricane Digital Memory Bank". 
- 
-===== Chapter 14: DetectWeb Tab ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter14.html|HWZT chapter 14 (DetectWeb Tab)]]: the tutorial example works in Scaffold 2.0 and Firefox 3.5, with 2 minor exceptions 
- 
-  - HWZT says "Save your entry and click Execute": instead click icon="Run detectWeb" (the eye) 
-  - HWZT says "If you would like to see the Icon in the address bar [of the first sample page], you will likely have to relaunch Firefox": in fact, if you have in fact saved your entry, you need only reload the page or tab. 
- 
-As HWZT has gotten increasingly casual, here are some step-by-step instructions for this chapter's tutorial section: 
- 
-  - Close any running Scaffold 2.0 instances. 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - Hit icon=Load (the non-OS UI item closest to the upper left of the dialog). This should popup dialog="Load Translator" displaying a single table mapping "Label" to "Creator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator</code>, then hit button=OK. 
-  - You will return to the main dialog="Zotero Scaffold". Check to see that tab=Metadata is properly populated. 
-  - Switch to tab=Code and enter the following:<code>function detectWeb(doc, url) { 
-  if (doc.title.match("Single Item")) { 
-    return "book"; 
-  } else if (doc.title.match("Search Results")) { 
-    return "multiple"; 
-  } 
-} 
-</code> 
-  - Click icon="Run detectWeb" (the eye): in the Test Frame you should get results like <code>12:00:00 detectWeb returned type "book"</code> 
-  - Click icon=Save (second from left): your translator should save silently. 
-  - Return focus to the page or tab containing the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] and refresh it: you should see the Zotero book icon in the location field of your Firefox. 
- 
-===== Chapter 15: Scraping the Search Results Page ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter15.html|HWZT 
-chapter 15 (Scraping the Search Results Page)]]: Again, the tutorial example works, but is rather casually presented (notably, it requires a different sample page and really a separate translator), so here are some step-by-step instructions. Basically, we will 
- 
-  - rename and save the original translator 
-  - create the stub for a second translator 
-  - populate, test, and save the second translator 
- 
-To rename the original translator, 
- 
-  - Close any running Scaffold 2.0 instances. 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - Hit icon=Load (at the upper left of the dialog). This should popup dialog="Load Translator" displaying a single table mapping "Label" to "Creator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator</code>, then hit button=OK. 
-  - You will return to the main dialog="Zotero Scaffold". Check to see that tab=Metadata is properly populated. 
-  - Append to field=Label " (single result)" so that the field now says <code>How to Write a Zotero Translator (single result)</code>. Click icon=Save (second from left): your translator should save silently. 
-  - Click icon="Run detectWeb" (the eye) to ensure the code still works: in the Test Frame you should get results like <code>12:00:00 detectWeb returned type "book"</code> 
-  - Leave the current instance of Scaffold 2.0 open, since we'll use the same translator in the next section. 
- 
-To create the stub for a second translator,  
- 
-  - Switch to tab=Metadata. 
-  - Click button=Generate next to field="Translator ID": the Translator ID value should change. 
-  - Change the contents of field=Label from <code>How to Write a Zotero Translator (single result)</code> to <code>How to Write a Zotero Translator (search results)</code>. 
-  - Click icon=Save (second from left): your translator should save silently. 
-  - Close all running Scaffold 2.0 instances. 
- 
-To populate, test, and save the second translator, 
- 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/searchresults1.html|first sample search page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - Hit icon=Load (upper left of the dialog) to popup dialog="Load Translator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator (search results)</code>, then hit button=OK. 
-  - You will return to the main dialog="Zotero Scaffold". Check to see that tab=Metadata is properly populated. 
-  - Click icon="Run detectWeb" (the eye): in the Test Frame you should get results like <code>12:00:00 detectWeb returned type "multiple"</code> 
-  - Switch to tab=Code and enter the following code (actually, you're just appending <code>function doWeb</code> :<code>function detectWeb(doc, url) { 
-  if (doc.title.match("Single Item")) { 
-    return "book"; 
-  } else if (doc.title.match("Search Results")) { 
-    return "multiple"; 
-  } 
-} 
-function doWeb(doc, url) { 
-  var namespace = doc.documentElement.namespaceURI; 
-  var nsResolver = namespace ? function(prefix) { 
-    if (prefix == 'x') return namespace; else return null; 
-  } : null; 
-  var articles = new Array(); 
-  var items = new Object(); 
-  var nextTitle; 
-  if (detectWeb(doc, url) == "multiple") { 
-    var titles = doc.evaluate('//td[2]/a', doc, nsResolver, XPathResult.ANY_TYPE, null); 
-    while (nextTitle = titles.iterateNext()) { 
-      items[nextTitle.href] = nextTitle.textContent; 
-    } 
-    items = Zotero.selectItems(items); 
-    for (var i in items) { 
-      articles.push(i); 
-    } 
-  } else { 
-    articles = [url]; 
-  } 
-  Zotero.Utilities.processDocuments(articles, scrape, function(){Zotero.done();}); 
-  Zotero.wait(); 
-} 
-</code> 
-  - Click icon=Save (second from left): your translator should save silently. 
-  - Click icon="Run doWeb" (the thunderbolt): a dialog="Select Items" should popup, with a selection area containing 10 items, corresponding to the 10 items in the [[http://niche-canada.org/member-projects/zotero-guide/searchresults1.html|sample search page]]. Check to see that the titles of the items in the dialog match the titles of the items on the sample search page, then click button=Cancel on dialog="Select Items". 
-  - Click icon=Save (second from left): your translator should save silently. 
-  - Close all running Scaffold 2.0 instances. 
-  - Return focus to the page or tab containing the [[http://niche-canada.org/member-projects/zotero-guide/searchresults1.html|sample search page]] and refresh it: you should see the Zotero folder icon in the location field of your Firefox. 
- 
-===== Chapter 16: Scraping the Individual Entry Page ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter16.html|HWZT chapter 16 (Scraping the Individual Entry Page)]]: As previously, the tutorial example works, but is rather casually presented, so here are some step-by-step instructions to produce the result of the tutorial section: 
- 
-  - open the single-result translator 
-  - extend the single-result translator 
-  - test the single-result translator on single-result pages 
-  - test the single-result translator on a search-results page 
- 
-To open the single-result translator: 
- 
-  - Close any running Scaffold 2.0 instances. 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - Hit icon=Load (at the upper left of the dialog). This should popup dialog="Load Translator" displaying a single table mapping "Label" to "Creator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator (single result)</code>, then hit button=OK. 
-  - You will return to the main dialog="Zotero Scaffold". Check to see that tab=Metadata is properly populated. 
-  - Click icon="Run detectWeb" (the eye) to ensure its current code still works: in the Test Frame you should get results like <code>12:00:00 detectWeb returned type "multiple"</code> 
-  - Leave the current instance of Scaffold 2.0 open, since we'll use the same translator in the next section. 
- 
-To extend the single-result translator: add a <code>doWeb</code> function to the JavaScript in tab=Code. Obviously one can implement this several ways. The following code has been tested, and is somewhat more modular than that provided by HWZT. To use it, just 
- 
-  - replace the current contents of tab=Code with the code below:<code>function detectWeb(doc, url) { 
-  if (doc.title.match("Single Item")) { 
-    return "book"; 
-  } else if (doc.title.match("Search Results")) { 
-    return "multiple"; 
-  } 
-} 
- 
-// The function used to save well formatted data to Zotero 
-function associateData (newItem, items, field, zoteroField) { 
-  if (items[field]) { 
-    newItem[zoteroField] = items[field]; 
-  } 
-} 
- 
-function scrape(doc, url) { 
-  // variable declarations 
-  var newItem = new Zotero.Item('book'); 
-  newItem.url = doc.location.href; 
-  newItem.title = "No Title Found"; 
-  var items = new Object(); 
-  var tagsContent = new Array(); 
- 
-  // scrape page data, save to Zotero 
-  getItems(doc, items, tagsContent); 
-  getAuthors(newItem, items); 
-  getImprints(newItem, items); 
-  getTags(newItem, items, tagsContent); 
-  saveToZotero(newItem, items); 
-} 
- 
-function getItems(doc, items, tagsContent) { 
-  // namespace code 
-  var namespace = doc.documentElement.namespaceURI; 
-  var nsResolver = namespace ?  
-    function(prefix) { 
-      if (prefix == 'x') return namespace; else return null; 
-    } : null; 
- 
-  // populate "items" Object and save tags to an Array 
-  var blankCell = "temp"; 
-  var headersTemp; 
-  var headers; 
-  var contents; 
-  var myXPathObject = doc.evaluate('//td[1]', doc, nsResolver, XPathResult.ANY_TYPE, null); 
-  var myXPathObject2 = doc.evaluate('//td[2]', doc, nsResolver, XPathResult.ANY_TYPE, null); 
-  while (headers = myXPathObject.iterateNext()) { 
-    headersTemp = headers.textContent; 
-    if (!headersTemp.match(/\w/)) { 
-      headersTemp = blankCell; 
-      blankCell = blankCell + "1"; 
-    } 
-    contents = myXPathObject2.iterateNext().textContent; 
-    if (headersTemp.match("temp")) { 
-      tagsContent.push(contents); 
-    } 
-    items[headersTemp.replace(/\s+/g, '')]=contents.replace(/^\s*|\s*$/g, ''); 
-  } 
-} 
- 
-function getAuthors(newItem, items) { 
-  //Formatting and saving "Author" field 
-  if (items["PrincipalAuthor:"]) { 
-    var author = items["PrincipalAuthor:"]; 
-    if (author.match("; ")) { 
-      var authors = author.split("; "); 
-      for (var i in authors) { 
-        newItem.creators.push(Zotero.Utilities.cleanAuthor(authors[i], "author")); 
-      } 
-    } else { 
-      newItem.creators.push(Zotero.Utilities.cleanAuthor(author, "author")); 
-    } 
-  } 
-} 
- 
-function getImprints(newItem, items) { 
-  // Format and save "Imprint" fields 
-  if (items["Imprint:"]) { 
-    items["Imprint:"] = items["Imprint:"].replace(/\s\s+/g, ''); 
-    if (items["Imprint:"].match(":")) { 
-      var colonLoc = items["Imprint:"].indexOf(":"); 
-      newItem.place = items["Imprint:"].substr(1, colonLoc-1); 
-      var commaLoc = items["Imprint:"].lastIndexOf(","); 
-      var date1 =items["Imprint:"].substr(commaLoc + 1); 
-      newItem.date = date1.substr(0, date1.length-1); 
-      newItem.publisher = items["Imprint:"].substr(colonLoc+1, commaLoc-colonLoc-1); 
-    } else { 
-      newItem.publisher = items["Imprint:"]; 
-    } 
-  } 
-} 
- 
-function getTags(newItem, items, tagsContent) { 
-  if (items["Subjects:"]) { 
-    tagsContent.push(items["Subjects:"]); 
-  } 
-  for (var i = 0; i < tagsContent.length; i++) { 
-    newItem.tags[i] = tagsContent[i]; 
-  } 
-} 
- 
-function saveToZotero(newItem, items) { 
-  // Associate and save well-formed data to Zotero 
-  associateData (newItem, items, "Title:", "title"); 
-  associateData (newItem, items, "ISBN-10:", "ISBN"); 
-  associateData (newItem, items, "Collection:", "extra"); 
-  associateData (newItem, items, "Pages:", "pages"); 
-  newItem.repository = "NiCHE"; 
-  newItem.complete(); 
-} 
- 
-function doWeb(doc, url) { 
-  // namespace code 
-  var namespace = doc.documentElement.namespaceURI; 
-  var nsResolver = namespace ? function(prefix) { 
-    if (prefix == 'x') return namespace; else return null; 
-  } : null; 
- 
-  // variable declarations 
-  var articles = new Array(); 
-  var items = new Object(); 
-  var nextTitle; 
- 
-  // If Statement checks if page is a Search Result, then saves requested Items 
-  if (detectWeb(doc, url) == "multiple") { 
-    var titles = doc.evaluate('//td[2]/a', doc, nsResolver, XPathResult.ANY_TYPE, null); 
-    while (nextTitle = titles.iterateNext()) { 
-      items[nextTitle.href] = nextTitle.textContent; 
-    } 
-    items = Zotero.selectItems(items); 
-    for (var i in items) { 
-      articles.push(i); 
-    } 
-  } else { 
-    //saves single page items 
-    articles = [url]; 
-  } 
- 
-  // process everything, calling function=scrape to do the heavy lifting 
-  Zotero.Utilities.processDocuments(articles, scrape, function(){Zotero.done();}); 
-  Zotero.wait(); 
-} 
-</code> 
-  - Click icon=Save (second from left): your translator should save silently. 
-  - Close all running Scaffold 2.0 instances. 
- 
-To test the translator on a single-result page:  
- 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample1.html|first sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - Hit icon=Load (at the upper left of the dialog). This should popup dialog="Load Translator" displaying a single table mapping "Label" to "Creator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator (single result)</code>, then hit button=OK. 
-  - You will return to the main dialog="Zotero Scaffold". Check to see that tab=Metadata is properly populated. 
-  - Click icon="Run doWeb" (the thunderbolt) to test the new code: in the Test Frame you should get results like <code>12:00:00 Returned item: 
-    'itemType' => "book" 
-    'creators' ... 
-        '0' ... 
-            'firstName' => "Alan" 
-            'lastName' => "MacEachern" 
-            'creatorType' => "author" 
-        '1' ... 
-            'firstName' => "William J." 
-            'lastName' => " Turkel" 
-            'creatorType' => "author" 
-    'notes' ... 
-    'tags' ... 
-        '0' => "History" 
-        '1' => "Methodology" 
-        '2' => "Tables." 
-        '3' => "Environment" 
-    'seeAlso' ... 
-    'attachments' ... 
-    'url' => "http://niche-canada.org/member-projects/zotero-guide/sample1.html" 
-    'title' => "Method and Meaning in Canadian Environmental History" 
-    'place' => "Toronto" 
-    'date' => "2009" 
-    'publisher' => "Nelson Canada" 
-    'ISBN' => "0176441166" 
-    'extra' => "None" 
-    'pages' => "573" 
-    'libraryCatalog' => "NiCHE" 
-    'complete' => function(...){...}  
-          
-14:36:30 Translation successful 
-</code> 
-  - Close all running Scaffold 2.0 instances. 
- 
-Test the translator on another single-result page:  
- 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/sample2.html|second sample page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - Hit icon=Load (at the upper left of the dialog). This should popup dialog="Load Translator" displaying a single table mapping "Label" to "Creator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator (single result)</code>, then hit button=OK. 
-  - You will return to the main dialog="Zotero Scaffold". Check to see that tab=Metadata is properly populated. 
-  - Click icon="Run doWeb" (the thunderbolt) to test the new code: in the Test Frame you should get results like <code>12:00:00 Returned item: 
-             'itemType' => "book" 
-             'creators' ... 
-                 '0' ... 
-                     'firstName' => "David Freeland" 
-                     'lastName' => "Duke" 
-                     'creatorType' => "author" 
-             'notes' ... 
-             'tags' ... 
-                 '0' => "Canada" 
-                 '1' => "Environment" 
-                 '2' => "Bibliography" 
-                 '3' => "Tables." 
-                 '4' => "History" 
-             'seeAlso' ... 
-             'attachments' ... 
-             'url' => "http://niche-canada.org/member-projects/zotero-guide/sample2.html" 
-             'title' => "Canadian Environmental History: Essential Readings" 
-             'place' => "Toronto" 
-             'date' => "2006" 
-             'publisher' => "Canadian Scholars Press" 
-             'ISBN' => "1551303108" 
-             'extra' => "None" 
-             'pages' => "392" 
-             'libraryCatalog' => "NiCHE" 
-             'complete' => function(...){...}  
-          
-18:52:01 Translation successful 
-</code> 
-  - Close all running Scaffold 2.0 instances. 
- 
-To test the translator on a search-results page: 
- 
-  - Ensure the [[http://niche-canada.org/member-projects/zotero-guide/searchresults1.html|first sample search page]] is open in your browser and has focus. 
-  - Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog="Zotero Scaffold". 
-  - Hit icon=Load (upper left of the dialog) to popup dialog="Load Translator". 
-  - Scroll through the "Load Translator" table until you see Label=<code>How to Write a Zotero Translator (search results)</code>, then hit button=OK. 
-  - You will return to the main dialog="Zotero Scaffold". Check to see that tab=Metadata is properly populated. 
-  - Click icon="Run doWeb" (the thunderbolt): a dialog="Select Items" should popup, with a selection area containing 10 items, corresponding to the 10 items in the [[http://niche-canada.org/member-projects/zotero-guide/searchresults1.html|sample search page]]. Check to see that the titles of the items in the dialog match the titles of the items on the sample search page, then click button=Cancel on dialog="Select Items". 
-  - TODO: test adding an item to your library. 
-  - Leave the current instance of Scaffold 2.0 open, since we'll use the same translator in the next section. 
- 
-Since it is now clear that our "single-result translator" also handles search results properly, we can save it as just "How to Write a Zotero Translator" with no suffix: 
- 
-  - Switch to tab=Metadata. 
-  - In field=Label, remove the "(single result)" suffix. 
-  - Click icon=Save (second from left): your translator should save silently. 
-  - TODO: delete now-unnecessary translator="How to Write a Zotero Translator (search results)" 
-  - Close all running Scaffold 2.0 instances. 
- 
-===== Chapter 17: Common Problems when Scraping an Individual Entry Page ===== 
- 
-[[http://niche-canada.org/member-projects/zotero-guide/chapter17.html|HWZT chapter 17 (Common Problems when Scraping an Individual Entry Page)]]: information only, hopefully uptodate. 
  
dev/how_to_write_a_zotero_translator_plusplus.1280266167.txt.gz · Last modified: 2010/07/27 17:29 by tomrochewiki