Chapter 15: Scraping the Search Results Page

HWZT chapter 15 (Scraping the Search Results Page): Again, the tutorial example works, but is rather casually presented (notably, it requires a different sample page and really a separate translator), so here are some step-by-step instructions. Basically, we will

  1. rename and save the original translator
  2. create the stub for a second translator
  3. populate, test, and save the second translator

To rename the original translator,

  1. Close any running Scaffold 2.0 instances.
  2. Ensure the first sample page is open in your browser and has focus.
  3. Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog=“Zotero Scaffold”.
  4. Hit icon=Load (at the upper left of the dialog). This should popup dialog=“Load Translator” displaying a single table mapping “Label” to “Creator”.
  5. Scroll through the “Load Translator” table until you see Label=
    How to Write a Zotero Translator

    , then hit button=OK.

  6. You will return to the main dialog=“Zotero Scaffold”. Check to see that tab=Metadata is properly populated.
  7. Append to field=Label “ (single result)” so that the field now says
    How to Write a Zotero Translator (single result)

    . Click icon=Save (second from left): your translator should save silently.

  8. Click icon=“Run detectWeb” (the eye) to ensure the code still works: in the Test Frame you should get results like
    12:00:00 detectWeb returned type "book"
  9. Leave the current instance of Scaffold 2.0 open, since we'll use the same translator in the next section.

To create the stub for a second translator,

  1. Switch to tab=Metadata.
  2. Click button=Generate next to field=“Translator ID”: the Translator ID value should change.
  3. Change the contents of field=Label from
    How to Write a Zotero Translator (single result)

    to

    How to Write a Zotero Translator (search results)

    .

  4. Click icon=Save (second from left): your translator should save silently.
  5. Close all running Scaffold 2.0 instances.

To populate, test, and save the second translator,

  1. Ensure the first sample search page is open in your browser and has focus.
  2. Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog=“Zotero Scaffold”.
  3. Hit icon=Load (upper left of the dialog) to popup dialog=“Load Translator”.
  4. Scroll through the “Load Translator” table until you see Label=
    How to Write a Zotero Translator (search results)

    , then hit button=OK.

  5. You will return to the main dialog=“Zotero Scaffold”. Check to see that tab=Metadata is properly populated.
  6. Click icon=“Run detectWeb” (the eye): in the Test Frame you should get results like
    12:00:00 detectWeb returned type "multiple"
  7. Switch to tab=Code and enter the following code (actually, you're just appending
    function doWeb

    :

    function detectWeb(doc, url) {
      if (doc.title.match("Single Item")) {
        return "book";
      } else if (doc.title.match("Search Results")) {
        return "multiple";
      }
    }
    function doWeb(doc, url) {
      var namespace = doc.documentElement.namespaceURI;
      var nsResolver = namespace ? function(prefix) {
        if (prefix == 'x') return namespace; else return null;
      } : null;
      var articles = new Array();
      var items = new Object();
      var nextTitle;
      if (detectWeb(doc, url) == "multiple") {
        var titles = doc.evaluate('//td[2]/a', doc, nsResolver, XPathResult.ANY_TYPE, null);
        while (nextTitle = titles.iterateNext()) {
          items[nextTitle.href] = nextTitle.textContent;
        }
        items = Zotero.selectItems(items);
        for (var i in items) {
          articles.push(i);
        }
      } else {
        articles = [url];
      }
      Zotero.Utilities.processDocuments(articles, scrape, function(){Zotero.done();});
      Zotero.wait();
    }
  8. Click icon=Save (second from left): your translator should save silently.
  9. Click icon=“Run doWeb” (the thunderbolt): a dialog=“Select Items” should popup, with a selection area containing 10 items, corresponding to the 10 items in the sample search page. Check to see that the titles of the items in the dialog match the titles of the items on the sample search page, then click button=Cancel on dialog=“Select Items”.
  10. Click icon=Save (second from left): your translator should save silently.
  11. Close all running Scaffold 2.0 instances.
  12. Return focus to the page or tab containing the sample search page and refresh it: you should see the Zotero folder icon in the location field of your Firefox.

Next: Chapter 16: Scraping the Individual Entry Page: Scrape Function