HWZT
chapter 15 (Scraping the Search Results Page): Again, the tutorial example works, but is rather casually presented (notably, it requires a different sample page and really a separate translator), so here are some step-by-step instructions. Basically, we will
rename and save the original translator
create the stub for a second translator
populate, test, and save the second translator
To rename the original translator,
Close any running Scaffold 2.0 instances.
-
Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog=“Zotero Scaffold”.
Hit icon=Load (at the upper left of the dialog). This should popup dialog=“Load Translator” displaying a single table mapping “Label” to “Creator”.
Scroll through the “Load Translator” table until you see Label=
How to Write a Zotero Translator
, then hit button=OK.
You will return to the main dialog=“Zotero Scaffold”. Check to see that tab=Metadata is properly populated.
Append to field=Label “ (single result)” so that the field now says
How to Write a Zotero Translator (single result)
. Click icon=Save (second from left): your translator should save silently.
Click icon=“Run detectWeb” (the eye) to ensure the code still works: in the Test Frame you should get results like
12:00:00 detectWeb returned type "book"
Leave the current instance of Scaffold 2.0 open, since we'll use the same translator in the next section.
To create the stub for a second translator,
Switch to tab=Metadata.
Click button=Generate next to field=“Translator ID”: the Translator ID value should change.
Change the contents of field=Label from
How to Write a Zotero Translator (single result)
to
How to Write a Zotero Translator (search results)
.
Click icon=Save (second from left): your translator should save silently.
Close all running Scaffold 2.0 instances.
To populate, test, and save the second translator,
-
Open Scaffold 2.0 from the Firefox main menu with Tools>Scaffold. This should popup dialog=“Zotero Scaffold”.
Hit icon=Load (upper left of the dialog) to popup dialog=“Load Translator”.
Scroll through the “Load Translator” table until you see Label=
How to Write a Zotero Translator (search results)
, then hit button=OK.
You will return to the main dialog=“Zotero Scaffold”. Check to see that tab=Metadata is properly populated.
Click icon=“Run detectWeb” (the eye): in the Test Frame you should get results like
12:00:00 detectWeb returned type "multiple"
Switch to tab=Code and enter the following code (actually, you're just appending
function doWeb
:
function detectWeb(doc, url) {
if (doc.title.match("Single Item")) {
return "book";
} else if (doc.title.match("Search Results")) {
return "multiple";
}
}
function doWeb(doc, url) {
var namespace = doc.documentElement.namespaceURI;
var nsResolver = namespace ? function(prefix) {
if (prefix == 'x') return namespace; else return null;
} : null;
var articles = new Array();
var items = new Object();
var nextTitle;
if (detectWeb(doc, url) == "multiple") {
var titles = doc.evaluate('//td[2]/a', doc, nsResolver, XPathResult.ANY_TYPE, null);
while (nextTitle = titles.iterateNext()) {
items[nextTitle.href] = nextTitle.textContent;
}
items = Zotero.selectItems(items);
for (var i in items) {
articles.push(i);
}
} else {
articles = [url];
}
Zotero.Utilities.processDocuments(articles, scrape, function(){Zotero.done();});
Zotero.wait();
}
Click icon=Save (second from left): your translator should save silently.
Click icon=“Run doWeb” (the thunderbolt): a dialog=“Select Items” should popup, with a selection area containing 10 items, corresponding to the 10 items in the
sample search page. Check to see that the titles of the items in the dialog match the titles of the items on the sample search page, then click button=Cancel on dialog=“Select Items”.
Click icon=Save (second from left): your translator should save silently.
Close all running Scaffold 2.0 instances.
Return focus to the page or tab containing the
sample search page and refresh it: you should see the Zotero folder icon in the location field of your Firefox.
Next: Chapter 16: Scraping the Individual Entry Page: Scrape Function