====== Chapter 4: The DOM & HTML ====== **Note**: Words appearing between "<" and ">" are HTML elements. Words that appear between "" are HTML comments, intended for human readers, ignored by the browser. HTML comments do not tell your browser to do anything, but can be used to provide valuable information to anyone reading your code. If you do not recognize these markup structures, or find it difficult to follow this chapter, please take the W3Schools HTML tutorial before continuing. ===== What is the DOM? ===== DOM stands for "Document Object Model." It is not so much a thing as a way of describing how web pages are structured. Most people think of a web page much the same way as they think of a newspaper spread: there are words, pictures and headlines on various parts of the page. As far as we can tell, white space appears where nothing else has been placed. However, this is not how websites actually work. Web pages are actually comprised of a series of nodes. These nodes are organized in a particular hierarchy, as defined by the person who wrote the web page, according to how they decided they wanted the page to function. But before we discuss that further, let's take a look at what a web page really is. ===== Understanding HTML structure ===== If you've ever written a basic web page, you know that it is really just an HTML document. These documents contain the page's content — the words, links, images — as well as a series of tags that help your browser understand at what it is looking. If you've never written a website, go up to your "View" menu and click on "Page Source." A new window will pop up with what is called "source code." This is what your browser interprets. The result of this interpretation is what you see in your browser when you go to the website. Don't worry if you can't understand most of the things you see in the source code; the source code of most websites is not very suitable for reading. You certainly don't need to understand everything about web pages to write a Zotero translator. However, you will have to have a general understanding of how HTML documents are structured. Most newer websites contain many languages and markup styles in a typical page source. These include but are not limited to JavaScript, Java, PHP, Flash, CSS and XML. HTML will always appear between two sets of angle brackets < >. Looking for these will often make it easier for you to distinguish HTML from other markups and code. Many browsers, including Firefox, will colour code the source for you to make your job even easier. At this point, we are only interested in looking at the HTML bits, which means you can ignore everything else. For the most part, the tags we are interested in start with:
, , , ,
,