Introduction
Ithe basics of HTML—what it is, what it does, its history in brief, and what the structure of an HTML document looks like. The articles that follow this one will look at each individual part of HTML in much greater depth.HTML is
Most desktop applications that read and write files use a special file format. For example, Microsoft Word understands “.doc” files and Microsoft Excel understands “.xls”. These files contain the instructions on how to rebuild the documents next time you open them, what the contents of that document are, and “metadata” about the article such as the author, the date the document was last modified, even things such a list of changes made so you can go back and forth between versions.
HTML (“HyperText Markup Language”) is a language to describe the contents of web documents. It uses a special syntax containing markers (called “elements”) which are wrapped around the text within the document to indicate how user agents (eg web browsers) should interpret that portion of the document.
A user agent is any software that is used to access web pages on behalf of users. There is an important distinction to be made here—all types of desktop browser software (Internet Explorer, Opera, Firefox, Safari, Chrome etc.) and alternative browsers for other devices (such as the Wii Internet channel, and mobile phone browsers such as Opera Mini and WebKit on the iPhone) are user agents, but not all user agents are browser software. The automated programs that Google and Yahoo! use to index the web to use in their search engines are also user agents, but no human being is controlling them directly
HTML
HTML is just a plain textual representation of content and its general meaning. For example:
<p id="example">This is a paragraph.</p>
The “
<p>
” part is a marker (which we refer to as a “tag”) that means “what follows should be considered as a paragraph”. Because it is at the start of the content it is affecting, this particular tag is an "opening tag". The “</p>
” is a tag to indicate where the end of the paragraph is (which we refer to as a “closing tag”). The opening tag, closing tag and everything in betweeen is called an “element”. The id="example"
is an attribute; you'll learn more about these later on. Many people use the terms element and tag interchangeably however, which is not strictly correct.
In most browsers there is a “Source” or “View Source” option, commonly under the “View” menu. Try this now - go to your favourite web site, choose this option, and spend some time looking at the HTML that makes up the structure of the page
The history of HTML
In the article The history of the Internet and the web, and the evolution of web standards you learned a little about how the modern Web came about. When Tim Berners-Lee invented the World Wide Web, he created both the first web server and web browser and the first version of HTML.
Whilst HTML has changed considerably since the first days, a lot of the content of modern-day HTML is embodied in that first documentation and more than half of the “tags” described in the original “HTML tags” document still exist.
As more people started writing web pages and alternatives to the original browser software, more features were added to HTML. Many were adopted universally (such as the
img
element used to insert an image into a document, first implemented in NCSA Mosaic). Some were more proprietary and really only used in one or two browsers. There was a growing need for standardisation — so that web developers and authors of web browsing software had a document (called a “specification”) that definitively described to them what HTML looked like so they could judge whether they were implementing/using HTML correctly.
The IETF (Internet Engineering Task Force — a standards body concerned with inter-operability across the internet) published a draft proposal of HTML in 1993. This expired without becoming a standard in 1994, but prompted the IETF to create a working group to look at HTML standardisation.
In 1995, HTML 2.0 was written, taking ideas from the original HTML draft. An alternate proposal called HTML+ was also written by Dave Raggett, which was used as a basis for many of the new elements implemented by browsers (such as the method for inserting images into documents, pioneered by NCSA Mosaic).
A draft of HTML 3.0 followed later that year, but work on that version was discontinued because of a lack of support for the direction from browser makers. HTML 3.2 dropped many of the new features of 3.0, and instead adopted many of the creations of the then-popular browsers Mosaic and Netscape Navigator.
In 1997, the W3C published HTML 4.0 as a recommendation that adopted more browser-specific extensions but also attempted to rationalise and clean up HTML. This was done by marking various elements as deprecated—which means the elements are obsolete and whilst they still exist in this version they will be removed in a later revision. This was to encourage better and more semantic use of HTML in documents (described in more detail our The web standards model).
HTML 4.01 was published in 1999, with some errata noted in 2001. This is the latest version of HTML, although HTML 5 is currently being drafted.
In 2000, the W3C also published the XHTML 1.0 specification, which was HTML re-structured to be a valid XML document.
In 2007, the W3C restarted the work on HTML by creating a new working group and adopting the work started by the WhatWG as HTML5. In this course we will be using HTML5, but don't worry — if you have already done some work in HTML4, you won't need to relearn everything. HTML5 contains all of HTML4 (albeit with some features redefined), and also adds some new powerful features on top. We'll make it clear when something we are talking about is new in HTML5.
In this article, you have learned the basics of HTML, where it has evolved from and have some insight into the structure of an HTML document. We will now continue to describe the
head
section of an HTML document in some more detail, before continuing to address the body
content.
Note: This material was originally published as part of the Opera Web Standards Curriculum, available as 12: The basics of HTML, written by Mark Norman Francis. Like the original, it is published under the Creative Commons Attribution, Non Commercial - Share Alike 2.5 license
No comments:
Post a Comment