What is HTML?

HTML stands for Hypertext Markup Language which is the coding used when designing pages for the World Wide Web. HTML is a special type of language which has a number of different commands known as tags. These tags tell the Internet browser what to display in terms of formatting text in different styles, fonts and colors, what type of background and images should be displayed, and where different links should go to.

The pre-cursor to HTML was a language called ENQUIRE, which was designed by a researcher at the CERN nuclear laboratory in Switzerland called Tim Berners-Lee. The original language was designed as a way for researchers at the laboratory to share documents, but was further developed with the help of Robert Cailliau, also from CERN, and proposed for use on the Internet in 1989. HTML was made public for use on the Internet on 1991, although the first standards did not become available 1993. The initial specification was followed by HTML 2 in 1997 and then HTML 3.2 in 1997, with the original HTML 3 never being followed through on from its proposal. The latest version, HTML 4, was published back in 1999, although HTML 5 has been published in draft form and is expected to be published soon.

Even though HTML has contributed significantly to the growth of the Internet, many complaints have arisen as the code has become more complex, as this has made it more complicated and awkward to use. HTML 4 tried to address these complaints by removing some of the tags which were considered to be particularly problematic. A further step was taken with the launch of the Extensible Hypertext Markup Language (XHTML) which is a cross between Extensible Markup Language (XML) and HTML. This was designed to be both easier to produce and also to check although this has not impacted as far as intended as there are a number of browsers which are not able to support this language. This means that HTML remains the dominant Web programming language.

There are three main components of HTML – elements, data types and character references. Elements are the main commands in the HTML language which tell the browser about the formatting of the items shown on the page. For example the <p> command tells the browser to start a new paragraph. Other elements such as <body> also have attributes with which they are combined. So the <body> element tells the browser that the language in that section is concerned with the main body of the page and the associated attribute text=”#FFFFFF” would indicate that the text in that section should be white. Data types can either be used by the elements, such as style sheets and scripts, or used with attributes, such as different colors, styles and sizes. Character references tell the page when symbols need to be included on the page, and there are more than 250 of these altogether. Many symbols are used within HTML and so special character references need to be used to tell the browser when that symbol needs to actually appear rather than forming part of the HTML script.

HTML files can be written in a basic text program such as Notepad in Microsoft Windows and SimpleText on Apple Macs. In order to do this a sound knowledge of HTML is needed as there is no means of checking what is being written in these programs other than loading the file into a browser. A better option is to write in a program such as Dreamweaver, which integrates a HTML writing capability into a more user-friendly interface. For example in Dreamweaver you are able to switch between HTML source code and a “What you see is what you get” (WYSIWYG) screen which lets you modify what you have entered as HTML to get the look you want.