A Beginner's Notes for HTML Learning

Because I want to get better and grow stronger, I will study hard..er :)

Overview of HTML

HTML stands for "Hyper Text Markup Language", a markup language for hypertext. The term "hypertext" can chase back to even 1940s and originally means a text contained a hyperlink, a simple example can be "Link" in <a>Link</a>. Hypertext can then also mean a document which contains hyperlink(s) connected to other parts of the same or other sources.

On the other hand, "markup language" refers to a text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship between its parts (From Wikipedia). Other examples of markup language are XHTML (eXtensible Hypertext Markup Language) and XML (eXtensible Markup Language).

HTML is a language for creating webpage, defining its structure and giving meaning to different blocks of text for being rendered. Everything we see in the internet, be it a webpage or an application on mobile, more or less, contains HTML. (XHTML is also used for creating webpages, in hope of replacing HTML, which will not be covered here.)

The standard of HTML was once maintained by World Wide Web Consortium (W3C) for a very long time and the latest version W3C recommended was HTML5 in 2014, to replace HTML4.1, which was in practice for a whole 15 years since 1999. Compared with the previous versions, HTML5 introduced many interesting features and is considered much easier to use. In the later sections, those new features will be highlighted for ease of reference.

The current specification for HTML is maintained by WHATWG and is called HTML living standard.

Summary:
It is important to bear in mind that HTML is the backbone of the web and it has not stopped evolving since then. Along these years, there is ongoing adjustment to HTML in order to match the need following the rapid development of the internet.

^ Back to the top

Reasons for semantic HTML

Before heading to the reasons for writing HTML in a clear and semantic manner, it is crucial to know SEO.

SEO stands for Search Engine Optimization. The importance of SEO, in particular in Internet Marketing, can be summarized as below,

"SEO is performed because a website will receive more visitors from a search engine when websites rank higher on the search engine results page (SERP). These visitors can then potentially be converted into customers." - Wikipedia

To be fair... even if it is not for boosting the number of potential customers, it is reasonable that one still wants their webpage to be found and be seen by others - after all, that has been the spirit of internet from very beginning - to interconnect with others.

Accordingly, in the past (which was not so far ago), the algorithm the search engines employed heavily relied on the information provided by the webmasters/content providers. The latter could improve the ranking of their webpages by methods as simple as adding repeating or even inaccurate search terms / keywords in the metadata. Needless to say, the search result may then appear many irrelevant webpages due to the aforesaid manipulation from the side of webmasters.

To respond such behaviour, search engines started to developing more complex ranking algorithms which calculate the score of webpage in aspect of semantic signals holistically.

That is one of the main reasons and is a strong enough reason for writing HTML in a clear and semantic way. HTML may be plain and not very complicated to learn (compared with programming languages). Nonetheless, it is all about semantics which is crucial for SEO and therefore deserves to receive our attention.


Other than SEO, I think we should bear in mind that the cost of creating something in digital form may be low compared with the material world but it does not mean no cost. We should always treasure the resources we have at hands at the moment and utilize them.

Writing semantic HTML can lower the maintenance cost resulting in higher sustainability rate of the code itself. Again, HTML may be plain and not as fancy as programming languages or stylesheet lanaguage as CSS; however, it is an essential part of the internet and building up a good (sustainable) habit of writing HTML is just same as any other aforesaid languages.

^ Back to the top

How to write clearer HTML? (Good practices for HTML)

The following can serve as a brief reminder/guideline as a starter:

  1. Make good use of semantic tags, for example <main>, <section>, <nav>, <aside>, etc. ...instead of abusing <div> and <span> (only use those when no better options available).
  2. Give desciptive ID/class name, for example "#picture-left/.picture-center" so that one may refer to the concerned element in HTML more easily. ... instead of simply using abstract numbers to address the elements, like "#picture1".
  3. Try to develop a more compound class attribute with generic classes - one can consider the way how Bootstrap works, for example giving a forest picture in the center a class of "pictures picture-center picture-landscape". ... thus the element can be manipulated through various classes without always being referred to directly.
  4. Use lowercase and kebab-case throughout HTML, for better readability and consistency. ...let camelCase belong to Javascript.
  5. Use <h1> on the most important message of the whole page and only ONCE.
  6. When the attribute is too long for an element, open a new line for that.
  7. Prepare alternatives as fall-back for any form of multimedia rendering.
  8. Use labels for every labelable elements for better machine readability and accessibility.
  9. Add a title attribute in anchor elements (<a>) and most importantly, give it a meaningful desciption for same reason.
  10. Always validate and minify the HTML document. ...everyone may make mistakes one way or the other. This habit can help proofread your work and ensure the validity of your codes so it won't be just another tag soup in the internet.

By all means the above list is not exhaustive and can carry on and on endlessly. There are more to be put into consideration for a good coding practice - however minor and trival those may seem.

^ Back to the top

HTML elements

Elements are the core part of HTML, among which most are wrapped by a start tag/ an opening tag (<element name>) and an end tag/ a closing tag (</>), few have only a start tag (<element name />) - personally, the difference between two is similar to the difference between intransitive and transitive verbs in English. With the markup, User Agents (UAs), web browser as a typical example, can know how to render the content accordingly.

All elements/tags can be grouped into various categories. The following "Structure-based Category" and "Content-based Category" (my own madeup terms) are grouping elements by different logic, either where they are placed or what meaning they define.

  • Structure-based Category
  • Main Root
  • Document Metadata
  • Sectioning Root
  • Content Sectioning
  • Text Content
  • Inline Text Semantics
  • Images and Multimedia
  • SVG and MathML
  • Table Content
  • Form Content
  • Interactive Elements
  • Web Components

In the following lists, HTML elements are provided with a brief description and personal notes for highlight (some with example, if I am less familiar with).

Main root

Element Description
<html> Represents the root (top-level element) of an HTML document, therefore also referred to as Root element. Every webpage contains strictly only one HTML element. All other elements must be its descendants
^ Back to the top of this section

Sectioning Content

Element Description
<body> Represents the content of an HTML document. There can be only one <body> element in a document.
^ Back to the top of this section

Sectioning Content

Element Description
<body> Represents the content of an HTML document. There can be only one <body> element in a document.
^ Back to the top of this section

More thoughts on HTML Tags - Common combinations & Differences on Confusing Tags

Common combinations

Differences on Confusing Tags

^ Back to the top

HTML attributes

Global attributes

Other attributes

^ Back to the top
HTML: A good basis for accessibility, MDN
A further and detailed explanation for the importance of using Semantic HTML
Be aware of XHTML, David HAMMOND
One of many articles and discussions available online explained the issues related to XHTML.
The W3C Markup Validation Service, W3C
A HTML Validator provided by W3C.
HTML Living Standard, WHATWG
The official and live recommendation for HTML standard.
^ Back to the top