HTML vs XHTML

From SwinBrain

XHTML is the reformulation of HTML 4 as an XML 1.0 application. This means that the syntax standards of XML apply to the way we write our HTML elements and attributes.

XHTML documents, because they must conform to the syntax rules of XML, have a cleaner and more precise markup structure. This is a good thing!

By ensuring that documents are created using well-formed and valid XHTML documents, authors are more likely to get the result they want from both styling with CSS and other uses of the DOM such as DHTML features (ie. client-side JavaScript for form validation) and web search index results.

XHTML vs HTML Syntax Rules

  • XHTML elements must be properly nested. Although this is also required for SGML based HTML, incorrect nesting is tolerated by almost all browsers, which does not encourage proper nesting.
  • XHTML elements must always be closed with a closing tag or or empty element syntax />
  • XHTML elements must be in lowercase, because XML is case-sensitive and the W3C decided to make everything lowercase.
  • XHTML element attribute names must be in lowercase (for the same reason as element names)
  • XHTML documents must have one root element

HTML 4 Doctypes

In HTML 4, three different document types (DTD's) were defined to help divide the many elements used by web developers into more manageable, and specific, documents.

This is the declaration to include HTML 4.01 Strict DTD. Note the the name part does not include the word "Strict".

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

Here is the declaration for the Transitional document type. Unfortunately, the DTD file is named "loose.dtd" which can cause some confusion. (XHTML 1.0 fixes this by being consistent with names and dtd filenames).

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

Finally, the inclusion of framesets justifies a separate doctype.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">

XHTML 1.0 Doctypes

As in HTML 4, there are three document type (DTD) standards for XHTML 1.0: Strict, Transitional and Frameset. Note: the naming is - thankfully - consistent for document type names and for dtd file names!

Strict is the smallest standard. Start here and avoid deprecated elements

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Transitional includes everything in the Strict but includes deprecated elements and attributes:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Frameset includes everything in the Transitional plus elements for frames as well:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">