XHTML 1.0: Where XML and HTML meet.
Short 1-page introduction into XHTML.

What is it?
The XHTML family is a reformulation of HTML 4 in XML. It represents the next step in the evolution of the Internet. It lets you enter the XML world, creating content that is both backward and future compatible. XHTML became an official W3C recommendation on January 26, so when you read this you are on the cutting edge of it all.

Benefits and transition path
Old HTML sins to overcome
Old HTML sins (continued)
Backwards compatibility to HTML
Transition pains from HTML to XHTML
Modularization of XHTML
Extension Modules
Document Profiles and Conclusion
Why all this? What is the problem with good old HTML?
We explored the shortcomings of HTML a while ago, in a nutshell:

HTML has a fixed set of elements, so it is not easily extended to different needs
HTML was designed with the PC in mind, not taking into account the multitude of alternative platforms coming to the Web, like TVs, mobile phones and digital tablets.
HTML is defined relatively sloppily, requiring parsers to be quite forgiving and intelligent in fixing problematic markup on the fly. This intelligence weighs in heavy on your hard disk with your favorite browser.

So what is there to gain?
Citing the W3C recommendation on XHTML (in rearranged order of my humbly perceived importance):

"Document developers and user agent designers are constantly discovering new ways to express their ideas through new markup. In XML, it is relatively easy to introduce new elements or additional element attributes. The XHTML family is designed to accommodate these extensions through XHTML modules and techniques for developing new XHTML-conforming modules (described in the forthcoming XHTML Modularization specification). These modules will permit the combination of existing and new feature sets when developing content and when designing new user agents.
Alternate ways of accessing the Internet are constantly being introduced. Some estimates indicate that by the year 2002, 75% of Internet document viewing will be carried out on these alternate platforms. The XHTML family is designed with general user agent interoperability in mind. Through a new user agent and document profiling mechanism, servers, proxies, and user agents will be able to perform best effort content transformation. Ultimately, it will be possible to develop XHTML-conforming content that is usable by any XHTML-conforming user agent.
XHTML documents conform to the XML standard, so they are readily viewed, edited, and validated with standard XML tools.
XHTML documents can still be written to operate as well or better than they did before in existing HTML 4-conforming user agents as well as in new, XHTML 1.0 conforming user agents.
XHTML documents can utilize applications (e.g. scripts and applets) that rely upon either the HTML Document Object Model or the XML Document Object Model [DOM].
As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely to interoperate within and among various XHTML environments."

How do HTML documents become XHTML?
An XHTML document must:

validate against one of the three DTDs.
start with the root element <html>.
refer to the XHTML namespace http://www.w3.org/1999/xhtml in its root element.
contain one of the following DOCTYPE declaration prior to the root element:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "DTD/xhtml1-frameset.dtd">

Here is an example of a minimal XHTML document:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <title>My first XHTML page</title>
  </head>
  <body>
    <p>Hello XHTML world!</p>
  </body>
</html>

=====================================================
Which old HTML sins to overcome?
The Ten Commandments of XHTML are:

Fix up all documents that are not well-formed
Well-formedness is a new concept introduced by XML. Essentially this means that all elements must be properly nested. Although overlapping is illegal in SGML, it was widely tolerated in existing browsers.
NOT: <p>this is <em>emphasized</p></em>
BUT: <p>this is <em>emphasized</em></p>

Change all tags and attribute names to lower case
XHTML documents must use lower case for all HTML element and attribute names. XML is case-sensitive, so for instance <li> and <LI> are different tags.

Add end tags for non-empty elements
In SGML-based HTML 4 certain elements were permitted to omit the end tag; with the elements that followed implying closure. This omission is not permitted in XML-based XHTML. All elements other than those declared in the DTD as EMPTY must have an end tag.
NOT: <p>one<p>two
BUT: <p>one</p><p>two</p>

Quote all attribute values
All attribute values must be quoted, even those which appear to be numeric.
NOT: <table rows=3>
BUT: <table rows="3">

Unminimize attributes
XML does not support attribute minimization, attribute-value pairs must be written in full. Attribute names such as compact and checked cannot occur in elements without their value being specified.
NOT: <dl compact>
BUT: <dl compact="compact">

Correctly tag empty elements
Empty elements must either have an end tag or the start tag must end with />. For instance, <br/> or <hr></hr>. See below for information on ways to ensure this is backward compatible with HTML 4 user agents.
NOT: <br><hr>
BUT: <br/><hr/>

Pay attention to whitespace handling in attribute values
In attribute values, user agents will strip leading and trailing whitespace from attribute values and map sequences of one or more whitespace characters (including line breaks) to a single inter-word space (an ASCII space character for western scripts).

Escape or externalize script and style elements
In XHTML, the script and style elements are declared as having parsed character content. As a result, < and & will be treated as the start of markup, and entities such as &lt; and &amp; will be recognized as entity references by the XML processor to < and & respectively. Wrapping the content of the script or style element within a CDATA marked section avoids the expansion of these entities.
<script>
 <![CDATA[
 ... unescaped script content ...
 ]]>
 </script>

CDATA sections are recognized by the XML processor and appear as nodes in the Document Object Model. An alternative is to use external script and style documents.

Stick to the existing SGML exclusions
SGML gives the writer of a DTD the ability to exclude specific elements from being contained within an element. Such prohibitions (called "exclusions") are not possible in XML. For example, the HTML 4 Strict DTD does not allow the nesting of an 'a' element within another 'a' element. It is not possible to express this in XML. Even though these restrictions cannot be defined in the DTD, certain elements should not be nested.

Use id for fragment identifiers, not name
HTML 4 defined the name attribute for the elements a, applet, form, frame, iframe, img, and map. HTML 4 also introduced the id attribute. Both of these attributes are designed to be used as fragment identifiers.
In XML, fragment identifiers are of type ID, and there can only be a single attribute of type ID per element. Therefore, in XHTML 1.0 the id attribute is defined to be of type ID. In order to ensure that XHTML 1.0 documents are well-structured XML documents, XHTML 1.0 documents must use the id attribute when defining fragment identifiers, even on elements that historically have also had a name attribute. In XHTML 1.0, the name attribute of these elements is formally deprecated, and will be removed in a subsequent version of XHTML.

=====================================================
How to ensure backwards compatibility?
Here are some design guidelines to follow for XHTML documents to render correctly in existing HTML user agents.

Properly format empty elements I
Include a space before the trailing / and > of empty elements, e.g. <br />, <hr /> and <img src="photo.jpg" alt="Photo" />. Also, use the minimized tag syntax for empty elements, e.g. <br />, as the alternative syntax <br></br> allowed by XML gives uncertain results in many existing user agents.

Properly format empty elements II
Given an empty instance of an element whose content model is not EMPTY (for example, an empty title or paragraph) do not use the minimized form (e.g. use <p> </p> and not <p />).

Eliminate embedded Style Sheets and Scripts
Use external style sheets and scripts if they use < or & or ]]> or •. XML parsers are permitted to silently remove the contents of comments, so the historical practice of "hiding" scripts and style sheets within comments to make the documents backward compatible is likely to not work as expected in XML-based implementations.

Avoid Line Breaks within Attribute Values
Avoid line breaks and multiple whitespace characters within attribute values. These are handled inconsistently by user agents.

Use only one isindex element
Don't include more than one isindex element in the document head. It is deprecated in favor of the input element.

Use the lang and xml:lang Attributes
Use both the lang and xml:lang attributes when specifying the language of an element.

Fix up Fragment Identifiers
In XML, URIs [RFC2396] that end with fragment identifiers of the form "#target" do not refer to elements with an attribute name="target"; rather, they refer to elements with an attribute defined to be of type ID, e.g., the id attribute in HTML 4. Many existing HTML clients don't support the use of ID-type attributes in this way, so identical values may be supplied for both of these attributes to ensure maximum forward and backward compatibility (e.g., <a id="target" name="target">...</a>).

Add the XML Character Encoding declaration
To specify a character encoding in the document, use both the encoding attribute specification on the xml declaration (e.g. <?xml version="1.0" encoding="EUC-JP"?>) and a meta http-equiv statement (e.g. <meta http-equiv="Content-type" content='text/html; charset="EUC-JP"' >).

Use Ampersand entities in Attribute Values
When an attribute value contains an ampersand, it must be expressed as a character entity reference (e.g. "&amp"). For example, when the href attribute of the a element refers to a CGI script that takes parameters, it must be expressed as
NOT: http://webref.com/cgi-bin/xml/demo1.pl?style=none&name=user
BUT: http://webref.com/cgi-bin/xml/demo1.pl?style=none&amp;name=user

=====================================================
Any transition pains?
Unfortunately, yes. Some of the subtle differences in HTML and XML encoding cause some difficulties:

Boolean Attributes
Some browsers cannot interpret boolean attributes when these appear in their full, non-minimized form, as required by XML 1.0. This problem doesn't affect user agents compliant with HTML 4, though. The following attributes are involved: compact, nowrap, ismap, declare, noshade, checked, disabled, readonly, multiple, selected, noresize, defer .

Document Object Model and XHTML
The Document Object Model level 1 Recommendation defines document object model interfaces for XML and HTML 4. The HTML 4 document object model specifies that HTML element and attribute names are returned in upper-case. The XML document object model specifies that element and attribute names are returned in the case they are specified. In XHTML 1.0, elements and attributes are specified in lower-case. This apparent difference can be addressed in two ways:
Applications that access XHTML documents served as Internet media type text/html via the DOM can use the HTML DOM, and can rely upon element and attribute names being returned in upper-case from those interfaces.
Applications that access XHTML documents served as Internet media types text/xml or application/xml can also use the XML DOM. Elements and attributes will be returned in lower-case. Also, some XHTML elements may or may not appear in the object tree because they are optional in the content model (e.g. the tbody element within table). This occurs because in HTML 4 some elements were permitted to be minimized such that their start and end tags are both omitted (an SGML feature). This is not possible in XML. Rather than require document authors to insert extraneous elements, XHTML has made the elements optional. Applications need to adapt to this accordingly.

XML Processing Instructions
Be aware that processing instructions are rendered on some user agents. However, also note that when the XML declaration is not included in a document, the document can only use the default character encodings UTF-8 or UTF-16.

Cascading Style Sheets (CSS) and XHTML
The Cascading Style Sheets level 2 Recommendation [CSS2] defines style properties which are applied to the parse tree of the HTML or XML document. Differences in parsing will produce different visual or aural results, depending on the selectors used. The following hints will reduce this effect for documents which are served without modification as both media types:
CSS style sheets for XHTML should use lower case element and attribute names. In tables, the tbody element will be inferred by the parser of an HTML user agent, but not by the parser of an XML user agent. Therefore you should always explicitly add a tbody element if it is referred to in a CSS selector.
Within the XHTML name space, user agents are expected to recognize the "id" attribute as an attribute of type ID. Therefore, style sheets should be able to continue using the shorthand "#" selector syntax even if the user agent does not read the DTD.
Within the XHTML name space, user agents are expected to recognize the "class" attribute. Therefore, style sheets should be able to continue using the shorthand "." selector syntax.
CSS defines different conformance rules for HTML and XML documents; be aware that
  the HTML rules apply to XHTML documents delivered as HTML
     and
  the XML rules apply to XHTML documents delivered as XML.

=====================================================
XHTML Modules
XHTML modules specify well-defined sets of XHTML elements that can be combined and extended to deliver content on a greater number and diversity of platforms.

Modularizing XHTML provides a means for product designers to specify which elements are supported by a device using standard building blocks and standard methods for specifying which building blocks are used. It is not economically feasible for content developers to tailor content to each and every permutation of XHTML elements. By specifying a standard, either software processes can autonomously tailor content to a device, or the device can automatically load the software required to process a module.

Modularization also allows for the extension of XHTML's layout and presentation capabilities, using the extensibility of XML, without breaking the XHTML standard. This development path provides a stable, useful, and implementable framework for content developers and publishers to manage the rapid pace of technological change on the Web.

The modules themselves are not yet finalized, nevertheless it is useful to get a feeling for the proposed granularity of those modules, and I personally do not expect significant changes to the final recommendation.

Basic Modules
The basic modules are modules that are required to be present in any XHTML Family Conforming Document Type.

Structure Module
The Structure Module defines the major structural elements for XHTML. These elements effectively act as the basis for the content model of many XHTML family document types. The elements and attributes included in this module are:

html
head
title
body
This module is the basic structural definition for XHTML content. The html element acts as the root element for all XHTML Family Document Types.

Basic Text Module
This module defines all of the basic text container elements, attributes, and their content model. Some prominent examples are the headings h1, h2, h3, h4, h5, h6, block directives address, blockquote, div, p, pre, and inline tags abbr, acronym, br, cite, code, dfn, em, kbd, q, samp, span, strong, var.

Hypertext Module
This module adds the a element to the Inline content set of the Basic Text Module.

List Module
As its name suggests, the List Module provides list-oriented elements. Specifically, the List Module supports the elements:

dl, containing dt's and dd's
ol, containg li's
ul, containing li's

=====================================================
Extension Modules
While all user agents have to support the former, they may or may not support any of the following modules, many of which contain only one or two elements:

Applet: supports the applet and param elements
Text extension: defines a variety of additional textual markup modules
Presentation: contains character modifiers like b, big, i, small, sub, sup, tt, and the horizontal ruler hr.
Edit: deletions and insertions with del and ins for citations, dates, or inline content.
BDO: can be used to declare the bi-directional rules for the element's content.
Forms: There is one module for the forms features found in HTML 3.2, and one for those in HTML 4.0.
Tables: One basic module for the table-related elements table, td, tr, th and caption, and a more advanced one for table-related elements that improve access with non-visual user agents.
Images: basic image embedding with the img tag, and may be used in some implementations independently of client side image maps.
Client-side Image Maps: the area and map elements for client side image maps. It requires that the Image Module (or another module that supports the img element) be included.
Server-side Image Maps: provides support for image-selection and transmission of selection coordinates. It requires that the Image Module (or another module that supports the img element) be included. The Server-side Image Map Module adds the ismap attribute to the img tag.
Objects: elements for general-purpose object inclusion. Specifically, the Object Module sports the object and param tags.
Frames: all frame-related elements like frameset, frame, noframes.
Iframes: the iframe element that can be used to define a base URL against which relative URIs in the document will be resolved.
Events: all of the well-known onXXX event handler attributes such as onload, onfocus.
Metainformation: the meta element that describes information within the declarative portion of a document (in XHTML within the head element).
Scripting: elements that are used to contain information pertaining to executable scripts or the lack of support for executable scripts, namely script and noscript.
Stylesheet Module: enables style sheet processing with the style element.
Link Module: used to define links to external resources with the link element.
URL base: the base element that can be used to define a base URL against which relative URIs in the document will be resolved.
Legacy: elements and attributes that were deprecated in previous versions of HTML and XHTML, namely font, s, strike, u, body attributes background, bgcolor, text, link, vlink, alink, br attribute clear, strike, and u.

=====================================================
Document Profiles
With the modularization of XHTML we have solved one problem: The increasingly proliferating types of Web clients can pick and choose specific subsets of the HTML standard to meaningfully support them given their form factors and display capabilities. But we created two new problems:

How does a client advertise its rendering capabilities to a server?
How does a document express the modules used for its content?
The latter will be adressed through not yet specified document profiles, but will the former remain the art of mapping user-agent-strings to capabilities? Who knows... you?

Conclusion
The reformulation of HTML in XML is an elegant way to bring both worlds together in a future-oriented but compatible way. While it is unlikely that every hand-written HTML page will be upgraded to XHTML, writing new pages in XHTML and improving the templates of page generators like ASP and JSP will give your documents a much wider audience in the brave new world of non-PC Web clients. Make sure you do the best you can to make it happen:

If you are a Web master, start planning the migration of your site to XHTML now!
If you are an HTML tool author, upgrade your tool to support document creation in XHTML now!
If you write HTML documents, adhere to the XHTML rules and DTDs now! Use the W3C validator to be sure you got it right. (Ja, this document has been validated. The remaining errors in line 30, 84, and 92 are caused by server-side includes not under my direct control.)
Thank you, in the name of all current and future owners of Web-enabled PDAs, phones, TVs, and toasters!
=====================================================