HTML is probably the worlds most important data format, and so changes come very slowly. But the World Wide Web Consortiums HTML Working Group has big plans for HTML.
The current HTML standard is actually now XHTML 1.0 Second Edition, which is a set of minor changes to HTML to turn it into valid XML while still allowing current Web browsers to handle it.
The big news is XHTML 2.0, a from-the-ground-up rethinking of HTML that keeps its strengths while ditching some long-standing stupidities or legacy items. It will not be fully backward-compatible with XHTML, so the W3C has freedom to introduce some fundamental (and much needed) redesigns. (Click here for the current working draft.)
XHTML 2.0 is far off—and since updated browsers will be needed in order to view it, it could be nothing else. The W3C anticipates an XHTML 2.0 Candidate Recommendation in October 2003 and Proposed Recommendation in July 2004, so we wont be seeing compliant software until the end of 2004.
XHTML 2.0 has many important changes for Web developers. Id like to focus on changes to HTML forms handling, since this area of HTML is undergoing a total rewrite, with many enterprise-friendly improvements on tap.
XHTML 2.0 will use XForms 1.0 as its form-handling technology. XHTML, of course, uses a XML-based syntax to describe forms and parameters.
My single favorite improvement is the addition of strong data typing to XHTML forms. Theres now a formalized way to mark form fields as required instead of having to write JavaScript or server-side logic to always check for this.
Any forms parameter can have an associated XML Schema data type. There are the basic integer and float types, of course, but the real winner is XML Schemas support for ranges, enumerations and arbitrary string regular expressions (allowing text fields to be very precisely input-masked).
XML Schema actually has more than double the number data types in SQL, so XForms will have stronger data typing than the databases storing the data they collect! I am a database guy, so data cleanliness is a big deal to me.
Since this input-checking happens client-side, this wont have a security advantage (you have to check again at the server since the browser is easy to bypass), but it will be a big usability and data-correctness win.
One area that still needs work is date handling, as the XML Schema date and time formats are both precise and cumbersome. I hope user agents (browsers) have some date parsing intelligence to try to normalize different date styles before validating them. Because the browser will know what kind of data type is expected, it could even do something like pop up a calendar control to avoid the whole date parsing issue entirely.
XForms also takes form presentation and formatting (fonts, colors, and so on) out of the form itself. Even the type of form control is not specified any longer. Instead, user agents will (by default) render the form in the most appropriate way given the type of user hardware. Style sheets can be used to change this default formatting as Web designers desire.
Forms can be initialized with XML data in a structured way instead of having to fill in value data in an HTML form the way we do today—one could even initialize an XForms form with the result of a Web service query to provide easy personalization.
By default, XForms will submit data in XML format, so servers will need to handle forms parsing differently. This should be transparent to Web application code, though, as the Web scripting engine should take care of the necessary parsing.
The W3Cs XForms homepage lists early implementations of client and server software that already support XForms, so you can try it out now. I cant wait to deploy this technology in our own applications. There are big benefits for users and developers ahead.
What changes to HTML would you like to see? I can be reached at timothy_dyck@ziffdavis.com.