HTML5 + RDFa = time to get rid of that 20th century furniture

We're entering a new era of the web. To the ignorant masses, this transition will go largely unnoticed; they'll enjoy increased usability and convenience, with more robust functionality and more relevant data at hand. And they'll mostly just take it for granted.

However, for web designers, front-end developers and data system programmers, we have a lot of work to do.

Why HTML5?

Why indeed? As someone who's worked almost exclusively with Drupal since 2004, my nose has been pretty much in xhtml 1.1. Back then, moving to xhtml took some learning and patience on my part, having played with basic HTML since 1995. Now xhtml feels like the familiar friend and HTML the ugly cousin.

But then I started really looking at HTML5. And the more I am learning about it, the more I am appreciating how HTML5 looks to be a real game-changer.

HTML5

DOMinate the web

Most of the buzz you see online about HTML5 focuses on the particulars — with the plurality of coverage over how HTML5's media tags stand to push most uses of Flash out to pasture. And that's certainly big.

However, there's something more fundamental in the change HTML5 is bringing to the web. I quote from Introducing HTML5 (Voices That Matter), by Bruce Lawson and Remy Sharp:

Many of our current methods of developing sites and applications rely on undocumented (or at least unspecified) features incorporated into browsers over time. For example, XMLHttp-Request (XHR) powers untold numbers of Ajax-driven sites. It was invented by Microsoft, and subsequently reverse engineered and incorporated into all other browsers, but had never been specified as a standard…. So one of the first tasks of HTML5 was to document the undocumented, in order to increase interoperability by leaving less to guesswork for web authors and implementors of browser.

This is big in itself. But it's not even the biggest thing, imho.

It was also necessary to unambiguously define how browsers and other user agents should deal with invalid markup…. The barrier to entry to publishing on the Web was democratically low, but each browser was free to decide how to render bad code. Something as simple as Hello mum! (note the mismatched closing tags) produces different DOMs in different browsers. Different DOMs can cause the same CSS to have a completely different rendering, and they can make writing JavaScript that runs across browsers much harder than it need be….

…HTML5 specifies new DOM APIs for drag and drop, server-sent events, drawing, video, and the like. These new interfaces that HTML pages expose to JavaScript via objects in the DOM make it easier to write such applications using tightly specified standards rather than barely documented hacks.

In other words, by clarifying specifics — especially in error handling — HTML5 stands to open the doors for much more efficient and effective JavaScript, heralding a new era for robust interactivity with dynamic interfaces and rich user experiences that would be too heavy and difficult, or impossible, to implement in xhtml or HTML4.

Suddenly that existing markup you have is starting to look kind of musty.

With new language comes new ways of thinking

The other thing to consider is how the web, and the nature of websites themselves, will change as the collective creativity of web designers worldwide starts to not just understand the syntax of HTML5 but also grok on an intuitive, subconscious level how websites can really let go of being collections of pages and embrace their web application natures.

There's a lot of old conventional thinking that is suddenly up for question. For example, while a website as a "navigation menu," an application may instead have a "toolbar": Does that change how you think about those links at the top of your page? People may be less interested in browsing, more interested in searching: Does that affect the role the search form plays in your interface layout?

Of course, for those of us who've been working building software-driven sites (you know, the "Web 2.0" things), this kind of thinking may have been percolating for a while.

Hopefully we'll also have learned the lessons taught by the untold numbers of Flash website designers, who gave us splash pages, annoying, gratuitous motion effects (with obnoxious sound effects), and user interfaces more focused on dazzling the user with the creator's cleverness rather than on serving the user with an interface that serves the user's needs. Here's hoping that HTML5 does not bring us into a new age of craptastic blinky poppy wooshy buzzy design.

We can do it

HTML5 and Drupal

This is a lively and ongoing process that, so far, has few allies — mostly I think because of the relative obscurity of HTML5 and the design affordances it brings in relation to the world of PHP/Drupal developers. Hopefully that's changing a small but rapidly growing core of themers and developers interested in making this happen, especially in the past few days. Very exciting.

Wither RDFa?

The other side of this revolutionary coin (how many metaphors can I mix into this post?) is the growth and real-world application of RDFa.

In case you didn't know it, RDFa is already here. Google is consuming it where it finds it, using it to generate more accurate and relevant search results. Best Buy is now famous for having enjoyed a 30% increase in sales since incorporating RDFa into their online shopping site.

RDFa

Robots speaking in complete sentences

That's the net effect of RDFa. You see, currently hyperlinks, to robots, are analagous mystery meat navigation. To the human reader, the nature and location of a hyperlink may make total sense in context — even the mouse-over reveal of the hyperlink URL can yield meaning to us cerebral bipeds. But all a plain hyperlink says to a robot is "follow me." The robot doesn't know who made the link (or the destination site), what the destination is, why the link is there — just where the link is going, and that only by its URL.

RDFa changes that by providing syntax, often abbreviated and/or abetted by libraries such as Dublin Core, to add meaning to the link.

Excuse me, did I say something?

One event happening (almost certainly) this year that could make for some very interesting RDFa developments will be the launch of Drupal 7. Historically, the Drupal user base, community and number of downloads have doubled with each major release of Drupal. And there are a number of factors that suggest that Drupal 7 will be no exception.

What's interesting about this is that Drupal 7 core implements RDFa. This means there are going to be umpteen oodles of websites and web apps out there talking RDFa — and, for most of them, without understanding the language. The amateurs are going to be joining the RDFa party. And that could become cacophonous.

This can mean an RDFa-enabled Tower of Babel. And that prospect has prompted some skeptics to argue against the semantic web. As they see it, we should make robots understand how we make the web, rather than try to remake the web so robots understand. It's an interesting topic, and I recommend Kate Ray's short video on the subject:

[Source: Web 3.0, on Vimeo

From where I sit, perhaps the ideal falls somewhere in a combination? There's no question that Open Data, for example, benefits from semantics. But on the other hand, it would be great to develop parsers that can interpret in existing contexts the underlying semantics of existing content, much like HTML5 provides for the kinds of sloppy markup errors that happen in the web (especially with user-generated content).

Nevertheless, it's hard to dismiss the potential of RDFa when there are be thousands of web developers bringing their creativity and initiative to the rave. For example, in Drupal, one project with interesting potential is AutoRDF, a Google Summer of Code project by Tushar Mahajan, that promises to "automatically tag node content. It will find important words and patterns in a node to tag important keyword and Rdf'ing it. It will build a taxonomy tree."

The SPARQLy Views

The fact that Drupal 7 will play a role in this democratized explosion of RDFa-related endeavors is itself very exciting. And that's not just in the RDFa structure on publishing content, but also in reaching out and pulling in structured RDFa data from elsewhere.

In other words, the web can be like one big website. Jane's site could query Joe's site's data without having any direct access to Joe's database.

One example is the a module (still in development) that leverages the power and flexibility of the Views module to consume and present SPARQL query data.

The SPARQL Views module by Lin Clark will likely be a first introduction for many to the kinds of wondrous things RDFa on the web can enable. Its drag-and-drop query builder will empower people with ability to plug SPARQL queries into Views for customized presentation. Awesome! (Disclosure: I am a Google Summer of Code mentor on this project, so feel free to take my enthusiasm with a grain of salt.)

I want my SEO

Ultimately a — if not the — big convincer for adoption of RDFa will be its effectiveness in getting content noticed. Years ago, Drupal seemed natively and naturally to beat other CMSs when it came to SEO. Drupal sites rose quickly to the top. Will it happen again with Drupal 7 and RDFa? We assume so, but the proof is in the results, and those are still many weeks, perhaps months, away yet.

But imagine what the web will be like when so many sites can be queried and polled from the outside. Information is power, shared information is empowering. When the World Wide Web becomes the World Wide Database, watch out. We'll look back at 2010 as the quaint time of horse and buggy.

Out with the old, in with the new

What this all means is that change is upon us. And all of us web designers, programmers, database administrators, information architects, strategists, and front-end developers need to get busy. We have some learning to do. Some new skills to perfect. Some new thinking to explore. Some new best practices to embrace.

The end user may not really notice as these improvements in user experience roll out. But the end user will notice if you're not rolling with these improvements. Just as a "Web 1.0" site from 1999 stands out to us today as an archaic relic, today's "Web 2.0" sites are going to feel very outdated a few years from now. And just as sites with table-based layouts or built in Flash can make for frustrating user experiences (or out and out inaccessibility) on today's delivery platforms like handhelds and tablets, websites built without the front-end affordances made possible with HTML5 may seem limited to end users, and websites without semantic metadata layers on their content may come off as rather opaque.

As a Drupal aficionado, I'm delighted to have Drupal 7 leading the way on RDFa. However, Drupal is still deeply entrenched in xhtml. The HTML5 phenomenon, while a long time coming, really sparked to life only in the last year — too late to be embraced and incorporated into Drupal 7's core.

But perhaps that's a good thing, because things can happen rapidly in Drupal contrib, and HTML5 is still evolving. So there's a new HTML5 working group on g.d.o to drive towards an HTML5 module (to transform all the markup of Drupal into HTML5-valid code) and an HTML5 base theme.

It's all very exciting. We are in interesting times. Don't stand still. The Drop is always moving.

For further reading:

If you're an Amazon shopper, here's a handy link to Lawson and Sharp's excellent book:

We want to work with you!