Skip navigation.

Book Review: 250 HTML and Web Design SecretsAll recent postsA Better, Stronger Home Page

ASP.NET Meets application/xhtml+xml

There has been a backlash of articles on serving XHTML with a proper Content-Type header. Having read documentation and blog posts on this saga I decided to try this out for myself since I do my best to serve proper XHTML here.

If you're not sure what I'm talking about and what the fuss about application/xhtml+xml is check out the following links for a helpful introduction to this subject (listed in the order of importance):

If you DID follow these links you know that serving XHTML as text/html pretty much defeats the purpose of authoring XHTML since it will be treated as HTML anyway. On the other hand, serve your content as application/xhtml+xml and Internet Explorer will blow up, while "conforming" browsers will attend to it as the spec says. To serve two masters we need to feed the good ol' text/html to IE, and application/xhtml+xml to Mozillas and Operas of the world.

Back when I published my findings on Unicode in VS.NET 2003 I alluded to the fact that you can highjack content type.

The trick is to change it before the headers and written out. I thought of my favorite way of handling such tasks: HTTP modules. One of the "nondeterministic" events an HttpModule fires is PreSendRequestHeaders. The even "occurs just before ASP.NET sends HTTP headers to the client." Perfect. This much accomplishes it:

public class ContentType : IHttpModule 
{
 
 public void Init(System.Web.HttpApplication application) {
      application.PreSendRequestHeaders += 
          new EventHandler(this.Application_PreSendRequestHeaders);
 }
 
 private void Application_PreSendRequestHeaders (object sender, 
                                            System.EventArgs e) 
 {
  HttpApplication app = ((HttpApplication)(sender));
  HttpContext context = app.Context;

  if (context.Request.Browser.Browser.ToUpper ().IndexOf("IE")==-1)
    context.Response.ContentType = "application/xhtml+xml";
  }
 
  public void Dispose() {}
}

I used my own Template Generator For HttpModules and HttpHandlers to create skeleton code of this HttpModule. All it does is look at the browser string submitted in the User-Agent header. There's no else statement since by default HttpResponse assigns text/html to its private member.

public HttpResponse(TextWriter writer)
{
 this._statusCode = 200;
 this._bufferOutput = true;
 this._contentType = "text/html";
  ...
}

Feeling of Remorse

As of today, all pages on this site are served as application/xhtml+xml to all browsers but IE. I'm willing to experiment. To be honest, this exercise puts me on the edge of my seat because a minor slipup in "well-formedness" and the page doesn't render (this is XML after all, so it "functions as designed"). The power of ASP.NET lies in delivering killer functionality to a developer and hiding all the plumbing. I just hope I lay out the plumbing right and nothing leaks.

We'll see how it goes. If anyone notices a catastrophe please shoot me an email.

August 3, 2004: I've changed the code above as suggested by Charl (see comments). This time around I check if the user agent accepts 'application/xhtml+xml'.

Comments

Comment permalink 1 Anne |
Couple of remarks: if the order of a list is important, use OL ;-). The W3C document is only a note. Per RFC 3023 XHTML must be sent as 'application/xhtml+xml'. You can now sign up for the X-Philes: http://www.goer.org/Markup/TheXPhiles/

O, and I'm glad you made it!
Comment permalink 2 Charl van Niekerk |
Shouldn't you rather sniff the Accept header instead? Then search engines will also get it under 'text/html', because otherwise it might have negative impact on your search engine listings. (They will still get your page, but your 'title' and such won't get special treatment.)

I have a simple example here which you can look at if you like. Forgive me for not using HTTP Modules and such, since my ASP.NET coding is far from advanced. ;-)
Comment permalink 3 Richard |
Have you considered, for example. Googlebot? It can't parse application/xhtml+xml, and it's not IE. I'd think again about your strategy unless you want your site to fall out of every search engine on the planet. That's without mentioning the large number of minor and older browsers and other devices (phones, etc.) which aren't IE either and don't grok application/xhtml+xml.

As the above comment mentions, you need to serve text/html by default, and application/xhtml+xml to browsers which include the latter string in the Accept field.
Comment permalink 4 Milan Negovan |
Charl and Richard: good thought about search engines. Least of all I want trouble with them. If it doesn't understand 'application/xhtml+xml' wouldn't it simply ignore it and parse the page anyway?

Charl: I love your idea with parsing the Accept header instead. I think it's much safer this way.
Comment permalink 5 Richard |
I've done some tests with application/xhtml+xml pages and Googlebot - I managed to get one into the SERPS, but with no title, the cached version was blank, and it was listed as filetype unknown. Let's just say that it was not well-ranked either.

Googlebot can read application/xml, but only as generic XML, parsed as plain text, with the title tag not recognized and no weight to semantic tags such as h1. I've not tested with Yahoo Slurp, but I suspect it will be similar.

One of the advantages of sending XHTML as text/html to legacy browsers is that you can read it in everything from Lynx upwards. If you exclude everything but IE and the latest popular browsers, it is no better than sticking a "Best viewed with browser X" sign on every page - back to the dark days of 1997 all over again.

Serve XHTML pages as application/xhtml+xml to user agents that can cope with it, yes absolutely. Excluding those who don't have the latest and greatest browser, not good.

PS. it was nice code anyway!
Comment permalink 6 Kjetil Hjartnes |
Not only will this method become problematic with search engines, but I also think it's more likely that people are tampering with the User agent header sent from the browser (to fake their UA, or whatever reason they may have) than the Accept header.
By the way, according to http://www.w3.org/People/mimasa/test/xhtml/media-types/results also Opera 5.0 and Opera 5.12 doesn't support 'application/xhtml+xml' either.

Emails and Notifications

Would you like to be notified when somebody responds to this post?  Would you like to have these comments emailed to you?

Submit your comment

Please enter only text since all HTML tags except hyperlinks will be stripped. Hyperlinks will become live links. Any comments with flaming or offensive language will be deleted. Be courteous to other posters. Thank you.

Your name (required):
Your email (optional):
Your site's URL (optional):
Enter this number
Type in the number above:
Comment (required):