November 18th, 2006


content negotation breakdown

IE7's content negotiation appears slightly broken. While checking out my web site in IE7, I clicked on the Card Games link on my home page (under Widgets), and was presented with only the widget screenshot JPG. Ok, I'll rewind a bit for some background.

The /widgets/ directory contains (among other things) card-games.html and card-games.jpg. The HTML file refers to the JPG file to show a screenshot of the widget. The home page links to /widgets/card-games (no extension), which Apache (using MultiViews) usually resolves to card-games.html. Actually, Apache's content negotiation relies on what the browser declares as its preferred formats. It's a conceptually simple idea but not at all straightforward in practice.

Firefox uses an HTTP Accept: header with a higher q value for text/html than for image/jpeg. IE7 asks for all kinds of formats without specifying any q values at all. Whatever Apache is doing, it seems to be selecting the card-games.jpg file when IE7 asks for the /widgets/card-games URL.

The weird thing is, after clicking on Card Games from the home page in IE7 and getting only the screenshot, pressing Reload causes Apache to serve the card-games.html file and all appears well.

I suppose what I need to do is rename the screenshot file so it's called card-games-screenshot.jpg or something, or put it in a different directory from the HTML file. The presence of both a .html and a .jpg file in the same directory, that are not different representations of the same content, appears to go against the spirit of content negotiation.

I came up with a wizardous unix command line to find files that might be subject to content negotiation problems, which I'd like to share:

for a in `find . -type d`; do ls $a|sed -e "s#^#$a/#"|sed -e 's/\.[^.]*$//'|uniq -c; done|grep -v '^ *1'|less

Comprehension and use of the above is left as an exercise for the reader.