Diagnosing Sitecore XHTML Validation Errors

Many of you may already be familiar with the out-of-the-box XHTML validator that Sitecore offers. If not you can find it at sitecore/system/Settings/Validation Rules/Item Rules/Item/Full Page XHtml.

Recently I encountered issues where the validator would fail with a 500 error even when the page was semantically valid. After a few hours of playing around I discovered that it was actually the validator method itself that was failing and not providing adequate error handling to present a useful message to the user.

How the XHTML validator works

  1. Build up a friendly URL of the current item (with extra parameters to handle language, device etc.)
  2. Create a web request with the combined URL and retrieve this page (this is where the error most likely comes from)
  3. Download the response and send it off to W3C Validator (there is more logic around here however, it appears to be handled slightly better)
  4. Return validation information to the user

So why does it break?

If the web request above fails for any reason the validator will return an error, it won't give a detailed reason why (it returns the exception message when you really need the inner exception message). It also fails to log anything to the Sitecore logs.

Troubleshooting the XHTML validator

There are many reasons why a web request could fail and after working through some ideas I stumbled across the cause of my 500 error. Proxy settings.

Here are some more quick ideas you can try to help you diagnose the issue:

  • Check that you have valid SSL certificates if you're using them (invalid or untrusted certificates won't work).
  • Check that you've set the server proxy correctly.
  • Check that your server is not URL Re-writing the requests in a strange way.
  • Create a new implementation of the Sitecore.Data.Validators.ItemValidators.FullPageXHtmlValidator based on the original, update the validator item in Sitecore to use you're new validator and then run a debugger over it.
  • Disable the XHTML validator (this would depend on the requirements for your implementation).