How to return proper 404 for google while providing user friendly content to the user?

Posted by Marek on Stack Overflow See other posts from Stack Overflow or by Marek
Published on 2010-03-30T17:59:35Z Indexed on 2010/03/30 18:03 UTC
Read the original article Hit count: 319

Filed under:
|
|
|
|

I am bouncing between posting this here and on Superuser. Please excuse me if you feel this does not belong here.

I am observing the behavior described here - Googlebot is requesting random urls on my site, like aecgeqfx.html or sutwjemebk.html. I am sure that I am not linking these urls from anywhere on my site.

I suspect this may be google probing how we handle non existent content - to cite from an answer to the linked question:

 [google is requesting random urls to] see if your site correctly 
 handles non-existent files (by returning a 404 response header)

We have a custom page for nonexistent content - a styled page saying "Content not found, if you believe you got here by error, please contact us", with a few internal links, served (naturally) with a 200 OK. The URL is served directly (no redirection to a single url).

I am afraid this may discriminate the site at google - they may not interpret the user friendly page as a 404 - not found and may think we are trying to fake something and provide duplicate content.

How should I proceed to ensure that google will not think the site is bogus while providing user friendly message to users in case they click on dead links by accident?

© Stack Overflow or respective owner

Related posts about googlebot

Related posts about google