How should I deal with user agent parsing in logs?
        Posted  
        
            by 
                Mr. Jefferson
            
        on Pro Webmasters
        
        See other posts from Pro Webmasters
        
            or by Mr. Jefferson
        
        
        
        Published on 2012-02-17T19:41:05Z
        Indexed on 
            2012/03/23
            17:41 UTC
        
        
        Read the original article
        Hit count: 410
        
logging
|user-agent
My web app project includes logging functionality so we can see where visitors are coming from (referrer URL), what the popular user agents are, what pages are most popular, etc. The log is stored in SQL Server, and when I query the user agents I use a large (almost 100 lines) and growing CASE statement to separate the user agents using string matching (i.e. if the user agent contains the string "Firefox/9" then it's Firefox 9). Is there a better way to do this so I don't have to continually add to that CASE statement to deal with new browser releases?
Also, how should I deal with less common, weird/unknown user agents? I've seen the following in the logs and been unable to find good information online about what they are:
- WordPress/3.3.1; http://www.facecolony.org
- Mozilla/4.0 ( http://www.hairirons.org redips; <a href=http://hairirons.org/>chi hair iron</a>)
I'd guess they're bots/crawlers, but the sites they point to don't appear to reference web crawlers (or even be available sometimes). I've seen other user agents aren't familiar to me, but I know they're bots because they include "bot" or "spider" or something similar in them.
© Pro Webmasters or respective owner