html parsing with libxml
        Posted  
        
            by zajcev
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by zajcev
        
        
        
        Published on 2009-04-28T22:32:43Z
        Indexed on 
            2010/05/02
            8:17 UTC
        
        
        Read the original article
        Hit count: 698
        
In another thread I got convinced into using HTML parsers instead of regexps for HTML parsing (I thought they would work fine, but they didn't ;) ).
I thought of using libxml (it has some HTML parser built in), but failed to find any useful tutorial. I also found this site and it says here it should do fine even with severly broken HTML.
Could you give me some examples of HTML parsing with libxml, or maybe recommend some different free library for Linux? I'm using C++.
I just thought someone would have some example code, so that I don't have to analyze the headers ;)
© Stack Overflow or respective owner