Extracting Demographic and Contact Information from unstructured text files
        Posted  
        
            by jn29098
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by jn29098
        
        
        
        Published on 2010-06-01T01:50:48Z
        Indexed on 
            2010/06/01
            1:53 UTC
        
        
        Read the original article
        Hit count: 385
        
I am looking to extract specific items out of a large pool of unstructured documents. These documents could be 1-5 pages of text formatted in various ways by the user, but in most cases would contain at least:
- Name
 - Address (physical)
 - Email Address
 - Phone number
 - website URL
 
I'm looking for a semantic parser that can attempt to extract these elements from the documents so that I can load that information into a relational database and work with these records as contacts.
Other services I've looked for, while valuable for other purposes, do not address this specific need.
Any thoughts, suggestions or leads?
© Stack Overflow or respective owner