Checking for valid document files
        Posted  
        
            by 
                sweb
            
        on Super User
        
        See other posts from Super User
        
            or by sweb
        
        
        
        Published on 2012-06-25T12:17:45Z
        Indexed on 
            2012/06/25
            15:18 UTC
        
        
        Read the original article
        Hit count: 247
        
I need a simple way to check if my files are valid documents (pdf, doc, docx, ppt, pptx, xls, xlsx, odt, ods, odp and etc).
I can't use file because magic does not work well at all. For example, for PDF files, this is my output. 
sweb@sweb-laptop: /media/files/ebooks/PDF and CHM$ file --mime *. Pdf
PHP 5 for Dummies. Pdf: application/pdf; charset=binary
PHP 6 and MySQL 5 for Dynamic Web Sites. Pdf: application/octet-stream; charset=binary
PHP6 and MySQL Bible. Pdf: application/pdf; charset=binary
PHP6.pdf: application/octet-stream; charset=binary
PHP and MySQL for Dummies SE. Pdf: application/pdf; charset=binary
For example, I use abiword – which is a good tool – but it converts any format. It doesn't check for valid documents: 
abiword --to=txt --to-name=output.txt audio.mp3
Is there any command available to check for valid documents then?
© Super User or respective owner