Extracting images from a PDF

Posted by sagar on Stack Overflow See other posts from Stack Overflow or by sagar
Published on 2010-03-18T10:35:54Z Indexed on 2010/03/18 14:21 UTC
Read the original article Hit count: 487

Filed under:
|
|
|
|

My Query

I want to extract only images from a PDF document, using Objective-C in an iPhone Application.

My Efforts

  • I have gone through the info on this link, which has details regarding different operators on PDF documents.
  • I also studied this document from Apple about PDF parsing with Quartz.
  • I also went through the entire PDF reference document from the Adobe site. According to that document, for each image there are the following operators:

    • q
    • Q
    • BI
    • EI
  • I have created a table to get the image:

     myTable = CGPDFOperatorTableCreate();
     CGPDFOperatorTableSetCallback(myTable, "q", arrayCallback2);
     CGPDFOperatorTableSetCallback(myTable, "TJ", arrayCallback);
     CGPDFOperatorTableSetCallback(myTable, "Tj", stringCallback);
    
  • I use this method to get the image:

     void arrayCallback2(CGPDFScannerRef inScanner, void *userInfo) {
    
    
     // THIS DOESN'T WORK
     // CGPDFStreamRef stream; // represents a sequence of bytes    
     // if (CGPDFDictionaryGetStream (d, "BI", &stream)){
     //     CGPDFDataFormat t=CGPDFDataFormatJPEG2000;
     //     CFDataRef data = CGPDFStreamCopyData (stream, &t);
     // }
     }
    

This method is called for the operator "q", but I don't know how to extract an image from it.

What should be the solution for extracting the images from the PDF documents? Thanks in advance for your kind help.

© Stack Overflow or respective owner

Related posts about pdf

Related posts about parsing