In the previous tutorial, you learnt how to extract all text from a PDF file. Besides text, you might want to get images from the PDF file. In this tutorial, I am going to show you how to extract images from a PDF file.
An image you see in a PDF is an object stream. An object can be retrieved from the PDF by using the GetPdfObject method of the PdfReader. Because there are different types of objects stored in the PDF, you need to check whether the retrieved object is an image. This can be done by converting the object to PRStream and using its Get method to get the type of the object. Then you will compare this type with the PdfName.IMAGE type. After the type object is determined, you will create an instance of PdfImageObject from the image object stream. With the instance of PdfImageObject you can get the data of the image in an array of bytes with the GetImageAsBytes method. This array of bytes will be saved to your disk by using the Write method of an instance of FileStream class.
The example code below extracts images from the jmf_tutorial.pdf file. The extracted images are saved in D:/imagesextracted folder. The picture below shows you the images that are extracted from the file and saved to the D:/imagesextracted folder.
PdfReader reader = new PdfReader("D:/jmf_tutorial.pdf");
int n = reader.XrefSize; //number of objects in pdf document
for (int i = 0; i < n; i++)
po = reader.GetPdfObject(i); //get the object at the index i in the objects collection
if (po == null || !po.IsStream()) //object not found so continue
pst = (PRStream)po; //cast object to stream
PdfObject type = pst.Get(PdfName.SUBTYPE); //get the object type
//check if the object is the image type object
if (type != null && type.ToString().Equals(PdfName.IMAGE.ToString()))
pio = new PdfImageObject(pst); //get the image
fs = new FileStream(path + "image" + i + ".jpg", FileMode.Create);
//read bytes of image in to an array
//write the bytes array to file
fs.Write(imgdata, 0, imgdata.Length);
This website intents to provide free and high quality tutorials, examples, exercises and solutions, questions and answers of programming and scripting languages: