C# tutorial: list fonts used a PDF file


Listing fonts used in a PDF file

In the previous tutorial, you learnt how to extract images from a PDF file. In this tutorial, I am going to show you how to list fonts used in the PDF file.

A font is a stream object stored in the PDF file. When you use a standard Type 1 font, iTextSharp will add a font dictionary to the PDF file. When you use a font that is embedded, the font dictionary will also refer to a stream with a full or partial font program that is copied into the PDF file.

To list fonts used in the PDF file, first you need to get all resources found in the file by using the GetAsDic method of the PdfDictionary instance. This method accepts one argument. As you want to get all resources of a particular page. The value of this argument will be PdfName.RESOURCES. Then, you will use the GetAsDic method again to get font objects by passing the PdfName.FONT as its argument. Each an entry of a font dictionary is a pair of key and name. The name of the pair is the name font.

The example code below processes all pages of the input PDF file, gets all resources from every page, stores the name of fonts used in the HashSet data structure, and shows the name of fonts in the Console window.

PdfReader reader = new PdfReader("D:/jmf_tutorial.pdf");
HashSet<String> names = new HashSet<string>();
PdfDictionary resources;
for (int p = 1; p <= reader.NumberOfPages; p++)
{

PdfDictionary dic = reader.GetPageN(p);
resources = dic.GetAsDict(PdfName.RESOURCES);

if (resources != null)
{

//get fonts dictionary
PdfDictionary fonts = resources.GetAsDict(PdfName.FONT);
if (fonts != null)
{

PdfDictionary font;
foreach(PdfName key in fonts.Keys){
font = fonts.GetAsDict(key);
String name = font.GetAsName(PdfName.BASEFONT).ToString();

//check for prefix subsetted font
if (name.Length > 8 && name.ToCharArray()[7] == '+')
{
name = String.Format("%s subset (%s)", name.Substring(8), name.Substring(1, 7));

}
else
{
//get type of fully embeded fonts
name = name.Substring(1);
PdfDictionary desc= font.GetAsDict(PdfName.FONTDESCRIPTOR);
if (desc == null)
name += " no font descriptor";
else if (desc.Get(PdfName.FONTFILE) != null)
name += " (Type 1) embedded";
else if (desc.Get(PdfName.FONTFILE2) != null)
name += " (TrueType) embedded";
else if (desc.Get(PdfName.FONTFILE3) != null)
name +=" (" + font.GetAsName(PdfName.SUBTYPE).ToString().Substring(1) + ") embedded";
}
names.Add(name);
}

  }

}
}

var collections = from name in names
select name;
foreach (String fname in collections)
{
Console.WriteLine(fname);
}
Console.Read();

list fonts in pdf


Comments




This website intents to provide free and high quality tutorials, examples, exercises and solutions, questions and answers of programming and scripting languages:
C, C++, C#, Java, VB.NET, Python, VBA,PHP & Mysql, SQL, JSP, ASP.NET,HTML, CSS, JQuery, JavaScript and other applications such as MS Excel, MS Access, and MS Word. However, we don't guarantee all things of the web are accurate. If you find any error, please report it then we will take actions to correct it as soon as possible.