Mittwoch, 23. September 2009

Under the surface ...

Did you ever had the idea to look inside the real pdf-code while looking through a pdf-document? If you have some technical understanding this can be very interesting for you. What do you need for this? You already have everything to look inside... Try your editor "Notepad" from your windows-system. With a little bit luck you'll see that the internal pdf-code is readable, too.

The first important information you'll find at the beginning of the pdf-file. There's something like "%PDF-1.3%âãÏÓ" (for example). Some characters you can ignore but the "PDF" shows you that it's a pdf-file (what a surprise). The "1.3" means that this document was created according to the (older) pdf specification 1.3.
Now you should try the search-function from your editor...

Try a search with "FontName". Sure you'll find more than one entry in your code showing you all embedded/used fonts in your document. Such an entry could look like this one: "/FontName/Arial-BoldItalicMT/".
Some other interesting searchkeys/tags for you are: "Creator", "CreationDate", "Producer", "ModDate", "Title", "Keywords" or "Subject" for example. If you can't find a tag, this means only that there's no content for it. If there's no maintained title for the document you won't find the tag "Title" in the code.

If the document is encrypted the text following the tags isn't readable but even this gives you an information about the document ... it's an encrypted document ;-)
Finally another interesting tag "Count" oder "/Count". Here you'll find the pagecount of the document. This could look like here: ".../Count 9/...".
A little more appetite for PDF? Time to go on discovering ;-)