Contents
Introduction. 1
Methods of consuming fonts. 1
The Standard Fonts. 1
Font Mapping (Windows to PostScript). 1
Viewing Fonts. 1
Using the Unicode Tables. 1
Font Ambiguity and Confusing Terms. 1
Font Styles and Methods of using them.. 1
Encoding and Non-Roman Alphabets. 1
Unicode. 1
Non-Roman Alphabets and Unicode Fonts. 1
Legal 1
Font Name Registration Problems. 1
Segoe UI Historic Phallus Microsoft
Censorship. 1
You may think choosing a font is just a case of scrolling
through that drop down list in Word but having a little background knowledge of
fonts will ensure things will run smoothly for your applications.
When printing to paper or a bitmap, fonts are no real
concern, provided they are available to the executing process performing the
printing then it’s going to work. With a vector file format like a PDF then things
are not quite that simple. With a vector format it is describing what is being
drawn, not a static snap shot bitmap, so if a particular font is used the PDF
needs to reference it somehow. Here are the basic font strategies when dealing
with PDF or any vector format:
·
Standard 14 – The PDF and PostScript standard include 14 basic
fonts that should always be present on any rendering device from a printer to a
software reader. If you stick to using these fonts you will never have to
worry.
·
Reference Font – Use any font available to the system but don’t
include it within the PDF. If you are producing PDFs that you will render on a
system with the same fonts installed then this is fine. If you try to render it
on a system where the fonts are not present it will substitute the font and
won’t look right.
·
Embed Font – The font (or at least the used parts) are included
within the PDF and used when rendering the PDF even if the font is not present
on the target system. The size of the PDF file will be increased as it is
carrying the font.
·
Convert to Outlines/Strokes/Curves – Text is converted to vector
objects and the font is no longer required. The size of the PDF file will be
increased as it now contains extra vector information describing all the text,
if there is a lot of text then this may be quiet considerable. No way of doing
this from code at the moment you normally do this in something like Adobe
Illustrator.
·
Include Bitmap – You can include a bitmap image within the PDF
that uses a particular font for example your company logo may use an exotic
font. This will also increase file size, by how much depends on the bitmap
size.
Choosing the right option depends on your application; if
your sending out documents, like order confirmations, I would recommend the PostScript
standard 14 fonts option unless you have good reason not to, it keeps the file
size down and avoids pretty much any problems. If you have to use a certain
font then choose another option depending on how it will be used at the other
end.
Within iTextSharp if you call FontFactory.RegisteredFonts()
you get this list they are the Standard 14:
·
courier
·
courier-bold
·
courier-oblique
·
courier-boldoblique
·
helvetica
·
helvetica-bold
·
helvetica-oblique
·
helvetica-boldoblique
·
symbol
·
times-roman
·
times-bold
·
times-italic
·
times-bolditalic
·
zapfdingbats
When you boil this down there are only 3 font families and 2
special fonts:
·
Courier – Looks like typewriter text and is mono-spaced (all
characters are the same width). Generally not used for standard paragraph text
but traditionally used for things like code listings within the page. On
Windows called “Courier New”.
·
Helvetica – Standard Sans-serif font (No little bits on ends of
characters). Not present on Windows “Arial” is the closest equivalent.
·
Times – Standard Serif font (little bits on ends of characters)
used in the Newspaper with the same name. Windows name: “Times New Roman”
·
Symbol – Many Greek letters and special brackets used in
mathematical equations also the playing card suits
·
Zapf Dingbats – Various wacky symbols like scissors, telephones,
religious, snowflakes, and arrows.
So for your general purpose text you only have 2 options
Helvetia (Arial) or Times depending on which one you prefer. Helvetia is often
considered modern and Times more traditional. You can always mix them, but not
too much, a Times heading with Helvetia body may work for you.
When using the Spludlow PDF the following fonts will be
automatically mapped:
Courier New Courier
Arial Helvetica
Times New Roman Times-Roman
Regular, Bold, Italic, and Bold & Italic styles are also
automatically mapped for these 3 fonts. So if for example you use the font “Courier
New, Bold, and Italic" it will map to the postscript font: "Courier-BoldOblique”.
If you want to keep things simple stick to these 3 fonts and
use the Windows font names with styles to them (rather than trying to use
individual separate font styles).
When writing code that will target print, bitmap, and PDF
use these 3 Windows Fonts and the PDF fonts will get mapped without you having
to write any extra code to cope with font differences in the PDF.
The Spludlow Framework provides methods in the
“Spludlow.Drawing.PDF” assembly that can be used to produce PDF Font Books and
PDF Unicode tables. Both of these PDFs can be very quickly paged through using
Adobe reader or Microsoft Edge.
NOTE: The “Spludlow.Drawing.PDF” assembly has to be
installed (see video).
NOTE: I recommend installing Adobe Reader for viewing PDF
files before you start.
Let’s run through the video:
·
On the Intranet Call page search for “font”. Notice only the
basic font methods are available.
·
Go to the Spludlow Web and download the “Spludlow.Drawing.PDF”
assembly.
·
Extract the archive.
·
Copy the directory to “C:\Program
Files\SpludlowV1\Spludlow.Drawing.PDF”
·
Open in notepad the file “C:\ProgramData\SpludlowV1\Config\Applications.txt”
·
Add the line (3 parts tab delimited):
o
Spludlow.Data.MySQL C:\Program
Files\SpludlowV1\Spludlow.Data.MySQL Lib
·
Close and save the file
·
Back to the Intranet Call page search for “font”. Notice now all
the font methods in the PDF assembly are being found.
·
Click on the method “Spludlow.Drawing.FontReports.Book(string
fontDirectory, string targetPdfFilename)”, leave the page as is for a moment.
·
Create a working directory “C:\FONTS” and give “SpludlowGroup”
full control.
·
Go to “C:\WINDOWS\Fonts” and enable the font “Segoe UI Historic”
(this will be handy later for demonstrating Unicode tables)
·
Back to the Call page enter the 2 method parameters:
o
fontDirectory C:\WINDOWS\Fonts
o
targetPdfFilename C:\FONTS\WindowsFonts.pdf
·
Click “Make Call Text” and “Run Call Text” to start running the
method
·
After a few minutes the PDF is produced. (You can check the
status & log pages for problems)
·
Open the PDF and page through looking for the font you want.
·
The same procedure in now performed on an un-installed collection
of fonts, “Adobe Font Folio” in the demo.
·
The source font directory must be accessible by “SpludlowGroup”,
this is achieved in the demo by moving the directory into “C:\FONTS”.
·
Search for “font” on the Call page again.
·
Click the method “Spludlow.Drawing.FontReports.UnicodeTables(string
fontDirectory, string targetDirectory)”.
·
Create an output directory for the Unicode tables:
“C:\FONTS\Windows Fonts Tables”
·
On the Call page enter the 2 method parameters:
o
fontDirectory C:\WINDOWS\Fonts
o
targetDirectory C:\FONTS\Windows Fonts Tables
·
Click “Make Call Text” and “Run Call Text” to start running the
method.
·
The Unicode Tables are produced one at a time for each font.
The font books and Unicode tables can be produced for
installed and un-installed fonts just make sure “SpludlowGroup” can read them
(Somewhere in a user’s profile, like your desktop, will not have permission).
NOTE: You could print the font books to bitmap, printer,
or dummy (view them in the Intranet print page) but PDF in Adobe Reader works
better than anything else, you can page through very quickly.
By looking at the Unicode Table PDF file size you can guess
at how many glyphs a font contains. Fonts supporting Roman alphabet only tend
to be around 1.4K. Fonts over this, like Arial 1.7K, carry extras like Latin
extended (all the European diacritics), Greek, Cyrillic, Hebrew, and Arabic.
Fonts with larger PDF’s are likely to be carrying the CJKV (Chinese, Japanese,
Korean, and Vietnamese) glyphs. Anything in between is most likely a specialist
font.
For older alphabets the “Segoe UI Historic” font (provided
with Windows but you have to enable it) carries stuff going back to Egyptian hieroglyphs.
You can “rip” the vector graphics from fonts by opening the
PDF in Illustrator on the page number taken from Adobe Reader, ensure the font
is installed in Windows.
NOTE: Some fonts due to their licence may not let you
embed or edit them.
I just created a logo in seconds without drawing anything using
“Segoe UI Emoji” for a legal firm that specializes in bodged fingernail
compensation. They can deal with any copyright issues.
When it comes to using fonts there is quite a lot of
annoyances and inconsistencies that can be a real pain. Here are some points to
clarify things:
·
“New” – Used when fonts are recreated by another to differentiate
from the original. Most of the time the fonts can be directory substituted.
Example “Courier” and “Courier New”.
·
“Roman” – The standard western alphabet, often used to just mean
the basic “Regular” font of the family for example “Times Roman” is just Regular
Times.
·
“Latin” – Another name for the “Roman” alphabet, you can use the
2 words interchangeably.
·
Helvetica & Arial – On a Mac you get both, but not on Windows
you have to use Arial, which is present on Macs these days. Obviously you can
install Helvetica on Windows but it may not be present on other Windows
systems. For basic business documents using these 2 fonts interchangeably will
be fine. For more design critical applications the subtle differences may be noticeable.
·
“Oblique” & “Italic” – Both mean the same thing, use them
interchangeability, the font style that slants. For example “courier oblique”
is called “courier italic” in a parallel universe.
·
“Condensed” – Squashed from the sides, you can fit more in the
same space.
·
“Sans” – Short for Sans-Serif (Like Helvetica)
·
“Gothic” – Means a Sans-Serif font (Like Helvetica)
·
“Script” – Often used in font names to mean like hand written or
calligraphy style
·
“Mono” – Mono-Spaced, all characters the same width like Courier
·
“Book” – Means “Regular”
·
“Demi” – Meaning Half so a demi-bold font is half bold
·
“Semi” – Can be used interchangeability with “Demi”
·
“Neue” – Another word for “New” example “Helvetica Neue”
·
“Light” – thinner that regular
·
“Heavy” – bolder than “Bold”
·
“Black” – heavier than “Heavy”
Fonts may also have a reference to the foundry (who created
the font) in the name. This can be confusing and may cause you to wrongly
attribute the foundry reference in the name to something significant when it
means nothing in practice. Some of the big players are:
·
LT – Linotype
·
MT – Monotype
·
ITC – International Typeface Corporation
·
BT – Bitsream
·
MS – Microsoft
·
Adobe – Adobe (Not abbreviated)
Here are the basic font styles found in .Net they are
self-explanatory:
·
Regular
·
Bold
·
Italic (Leaning)
·
Underline
·
Strikeout (Line through middle)
Bold and italic are the only real styles, underline and
strikeout are just lines drawn over the text.
Some fonts may be supplied with many extra styles like
“Thin”, “Light”, “Heavy”, and “Black”. You can also get things like “Helvetica
Rounded” which you would think should be a separate font all together but
that’s the way it is.
By default the font encoding used is CP-1252 this is the
standard 1 byte to 1 character encoding scheme used for most Western European
languages. This encoding can also be referred to as ANSI or ISO 8859-1 although
this is technically incorrect it means practically the same thing. In Acrobat
Reader if you go to “File->Properties->Fonts” it shows this encoding as
“ANSI”.
For other alphabets like Chinese then Unicode (see below) is
used for the encoding. To use Unicode for a font then suffix an Asterisk to the
font name for example “Microsoft YaHei*, 24”. The PDF format refers to Unicode
encoding as “IDENTITY_H” this is what you will see in Acrobat Reader’s Font
information. When using Unicode encoding iTextSharp seems to automatically
embed the font so you will get larger file sizes.
Before Unicode there where different code pages for
different alphabets, the stored single byte value for each character would
represent a different symbol in different code pages. So you had to
specifically use a particular code page to read and write from a file otherwise
you would end up with duff characters all over the place.
Unicode solves all these problems in these 2 simple ways:
1. Allow
more bytes to be used for each character (greater address space)
2. Allocate
every character from every alphabet a unique number from this greater address
space.
There is a little more to it, for example there are a few
different encoding methods (how the numbers are stored), e.g. UTF-8 and UTF-16,
but that’s the basics of how it works.
Here you can see standard Windows Arial has got the Arabic
alphabet in it as well as things like Greek and the extra accents used in many
European countries.
Obviously all fonts don’t contain all Unicode characters,
but if you are after a certain alphabet you may be surprised to find it in a seemingly
standard font.
Like anything fonts are copyrighted and owned by whoever put
the work into creating them. There will be a usage license somewhere that says
what you can and cannot do with them. If you care then by all means find and
read your font licence.
Some old TTF fonts seem to give problems when figuring out
the name. You can see this when running the “Font Demo” some fonts aren’t
displayed and you get an error saying the font can’t be found.
Looking into it there is a discrepancy between what the font
is displaying its name as and what the system thinks it is called. Take this
example:
·
Filename LTe50127.ttf
·
Glyph Typeface Name: Aachen LT
·
Windows Display Name: Aachen LT Regular
·
Registry Name: Aachen LT Bold
(TrueType)
·
Actual Working Name: Aachen LT Bold
Obviously this font is having an identity crisis, due to its
age it just isn’t playing ball with my current version of Windows. If you are experiencing
such problem I’d recommend you don’t use the font and look for an alternative.
If you absolutely have to use the font you can run “Spludlow.Drawing.Fonts.ReportFonts()”
and try to figure out the font name from the registry name.
Another way, that worked for me, is install the font in
Windows then go to delete it, when you get the “are you sure?” dialogue it
displays the real name here.
You can also open the font in a hex editor, you should be
able to spot the name in there somewhere.
The Microsoft font “Segoe UI Historic” contains many ancient
alphabets including Egyptian hieroglyphs.
For some reason Microsoft have removed the 3 Phallus Glyphs,
but they left in U+13072 (look it up)?
If you want the complete Egyptian hieroglyphs then try the
font NewGardiner.