Some things you wanted to know about fonts (but were too afraid to ask)

As a web developer, your most common font problem is probably “Why doesn’t this character look like what I expect it to look like?”

Your problem is either

This character is legible but isn’t in the nice font I/my company shelled out big bucks for

or

This character isn’t even legible

In the latter case, your character may look like ’this’, or it may just be a bunch of �����.

First I’m going to define some concepts with which we can troubleshoot most font problems. Then I’ll apply them to the three typical font problems I mentioned above.

Encoding: A set of mappings from sequences of bits to characters. For example, ASCII is a set of mappings from sequences of seven bits to characters, in which 01101000 maps to h.

Code point: The key in such a mapping, e.g. 01101000.

Glyph: An image of a character. For example, these are all glyphs for the character a:

(source: https://en.wikipedia.org/wiki/Glyph#/media/File:A-small_glyphs.svg)

Font: A set of glyphs representing a range of characters. These are the glyphs that make up the font Comic Sans Regular. The characters supported by a font typically belong to a group, such as ASCII characters or math symbols.

Now that we have a lexicon with which to talk about these things1, we can make some headway with troubleshooting each of the three font issues I mentioned:

  1. Correct but ugly characters. This must mean whatever is rendering the characters (since you’re probably a web developer, it’s probably your browser) isn’t using the intended font. Either it doesn’t know that it’s supposed to use that font, or it knows but doesn’t have access to that font, so it can’t find the glyph images it needs to render.
  2. Wrong character (’). If, say, you’re expecting but getting ’, this must mean the bit stream underlying your string is being parsed incorrectly such that the bits representing ' are somehow being mapped to ’ instead of '. In this case, since you’re expecting one character but getting three, your bits don’t appear to be broken up properly to begin with. If you get one character but it’s the wrong one, that also means the bits -> character mapping went awry. Either way, this sounds like an encoding problem (encoding == mappings, remember?) Maybe your string was encoded in a 8-bit encoding such as ISO-8859-9, but being interpreted with a 7-bit encoding such as ASCII.
  3. No characters (���, ???, what have you). Either there’s no mapping from those bits to a character in the encoding your browser is using (for example, ISO-8859-1 doesn’t know what to do with 0x1F), or there’s a mapping but there’s no glyph for that character in the font your browser is using (say, because your stylesheet specifies Comic Sans and the character is Japanese).

In later posts, we’ll walk through further debugging steps for each type of issue.


1 If you find my grossly simplified glossary unsatisfactory, here’s a more thorough and entertaining overview written by someone smarter than me.

Leave a Reply

Your email address will not be published. Required fields are marked *