Language

When a text has characters not present in a specified font family(ies), an SVG library would try to fallback to a random font that contains those characters. This process is called font fallback. It's completely undefined and it's a rabbit hole on its own.

The problem is that a single Unicode character can have different meanings depending on the language. It's not an issue for most languages but can become one for Chinese derivatives (aka CJK).

To force the text language, SVG uses the xml:lang attribute:

<g font-family="sans-serif" font-size="32">
    <!-- Default (will usually fallback to Chinese) -->
    <text x="100" y="58" text-anchor="middle">刃直海角骨入</text>
    <!-- Japanese -->
    <text x="100" y="108" text-anchor="middle" xml:lang="ja">刃直海角骨入</text>
    <!-- Traditional Chinese -->
    <text x="100" y="158" text-anchor="middle" xml:lang="zh-HANT">刃直海角骨入</text>
</g>

Here we define font-family to be essentially anything and we have three text elements with exactly the same content, but different languages.

Based on my tests, only Chrome, Firefox and librsvg do handle xml:lang and the expected output should look something like this:

As you can see, while our Unicode strings are identical, they are rendered differently.
On macOS, Chrome would fallback to the following fonts:

  • PingFang SC for the "default" language
  • Hiragino Kaku Gothic ProN for Japanese
  • PingFang TC for Traditional Chinese

And while the actual "style" of a glyph can be different depending on a font, the hieroglyph itself should always be the same.


Note: This chapter was inspired by this article.