Text chunks
When developing resvg, text chunks were probably one of the hardest things about text to wrap my head around.
When someone thinks about SVG text they probably imagine something like this:
<text x="5" y="10" font-family="Arial">
<span fill="red">red</span> text
</text>
Basically, a text with some styles applied to it. And this is true in most cases, but it's not how
SVG text structure actually looks like. SVG has an additional, hidden structure level called
text chunk.
Every time there is an x
or y
attribute in <text>
or <tspan>
- a new text chunk is defined.
This means that an SVG text element actually contains a list of text chunks,
where each chunk contains text with styles applied to it.
In some sort of pseudo-code it should look like:
class Text:
chunks: list[TextChunk] = []
class TextChunk:
text: str = ""
spans: list[TextSpan] = []
alignment: int = 0 # start/mid/end
class TextSpan:
# fill, stroke, font, etc.
For example, this text element has two chunks: some at 5,10 and text at 5,20.
<text x="5" y="10" font-family="Arial">
some
<tspan y="20">text</tspan>
</text>
Why is this important? Because this is what a line of text in SVG means - a chunk.
This is a line of text that would be passed to a text shaper/layout,
and not the whole element's content.
This is a line of text that would be aligned via text-anchor
,
and not the whole element's content or individual <tspan>
.
It's a very subtle nuance, but many libraries fail to handle it correctly.
For example, in this case, text-anchor
has no effect because that <tspan>
doesn't define a new chunk by setting an absolute coordinate.
<text x="5" y="10" font-family="Arial">
some
<tspan text-anchor="end">text</tspan>
</text>
In theory, each text chunk can be extracted into its own text
element, while tspan
can't.
Also, while it's not clear from the spec, each textPath
defines a new chunk as well.
Another interesting edge case is that there could be only one writing-mode
per text element.
An individual chunk or span cannot have their own writing modes.