typst: Emoji do not render in PDFs

Hi,

I’ve just cloned the repo and compiled it. After compiling that document, I’ve a blank page (using Evince) or some tofu (using Firefox):

#emoji.face.grin

The generated PDF is 10MB big, with the Noto Color Emoji font embedded. When changing the zoom level, Evince writes some font thing failed to stderr.

System: Ubuntu 22.04, cargo 1.68.0, typst 045a1096

About this issue

Original URL
State: closed
Created a year ago
Reactions: 40
Comments: 18 (10 by maintainers)

Most upvoted comments

As discussed on Discord, here is some background on color fonts and Typst’s PDF font handling and then the steps required to fix this issue.

A bit of background on color fonts

OpenType supports multiple formats for encoding emoji fonts. The data for each of these is stored in OpenType tables within the font. We can query this data with ttf-parser. The following color formats exist:

sbix: A table that encodes emojis as raster images. Backed by Apple. (Example font: Apple Color Emoji)
CBDT: Another table that encodes emojis as raster images, but in a slightly different way. Backed by Google. (Example font: Noto Color Emoji)
SVG: Encodes emojis as a subset of SVG. Backed by Adobe and Mozilla. (Example font: Twitter Color Emoji)
COLRv0: Encodes emojis with the normal font outlines + color palettes. Backed by Microsoft. (Example font: Segoe UI Emoji)
COLRv1: Microsoft noticed that color emojis with just plain colors look a bit boring, so they added a ton of SVG-like features to the COLR table. This format is quite recent and support for it isn’t merged into ttf-parser yet. Even with the latest updates, Windows doesn’t seem to ship it, so we can skip it for now. (Example font: Recent versions of Noto Color Emoji)

The inner workings of these OpenType tables is mostly abstracted away by ttf-parser, but it’s still important to know how they work conceptually.

How Typst writes text and fonts into PDFs

Within write_text in typst-pdf/src/page.rs, text is written by writing CIDs (character ids) into the content stream with the /TJ operator. In spite of their name CIDs are not like Rust chars. Instead, they typically map 1-1 to glyphs IDs in a font because we configure an Identity CID-to-GID mapping (except in the case of CFF fonts, which work a bit differently).

The CIDs reference a font configured via the /Tf operator. While writing the text items, we collect all fonts that are referenced, which we then embed into the PDF at the end. This happens in typst-pdf/src/font.rs To be able to copy from the PDF, a PDF viewer must map the CIDs back to Unicode text. This is what the /ToUnicode mapping is for, which we write for each font.

So, this is how it works for normal fonts. The problem now is that PDF viewers completely ignore the color tables in emoji fonts and fall back to normal outlines (if available). To get emojis to show up, we have two different options:

Encode them as graphics rather than text. Then, they aren’t copy-pastable. While we can, in theory, specify an /ActualText that should be copied, many PDF viewers don’t seem to support that.
Encode them as Type 3 fonts. A Type 3 font is a special type of PDF font that doesn’t embed font data in an external format like TTF or CFF, but rather defines the font’s glyphs directly as PDF objects. This way, we can create the emojis as PDF graphics, but display them with the normal text-showing operators.

Based on the conversation above and what other tools do, Type 3 seem like the better approach. Relevant details can be found in the PDF 1.7 specification section 9.5.6. “Type 3 fonts”.

Implementing it in Typst

Here’s a rough outline of the steps involved in implementing emoji handling for PDF in Typst:

When writing a glyph in a text run, we need to detect whether an emoji glyph definition exists in any of the formats above. If yes, we need to terminate the text run and switch to a Type 3 font we will generate for it. This should live in typst-pdf/src/page.rs, likely using some helpers defined in typst-pdf/src/font.rs.
We need code to convert emoji definitions in any of the formats above into PDF content streams. The PNG exporter has existing code for all the formats except COLR (since it’s a recent addition to ttf-parser) whereas the SVG exporter doesn’t handle them yet. To share as much code as possible between the exporters, it would probably make sense to convert a color glyph to a Typst Frame rather than directly producing PDF content for it and then reuse this frame across all three exporters. This code could live in typst/src/text/font/color.rs.
An unfortunate limitation of Type 3 fonts is that they can encode at most 256 glyphs, so if more than 256 emojis from the same font are used, we need to write multiple Type 3 fonts for that one.
We need to actually write the necessary variable number of Type 3 fonts for each font and generate /ToUnicode mappings for them. This should live in typst-pdf/src/font.rs.

+19

laurmaedje on Dec 8, 2023

Damn, all I wanted was a little 🐿️, turns out it’s Specs War Infinity Edition Thanks for planning on supporting this 😀

+14

lvignoli on Dec 8, 2023

As a workaround, I wrote a package svg-emoji to replace emoji with an SVG glyps directly. For now, it only offers Noto support.

+12

polazarus on Sep 18, 2023

@elegaanz thanks a lot for this crucial feature! Typst is now 1.0 for me 🐿️

lvignoli on Apr 17, 2024

The planned fix is to export them as XObjects with /ActualText to make them copyable. If that turns out to be problematic, another alternative would be to embed them as Type 3 fonts.

laurmaedje on Jun 19, 2023

Emoji fonts aren’t correctly exported at the moment. There may also be unrelated font issues at play here.

laurmaedje on Mar 25, 2023

Hum, that’s quite interesting:

#set text(font: (
  "Segoe UI Emoji"
))
Segoe Emoji here → #emoji.face.grin #emoji.amphora ← ?

#set text(font: (
  "Noto Color Emoji"
))
Noto Emoji here → #emoji.face.grin #emoji.amphora ← ?

With the following result:

Here is the file test.pdf

gsurrel on Mar 24, 2023