macOS Character Viewer
Emoji is a conspiracy by the Unicode® Consortium to make Americans care about internationalization.
For a time, many developers operated under the assumption that user-input would be primarily Latin-1-compatible. Or at least they could feel reasonably assured that everything would fall within the Basic Multilingual Plane — fitting comfortably into a single UTF-16 code unit.
But nowadays, with Emoji emerging as the lingua franca of these troubled times, text is presumed international until proven otherwise. Everyone should be ready for when the U+1F4A9 hits the fan.
So whatever you think about those colorful harbingers of societal collapse, at least they managed to break us of our ASCII-only information diets.
This week on NSHipster, we’ll be looking at a relatively obscure part of macOS that will prove essential for developers in today’s linguistic landscape: Character Viewer
From any macOS app, you can select the Edit menu and find an item at the very bottom called “Emoji & Symbols” (tellingly renamed from “Special Characters” in OS X Mavericks).
By default, this opens a panel that looks something like this:
You may have discovered this on your own and found it to be a convenient alternative to searching for Emoji online.
But this isn’t even its final form! Click the icon on the top right to see its true power:
Go ahead and memorize the global shortcut if you haven’t already: ⌃⌘Space. If you do any serious work with text, you’ll be using Character Viewer frequently.
Let’s take a quick tour of Character Viewer and see what it can do for us:
A Quick Tour of Character Viewer
The sidebar on the left provides quick access to your favorite and frequently-used characters, as well as a customizable list of named categories like Emoji, Latin, Punctuation, and Bullets/Stars.
The center column displays a grid of characters. Some categories, including Emoji, provide special views that make it easier to browse through the characters in that collection.
Selecting a character populates the inspector pane on the right with a larger, isolated rendering of the character, the character name, code point, and UTF-8 encoding. The inspector may also show alternate glyph renderings provided by other fonts as well as related characters, as applicable.
Copying Character Information
You can control-click a character and choose “Copy Character Info” from the shortcut menu to copy the information found in the inspector. For example:
😂
face with tears of joy
Unicode: U+1F602, UTF-8: F0 9F 98 82
Let’s take a look at what all of this means and how to use it in Swift code:
Character Literal
The first line of the copied character information is the character itself.
Swift source code fully supports Unicode, so you can copy-paste any character into a string literal and have it work as expected:
"😂" // 😂
All characters found in Character Viewer are valid string and character literals. However, not all entries are valid Unicode scalar literals. For example, the character 👩🏻🦰 is a named sequence consisting of four individual code points:
- 👩 WOMAN (U+1F469)
- 🏻 EMOJI MODIFIER FITZPATRICK TYPE-1-2 (U+1F3FB)
- ␣ ZERO WIDTH JOINER (U+200D)
- 🦰 EMOJI COMPONENT RED HAIR (U+1F9B0)
Attempting to initialize a Unicode.Scalar
value
from a string literal with this character
results in an error.
("👩🏻🦰" as Unicode.Scalar) // error
Unicode Code Point
Each Unicode character is assigned a unique name and number, known as a code point. By convention, Unicode code points are formatted as 4 – 6 hexadecimal digits (0–9, A–F) with the prefix “U+”.
In Swift,
string literals have the \u{n}
escape sequence
which takes a 1 – 6 hexadecimal number corresponding to a
Unicode scalar value
(essentially, the numerical value of any code point
that isn’t a surrogate).
The character 😂 has a scalar value equal to 1F602₁₆ (128514 in decimal).
You can plug that number into a \u{}
escape sequence in a string literal
to have it replaced by the character in the resulting string.
"\u{1F602}" // "😂"
Unicode scalar value escape sequences are especially useful when working with nonprinting control characters like directional formatting characters.
UTF-8 Code Units
The pairs of hexadecimal digits labeled “UTF8” correspond to the code points for the UTF-8 encoded form of the character.
The UTF-8 code unit is a byte (8 bits), which is represented by two hexadecimal digits.
In Swift,
you can use the String(decoding:as:)
initializer
to create a string from an array of UInt8
values
corresponding to the values copied from Character Viewer.
String(decoding: [0x F0, 0x9F, 0x98, 0x82], as: UTF8.self) // 😂
Unicode Character Name
The last piece of information provided by Character Viewer is the name of the character “face with tears of joy”.
The Swift standard library doesn’t currently provide a way to
initialize Unicode scalar values or named sequences.
However, you can use the String
method
applying
provided by the Foundation framework
to get a character by name:
import Foundation
"\\N{FACE WITH TEARS OF JOY}".applying Transform(.to Unicode Name,
reverse: true)
// "😂"
Perhaps more usefully,
you can apply the .to
transform in the non-reverse direction
to get the Unicode names for each character in a string:
"🥳✨".applying Transform(.to Unicode Name, reverse: false)
// \\N{FACE WITH PARTY HORN AND PARTY HAT}\\N{SPARKLES}
Things to Do with Character Viewer
Now that you’re more familiar with Character Viewer, here are some ideas for what to do with it:
Add Keyboard Shortcut Characters to Favorites
All developers should take responsibility for writing documentation about the software they work on and the processes they use in their organization.
When providing instructions for using a Mac app, it’s often helpful to include the keyboard shortcuts corresponding to menu items. The symbols for modifier keys like Shift (⇧) are difficult to type, so it’s often more convenient to pick them from Character Viewer. You can make it even easier for yourself by adding them to your Favorites list.
Click on the Action button at the top left corner of the Character Viewer panel and select the Customize List… menu item. In the displayed sheet, scroll to the bottom of the categories listed under Symbols and check the box next to Technical Symbols.
⌃ | Control | UP ARROWHEAD (U+2303) |
⌥ | Alt / Option | OPTION KEY (U+2325) |
⇧ | Shift | UPWARDS WHITE ARROW (U+21E7) |
⌘ | Command | PLACE OF INTEREST SIGN (U+2318) |
Dismiss the sheet and select Technical Symbols in the sidebar, and you’ll notice some familiar keyboard shortcut characters. Add them to your Favorites list by selecting each individually and clicking the Add to Favorites button in the inspector.
Demystify Unknown Characters
Ever see a character and wonder what it was? Simply copy-paste it into the search field of Character Viewer to get its name and number.
For example, have you ever wondered about the character you get by typing ⌥⇧K? Like, how did Apple get its logo into the Unicode Standard when that goes against their criteria for encoding symbols?
By copy-pasting into the Character Viewer, you can learn that, in fact, the Apple logo isn’t an encoded Unicode character. Rather, it’s a glyph associated with the code point U+F8FF from the Private-Use Area block.
The next time you have a question about what’s in your pasteboard, consider asking Character Viewer instead of Safari.
Divest Your Cryptocurrency
Given the current economic outlook for Bitcoin (₿) and other cryptocurrencies, you may be looking to divest your holdings in favor of something more stable and valuable. Look no further than the Currency Symbols category for some exciting investment opportunities, including French francs (₣) and Italian lira (₤).
Explore the Unicode Code Table
At the bottom of the Customize List sheet, you’ll find a section titled Code Tables.
Go ahead and check the box next to Unicode.
This is arguably the best interface available to you for browsing the Unicode Standard. No web page comes close to matching the speed and convenience of what’s available here in the macOS Character Viewer.
The top panel shows a sortable table of Unicode blocks, with their code point offset, name, and category. Clicking on any of these entries navigates to the corresponding offset in the bottom panel, where characters are displayed in a 16-column grid.
Brilliant.
Character Viewer is an indispensable tool for working with text on computers — a hidden gem in macOS if ever there was one.
But even more than that, Character Viewer offers a look into our collective linguistic and cultural heritage as encoded into the Unicode Standard. Etchings made thousands of years ago by Phoenician merchants and Qin dynasty bureaucrats and Ancient Egyptian priests and Lycian school children — they’re all preserved here digitally, just waiting to be discovered.
Seriously, how amazing is that?
So if ever you grow weary of the awfulness of software… take a scroll through the multitude of scripts and symbols in the Unicode code table, and take solace that we managed to get a few things right along the way.