Device Identifiers and Fingerprinting on iOS
For every era, there’s a monster that embodies the anxieties of the age.
At the dawn of the Holocene, our ancestors traced the contours of shadows cast by the campfire as they kept watch over the darkness. Once we learned to read the night sky for navigation, sailors swapped stories of sea creatures like Leviathan and Siren to describe the dangers of open ocean (and the perils to be found on unfamiliar shores).
Frankenstein’s monster was as much the creation of Mary Shelley as it was a spiritual collaboration with Luigi Galvani. And Bram Stoker’s fictionalized account of the mummy’s curse was more a response to the Egyptomania and European colonialism of the nineteenth century than any personal account of the Middle Kingdom.
More recently, the “monster ruins a beach party” trope of the 1960s arose from concerns of teenager morality. While the Martians who invaded those same drive-in double features served as a proxy for Cold War fears at the height of the Space Race.
All of which begs the question: “What monster best exemplifies our present age?”
Consider the unnamed monster from the film It Follows: a formless, supernatural being that relentlessly pursues its victims anywhere on the planet.
Sounds a bit like the state of ad tech in 2019, no?
This week on NSHipster — in celebration of our favorite holiday 🎃 — we’re taking a look at the myriad ways that you’re being tracked on iOS, both sanctioned and unsanctioned, historically and presently. So gather around the campfire, and allow us to trace the contours of the unseen, formless monsters that stalk us under cover of Dark Mode.
The Cynicism of Marketing and Advertising Technology
Contrary to our intuitions about natural selection in the marketplace, history is littered with examples of inferior-but-better-marketed products winning out over superior alternatives: VHS vs. Betamax, Windows vs. Macintosh, etc. (According to the common wisdom of business folks, at least.) Regardless, most companies reach a point where “if you build it, they will come” ceases to be a politically viable strategy, and someone authorizes a marketing budget.
Marketers are tasked with growing market share by identifying and communicating with as many potential customers as possible. And many — either out of a genuine belief or formulated as a post hoc rationalization — take the potential benefit of their product as a license to flouting long-established customs of personal privacy. So they enlist the help of one or more advertising firms, who promise to maximize their allocated budget and provide some accountability for their spending by using technology to target, deliver, and analyze messaging to consumers.
Each of these tasks is predicated on a consistent identity, which is why advertisers go to such great lengths to track you.
- Without knowing who you are, marketers have no way to tell if you’re a likely or even potential customer.
- Without knowing where you are, marketers have no way to reach you other than to post ads where they’re likely to be seen.
- Without knowing what you do, marketers have no way to tell if their ads worked or were seen at all.
Apple-Sanctioned Identifiers
Apple’s provided various APIS to facilitate user identification for various purposes:
Universal Identifiers (UDID)
In the early days of iOS,
Apple provided a unique
property on UIDevice
—
affectionately referred to as a
UDID
(not to be confused with a UUID).
Although such functionality seems unthinkable today,
that property existed until iOS 5,
until it was
deprecated and replaced by identifier
in iOS 6.
Vendor Identifiers (IDFV)
Starting in iOS 6,
developers can use the
identifier
property on UIDevice
to generate a unique identifier that’s shared across apps and extensions
created by the same vendor
(IDFV).
import UIKit
let idfv = UIDevice.current.identifier For Vendor // BD43813E-CFC5-4EEB-ABE2-94562A6E76CA
Advertising Identifiers (IDFA)
Along with identifier
came the introduction of a new
AdSupport framework,
which Apple created to help distinguish
identification necessary for app functionality
from anything in the service of advertising.
The resulting
advertisingidentifier
property
(affectionately referred to as
IDFA by its associates)
differs from identifier
by returning the same value for everyone.
The value can change, for example,
if the user resets their Advertising Identifier
or erases their device.
import Ad Support
let idfa = ASIdentifier Manager.shared().advertising Identifier
If advertising tracking is limited, the property returns a zeroed-out UUID instead.
idfa.uuid String == "00000000-0000-0000-0000-000000000000" // true if the user has limited ad tracking
DeviceCheck
identifier
and advertising
provide all the same functionality as the unique
property
they replaced in iOS 6,
save for one:
the ability to persist across device resets and app uninstalls.
In iOS 11, Apple quietly introduced the DeviceCheck framework, which allows developers to assign two bits of information that are persisted by Apple until the developer manually removes them.
Interacting with the DeviceCheck framework should be familiar to anyone familiar with APNS: after setting things up on App Store Connect and your servers, the client generates tokens on the device, which are sent to your servers to set or query two bits of information:
import Device Check
let device = DCDevice.current
if device.is Supported {
device.generate Token { data, error in
if let token = data?.base64Encoded String() {
send token to your server
}
}
}
Based on the device token and other information sent by the client, the server tells Apple to set each bit value by sending a JSON payload like this:
{
"device_token": "QTk4Qk FDNEIt NTBDMy00Qjc5LTh BRUEt MDQ5RTQz Rj NGQz U0Cg==",
"transaction_id": "D98BA630-E225-4A2F-AFEC-BE3A3D591708",
"timestamp": 1572531720,
"bit0": true,
"bit1": false
}
To retrieve those two bits at a later point in time,
the server sends a payload without bit0
and bit1
fields:
{
"device_token": "QTk4Qk FDNEIt NTBDMy00Qjc5LTh BRUEt MDQ5RTQz Rj NGQz U0Cg==",
"transaction_id": "D98BA630-E225-4A2F-AFEC-BE3A3D591708",
"timestamp": 1572532500
}
If everything worked,
Apple’s servers would respond with a 200
status code
and the following JSON payload:
{
"bit0" : true
"bit1" : false,
"last_update_time" : "2019-10"
}
Fingerprinting in Today’s iOS
Despite these affordances by Apple, advertisers have continued to work to circumvent user privacy protections and use any and all information at their disposal to identify users by other means.
Over the years, Apple’s restricted access to information about device hardware, installed apps, nearby WiFi networks. They’ve required apps to request permission to get your current location, access your camera and microphone, flip through your contacts, and find and connect to Bluetooth accessories. They’ve taken bold steps to prevent user tracking in Safari.
For lack of this information, companies have had to get creative, looking to forge unique identities from the scraps of what’s still available. This process of identification by a combination of external factors is known as fingerprinting.
The unfortunate reality is that we can be uniquely identified by vanishingly small amounts of information. For example, individuals within a population can be singled out by as few as four timestamped coordinates (de Montjoye et al., 2013) or little more than a birthday and a ZIP code (Sweeney, 2000).
Every WWDC since 2012 has featured a session about Privacy, but the only mention of fingerprinting specifically was a brief discussion in 2014 about how to avoid doing it.
By our count, a determined party could use conventional, unrestricted APIs to generate a few dozen bits of randomness:
Locale Information (~36 bits)
Locale information is the greatest source of identifying information on Apple platforms. The combination of your preferred languages, region, calendar, time zone, and which keyboards you have installed say a lot about who you are — especially if you have less conventional preferences.
import Foundation
Locale.current.language Code
log2(Double(Locale.iso Language Codes.count)) // 9.217 bits
Locale.current.region Code
log2(Double(Locale.iso Region Codes.count)) // 8 bits
Locale.current.calendar.identifier
// ~2^4 (16) Calendars
Time Zone.current.identifier
log2(Double(Time Zone.known Time Zone Identifiers.count)) // 8.775 bits
User Defaults.standard.object(for Key: "Apple Keyboards")
// ~2^6 (64) i OS keyboards
Accessibility (~10 bits)
Accessibility preferences also provide a great deal of information, with each individual setting contributing a single potential bit:
UIAccessibility.is Bold Text Enabled
UIAccessibility.is Shake To Undo Enabled
UIAccessibility.is Reduce Motion Enabled
UIAccessibility.is Darker System Colors Enabled
UIAccessibility.is Reduce Transparency Enabled
UIAccessibility.is Assistive Touch Running
Of the approximately ~25% of users who take advantage of Dynamic Type by configuring a preferred font size, that selection may also be used to fingerprint you:
let application = UIApplication.shared
application.preferred Content Size Category
Hardware Information (~5 or ~6 bits)
Although most of the juiciest bits have been locked down in OS updates over the years, there’s just enough to contribute a few more bits for purposes of identification.
On iOS, you can get the current model and amount of storage of a user’s device:
import UIKit
let device = UIDevice.current
device.name // "i Phone 11 Pro"
let file Manager = File Manager.default
if let path = file Manager.urls(for: .library Directory, in: .system Domain Mask).last?.path,
let system Size = try? file Manager.attributes Of File System(for Path: path)[.system Size] as? Int
{
Measurement<Unit Information Storage>(value: Double(system Size), unit: .bytes)
.converted(to: .gigabytes) // ~256GB
}
With 14 supported iOS devices, most having 3 configurations each, let’s say that this contributes about 32 possibilities, or 5 bits.
You can go a few steps further on macOS, to further differentiate hardware by its processor count and amount of RAM:
process Info.processor Count // 8
Measurement<Unit Information Storage>(value: Double(process Info.physical Memory),
unit: .bytes)
.converted(to: .gigabytes) // 16GB
It’s hard to get a sense of how many different Mac models are in use, but a reasonable estimate would be on the order of 26 or 27.
Cellular Network (~2 bits)
Knowing whether someone’s phone is on Verizon or Vodafone
can also be factored into a fingerprint.
You can use the CTTelephony
class from the
CoreTelephony framework
to lookup the providers for devices with cellular service:
import Core Telephony
let network Info = CTTelephony Network Info()
let carriers = network Info.service Subscriber Cellular Providers?.values
carriers?.map { ($0.mobile Network Code, $0.mobile Country Code) }
The number of providers varies per country, but using the 4 major carriers in United States as a guideline, we can say carrier information would contribute about 2 bits (or more if you have multiple SIM cards installed).
Communication Preferences (2 bits)
More generally, even knowing whether someone can send texts or email at all can be factored into a fingerprint. This information can be gathered without permissions via the MessageUI framework.
import Message UI
MFMail Compose View Controller.can Send Mail()
MFMessage Compose View Controller.can Send Text()
Additional Sources of Identifying Information
If the use of digital fingerprinting seems outlandish, that’s just scratching the surface of how companies and researchers have figured out how to circumvent your privacy.
GeoIP and Relative Network Speeds
Although access to geolocation through conventional APIs requires explicit authorization, third parties may be able to get a general sense of where you are in the world based on how you access the Internet.
Geolocation by source IP address is used extensively for things like region locking and localization. You could also combine this information with ping-time measurements to hosts in known locations to get a more accurate pinpoint on location (Weinberg et al., 2018):
ping -c 5 99.24.18.13 # San Francisco, USA
--- 99.24.18.13 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 11.900/12.184/12.895/0.380 ms
ping -c 5 203.47.10.37 # Adelaide, Australia
--- 203.47.10.37 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 202.122/202.433/203.436/0.504 ms
Battery Health
It’s unclear whether this is a concern in iOS,
but depending on how precise the results of UIDevice
battery APIs are,
you may be able to use them to identify a device by its battery health.
(Olejnik et al., 2016)
var timestamped Battery Levels: [(Date, Float)] = []
if UIDevice.current.is Battery Monitoring Enabled {
timestamped Battery Levels.append((Date(), UIDevice.current.battery Level))
}
And so on…
Everything from your heartbeat, to your gait, to your butt shape seem capable of leaking your identity. It can all be quite overwhelming.
I mean, if a motivated individual can find your home address by cross-referencing the reflection in your eyes against Google Street view, how can we even stand a chance out there?
Much as we may bemoan the current duopoly of mobile operating systems, we might take some solace in the fact that at least one of the players actually cares about user privacy. Though it’s unclear whether that’s a fight that can ever be won.
At times, our fate of being tracked and advertised to may seem as inevitable as the victims in It Follows.
But let’s not forget that, as technologists, as people with a voice, we’re in a position to fight back.