Device Identifiers and Fingerprinting on iOS
For every era, there’s a monster that embodies the anxieties of the age.
At the dawn of the Holocene, our ancestors traced the contours of shadows cast by the campfire as they kept watch over the darkness. Once we learned to read the night sky for navigation, sailors swapped stories of sea creatures like Leviathan and Siren to describe the dangers of open ocean (and the perils to be found on unfamiliar shores).
Frankenstein’s monster was as much the creation of Mary Shelley as it was a spiritual collaboration with Luigi Galvani. And Bram Stoker’s fictionalized account of the mummy’s curse was more a response to the Egyptomania and European colonialism of the nineteenth century than any personal account of the Middle Kingdom.
More recently, the “monster ruins a beach party” trope of the 1960s arose from concerns of teenager morality. While the Martians who invaded those same drive-in double features served as a proxy for Cold War fears at the height of the Space Race.
All of which begs the question: “What monster best exemplifies our present age?”
Consider the unnamed monster from the film It Follows: a formless, supernatural being that relentlessly pursues its victims anywhere on the planet.
Sounds a bit like the state of ad tech in 2019, no?
This week on NSHipster — in celebration of our favorite holiday 🎃 — we’re taking a look at the myriad ways that you’re being tracked on iOS, both sanctioned and unsanctioned, historically and presently. So gather around the campfire, and allow us to trace the contours of the unseen, formless monsters that stalk us under cover of Dark Mode.
The Cynicism of Marketing and Advertising Technology
Contrary to our intuitions about natural selection in the marketplace, history is littered with examples of inferior-but-better-marketed products winning out over superior alternatives: VHS vs. Betamax, Windows vs. Macintosh, etc. (According to the common wisdom of business folks, at least.) Regardless, most companies reach a point where “if you build it, they will come” ceases to be a politically viable strategy, and someone authorizes a marketing budget.
Marketers are tasked with growing market share by identifying and communicating with as many potential customers as possible. And many — either out of a genuine belief or formulated as a post hoc rationalization — take the potential benefit of their product as a license to flouting long-established customs of personal privacy. So they enlist the help of one or more advertising firms, who promise to maximize their allocated budget and provide some accountability for their spending by using technology to target, deliver, and analyze messaging to consumers.
Each of these tasks is predicated on a consistent identity, which is why advertisers go to such great lengths to track you.
- Without knowing who you are, marketers have no way to tell if you’re a likely or even potential customer.
- Without knowing where you are, marketers have no way to reach you other than to post ads where they’re likely to be seen.
- Without knowing what you do, marketers have no way to tell if their ads worked or were seen at all.
Apple-Sanctioned Identifiers
Apple’s provided various APIS to facilitate user identification for various purposes:
Universal Identifiers (UDID)
In the early days of iOS,
Apple provided a unique
property on UIDevice
—
affectionately referred to as a
UDID
(not to be confused with a UUID).
Although such functionality seems unthinkable today,
that property existed until iOS 5,
until it was
deprecated and replaced by identifier
in iOS 6.
Vendor Identifiers (IDFV)
Starting in iOS 6,
developers can use the
identifier
property on UIDevice
to generate a unique identifier that’s shared across apps and extensions
created by the same vendor
(IDFV).
import UIKit
let idfv = UIDevice.current.identifierForVendor // BD43813E-CFC5-4EEB-ABE2-94562A6E76CA
Advertising Identifiers (IDFA)
Along with identifier
came the introduction of a new
AdSupport framework,
which Apple created to help distinguish
identification necessary for app functionality
from anything in the service of advertising.
The resulting
advertisingidentifier
property
(affectionately referred to as
IDFA by its associates)
differs from identifier
by returning the same value for everyone.
The value can change, for example,
if the user resets their Advertising Identifier
or erases their device.
import AdSupport
let idfa = ASIdentifierManager .shared().advertisingIdentifier
If advertising tracking is limited, the property returns a zeroed-out UUID instead.
idfa.uuidString == "00000000-0000-0000-0000-000000000000" // true if the user has limited ad tracking
DeviceCheck
identifier
and advertising
provide all the same functionality as the unique
property
they replaced in iOS 6,
save for one:
the ability to persist across device resets and app uninstalls.
In iOS 11, Apple quietly introduced the DeviceCheck framework, which allows developers to assign two bits of information that are persisted by Apple until the developer manually removes them.
Interacting with the DeviceCheck framework should be familiar to anyone familiar with APNS: after setting things up on App Store Connect and your servers, the client generates tokens on the device, which are sent to your servers to set or query two bits of information:
import DeviceCheck
let device = DCDevice.current
if device.isSupported {
device.generateToken { data, error in
if let token = data?.base64EncodedString () {
send token to your server
}
}
}
Based on the device token and other information sent by the client, the server tells Apple to set each bit value by sending a JSON payload like this:
{
"device_token": "QTk4QkFDNEItNTBDMy00Qjc5LThBRUEtMDQ5RTQzRjNGQzU0Cg==" ,
"transaction_id": "D98BA630-E225-4A2F-AFEC-BE3A3D591708",
"timestamp": 1572531720,
"bit0": true,
"bit1": false
}
To retrieve those two bits at a later point in time,
the server sends a payload without bit0
and bit1
fields:
{
"device_token": "QTk4QkFDNEItNTBDMy00Qjc5LThBRUEtMDQ5RTQzRjNGQzU0Cg==" ,
"transaction_id": "D98BA630-E225-4A2F-AFEC-BE3A3D591708",
"timestamp": 1572532500
}
If everything worked,
Apple’s servers would respond with a 200
status code
and the following JSON payload:
{
"bit0" : true
"bit1" : false,
"last_update_time" : "2019-10"
}
Fingerprinting in Today’s iOS
Despite these affordances by Apple, advertisers have continued to work to circumvent user privacy protections and use any and all information at their disposal to identify users by other means.
Over the years, Apple’s restricted access to information about device hardware, installed apps, nearby WiFi networks. They’ve required apps to request permission to get your current location, access your camera and microphone, flip through your contacts, and find and connect to Bluetooth accessories. They’ve taken bold steps to prevent user tracking in Safari.
For lack of this information, companies have had to get creative, looking to forge unique identities from the scraps of what’s still available. This process of identification by a combination of external factors is known as fingerprinting.
The unfortunate reality is that we can be uniquely identified by vanishingly small amounts of information. For example, individuals within a population can be singled out by as few as four timestamped coordinates (de Montjoye et al., 2013) or little more than a birthday and a ZIP code (Sweeney, 2000).
Every WWDC since 2012 has featured a session about Privacy, but the only mention of fingerprinting specifically was a brief discussion in 2014 about how to avoid doing it.
By our count, a determined party could use conventional, unrestricted APIs to generate a few dozen bits of randomness:
Locale Information (~36 bits)
Locale information is the greatest source of identifying information on Apple platforms. The combination of your preferred languages, region, calendar, time zone, and which keyboards you have installed say a lot about who you are — especially if you have less conventional preferences.
import Foundation
Locale.current.languageCode
log2(Double(Locale.isoLanguageCodes .count)) // 9.217 bits
Locale.current.regionCode
log2(Double(Locale.isoRegionCodes .count)) // 8 bits
Locale.current.calendar.identifier
// ~2^4 (16) Calendars
TimeZone .current.identifier
log2(Double(TimeZone .knownTimeZoneIdentifiers .count)) // 8.775 bits
UserDefaults .standard.object(forKey : "AppleKeyboards" )
// ~2^6 (64) iOS keyboards
Accessibility (~10 bits)
Accessibility preferences also provide a great deal of information, with each individual setting contributing a single potential bit:
UIAccessibility.isBoldTextEnabled
UIAccessibility.isShakeToUndoEnabled
UIAccessibility.isReduceMotionEnabled
UIAccessibility.isDarkerSystemColorsEnabled
UIAccessibility.isReduceTransparencyEnabled
UIAccessibility.isAssistiveTouchRunning
Of the approximately ~25% of users who take advantage of Dynamic Type by configuring a preferred font size, that selection may also be used to fingerprint you:
let application = UIApplication.shared
application.preferredContentSizeCategory
Hardware Information (~5 or ~6 bits)
Although most of the juiciest bits have been locked down in OS updates over the years, there’s just enough to contribute a few more bits for purposes of identification.
On iOS, you can get the current model and amount of storage of a user’s device:
import UIKit
let device = UIDevice.current
device.name // "iPhone 11 Pro"
let fileManager = FileManager .default
if let path = fileManager .urls(for: .libraryDirectory , in: .systemDomainMask ).last?.path,
let systemSize = try? fileManager .attributesOfFileSystem (forPath : path)[.systemSize ] as? Int
{
Measurement<UnitInformationStorage >(value: Double(systemSize ), unit: .bytes)
.converted(to: .gigabytes) // ~256GB
}
With 14 supported iOS devices, most having 3 configurations each, let’s say that this contributes about 32 possibilities, or 5 bits.
You can go a few steps further on macOS, to further differentiate hardware by its processor count and amount of RAM:
processInfo .processorCount // 8
Measurement<UnitInformationStorage >(value: Double(processInfo .physicalMemory ),
unit: .bytes)
.converted(to: .gigabytes) // 16GB
It’s hard to get a sense of how many different Mac models are in use, but a reasonable estimate would be on the order of 26 or 27.
Cellular Network (~2 bits)
Knowing whether someone’s phone is on Verizon or Vodafone
can also be factored into a fingerprint.
You can use the CTTelephony
class from the
CoreTelephony framework
to lookup the providers for devices with cellular service:
import CoreTelephony
let networkInfo = CTTelephonyNetworkInfo ()
let carriers = networkInfo .serviceSubscriberCellularProviders ?.values
carriers?.map { ($0.mobileNetworkCode , $0.mobileCountryCode ) }
The number of providers varies per country, but using the 4 major carriers in United States as a guideline, we can say carrier information would contribute about 2 bits (or more if you have multiple SIM cards installed).
Communication Preferences (2 bits)
More generally, even knowing whether someone can send texts or email at all can be factored into a fingerprint. This information can be gathered without permissions via the MessageUI framework.
import MessageUI
MFMailComposeViewController .canSendMail ()
MFMessageComposeViewController .canSendText ()
Additional Sources of Identifying Information
If the use of digital fingerprinting seems outlandish, that’s just scratching the surface of how companies and researchers have figured out how to circumvent your privacy.
GeoIP and Relative Network Speeds
Although access to geolocation through conventional APIs requires explicit authorization, third parties may be able to get a general sense of where you are in the world based on how you access the Internet.
Geolocation by source IP address is used extensively for things like region locking and localization. You could also combine this information with ping-time measurements to hosts in known locations to get a more accurate pinpoint on location (Weinberg et al., 2018):
ping -c 5 99.24.18.13 # San Francisco, USA
--- 99.24.18.13 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 11.900/12.184/12.895/0.380 ms
ping -c 5 203.47.10.37 # Adelaide, Australia
--- 203.47.10.37 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 202.122/202.433/203.436/0.504 ms
Battery Health
It’s unclear whether this is a concern in iOS,
but depending on how precise the results of UIDevice
battery APIs are,
you may be able to use them to identify a device by its battery health.
(Olejnik et al., 2016)
var timestampedBatteryLevels : [(Date, Float)] = []
if UIDevice.current.isBatteryMonitoringEnabled {
timestampedBatteryLevels .append((Date(), UIDevice.current.batteryLevel ))
}
And so on…
Everything from your heartbeat, to your gait, to your butt shape seem capable of leaking your identity. It can all be quite overwhelming.
I mean, if a motivated individual can find your home address by cross-referencing the reflection in your eyes against Google Street view, how can we even stand a chance out there?
Much as we may bemoan the current duopoly of mobile operating systems, we might take some solace in the fact that at least one of the players actually cares about user privacy. Though it’s unclear whether that’s a fight that can ever be won.
At times, our fate of being tracked and advertised to may seem as inevitable as the victims in It Follows.
But let’s not forget that, as technologists, as people with a voice, we’re in a position to fight back.