Hashing Doesn’t Make Data Anonymous!

By: Linda L. Goodman

The Federal Trade Commission (“FTC”) routinely evaluates the privacy representations a company makes against their data handling practices. When discrepancies arise between claim and reality, incorrect assertions about data identification are often to blame. Data is anonymous only when it can never be associated back to a person. If data can be used to uniquely identify or target a user, it is Personally Identifiable Information (“PII”) and handled accordingly.

Hashing, only obscures personal data.  It involves taking a piece of data—like an email address, a phone number, or a user ID and turn it into a number (a hash) in a consistent way: the same input data will always create the same hash.  For example, hashing the fictional email spam@gmail.com transforms it into the hash “2813448ce6316cb70b38fa29c8c64130”, a hexadecimal number that might appear random, but is always what someone gets when they hash that email.

Hashing is beneficial because by itself cannot easily be used to identify original data. For this reason, companies often use hashing in cases where they are uncomfortable writing down or sharing the directly identifying data, but they still want to be able to store the data for matching against later. Since the hash “2813448ce6316cb70b38fa29c8c64130” appears meaningless and seemingly can’t be used to find the original data, companies often claim that hashing allows them to preserve user privacy.

But hashes aren’t “anonymous” and can still be used to identify users, and their misuse can lead to harm.  The FTC has long held that companies should not act or claim as if hashing personal information renders it anonymized.  FTC staff are vigilant to ensure companies are following the law and take action when the privacy claims they make are deceptive.

More importantly, the FTC (and CCPA Agency) do not accept the misnomer that hashing makes the data anonymous. In 2015, the FTC brought a case against Nomi, alleging that they had surveilled consumers within stores using their MAC address – a number that identifies a device when connecting to a network. The complaint explained, “Nomi cryptographically hashes the MAC addresses it observes prior to storing them on its servers. Hashing obfuscates the MAC address, but the result is still a persistent unique identifier.

Nomi wasn’t the only company the Commission alleged incorrectly relied on hashing to make data less sensitive. In 2022 the FTC brought a case against an online counseling service BetterHelp, alleging they had shared consumers’ sensitive health data—including hashed email addresses—with Facebook. The complaint laid out that BetterHelp knew that Facebook would “undo the hashing and reveal the email addresses of those Visitors and Users.” Though BetterHelp sent hashes to Facebook, rather than email addresses, the outcome was the same: Facebook allegedly learned who was seeking counselling for mental health and used that sensitive information to target ads to them.

The privacy harms in both cases originate from the fact that the companies could identify users, not the way that they did so. Hashing is just one tool used in persistent user identification, and the FTC has recently called out other mechanisms of user tracking that rely on pseudonymous identifiers.

In 2023, the FTC brought a complaint against Premom, alleging the company had collected and shared users’ unique advertising and device identifiers with third-parties, contrary to Premom’s “representation that it would share only ‘non-identifiable data’ with third-parties. In the complaint, the FTC laid out how Premom’s collection and sharing of these identifiers enabled “third parties to circumvent operating system privacy controls, track individuals, infer the identity of an individual user, and ultimately associate the use of a fertility app to that user.” In this case, persistent user tracking was done using a unique advertising ID, which didn’t provide the user any anonymity.

Similarly, in January of 2024, the FTC announced a complaint against InMarket, alleging that they had unlawfully collected data associated with a unique mobile device identifier. The Commission alleged that this unique identifier was used to track individuals over time and across apps without their informed consent.

The Federal Trade Commission pays close attention to the identifiers used to recognize users online: email addresses, phone numbers, MAC addresses, hashed email addresses, device identifiers, advertising identifiers, to recap a few. Regardless of what they look like, all user identifiers have the capability to identify and track people over time, therefore must be disclosed in your privacy policy!

 

____________________________________________________________________________________________________________

This article was originally posted on Cliclaw.com as part of my ongoing efforts to share valuable legal insights. I regularly contribute guest blogs to leading websites in the field of internet compliance. In these posts, I cover a range of topics to help businesses stay compliant in the ever-evolving digital world. You can read my latest guest contributions on Cliclaw.com.

This article is a publication of The Goodman Law Firm and is intended to provide information on recent legal developments. This article does not create an attorney-client relationship, nor should it be construed as legal advice or an opinion on specific situations. This may constitute “Attorney Advertising” under the Rules of Professional Conduct and under the law of other jurisdictions.

Linda L. Goodman is an attorney specializing in internet compliance and privacy law. With years of experience helping businesses navigate complex legal landscapes, Linda contributes expert insights on compliance issues in the digital space. To learn more about her services and insights, visit her law firm website at The Goodman Law Firm.

© 2024 TGLF, A.P.C.