De-Identification Guidelines

Purpose

The purpose of this guideline is to outline the UO standards for de-identifying data. To protect the privacy and security of data subjects and to maintain confidentiality of other sensitive information, data (human subject or other types) may need to be de-identified before use in several academic, research, business, or operational functions. For example, sensitive data may need to be stripped of identifying information before use for research purposes, institutional effectiveness studies, operational efficiency, public safety or information security, or prior to release to other entities.

Acceptable De-identification Methods

Two de-identification methods are acceptable – the expert determination and the safe harbor methods. These are based on the Health Insurance Portability and Accountability Act (HIPAA) privacy rules detailed in the US Department of Health and Human Services resources referenced in the Resources section below.

Expert Determination De-Identification Method

The expert determination method of de-identification is acceptable if determination is made by an expert that the risk of re-identification is “very small” when the anticipated recipients use it alone or in combination with other reasonably available information. Expert should document the methods of such analysis. Experts may be found in the scientific, mathematical, or other scientific domains. When using expert determination and data is subject to the provisions of HIPAA, the US Department of Health and Human Services Office of Civil Rights will review and vet the professional experience and credentials of the expert with de-identification methodologies.

Safe Harbor De-identification Method

Unique identifiers of the individual or of relatives, employers, or household members of the individual should be removed to achieve the safe harbor method of de-identification.

Unique Identifiers

Names
All geographic subdivisions smaller than a State, including
- Street address
- City
- County
- Precinct
- Zip code
All elements of dates (except year) for dates directly related to the individual, including
- Birth date
- Admission date
- Discharge date
- Date of death
- Elements of dates for individuals over 89 years old[1]
Telephone numbers
Fax numbers
Social security numbers
Medical record numbers
Health plan beneficiary numbers
Account numbers
Certificate/license numbers
Email addresses
Social media profile names (or handles)
Web Universal Resource Locators (URLs)
Internet Protocol (IP) address numbers
Device identifiers and serial numbers
Vehicle identifiers and serial numbers, including license plate numbers
Device identifiers and serial numbers
Biometric identifiers, including finger and voice prints
Full-face photographs and any comparable images
Any other unique identifying number, characteristic, or code. In addition to the removal of unique identifiers, there should be reasonable assurance that the individual or entity intending to use the data does not have actual knowledge that the remaining information could be used alone or in combination with any reasonably available information to identify an individual who is subject. Other details that may result in the identification of an individual include: initials, circumstances associated with the care of an individual, highly publicized details, and profession or occupation.

[1] Elements of dates and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older.

Re-Identification by Code

Often times, de-identified data may need to be re-identified (rendered distinguishable) for various purposes including to enhance research activities. For these situations, a special code or pseudonym may be assigned to individual records that meet the following criteria:

The code or pseudonym may not be derived from other related information about the individual. Common de-identification techniques include the use of a one-way cryptographic function known as hashing, or a random number generator
Only authorized parties should know or have access to the re-identification method.
The re-identification method is documented and include, at a minimum, the following security controls:
- Physical, technical, and administrative safeguards to protect the index of codes used to re-identify the data
- Physical and/or logical separation of storage of the de-identified data and the index of codes
- Documented retention period for the index of codes
- Documented steward responsible for safeguarding the index of codes
- Documented list of individuals authorized to access the index of codes
- Description of re-identification purpose, frequency and duration (e.g, will the data remain de-identified for the duration of the study, or will it be re-identified periodically, by whom, and for what reasons?)

Resources

45 CFR Subpart E: Privacy of Individually Identifiable Health Information, section 164.514 (a) and (b).

"Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule," US Department of Health and Human Services, Office for Civil Rights.

Search this site

Division of Safety and Risk Services Menu