The purpose of this guideline is to outline the UO standards for de-identifying data. To protect the privacy and security of data subjects and to maintain confidentiality of other sensitive information, data (human subject or other types) may need to be de-identified before use in several academic, research, business, or operational functions. For example, sensitive data may need to be stripped of identifying information before use for research purposes, institutional effectiveness studies, operational efficiency, public safety or information security, or prior to release to other entities.
Acceptable De-identification Methods
Two de-identification methods are acceptable – the expert determination and the safe harbor methods. These are based on the Health Insurance Portability and Accountability Act (HIPAA) privacy rules detailed in the US Department of Health and Human Services resources referenced in the Resources section below.
Expert Determination De-Identification Method
The expert determination method of de-identification is acceptable if determination is made by an expert that the risk of re-identification is “very small” when the anticipated recipients use it alone or in combination with other reasonably available information. Expert should document the methods of such analysis. Experts may be found in the scientific, mathematical, or other scientific domains. When using expert determination and data is subject to the provisions of HIPAA, the US Department of Health and Human Services Office of Civil Rights will review and vet the professional experience and credentials of the expert with de-identification methodologies.
Safe Harbor De-identification Method
Unique identifiers of the individual or of relatives, employers, or household members of the individual should be removed to achieve the “safe harbor” method of de-identification.
- All geographic subdivisions smaller than a State, including
- Street address
- Zip code
- All elements of dates (except year) for dates directly related to the individual, including
- Birth date
- Admission date
- Discharge date
- Date of death
- Elements of dates for individuals over 89 years old
- Telephone numbers
- Fax numbers
- Social security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate/license numbers
- Email addresses
- Social media profile names (or handles)
- Web Universal Resource Locators (URLs)
- Internet Protocol (IP) address numbers
- Device identifiers and serial numbers
- Vehicle identifiers and serial numbers, including license plate numbers
- Device identifiers and serial numbers
- Biometric identifiers, including finger and voice prints
- Full-face photographs and any comparable images
- Any other unique identifying number, characteristic, or code. In addition to the removal of unique identifiers, there should be reasonable assurance that the individual or entity intending to use the data does not have actual knowledge that the remaining information could be used alone or in combination with any reasonably available information to identify an individual who is subject. Other details that may result in the identification of an individual include: initials, circumstances associated with the care of an individual, highly publicized details, and profession or occupation.
 Zip code, and their equivalent geocodes, except for the initial three digits of a zip code if, according to the current publicly available data from the Bureau of Census (1) the geographic units formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and (2) the initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000
 Elements of dates and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older
Re-Identification by Code
Often times, de-identified data may need to be re-identified (rendered distinguishable) for various purposes including to enhance research activities. For these situation, a special code or pseudonym may be assigned to individual records that meet the following criteria:
- The code or pseudonym may not be derived from other related information about the individual. A common de-identification techniques include the use of a one-way cryptographic function known as hashing, or a random number generator
- Only authorized parties should know or have access to the re-identification method.
- The re-identification method is documented and include, at a minimum, the following security controls:
- Physical, technical, and administrative safeguards to protect the index of codes used to re-identify the data
- Physical and/or logical separation of storage of the de-identified data and the index of codes
- Documented retention period for the index of codes
- Documented steward responsible for safeguarding the index of codes
- Documented list of individuals authorized to access the index of codes
- Description of re-identification purpose, frequency and duration (e.g, will the data remain de-identified for the duration of the study, or will it be re-identified periodically, by whom, and for what reasons?)
45 CFR Subpart E: Privacy of Individually Identifiable Health Information, section 164.514 (a) and (b).
"Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule," US Department of Health and Human Services, Office for Civil Rights.