Answered By: Thomas King
Last Updated: Nov 29, 2018     Views: 3

De-identification is the removal of information that can be used to identify individuals (usually limited to research participants, but which may include other individuals referred-to in the data).

The first step is to remove all direct identifiers - pieces of information that can be used in isolation to identify an individual. Direct identifiers include things like names, email addresses, phone numbers, ID numbers (including staff numbers), etc. In quantitative datasets, often this will be as simple as removing the fields containing such information.

The second step is to remove all indirect identifiers - pieces of information that can be combined with other information to identify someone. An example of a set of indirect identifiers could be the following: an individual's position in an institution, combined with the date the data was collected, can sometimes be enough to identify an individual. Indirect identifiers are more common in qualitative data and are considerably harder to de-identify.