by Michael Werneburg
I currently have a contract with a firm that "de-identifies" health information prior to it being shared with third parties such as marketers, drug manufacturers, and researchers. De-identification is the process of ensuring that the payload information (about a series of hospital visits, or about drug prescriptions) cannot be tracked back to the individual patients.
It's a tricky business, because it's not about the direct identifiers that you simply blot out of the information prior to its sharing: the names, birth dates, and patient ID's. It's about handling the rest of the information in a way that no journalist or prosecutor is going to be able to piece things together from the evidence remaining. This involves things like introducing subtle changes to the data that allow the data to still retain value. Several of the tools used involve statistical processes.
It's very interesting, and has opened a new dimension in my understanding of information risk.