Blog Development

Protected health information and logging

By Abigail Watson 16 Aug 2017

The U.S. Department of Health and Human Services, in conjunction with the Office for Civil Rights, has identified the following 18 types of Protected Health Information (PHI) as being applicable to the HIPAA Privacy Rule’s De-Identification Standard. If you’re developing a healthcare app that uses any of these types of PHI, writing this data to a logging service that resides on a third-party data center may result in a HIPAA violation. These are the fields that healthcare providers and healthcare apps need to de-identify or anonymize before logging

Protected health information

  1. Names
  2. All geographical identifiers smaller than a state
  3. Dates (other than year) directly related to an individual
  4. Phone numbers
  5. Fax numbers
  6. Email addresses
  7. Social Security numbers
  8. Medical record numbers
  9. Health insurance beneficiary numbers
  10. Account numbers
  11. Certificate/license numbers
  12. Vehicle identifiers and serial numbers
  13. Device identifiers and serial numbers
  14. Web Uniform Resource Locators (URLs)
  15. Internet Protocol (IP) address numbers
  16. Biometric identifiers, including finger, retinal, and voice prints
  17. Full face photographic images
  18. Any other unique identifying number, characteristic, or code

Historically, tracking all of these pieces of data has been difficult at best. As such, previous tutorials and best practices may provide a hash function to show how to anonymize a field and leave the responsibility of protecting health data to the developers and system administrators of the application. This is fine, but it leaves the problem of identifying which data is PHI to people who may not be familiar with HIPAA requirements, and there is the small chance of the hash function being reversed and the data decrypted. Ideally, organizations would like a structured approach to identifying PHI that’s peer-reviewed and that consistently anonymizes or ‘zeros out’ the data, so that DevOps engineers and others who don’t need this data aren’t accidently exposed to it.

In the past few years, we’ve seen a very interesting initiative from Health Level Seven International (HL7) that may be able to do just that. The initiative is called Fast Healthcare Interoperability Resources (FHIR) and is a web-standards interoperability initiative that defines a standard web API for participating vendors to implement. It’s a global healthcare API, which has the support of major electronic health record vendors.

HL7 helps logging efforts, because the FHIR specification defines standard data models for data objects that are commonly exchanged between systems. And those data models can specify if and how they handle protected health information. For instance, the FHIR patient resource is defined by the following schema:

We can see that – at a minimum – PHI is located in the following places on the patient schema.[].family[].given[]

With this API and schema information, we can start writing libraries to automatically anonymize PHI. Now then, it should be noted that HL7 documents can contain unstructured data that might store PHI, which the application developer is still responsible for anonymizing and protecting. But for routine structured data, the FHIR schemas provide the basis for creating libraries that can facilitate maintaining HIPAA compliance if we’re thinking about using third-party data centers, logging services, and PaaS infrastructure.

The following code sample demonstrates how an organization can anonymize its patient names using a <codePatient object, whose internal data structure conforms to the FHIR patient schema. The Patient resource is available by running meteor add clinical:hl7-resource-patient clinical:autopublish.

Patient.prototype.anonymize = function () {
  var anonymizedPatient = this;
  if( &&[0]){
    var anonymizedName =[0];

    if([0].family){ =[0].family);        
    if([0].given &&[0].given[0]){
      var secretGiven =[0].given[0]);
      anonymizedName.given = [];      
      anonymizedName.text =[0].text);
    } = [];;

  return anonymizedPatient;

Anon = {
  name: function(name){
    var anonName = '';
    for(var i = 0; i < name.length; i++){
      if(name[i] === " "){
        anonName = anonName + " ";
      } else {
        anonName = anonName + "*";
    return anonName;

With such a utility function in place, we can then begin thinking about

  // the clinical:hl7-resource-patient package adjusts the findOne() function so it returns a Patient object
  var currentPatient = Patients.findOne({'identifier.value': Meteor.userId()});

  // we can now call anonymize() on our data object when we send it to loggly 
  winston.log('info', "Hello World from Clinical Meteor!", currentPatient.anonymize());

And that creates all sorts of exciting possibilities, since we’re using isomorphic JavaScript and have the anonymize() function on both the client and server. This approach allows developers to anonymize on the client based on user role; to anonymize on the server when logging to external systems; to anonymize when exporting data to flat-files; and to log events from both server and client. Of course, we need to implement FHIR resources for the ~100+ resources that HL7 has defined, and we need to go through all 100 resources and identify where PHI is located. But for the first time, we have a path forward to implement a coherent anonymization strategy for all PHI types according to a standard API.


For more information, see these links. And happy coding!

Abigail Watson

Abigail Watson 20+ years of IT industry experience, now focusing on biomedical informatics and full-stack javascript applications. Five years’ experience as a Node/Meteor developer. Chicago-area cyclist and entrepreneur.

Share Your Thoughts