EHR Data Normalization
Using FHIR Standards

EHR Data Normalization Using FHIR Standards

The Challenge

Given Fast Healthcare Interoperability Resources (FHIR) is an industry standard for normalizing healthcare data that can be used for AI/ML model development, we compared two implementation approaches:

1. Next-Gen Connect
2. Custom Python Script to achieve the outcome.

The data includes 55k patient records consisting of:

360 million observations
21 million medication administrations
4 million medication requests
120k encounters
2 million diagnostic reports
250k procedures that are converted into an FHIR based target canonical structure.

Approach

Data Source and Format: Input data is in relational structure and does not contain any personally identifiable information (PII) or personal health information (PHI) and is provided to us in CSV format.

Utilized Data Pre-processing: All large files greater than 50 Gb were divided into smaller files by each patient for faster search, access, load time, and memory utilization during processing.

Compute infrastructure: Utilized elastic cloud computes and storage utilities such as Big query and Healthcare API on google cloud platform (GCP). This effort can also be replicated on AWS and Azure.

Data mapping and Transformation: Transformed the data to FHIR resources by performing mapping and transformations of values from input relational structure to standard fields in FHIR (STU3).

No Test

FHIR Validation: Converted data is validated using pre-built python packages for consistency and correctness of the format . Similar operations were performed using Nextgen connect by performing mapping using JavaScript.

Results

Advantages of Script

The custom Script approach can be faster than working with a commercial tool when data engineering teams are not familiar with the software product.

FHIR resource validation is relatively easier with custom script by leveraging certain open-source packages or by creating our own packages.

Custom scripts have the flexibility to add new features as needed while the tool may be limited to existing features and the steep learning curve involved.

Advantages of Tool

The commercial product provides functionality to handle various data challenges through easy to use GUI capabilities making it Palatable for non-technical functional users.

With help of documentation and instructions, data transformations and mapping activities can be easy to replicate within the tool compared to a proprietary custom script.

Take a look at how Google is taking on healthcare and clinical artificial intelligence with EHR and FHIR.

Conclusion

Interoperability is clearly a challenge that the healthcare industry has been trying to solve since the implementation of EHR systems.

There are several open-source technical solutions available that have adopted interoperability standards such as FHIR to normalize and prepare data for AI and ML Model development. However, data mapping and transformation is a crucial step that requires clinical domain expertise from hospitals where the data was created because of the Idiosyncrasies pertinent to that hospital.

For more insight into how AI is transforming healthcare, head to our post on 6 exciting use cases of AI in Healthcare.

Featured Work

All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)

Download

All Data Inclusive, Deep Learning Models to Predict Critical Events in the Medical Information Mart for Intensive Care III Database (MIMIC III)

Featured Work

Artificial Intelligence and Robotic Surgery: Current Perspective and Future Directions

Download

Artificial Intelligence and Robotic Surgery: Current Perspective and Future Directions

Featured Work

Augmented Intelligence: A synergy between man and the machine

Download

Augmented Intelligence: A synergy between man and the machine

Featured Work

Building Artificial Intelligence (AI) Based Personalized Predictive Models (PPM)

Download

Building Artificial Intelligence (AI) Based Personalized Predictive Models (PPM)

Featured Work

Predicting intraoperative and postoperative consequential events using machine learning techniques in patients undergoing robotic partial nephrectomy (RPN)

Download

Predicting intraoperative and postoperative consequential events using machine learning techniques in patients undergoing robotic partial nephrectomy (RPN)

Featured Work

Stereo Correspondence and Reconstruction of Endoscopic Data Challenge

Download

Stereo Correspondence and Reconstruction of Endoscopic Data Challenge

EHR Data Normalization Using FHIR Standards

The Challenge

Data Source and Format: Input data is in relational structure and does not contain any personally identifiable information (PII) or personal health information (PHI) and is provided to us in CSV format.

Utilized Data Pre-processing: All large files greater than 50 Gb were divided into smaller files by each patient for faster search, access, load time, and memory utilization during processing.

Compute infrastructure: Utilized elastic cloud computes and storage utilities such as Big query and Healthcare API on google cloud platform (GCP). This effort can also be replicated on AWS and Azure.

Data mapping and Transformation: Transformed the data to FHIR resources by performing mapping and transformations of values from input relational structure to standard fields in FHIR (STU3).

FHIR Validation: Converted data is validated using pre-built python packages for consistency and correctness of the format . Similar operations were performed using Nextgen connect by performing mapping using JavaScript.

Results

Advantages of Script​

Advantages of Tool​​

Conclusion

Advantages of Script

Advantages of Tool