How to transform European patient data to the OMOP Common Data Model

The first OHDSI Symposium in Europe will be held in Rotterdam (NL) on March 23 2018, hosted by the Erasmus Medical Centre. As with previous international OHDSI symposiums, Maxim and Kees will attend the meeting. Stefan, Julia, Irina, and Marinel will accompany them. This blog gives a preview of the lessons learned working with OHDSI.

In the past years, The Hyve has been involved with the transformation of patient and medication data from four different countries to the OMOP Common Data Model, the standard data model adopted by OHDSI. The projects required the mapping of Swedish electronic health records (EHR) data, Danish drug description data, Dutch general practitioner (GP) data and British EHR data to the OMOP CDM. This proved quite challenging at times.

Our experiences with mapping to the OMOP CDM

At the OHDSI symposium in Rotterdam, Maxim will present the issues he encountered with the Danish project. A set of 4754 drugs, their active ingredient dose, and dosage form needed to be mapped to the RxNorm, the OMOP standard vocabulary for drugs. With a script that The Hyve developed in-house 67 percent of the drugs could be mapped automatically without data loss. The remaining 33 percent were mapped to a lower hierarchy or needed to be mapped by hand, a labour-intensive and time-consuming task.

A combination of automated and manual mapping always needs to be done for European cohorts, and The Hyve observed a number of issues they often ran into with various cohorts.

During the step in which the active ingredient is mapped to the RxNorm, Maxim often encountered the problem that drugs with more than one active ingredient could not be mapped automatically. One drug, for example, contained bendroflumethiazide and potassium. The ingredients bendroflumethiazide and potassium do exist separately in the ATC, the classification ontology that lists all active ingredients of drugs, but the combination does not exist. The combined drug could be mapped manually to the RxNorm concept, but automated mapping would be extremely helpful.

From the experience of The Hyve the step of mapping the dose form to the RxNorm proved even more difficult. In the Danish dataset, for example, no less than 490 different dose forms were identified. In the end, only the 90 most frequently occurring forms were mapped.
Sometimes the Danish description needed to be translated to the more commonly used name. For example, the prescription records never mentioned ‘cream’, whereas ‘topical cream’ was used 4177 times. For better comparability of the data, we decided to map the Danish dose form ‘cream’ to ‘topical cream’ instead.

At the symposium, Maxim will also present results of the quality analysis he performed on the Swedish, Dutch and UK datasets that The Hyve converted to the OMOP Common Data Model. The Achilles Heel tool is available for this type of analysis, but The Hyve colleagues extended this code to reveal additional quality information, indicating for example how much can be gained by mapping the top unmapped concepts.

Future opportunities and challenges

Kees will discuss opportunities and challenges for adoption of the OMOP CDM in Europe, as one of the members of an Expert Panel. The four OHDSI categories, drugs, medical conditions, surgical procedures and measurements (e.g. lab results), all pose their own challenges.
In The Hyve’s experience, one of the major hurdles for adoption of the OHDSI standard is the translation of medical terms from Swedish, Danish and Dutch to English, and then often translating them again to the US standard vocabulary that the OMOP Common Data Model uses.

If you want to hear more on these and other of our experiences with OHDSI, please come by our booth at the symposium.

Written by

Maxim Moinat