Harmonizing Observational Data – Supporting Real World Data Life Cycle


OHDSIOHDSI stands for Observational Health Data Sciences and Informatics and is pronounced as “Odyssey”. It is an initiative to enable analysis and sharing of real world health data (or observational data) between different institutes and companies and the successor of the Observational Medical Outcomes Partnership (OMOP). Additionally, it can also be used as a standalone data warehouse application.


Reasons to use OHDSI

Once you have mapped your data to the OMOP Common Data Model (OMOP CDM), OHDSI provides you with a complete suit of tools to analyze your data: ATLAS. ATLAS can be hosted on your own server and accessed via a web browser. It helps you build your own cohorts, or focus on a specific subject and visualize the results.

Another strong point of OHDSI is to extend your data set by exchanging analysis results with other groups, without having to share the actual data. This is based on uniformity of the data sets in the OMOP CDM and available analysis tools. For this purpose over 650 million records world-wide are reported to be available to be analyzed in OHDSI at the moment. This makes it the preferred data warehouse for observational studies.

As in most open source community efforts, new features and updates are added regularly. You can try ATLAS here (hosted by The Hyve) or here (hosted by the OHDSI organization). Updates and new tools are continuously developed by the OHDSI community. See this page for some other examples. These new tools can be used without modification when your data is converted to the OMOP CDM.

An example of an ATLAS feature is the profile timeline, which shows the complete history of one patient: 

OHDSI atlas overview

Try this yourself by navigating to ‘Profile’ and, selecting a study and a patient.

Check out ATLAS!

There is an active OHDSI community to help you out, with participants from different disciplines  (e.g. clinical medicine, biostatistics, computer science, epidemiology, life sciences) and stakeholder groups (e.g. researchers, patients, providers, payers, product manufacturers, regulators).


How does it work

As with other data warehouses, data from different sources are transformed to fit a general data model: the Observational Medical Outcomes Partnership (OMOP) common data model (CDM).

This model has the patient as the central entity and tables are present for the data commonly needed in clinical trials and observational studies such as drug use, procedures performed etc. These tables are grouped together in the “clinical data” part of the data warehouse.

Furthermore, the OMOP model includes the major commonly used ontologies: SNOMED, Loinc, RxNorm etc. All details of the model are described extensively on the OHDSI wiki pages.


To facilitate the actual ETL (Extract Transform Load) process of your data into this data model a number of tools have been developed by OHDSI. 

  • White Rabbit: inventorize your source data (what tables and variables are present)
  • Rabbit in a hat: document/visualize the mapping of the source data to the data model
  • Usagi: map concepts used in the source data to the ontologies (vocabularies), for instance for medication, treatments etc
  • Achilles: evaluate the upload of the data and clean the data

More information these tools can be found on the OHDSI Wiki pages. The tools are available on Github.

For the actual transformation of the data any preferred language and application can be chosen (e.g.SQL, Python) and is not predefined. To perform the ETL you need extensive knowledge of your source data set and a good overview of the OMOP data model. Once you have transformed and loaded your data in the OMOP CDM database it is easy to get data from the database. You can even share your analyses with other groups using OHDSI or run them on their OMOP data sets, without actually sharing the actual data.


Our services for OHDSI

We offer professional services and support at all the different stages of the process:

Data mapping

To standardize the data, OMOP has its own “standard” concepts which uniquely identify concepts from different ontologies. For example, medication concepts are drawn from the RxNorm vocabulary and conditions from the SNOMED vocabulary.  We have extensive experience in mapping of source data to these standard ontologies. We offer advice/support on the mapping of your source data.

Building your own ETL pipeline

We can help to transform your data to the OMOP Common Data Model. We will guide you through this process from start to end:

  • Analysing your source data
  • Mapping your data to the OMOP CDM in a workshop in which we define the transformation rules
  • Creating a reproducible mapping process
  • Defining a Quality Control summary
  • Creating clear documentation of the mapping process

Setting up a local database

To get the most from your OMOP data, we help with setting up your own local database and OHDSI Tools. We have ample experience with installing and configuring the OHDSI tools, including Atlas, Achilles and TxPath.

Additionally we perform software engineering services to extend the functionalities of the OHDSI tools to optimally support your analysis needs.


I am interested in OHDSI services

Read more about OHDSI use cases and blog posts