Note: Click on the featured images to go to slides


Knowledge graphs and semantic models for drug discovery and healthcare

Webinar hosted by Pistoia Alliance 


Data for drug discovery and healthcare is often trapped in silos which hampers effective interpretation and reuse. To remedy this, such data needs to be linked both internally and to external sources to make a FAIR data landscape which can power semantic models and knowledge graphs.


The IMI EHDEN project: large-scale analysis of observation data in Europe 

Talk at CDISC 2020 Europe Interchange Virtual Conference


The European Health Data & Evidence Network (EHDEN) project, funded via the Innovative Medicines Initiative (IMI), is the largest of its kind in Europe working in the domain of RWD/RWE. At its core is the standardisation of RWD via use of the Observational Medical Outcomes Partnership (OMOP) common data model (CDM), standardised analytics and a sustainable research community for the coming decades. Potentially, between EHDEN and OHDSI there are several use cases developed on the boundaries between clinical trials and observational data, and we look forward discussing these with the CDISC community.


Reuse of R&D data and the promise of FAIR data lakes

Talk at BioData World Congress 2019 


FAIR data lakes

At the Bio Data World conference in Basel in December 2019, Kees van Bochove, Founder of The Hyve gave a talk on re-use of pharma R&D data, and what strategies could be used to realize operationalization of FAIR data at scale.


How 2019 became the year FAIR landed in biopharmaceutical R&D

Keynote at Proventa International’s Bioinformatics East Coast Strategy Meeting 2019 and Talk at Pharmaceutical IT & Data Congress 2019


FAIR biopharmaceutical

  • Fairspace: a new cloud service to enable collaborative science
  • Implementation of FAIR in practice: which of the 15 principles to start with and what’s the RoI?
  • Common Data Models: OMOP/OHDSI, i2b2/tranSMART, CDISC, FHIR, etc: how do they relate, and which one to choose


Large scale observational clinical research with OHDSI 

Talk at i2b2 tranSMART Tübingen Symposium 2019

observational clinical research OHDSI

With 200 researchers from 25 countries and half a billion unique patients, OHDSI carries out federated studies at sufficient scale to answer questions about diagnosis and treatment. At the heart of the OHDSI platform is the OMOP Common Data Model, currently at v6, around which a toolset is built for carrying out reusable, repeatable and reproducible observational clinical research on a large scale.


Overview of the features and architecture of Glowing Bear and tranSMART

Talk at i2b2 tranSMART Tübingen Symposium 2019

Glowing Bear is a cohort selection user interface for the TranSMART clinical data warehouse. In recent years, features for several use cases have been added: time series data, standard ontologies, family relations, sample-level lab data. Meanwhile, the structure of the platform has been transformed to be more modular and maintainable. We give an overview of the added features and the changes to the data model and architecture.


Easy and secure deployment of Glowing Bear and tranSMART

Talk at i2b2 tranSMART Tübingen Symposium 2019

Deployment of tranSMART and all its dependencies used to be a complex task, mainly because of many dependencies, different versions and configuration options. With the new structure of the platform, dockerization of all its components and a main compose scripts it is not only faster to deploy everything, but also easier to manage the configuration, ensure security and monitor the components.


Building ETL pipelines for tranSMART 17.X – New tools for the data loader

Talk at i2b2 tranSMART Tübingen Symposium 2019

ETL pipelines tranSMART 17

An overview of data loading tools to tranSMART 17.X for Jupyter Notebook and automated ETL pipelines


Tackling the Clinical Data Challenges When Analyzing a Million Genomes 

Talk at BioIT World 2019 

Clinical data challenges

Population genetics and genomics is an emerging topic for the application of machine learning methods in healthcare and biomedical sciences. Currently, several large genomics initiatives […] are all in the process of making both clinical and genomics data available from large numbers of patients to benefit biomedical research. However, a key challenge in these initiatives is the standardization of the clinical and outcomes data in such a way that machine learning methods can be effectively trained to discover useful medical and scientific insights. 


Applying the OMOP data model & OHDSI software to national European health data registries: the IMI EMIF project

Talk at SCOPE Summit 2017 –  Real World Data track 


Open source for RWD

A large open source initiative for standardisation and epidemiological analysis for real world data is OHDSI: Observational Health Data Sciences and Informatics. OHDSI leverages the OMOP common data model for observational data, and provides data analysis tools for a broad range of use cases. This talk will explain OMOP and OHDSI with case study IMI EMIF, in which health data from over 50 million patients from 13 national and regional European registries is brought together.