TranSMART, an open source platform for data storage, exchange and hypotheses testing, was originally developed, and subsequently open sourced, by Janssen R&D. It has been conquering the space of data warehousing solutions ever since.
This knowledge management platform aims to enable scientists to develop and refine research questions by investigating correlations between genetic and phenotypic data. However, a few missing functionalities in the earlier versions of tranSMART (version 16.2 and before) restricted the capabilities of the tool and opportunities for research. Researchers shared that they especially missed the support for modeling, absolute and relative, time series and the support of multiple samples for one patient, as well as the functionality to ask questions on variables over multiple studies. Another request was made to allow for large scale file storage and analysis.
The tranSMART Foundation brought four top-10 pharmaceutical companies – Pfizer, Sanofi, Abbvie and Roche – together to help sponsor a joint project to develop the new functionality in a coherent way and improve tranSMART. The focus of the sponsors was to add three main functionalities:
- Cross-study and ontology term support.
- Support for modeling time series and sample data, to allow the storage of longitudinal and EHR data.
- Creating the connection between tranSMART and Arvados, to support large data storage and analysis.
Additional to the development of the new functionality, another goal of the project was to increase the quality of the tranSMART back-end to make it ready for the future.
The focus of the project was to implement the functionality in the tranSMART REST API first, and adjust the user interface later. However, backwards compatibility with the user interface and a migration path for the database were key requirements.
To meet the project objectives, The Hyve designed and developed a new version of the tranSMART back-end. The results of the development project include:
- An adapted data model that allows for better support of cross-study and longitudinal data;
- A new query language that allows for all required calls for cross-study and longitudinal data and is also more performant;
- API calls for making linking files between tranSMART and Arvados and starting analysis workflows on them;
- An upgrade of tranSMART to the latest version of the Grails framework;
- Simplification of the tranSMART installation procedure;
- Expansion of the automated testing suite so that it can now be run with each installation;
- Expansion of the tranSMART technical documentation (including database documentation, an installation guide and code documentation that updates automatically).
Overview of the tranSMART 17.1 data model aligned with the i2b2 data model.
The back-end improvements implemented in the development project, delivered early in 2017, had a large impact on the capabilities of tranSMART, as well as its quality.
With the updated data modelling functionality for time series, samples and ontologies tranSMART 17.1 can now elegantly model EHR and other observational data. It has also brought the platform much closer to the i2b2 clinical data platform, which it was originally built on.
The new powerful REST API Version 2, which exposes all this functionality, has paved the way for an ecosystem of powerful applications on top of tranSMART. This is already shown by the development of the new user interface Glowing Bear, the tranSMART Python client and the analysis platform Fractalis, which will succeed the SmartR analyses.
It has also made it a lot easier to contribute to and install the tranSMART platform. The new tranSMART 2017 Server code has already been released in beta on the tranSMART Foundation Github and is being installed with multiple of our clients.
If you want to know more about the tranSMART 17.1 project or recent developments on the tool please contact us.