cBioPortal setup made easy using Docker (Updated 2019)

Diana Baiceanu

01-11-2019 3 min read

The easiest way to install and run cBioPortal is by using Docker. Below are the commands needed to easily install and configure all its dependencies and prerequisites.

When you have Docker installed (installation steps for Ubuntu) please make sure you have:

Dedicated at least 4GB of memory to Docker.
Added yourself to a docker group after installation in order to run docker as a user:

$ sudo groupadd docker
$ sudo usermod -aG docker $USER
$ newgrp docker

#1) Download the seed database and schema to a local folder named seedDB:

$ wget https://github.com/cBioPortal/datahub/raw/master/seedDB/seed-cbioportal_hg19_v2.7.3.sql.gz -P ./seedDB

$ wget https://raw.githubusercontent.com/cBioPortal/cbioportal/v2.0.0/db-scripts/src/main/resources/cgds.sql -P ./seedDB

#2) Create a network:

$ docker network create cbio-net

#3) Start a MySQL server. Make sure the folder db_files was not previously created or is empty:

$ docker run -d --restart=always \

--name=cbioDB \

--net=cbio-net \

-e MYSQL_ROOT_PASSWORD='P@ssword1' \

-e MYSQL_USER=cbio \

-e MYSQL_PASSWORD='P@ssword1' \

-e MYSQL_DATABASE=cbioportal \

-v $PWD/db_files/:/var/lib/mysql/ \

mysql:5.7

#4) Upload the seed database to the server. This command may take a few minutes.

$ docker run \

--rm -it \

--name=load-seeddb \

--net=cbio-net \

-e MYSQL_USER=cbio \

-e MYSQL_PASSWORD='P@ssword1' \

-v $PWD/seedDB/cgds.sql:/mnt/cgds.sql:ro \

-v $PWD/seedDB/seed-cbioportal_hg19_v2.7.3.sql.gz:/mnt/seed.sql.gz:ro \

mysql:5.7 \

sh -c 'cat /mnt/cgds.sql | mysql -hcbioDB -u"$MYSQL_USER" -p"$MYSQL_PASSWORD" cbioportal \

&& zcat /mnt/seed.sql.gz | mysql -hcbioDB -u"$MYSQL_USER" -p"$MYSQL_PASSWORD" cbioportal'

#5) Save to the current directory the portal.properties.EXAMPLE file, rename it to portal.properties and edit it as below:

db.user=cbio

db.password=P@ssword1

db.host=cbioDB

db.portal_db_name=cbioportal

db.connection_string=jdbc:mysql://cbioDB/

#6) Migrate database to latest version:

$ docker run --rm -it --net cbio-net \

-v $PWD/portal.properties:/cbioportal/portal.properties:ro \

cbioportal/cbioportal:latest \

migrate_db.py -p /cbioportal/portal.properties -s /cbioportal/db-scripts/src/main/resources/migration.sql

#7) Create a mongoDB database and run a session service container adding a link to the mongoDB:

$ docker run -d --name=mongoDB --net=cbio-net \

-e MONGO_INITDB_DATABASE=session_service \

mongo:3.6.6

$ docker run -d --name=cbio-session-service --net=cbio-net \

-e SERVER_PORT=5000 \

-e JAVA_OPTS="-Dspring.data.mongodb.uri=mongodb://mongoDB:27017/session-service" \

cbioportal/session-service:latest

#8) Run cBioPortal server:

$ docker run -d --restart=always \

--name=cbioportal-container \

--net=cbio-net \

-v $PWD/portal.properties:/cbioportal/portal.properties:ro \

-e JAVA_OPTS='-Xms2g -Xmx4g -Dauthenticate=noauthsessionservice

-Dsession.service.url=http://cbio-session-service:5000/api/sessions/my_portal/' \

-p 8081:8080 \

cbioportal/cbioportal:latest \

/bin/sh -c 'java ${JAVA_OPTS} -jar webapp-runner.jar /cbioportal-webapp'

cBioPortal can now be reached at http://localhost:8081/

Activity of Docker containers can be seen with:

$ docker ps -a

The next step will be to load a study into cBioPortal. For this you can follow the instructions in our blogspot on simple data loading into cBioPortal with Docker using a public TCGA study from Datahub.

Leave us a message there if you are running into trouble. To get in touch with us for (commercial) support, send us a message at marketing@thehyve.nl or fill in this contact form.

cBioPortal

As a globally recognised leader in cBioPortal installations, The Hyve offers unparalleled expertise in managing and utilising large-scale biomedical data. Since joining the cBioPortal open-source community in 2015, The Hyve has actively contributed to the platform's development, overseeing the most active cBioPortal installations worldwide.

These services are tailored to meet the needs of a diverse clientele, including pharmaceutical companies, hospitals, data providers, and research organisations. Each solution enhances the analysis and visualisation of cancer genomics datasets, supporting research and clinical decision-making.