New cBioPortal Feature: orientation-aware OQL filter for gene fusion events

Identifying structural variants, such as gene fusions, plays a crucial role in cancer genomics, as they can drive the development and progression of various cancers. Gene fusions, where parts of two different genes combine to form a new hybrid gene, are often a result of chromosomal rearrangements such as translocations, deletions, or inversions. These genetic alterations often lead to abnormal protein function that can significantly impact cellular processes promoting cancer cell growth and disease propagation. Understanding and identifying these events is essential for developing targeted therapies and improving diagnostic accuracy.

cBioPortal supports gene fusion data by providing visualizations and detailed annotations for these fusion events in the OncoPrint tab in the Results View. Users can identify which genes are involved in fusions across different cancer types and understand the frequency and distribution of these fusions in selected studies. The Onco Query Language (OQL) also supports querying for fusion events for a given gene across one or multiple studies. The basic OQL notation that exists in cBioPortal is as follows:

GeneID: FUSION

The above query searches for all reported gene fusion events for a gene across the selected study(ies). Let’s look at the result of such a query in the MSK-IMPACT Clinical Sequencing Cohort study in cBioPortal. In this study, the TMPRSS2 gene is most commonly found in structural variants in prostate cancer tumor samples. We first submit a query (TMPRSS2: FUSION) for all fusion events in the TMPRSS2 gene in prostate cancer.

Figure 1: OncoPrint tab in the Results View for the query: all fusion events detected for TMPRSS2 gene in prostate cancer.
Figure 1: OncoPrint tab in the Results View for the query: all fusion events detected for TMPRSS2 gene in prostate cancer.

Following the query above, we can see in the OncoPrint tab (Figure 1) that in 29% of the samples profiled for gene fusions, TMPRSS2 is fused with another gene in prostate cancer. However, using this simple query notation, the user is not immediately aware of the frequency of fusion events where TMPRSS2 is in the upstream/downstream regions of the fusion. The most frequent fusion partners are also only visible when hovering over the samples in the TMPRSS2: FUSION track. This was a limitation for users specifically interested in researching the fusion events. To answer the need for a more detailed look at gene fusion events in cBioPortal, our team at The Hyve has developed new OQL support functionalities.

A new way to query for gene fusions in cBioPortal

The new update to the OQL introduces a syntax to specify the orientations of any gene pair involved in the fusion event. These enhancements allow for a more precise and detailed exploration of fusion events. Table 1 summarizes the new syntax.

Table 1: The newly developed OQL support syntax for querying gene fusion events.
Table 1: The newly developed OQL support syntax for querying gene fusion events.

Now, let’s take another look at our example from before using the newly developed OQL support. To find fusions where TMPRSS2 is specifically the upstream or downstream gene we add “TMPRSS2::” and “::TMPRSS2” to our query, respectively. Modifying the query, we see two new tracks added to the OncoPrint tab for the specific fusion events. The results make it clear that the TMPRSS2 gene is more often found in the downstream position of the gene fusion events (Figure 2).

Figure 2: The TMPRSS2 gene is detected in the downstream position in 26% of the samples profiled for structural variants.
Figure 2: The TMPRSS2 gene is detected in the downstream position in 26% of the samples profiled for structural variants.

How about other genes involved in fusion events with TMPRSS2? In prostate cancer, ERG:TMPRSS2 fusion has been identified as a potential predictive marker [1]. Using the new OQL support developed by The Hyve, we can easily query for fusion events involving these two genes in a specific order. For fusions where ERG is the downstream gene, we add “TMPRSS2::ERG” and for fusions where TMPRSS2 is the downstream gene, we add “ERG::TMPRSS2” to our query.

Figure 3: Most putative driver fusion events are between ERG and TMPRSS2, where the latter is in the downstream position.
Figure 3: Most putative driver fusion events are between ERG and TMPRSS2, where the latter is in the downstream position.

The OncoPrint tab now shows us a detailed breakdown of the fusion events for TMPRSS2 (Figure 3). Looking at the fusions where TMPRSS2 is the upstream gene, only 1% of the time the fusion partner is ERG. What other genes are involved in fusions with TMPRSS2, which are putative driver events? We can quickly answer this question by evaluating the putative driver alterations (dark purple) that do not overlap between TMPRSS2:: and TMPRSS2::ERG tracks. Hovering over the samples, one can see that these genes are all part of the ETS Variant Transcription Factors family. The results also show that all ERG::TMPRSS2 fusions result in putative driver events in this study cohort.

Figure 4: The new structural variants table with positional information (right) can be added to the Study View using the option skin.study_view.show_sv_table=true in local cBioPortal instances.
Figure 4: The new structural variants table with positional information (right) can be added to the Study View using the option skin.study_view.show_sv_table=true in local cBioPortal instances.

This valuable information was extracted within minutes and is possible thanks to the new OQL syntax for querying gene fusions in cBioPortal! This public feature enables precise, orientation-aware questions about fusion events in your studies, advancing cancer research and treatment. To enhance data analysis further, local instances can display a fusion summary table with positional info in the Study View (Figure 4). This option can be enabled simply by adding the option skin.study_view.show_sv_table=true in the application.properties file.

The Hyve provides services to develop, extend, and improve features in cBioPortal. This includes developing tailor-made filtering and visualization tools within cBioPortal, such as the orientation-aware OQL for structural variants. If you have a desired feature in mind to improve your cancer genomics data exploration with cBioPortal, feel free to contact us.

[1] Lorenzin F, Demichelis F. Past, Current, and Future Strategies to Target ERG Fusion-Positive Prostate Cancer. Cancers (Basel). 2022 Feb 22;14(5):1118. doi: 10.3390/cancers14051118

We need your consent to show you this video

cBioPortal cBioPortal

As a globally recognised leader in cBioPortal installations, The Hyve offers unparalleled expertise in managing and utilising large-scale biomedical data. Since joining the cBioPortal open-source community in 2015, The Hyve has actively contributed to the platform's development, overseeing the most active cBioPortal installations worldwide.

These services are tailored to meet the needs of a diverse clientele, including pharmaceutical companies, hospitals, data providers, and research organisations. Each solution enhances the analysis and visualisation of cancer genomics datasets, supporting research and clinical decision-making.

Read more

Let's start collaborating

  • The most user-friendly tool for cancer genomics
  • Dedicated cBioPortal specialists
  • Integral part of the cBioPortal community

Fill in the form and we will get in touch