The advent of molecular sequencing technology has led to an enormous amount of data being generated from the genome, transcriptome, proteome, and metabolome in cancer research. However, analyzing these complex -omics datasets to identify specific biomarkers from potential confounding variables remains a significant challenge.
Researchers at Beth Israel Deaconess Medical Center (BIDMC) recently explored the application of a statistical technique, propensity score matching (PSM), to minimize confounding factors in omics analysis. Their findings were published in the journal PLOS ONE.
The team analyzed two colorectal (CRC) cancer datasets using archived tissue samples. They divided cancer cases into favorable and unfavorable prognosis groups based on patient survival information. The groups were then compared to uncover prognostic protein or transcript biomarkers.
The first dataset consisted of proteomic expression profiles from 544 surgically resected CRC tissues stored in the pathology archives of a major academic medical center. This cohort included 367 patients who survived more than three years without recurrence (favorable outcome group) and 60 patients who had recurrence within three years (unfavorable outcome group). The second dataset comprised RNA sequencing profiles of 163 CRC cases, with 130 cases in the favorable prognosis group and 33 cases in the unfavorable prognosis group.
The study’s results suggest that PSM could emerge as an efficient and cost-effective strategy for multiomic data analysis and clinical trial design in biomarker discovery. By minimizing confounding factors, PSM could help researchers identify more accurate and reliable biomarkers for cancer diagnosis and treatment.
