GitHub - Nizarassad/Pan-Cancer-Analysis: The Cancer Genome Atlas Pan-Cancer analysis project

The Cancer Genome Atlas Pan-Cancer Analysis Project

Pan-cancer analysis involves assessing frequently mutated genes and other genomic abnormalities common to many different cancers, regardless of tumor origin. Using next-generation sequencing (NGS), pan-tumor projects such as The Cancer Genome Atlas2 have made significant contributions to our understanding of DNA and RNA variants across many cancer types.

Microscopic view of pan cancer cells: highlighting the diverse morphology of cancerous tissues

The objective of this project is to utilize regression techniques to predict a continuous value using data from The Cancer Genome Atlas (TCGA) Pan-Cancer analysis project.

Goal and Aims:

The goal of this project is to train regression models using this data to predict a continuous value representing a specific molecular characteristic of cancers. The aims are the following:

Study the genomes of various cancers to better understand their molecular characteristics.
Assist in developing new treatment strategies

Dataset

The data, collected from different types of tumors, can be downloaded from the two links below:

Models

This project implements four different regression models using cancer genome data:

Simple Linear Regression (SLR)
Multiple Linear Regression (MLR)
Ridge Regression (RR)
Lasso Regression (LR)

Evaluation

The models are evaluated using two different metrics: mean squared error (MSE) and cross-validation.

Model	Score	Cross-validation	Mean Squared Error
Simple Linear Regression	0.98	0.98	0.05
Multiple Linear Regression	0.98	0.98	0.04
Ridge Regression	0.97	0.96	0.05
Lasso Regression	0.92	0.94	0.07

Simple and multiple linear regression models exhibit comparable performance, while Ridge and Lasso regression models have slightly lower performance.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
README.md		README.md
The Cancer Genome Atlas Pan-Cancer analysis project.pdf		The Cancer Genome Atlas Pan-Cancer analysis project.pdf
main.py		main.py
presentation.mkv		presentation.mkv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Cancer Genome Atlas Pan-Cancer Analysis Project

Goal and Aims:

Dataset

Models

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Cancer Genome Atlas Pan-Cancer Analysis Project

Goal and Aims:

Dataset

Models

Evaluation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages