Skip to content

Latest commit

 

History

History
79 lines (58 loc) · 2.35 KB

File metadata and controls

79 lines (58 loc) · 2.35 KB

Test Package PyPI version

grein_loader

Python package to automatically download datasets from GREIN

http://www.ilincs.org/apps/grein

VBIL Note

To run our version of grein_loader, use the following code:

Use local version of grein_loader

from pathlib import Path import sys

ROOT = Path.cwd().resolve().parent GREIN_LOADER_DIR = ROOT / "grein_loader" GREIN_LOADER_SRC = GREIN_LOADER_DIR / "src"

if str(GREIN_LOADER_SRC) not in sys.path: sys.path.insert(0, str(GREIN_LOADER_SRC))

print("workspace root:", ROOT) print("grein_loader dir:", GREIN_LOADER_DIR) print("grein_loader src:", GREIN_LOADER_SRC) print("using local src:", GREIN_LOADER_SRC.exists())

Introduction

Grein Loader enables users to access data from the GREIN website by using the GSE identification number

Installation

Install the package from pypi by using:

pip install grein_loader

Usage

The package allows you to download the description, metadata and the raw counts of a GREIN dataset based on the GSE id. The datasets from GREIN are publicly available and can be accessed via the GREIN webpage. Each dataset uses an GEO accession id which allows you to access its data.

load_dataset()

geo_accession = "GSE112749"
description, metadata, count_matrix = grein_loader.load_dataset(geo_accession)

Input/Output parameters

Input parameter:
| gse_id | string | GEO accession id

Output parameter: 
| description  | dictionary      | description of dataset
| metadata     | dictionary      | metadata of dataset
| count_matrix | pandas dataframe| numpy array of raw counts

load_overview()

loads a number of datasets from Grein, the datasets are also listed on the main paige of GREIN

number_of_datasets = 10
overview = loader.load_overview(number_of_datasets)

The function returns a list of dictionaries, each dictionary contains the GSE id, number of samples, species and description provided from GREIN.

Input parameter:
number_of_samples

Output parameter: 
list of dictionaries with, "geo_accession", no_samples", "species","title", "study_summary"