Per recent discussions, there is additional data we would want to use from tcr and bcr libraries than only the clonotype_id and cdr3 amino acid sequences that are currently pulled in.
-
This isn't as simple as just pulling in more columns because cellranger doesn't organize all the columns we want into their 'clonotypes.csv' file that we currently extract from.
-
We've discussed minimally expanding to get:
cdr1, cdr2, cdr3, v_gene, j_gene
-
But perhaps we output a metadata_file where we don't need to worry about Seurat metadata clutter and then also grab all of:
c("v_gene","d_gene","j_gene","c_gene","fwr1","cdr1","fwr2","cdr2","fwr3","cdr3","fwr4","reads","umis")
(The updated function would build this dataframe by ';' join these data from the multiple lines in the all_contig_annotations.csv file that have matching 'barcode' + 'raw_contig_annotation' values.)
Questions:
- All columns in the latter suggestion?
- Should this get output to metadata file, or pull ALL these many columns into the Seurat object?
- If metadata file where?
automated_processing/(T|B)CR_contigs.csv?
Per recent discussions, there is additional data we would want to use from tcr and bcr libraries than only the clonotype_id and cdr3 amino acid sequences that are currently pulled in.
This isn't as simple as just pulling in more columns because cellranger doesn't organize all the columns we want into their 'clonotypes.csv' file that we currently extract from.
We've discussed minimally expanding to get:
cdr1, cdr2, cdr3, v_gene, j_gene
But perhaps we output a metadata_file where we don't need to worry about Seurat metadata clutter and then also grab all of:
c("v_gene","d_gene","j_gene","c_gene","fwr1","cdr1","fwr2","cdr2","fwr3","cdr3","fwr4","reads","umis")(The updated function would build this dataframe by ';' join these data from the multiple lines in the all_contig_annotations.csv file that have matching 'barcode' + 'raw_contig_annotation' values.)
Questions:
automated_processing/(T|B)CR_contigs.csv?