Index
Intro
Stargazer is a bioinformatics tool for calling star alleles (haplotypes) in PGx genes using data from NGS or SNP array. Stargazer can accept NGS data from both WGS and TS.
Stargazer identifies star alleles by detecting SNVs, indels, and SVs. Stargazer can detect complex SVs including gene deletions, duplications, and hybrids by calculating paralog-specific copy number from read depth.
When building Stargazer, we used the clinically important CYP2D6 gene as a model for detection and interpretation of SVs in the context of other observed SNVs and indels. We purposely chose CYP2D6 as a starting point because 1) the enzyme it encodes metabolizes ~25% of prescription drugs, 2) its activity varies considerably among individuals due to the gene's highly polymorphic nature, and 3) it is one of the most complex genetic loci to genotype in the human genome. To date, over 100 star alleles have been defined for CYP2D6, some involving a gene hybrid with its nearby non-functional but highly homologous paralog CYP2D7.
For more details on how Stargazer works, please see the Documentation page. Thanks for your interest in Stargazer!
Abbreviations and acronyms
You will see lots of abbreviations and acronyms shown below throughout this website (you've been warned).
1KGP, The 1000 Genomes Project; AF, allele fraction; AS, activity score; CN, copy number; DOI, digital object identifier; GDF, GATK-DepthOfCoverage format; indels, insertion-deletion variants; NGS, next-generation sequencing; PGx, pharmacogenomics; SDF, SAMtools depth format; SGE, Sun Grid Engine; SNP, single nucleotide polymorphism; SNV, single nucleotide variant; SV, structural variant; TS, targeted sequencing; VCF, variant call format; WGS, whole genome sequencing
Target genes
The latest version of Stargazer can call star alleles in 58 PGx genes shown below. We are continuously extending Stargazer to additional genes, so stay tuned! If you do not see your favorite PGx genes in the list, let us know and we will try to prioritize them when adding new genes.
2C_CLUSTER, ABCB1, ABCG2, CACNA1S, CFTR, CYP17A1, CYP19A1, CYP1A1, CYP1A2, CYP1B1, CYP26A1, CYP2A13, CYP2A6, CYP2B6, CYP2C19, CYP2C8, CYP2C9, CYP2D6, CYP2E1, CYP2F1, CYP2J2, CYP2R1, CYP2S1, CYP2W1, CYP3A4, CYP3A43, CYP3A5, CYP3A7, CYP4A11, CYP4A22, CYP4B1, CYP4F2, DPYD, G6PD, GSTM1, GSTP1, IFNL3, NAT1, NAT2, NUDT15, POR, PTGIS, RYR1, SLC15A2, SLC22A2, SLCO1B1, SLCO1B3, SLCO2B1, SULT1A1, TBXAS1, TPMT, UGT1A1, UGT1A4, UGT2B15, UGT2B17, UGT2B7, VKORC1, XPC
Author
Stargazer was developed by Seung-been Lee (he goes by "Steven") during his PhD in the Nickerson lab at the University of Washington. Steven graduated in June of 2019 and now works in industry, but had been instrumental in developing Stargazer. However, Sean & Aparna are now developing and maintaining it.
Citation
If you use Stargazer in a published analysis, please report the program version and cite the appropriate article.
The most recent reference for Stargazer's genotyping algorithm is:
Lee et al., 2019. Calling star alleles with Stargazer in 28 pharmacogenes with whole genome sequences. Clinical Pharmacology & Therapeutics. DOI: https://doi.org/10.1002/cpt.1552.
The PGx genotyping pipeline using Stargazer is described in:
Lee et al., 2018. Stargazer: a software tool for calling star alleles from next-generation sequencing data using CYP2D6 as a model. Genetics in Medicine. DOI: https://doi.org/10.1038/s41436-018-0054-0.
Publications
Below are selected articles in which Stargazer was used for genotype analysis.
Gaedigk, Andrea, et al. "CYP2C8, CYP2C9, and CYP2C19 characterization using next-generation sequencing and haplotype analysis: a GeT-RM collaborative project." The Journal of Molecular Diagnostics 24.4 (2022): 337-350. DOI:
McInnes et al., 2020. Transfer learning enables prediction of CYP2D6 haplotype function. BioRxiv. DOI: https://doi.org/10.1101/684357. Taliun et al., 2019. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. BioRxiv. DOI: https://doi.org/10.1101/563866. Dalton and Lee et al., 2019. Interrogation of CYP2D6 structural variant alleles improves the correlation between CYP2D6 genotype and CYP2D6-mediated metabolic activity. Clinical and Translational Science. DOI: https://doi.org/10.1111/cts.12695. Claw et al., 2019. Pharmacogenomics of nicotine metabolism: novel CYP2A6 and CYP2B6 genetic variation patterns in Alaska Native and American Indian populations. Nicotine & Tobacco Research. DOI: https://doi.org/10.1093/ntr/ntz105. Bhatt et al., 2018. Hepatic Abundance and Activity of Androgen and Drug Metabolizing Enzyme, UGT2B17, are Associated with Genotype, Age, and Sex. Drug Metabolism and Disposition. DOI: https://doi.org/10.1124/dmd.118.080952.