What is Stargazer?

Stargazer is a bioinformatics tool for calling star alleles in various polymorphic pharmacogenes from next-generation sequencing (NGS) data. Stargazer works with data from whole genome sequencing or targeted sequencing.

Stargazer identifies star alleles from NGS data by detecting single nucleotide variants (SNVs), insertion-deletion variants (indels), and structural variants (SVs). Stargazer detects SVs including gene deletions, duplications, and hybrids by calculating paralog-specific copy number from read depth.

When building Stargazer, we used the clinically important CYP2D6 gene as a model for detection and interpretation of SVs in the context of other observed SNVs and indels. We purposely chose CYP2D6 as a starting point because it is one of the most complex genetic loci to genotype in the human genome (over 100 star alleles have been defined for CYP2D6, some involving a gene hybrid with its nearby non-functional but highly homologous paralog CYP2D7). Genotyping CYP2D6 is also important for precision drug therapy because it metabolizes approximately 25% of drugs and its activity varies considerably among individuals due to the gene's highly polymorphic nature.

For more details on how Stargazer works, please see the Documentation page. Thanks for your interest in Stargazer!

Target gene list

The latest version of Stargazer (v1.0.4) can call star alleles in the following 28 genes: CYP1A1, CYP1A2, CYP2A6/CYP2A7, CYP2B6/CYP2B7, CYP2C8, CYP2C9, CYP2C19, CYP2D6/CYP2D7, CYP2E1, CYP3A4, CYP3A5, CYP4F2, DPYD, GSTM1, GSTP1, GSTT1, NAT1, NAT2, SLC15A2, SLC22A2, SLCO1B1, SLCO2B1, TPMT, UGT1A1, UGT2B7, UGT2B15, UGT2B17, VKORC1. We are continuously extending Stargazer to include other clinically important genes, so stay tuned!


Stargazer was developed by Seung-been "Steven" Lee in the Nickerson lab at the University of Washington.


Seung-been Lee, Marsha M. Wheeler, Karynne Patterson, Sean McGee, Rachel Dalton, Erica L. Woodahl, Andrea Gaedigk, Kenneth E. Thummel, and Deborah A. Nickerson (2018). Stargazer: a software tool for calling star alleles from next-generation sequencing data using CYP2D6 as a model. Genetics in Medicine. DOI: https://doi.org/10.1038/s41436-018-0054-0.


Stargazer_v1.0.4 (March 3, 2019)

  • Stargazer has been extended to call star alleles in 28 genes.
  • Many useful optional arguments have been added.

Stargazer_v1.0.3 (July 9, 2018)

  • Stargazer has been extended to call star alleles in CYP2A6/CYP2A7 including those with structural variation (e.g., CYP2A6*4, *1x2, *12, and *34). Examples of copy number plots for samples with structural variation can be found in the Documentation page.

Stargazer_v1.0.2 (June 14, 2018)

  • To determine the duplicated star allele in samples with three gene copies or more (e.g., CYP2D6*1x2/*4 vs. *1/*4x2), Stargazer computes allele fractions from sequence reads that carry the corresponding variant. Previous versions of Stargazer test if the observed allele fraction from a sample with three gene copies or more is greater than the mean of allele fractions from all samples within the same sequencing project that are heterozygous for the variant of interest and do not have any structural variation. This version instead uses an optimal decision boundary found with Bayesian updating for two main reasons. First, the empirical mean is not always obtainable (i.e., there is only one sample with the variant) or the mean value might not be accurate if not many samples have this variant. Second, the approach allows utilization of an informative prior that says allele fractions should be centered at 0.5 if heterozygous samples without structural variation are used.

Stargazer_v1.0.1 (April 11, 2018)

  • For detection of structural variation, this version no longer filters out loci based on the variance in read depth across the samples. Instead, it filters out pre-selected regions that have been empirically shown to produce high noise (e.g., regions in which reads are mapping multiply).
  • In order to call structural variants, this version fits every pairwise combination of known sequence structures (one for each chromosome) against the sample's observed copy number profile and then selects the combination that produces the least deviance. This combinatorial testing is used in Stargazer_v1.0.0 as well but only for samples with more than one structural variation (abnormal structure for the first chromosome and abnormal structure for the second chromosome). Basically, this version generalizes the combinatorial testing to be applied to even samples without any structural variation (normal structure and normal structure) and samples with only one structural variation (normal structure and abnormal structure). As a result, the copy number plot now displays the two best sequence structures for the sample in addition to the sample's original copy number.

Stargazer_v1.0.0 (March 11, 2018)

  • This version is described in Lee et al., 2018.