The 23andMe genotyping platform detects single nucleotide polymorphisms (SNPs) and some more complex variations such as insertions and deletions at a predetermined set of locations in the genome that have been shown to vary between individuals. These locations are known as “markers,” and the set of possible outcomes is known as “variants” or “alleles.”
Base Pairs (A, C, T, G)
There are four DNA bases: adenine (A), thymine (T), guanine (G), and cytosine (C). At a given genomic location, you might have a C and someone else might have a T.
Your genotype will usually be reported as a pair of alleles (e.g. "A/G.") because you have two sets of autosomes (chromosomes 1-22), one from your mother and one from your father.
For markers genotyped by 23andMe, the Raw Data feature reports:
- The marker name (an rsID or internal ID number)
- The marker’s exact genomic location
- The possible alleles at that marker (usually A, C, G, or T)
- The variants detected in your saliva sample (i.e. your genotype)
In some cases, your genotype will be reported as a single allele because not all DNA is inherited in chromosome pairs. Notably, this applies to mitochondrial DNA and, for the most part, the X and Y chromosomes in males.
Insertions and Deletions
Occasionally, one or more bases may be inserted into or deleted from the genetic code at a particular location. In this case, your genotype may be reported as an insertion or deletion (‘--’) instead of an allele pair.
Depending on where in the genome the change is located, either an insertion or a deletion could represent the normal version of the variant. In other words, there are some places in the genome where having an extra base (insertion) is the normal variant and having a deletion is the rare variant. Conversely, there are some places in the genome where having an insertion is rare, making a deletion the normal variant at that location.
23andMe does not report on all possible insertions or deletions. In general, the ones reported on are small, spanning only one or a few bases.
In order to return highly accurate data to customers, we use a stringent algorithm to make genotype calls. Occasionally, a person's data may not allow us to determine their genotype confidently at a particular marker. When the algorithm cannot make a confident genotype call, it gives a "not determined" result instead. In downloaded data, the entry for any uncalled SNP displays ‘--’ instead of a two-letter genotype.
A number of "not determined" results throughout the raw data are expected, and your data would not have been returned to you if it had not met our quality standards. However, it's important to keep in mind that only a subset of markers have been individually validated for accuracy.
A small portion of markers, including those on the sex chromosomes (X and Y) and the mitochondrial DNA, are difficult to analyze because of biological issues (e.g. pseudogenes, DNA structure, and highly variable regions). These markers are more likely to have a “not determined” result.