What is this tool?
DNA Relatives Clustering is a tool that automatically groups your DNA Relatives based on how they’re genetically related to each other—not just to you. Instead of manually piecing things together, clustering provides a visual map of your family network, making it easier to track down shared ancestors, visualize family branches, and ultimately uncover hidden family connections.
It’s inspired by the Leeds Method but uses a novel algorithm designed by 23andMe scientists.
What can I learn from using this tool?
Use this tool if you want to simplify complex DNA match lists to help you break down genealogical brick walls and accelerate your research.
- Quickly identify which branch of the family tree a mystery match belongs to (e.g., “this cluster seems to be related to me through my great-grandmother on my mom’s mom’s side). This allows you to focus your research efforts.
Tip: Identify “anchor” relatives—people in your clusters who you already know how they fit into your family tree. Using these anchor relatives provides a crucial roadmap to understand the common ancestor you may share with other, unfamiliar relatives in the same cluster.
How do I “read” the clustering results?
The tool generates a "map" of your family connections:
- Rows and Columns: Your relatives are listed along both the top and side.
- The Diagonal: Squares on the diagonal (from top left to bottom right) show DNA sharing between you and your DNA Relatives.
- Off-Diagonal: Squares off the diagonal show the amount of DNA sharing between your DNA Relatives.
- Clusters: A solid "block" of color corresponds to some branch of your family (e.g., a specific set of shared ancestors such as great-great-grandparents).
- Colored Squares: A colored square means two people are in the same cluster because they share enough DNA with each other and with other relatives in the same cluster.
- Dark Gray Squares: People who share DNA but are not in the same cluster are shown with dark gray outlines “off the diagonal.”
- “Single Square” clusters: A colorful square on the diagonal shows a relative who shares DNA with other relatives, but not with enough other relatives to be included in a larger cluster; this can be informative.
- Hover & Click: Hover over a square in a cluster to see the amount of shared DNA between those two relatives. Tip: Dig deeper by clicking on the square and then the names in the dialogue to view their 23andMe Profiles, where you may discover shared surnames and ancestor locations.
- Filter by mom’s side or dad’s side: If a parent has tested on 23andMe and you’ve connected with them, use the Maternal/Paternal filters to isolate and speed up research on clusters related to your mom or dad.
Customizing clusters
The default settings are designed for broad discovery, but advanced users can manipulate the algorithm to refine their analysis. By default, your clusters will exclude very close relatives. Why? They share multiple lines and can act as “super-connectors” that merge clusters artificially.
Understanding "connection strength threshold"
This controls the "tightness" of a cluster.
- Increase this threshold to produce highly interconnected clusters. Tip: Set this to 1 if you want to see groups of relatives who are all mutually related to one another. You'll end up with more clusters that highlight small family groups.
- Decrease this threshold to merge clusters into fewer, more loosely related clusters. Tip: Set this to 0 to maximally merge clusters based on connections between relatives. This is useful if you want to identify four clusters corresponding to different grandparents.
Strategic filtering using centimorgans (cM)
A centimorgan (cM) is a unit used in genetics to measure how much DNA two people share. Adjust cM sharing thresholds in this tool as needed to help you narrow your search or test hypotheses.
- The "Sweet Spot": If your goal is to identify your four sets of grandparents (the Leeds Method goal): under “Which Relatives to Include,” try setting your minimum DNA shared with you to 50-70 cM and maximum DNA shared with you to 300-500 cM. This filters out the "noise" of distant cousins as well as the "overlap" of closer family members.
- What about “DNA shared between relatives?”
- Decreasing minimum DNA shared between relatives: Identifies more connections, but also more false links (endogamy, pedigree collapse, population effects)
- Increasing minimum DNA shared between relatives: Produces cleaner clusters, but may exclude legitimate links. Clusters tend to map to grandparent or great-grandparent lines
Tip: If DNA shared between relatives is set too low, you’ll end up with large, messy, merged clusters. If DNA shared between relatives is set too high, you’ll end up missing connections that should put relatives in the same cluster. The sweet spot can be a little different for every user!
Tip: Identifying relative clusters that map to each grandparent is easiest when many 2nd cousins are tested in the database.
Overlap between clusters
What does it mean if there’s a lot of overlap (off-diagonal gray squares) between clusters?
This often means that these clusters are related to you through the same grandparent line, and that you are seeing an earlier split within the same lineage (i.e. potentially reflecting connections via different great-grandparents). However, if there are only a few off-diagonal gray squares at low minimum cM thresholds, this is more likely to reflect noise.
Dealing with endogamy
“Endogamy” describes a population where repeated intermarriage causes individuals to be related to each other in multiple ways, producing widespread shared DNA that does not come from a single recent ancestor. In a relative clustering analysis, endogamy can incorrectly merge clusters based on background DNA sharing through multiple family lines.
If your background includes endogamy:
- Start by raising the minimum DNA shared between relatives to 30–50 cM. Go even higher as needed until clusters start to separate. For some populations with significant histories of endogamy, you may need to raise this minimum threshold above 100 cM.
- Raise the minimum DNA shared between you and your relatives to at least what you set for the minimum sharing between relatives, and if that doesn't help, go higher. Why? In an endogamous population, small amounts of DNA sharing may not reflect genealogically significant relationships. Excluding such relationships will help focus the analysis on meaningful relationships.
Relatives in Common Mode
Instead of clustering your whole list, use the Relatives in Common filter if you want to track down how you might be related to a specific individual or better understand a specific branch of your family tree.
Recommended workflow:
- From the Relatives in Common section of the cluster customization shelf, select a match you don't recognize with whom you share higher amounts of DNA (cM).
- Click “Update clustering”
- If that mystery match falls into a cluster with known cousins from your "Smith" line, you have instantly narrowed your genealogical search to that specific family tree.