# Calculating Pedigree Collapse on DNA Matches

Here’s how to calculate the pedigree collapse effect in your own DNA matching discoveries. The bottom line may not be as scary as you thought for your genetic genealogy!

In the first part of this two-part article, I defined how pedigree collapse (the intermarrying of ancestors with one another) may affect your genetic relatedness to your DNA matches. Here, learn when pedigree collapse might become significant to evaluating your genetic relationships to your matches, and how to calculate this effect in your own tree (it’s not as scary as it looks).

## Calculating shared DNA in pedigree collapse

This same systematic process described in the previous article can be applied to any relationship in the last several generations to determine an expected average amount of DNA sharing. So let’s bring this back to what you might see on your cousin match list.

If you see a match where you both descend through ancestors involved in pedigree collapse, what degree of elevated DNA sharing would you expect to see? Would it be enough to bump you up into a different relationship category? For example, if you were second cousins, would the elevated shared DNA be enough to make you look like you were 1st cousins perhaps? The Colapso family can help us answer that question.

Back to our example. Let’s determine the amount of shared DNA that would be expected between the cousins that are downstream from the event that initiated the collapse: when the two 1st cousins, Charlie and Cindy, had children together.

Ethan and Eva are 1st cousins through their common grandparents, Charlie and Cindy. As we explored earlier, it is expected that they would share on average 12.5% of DNA due to this 1st cousin relationship. However, because their grandparents are also 1st cousins themselves, Ethan and Eva are more genetically similar than cousins that descend through unrelated people. We can quantify this elevated level of DNA sharing by utilizing the coefficient of relationship.

First we’ll identify the distinct inheritance pathways that exist between Ethan and Eva, and the degrees of separation along each path. Isolating their closest relationship through their grandparents, Charlie and Cindy, Ethan and Eva share two distinct inheritance pathways: one through each grandparent. From Ethan there are two steps up to Charlie and two steps down to Eva (shown in red), making four degrees of relationship along that inheritance pathway. Similarly along the distinct path through Cindy (shown in blue), from Ethan there are two steps up to Cindy and two steps down to Eva, with four degrees of relationship between them.

Using the same methodology explored previously we can calculate the coefficient of relationship for Ethan and Eva through their grandparents, Charlie and Cindy.

This means that focusing on their 1st cousin relationship, Ethan and Eva are expected on average to share 12.5% of their genome, or approximately 850 cM.

### 12.5% x 6800 cM = 850 cM

This could be the end of the story for predicting the expected amount of shared DNA for this relationship, except Ethan and Eva also share another set of recent common ancestors. Their great-grandparents, Adam and Anna, appear twice in their pedigree making only six unique great-grandparents rather than the typical eight. Ethan and Eva each have a double-dose of Adam and Anna floating around in their genetic make-up. This is what causes the elevated level of shared DNA in downstream descendants over what they would expect in the absence of pedigree collapse.

The figure below describes an important concept to draw out of this particular case study. Adam and Anna exist twice at the great-grandparent level. We want to calculate the amount of increased DNA sharing we would expect with this “double-dose” of the same ancestors. The inheritance pathways through one set of Adam and Anna contribute the typical amount of shared DNA to their downstream descendants, as would happen in the absence of duplicated ancestors. But the second set of Adam and Anna is the source of the increased DNA sharing in their descendants. We want to use the distinct inheritance pathways through this second set of Adam and Anna to augment the Coefficient of Relationship we’ve already determined for Ethan and Eva as 1st cousins through Charlie and Cindy. Due to this occurrence of pedigree collapse, they will share the typical amount of DNA that 1st cousins would, but we will add to it the coefficient of relationship determined through the pathways and degrees of separation through only the extra set of Adam and Anna.

As with before, we will determine the distinct inheritance pathways and degrees of separation between Ethan and Eva, but this time through Adam and Anna. Because they are on the pedigree twice, there are separate paths that traverse each set of Adam and Anna. The first pathway analysis goes through the first set of Adam and Anna, and we will designate this as the one that represents the typical amount of shared DNA that a pair of ancestors would contribute to their descendants.

Remember though, we want to quantify the amount of increased DNA sharing over what would be the typical amount of DNA contribution from a single set of the same ancestors. To calculate the amount with which to augment the coefficient of relationship we will only include the analysis that comes through the extra set of Adam and Anna, as these are the inheritance pathways that provide the elevated DNA contribution.

Using the same methodology for calculating the coefficient of relationship, we can determine the amount that this secondary relationship between Ethan and Eva would augment the numbers for their closer 1st cousin relationship.

Remember that Ethan and Eva are also 1st cousins, and are expected to share 12.5% of their DNA through that relationship. Shifting the focus to their secondary relationship as 3rd cousins through Adam and Anna, Ethan and Eva are expected to share an additional 0.781% of their genome over what would be expected in the absence of pedigree collapse due to the double-dose of Adam and Anna in their pedigree at the great-grandparent level. This translates into approximately 53 cM on average of elevated shared DNA between these 1st cousins who descend from grandparents who are also 1st cousins:

### 0.781% x 6800 cM = 53 cM

On average, the total amount of DNA that Ethan and Eva would expect to share is calculated by summing over all of their recent common relationships in the last several generations.

Ethan and Eva represent relatives who have a recent incident of isolated pedigree collapse in their ancestry. They are 1st cousins themselves, and they share grandparents who are also 1st cousins. Ethan and Eva are genetically more similar to one another than 1st cousins who have entirely-unrelated recent ancestors, so we expect them to have an elevated level of cM sharing. For 1st cousins in the absence of pedigree collapse, we expect that on average they would share 850 cM. For Ethan and Eva this level of DNA sharing is elevated on average by 53 cM, making an expected augmented average of 903 cM in the presence of this particular form of recent pedigree collapse.

It is important to note that these average shared DNA figures are just that: averages. This means that due to the random inheritance of DNA fragments of varying sizes at each generation, it is very possible that pairs of relatives will share DNA at levels higher or lower than the predicted average, but within a range that spans the average predicted amount. This is evident in shared DNA numbers compiled in the Shared cM Project (Table 1), as well as in data simulated by AncestryDNA, which predicts the ranges of cM sharing that are expected for various levels of relationship (see Figure 5.2 in this white paper). Information from these sources on the ranges of shared DNA expected statistically or observed empirically for various relationships is summarized in the table below.

First cousins share on average 850 cM, but in practice 1st cousins report levels of DNA sharing in the range of 619-1159 cM (Shared cM Project), and AncestryDNA predicts statistically that they expect the amount of shared DNA to fall anywhere between 650-1300 cM. The recent pedigree collapse event in the pedigree of Ethan and Eva increases their average shared DNA as 1st cousins from 850 cM to 903 cM, but this elevated figure still falls well within the expected range for DNA sharing among 1st cousins. This isolated incident of pedigree collapse is not enough to inflate their amount of shared DNA to make them appear to belong to a different relationship category (half sibling or otherwise). They still fall solidly within the range expected DNA sharing for that of 1st cousins.

If you want to read even more about the amount of expected shared DNA between cousins with different relationships, read this article by fellow genetic genealogist Leah Larkin.

## How much more DNA are these cousins sharing?

If Ethan and Eva were to encounter each other for the first time on a DNA company cousin match list, this exercise tells us that their increased DNA sharing would likely not be enough to tip them off to the fact that there is a recent occurrence of pedigree collapse in their shared ancestry. They still look a lot like just regular 1st cousins—1st cousins whose recent ancestors are entirely unrelated.

This proves to be true for other, more distant, cousin relationships downstream from this particular form of isolated pedigree collapse, where 1st cousins form a union and have children. The amount of shared DNA between 2nd cousins Fred and Fiona in the Colapso family who are what we call downstream from our pedigree collapse, is only elevated by 13 cM on average, because we are adding an additional 4th cousin relationship (through Anna and Adam) through their 2nd cousin relationship through Cindy and Charlie. (Note that while our empirical calculation says 13 cM, the Shared cM Project indicates that the average amount of shared DNA for 4th cousins is 35 cM, there are lots of fun reasons for this, but we aren’t going to talk about that right now). But no matter how you look at it, the shared DNA between Fiona and Fred is still compatible with the expected range of shared cM for second cousins without pedigree collapse. For downstream 3rd cousins (so Fiona and Fred’s kids), the amount of DNA sharing is only increased by 3 cM (the calculated added amount for their extra 5th cousin relationship), again keeping the total shared DNA well within the expected range. The effect of the pedigree collapse is lessened with each succeeding generation, as descendants become more distant from the “double-dose” ancestors that appear twice in their pedigree.

Because this is one of the closest forms of isolated pedigree collapse possible (the union of 1st cousins to produce children would only be upstaged in genetic similarity by the union of siblings, half-siblings, uncle-niece, and other immediately related relative pairs), it follows that more distantly related unions (2nd cousin and so forth) would have even less of an effect on the total shared DNA between downstream descendants. So cousins resulting from these more distantly related unions who encounter each other for the first time on a DNA company match list would likely see no evidence of this form of more distant and isolated pedigree collapse in their amount of shared DNA.

## Multiplying effect of pedigree collapse

There are situations that some encounter in their ancestry where, unlike this example in the Colapso family with just one isolated incident of the union of related ancestors, there are many unions between related people at many generational levels. This situation produces much more widespread pedigree collapse where many of the same ancestors appear multiple times in the recent generations of a pedigree. In this case, there are many inheritance pathways that intertwine between relatives and unions, and children are produced from parents that are much more genetically similar than unrelated people.

What if you’re actually seeing endogamy, not just a couple instances of pedigree collapse?

Due to the now widespread accessibility of consumer genetic testing, often times downstream descendants in these families are encountering each other for the first time on a DNA company match list. Upon comparing pedigrees they find that their level of shared DNA is substantially elevated from what would be expected for the typical ranges of cM sharing for their closest relationship. See Kimberly T. Powell’s excellent book chapter, “The Challenge of Endogamy and Pedigree Collapse” in Debbie Parker Wayne’s Advanced Genetic Genealogy: Techniques and Case Studies (2019, p. 127-153), for examples of widespread pedigree collapse within an extended family and how it effects the level of cM sharing among downstream individuals.

Thank you for sticking with me through all that fun math, and learning more about pedigree collapse through several generations of the Colapso family! This exercise has introduced some of the issues that attend the interpretation of genetic data from a collapsed pedigree. The most global takeaway is that pedigree collapse always results in elevated levels of DNA sharing in the next several generations of descendants, but the farther away from the collapsing event the less effect it has on the shared DNA of descendants.

If the presence of pedigree collapse is isolated rather than widespread, downstream cousins will still expect increased DNA sharing but the elevated amount is modest enough to keep the augmented values well within the range of expected shared DNA for that relationship. Cousins will likely not recognize from their amount of shared DNA that they have an isolated incident of pedigree collapse in their recent pedigree.

This changes, however, if the incidents of closely related unions are widespread, as the augmented level of cM sharing among descendants may be enough to bump them up to a closer relative category. In the presence of widespread pedigree collapse, the same tools used to determine the coefficient of relationship among relatives can be applied and are a great help in deciphering the confounding and overlapping nature of relationships in these complicated pedigrees. And remember, if you encounter a cousin on a DNA match list who descends through a shared lineage with pedigree collapse, you may not recognize from the amount of shared cM that you are experiencing an elevated level of DNA sharing. But look closely at those numbers and see where you fall in the range of expected DNA sharing: you may find you are on the upper end.

## Put these skills in action

Now that you understand more about how you may be related to your matches, take the next step and start learning how you can use your matches to find your ancestors. Our free downloadable guide, 4 Next Steps for your DNA,  has simple, actionable steps you can take to get started!

#### Jayne Ekins

Jayne has been in the field of genetic genealogy since its beginnings as part of the Sorenson Molecular Genealogy Foundation. She has lectured throughout the United States and international venues on the applications of molecular biology to elucidating ancient and recent genealogical connections. She has authored and co-authored many peer-reviewed scientific publications, as well as general articles on genetic genealogy. It is a pleasure for her to see the accelerating developments in genetic genealogy, and the wide accessibility and application it has for the average human curious about their origins.

1. I come from a village in the Scottish Western Isles where pedigree collapse is common and recent. By analysing the various relationships between 9 of my cousins I have found that, adding together the average cM for each relationship gives a very good prediction of the actual cM each one shares, about + or – 20%. I’ve calculated across 36 relationship and it works — a practical application of your theory.

2. My aunt married her first cousin. Would I use the same technique to calculate the average cM increase in my match to her daughter (my first cousin)?

• Right.
WHEW!
So you will have to take both of those relationships into account when you are comparing with that cousin.

3. My father-in-law’s parents were cousins to each other in more than one way. My mother-in-law’s parents were cousins to each other more than once as well. I administer tests for both my husband and his half-uncle on his mother’s side. My husband has matches with shared DNA greater than 100cM and those people are super excited to find him. When I look at those same matches with the half-uncle, the truth reveals itself in that my husband shouldn’t actually share DNA with the match. (The half-uncle’s shared DNA is usually less than 20-30 cM.) When talking with matches about my husband’s DNA, the first thing I make sure to tell them is that his DNA is "unnaturally over-saturated" and try to help them figure out exactly how the are really connected.

4. Thanks for a very interesting article and the method for calculation of the effect of Pedigree Collapse on the amount of DNA two descendents have in common. I have a question about your calculation. I modified your model and gave Bob a brother and attempted to calculate the amount of DNA Eva would be expected to have in common with Bob’s great grandchild, Eva’s third cousin. In my model, I found four distinct paths of 8 steps each between Eva and her third cousin giving them 106 cM in common or twice the "extra" DNA you calculated in common between Eva and her brother Ethan. I believe that there are four paths between Eva and Ethan as well. Do you agree? I would very much appreciate hearing your response to my question. Thank you.

5. Hello Kenneth Wing, this is Jayne Ekins. Thank you for your interesting question. It’s great to collaborate on vetting methods like this with different scenarios. This would be much better to discuss interactively, but I’ll try to be as clear as possible about the way I see this problem in this message format.

First your question about number of distinct paths. If Bob has a brother (I’ll call him Bruce) whose descendants do not intermarry with their cousins, then yes I do see 4 distinct paths of 8 steps each between Eva and the great-grandchild descendant of Bruce (I’ll call him Eric). Eva and Eric are 3rd cousins, and because Eva descends from an incidence of cousin intermarriage she still brings to the table an "extra dose" of their mutual common ancestors Adam and Anna. To calculate the amount of DNA expected to be shared between Eva and Eric, the paradigm I’m coming from first takes into account the "typical" amount of DNA expected to be shared in a pair’s primary relationship outside of pedigree collapse. In this case for 3rd cousins that is 53 cM, coming from 2 of the 4 distinct pathways between Eva and Eric. The other 2 pathways I consider to contribute "extra" DNA in common to the relationship due to pedigree collapse coming from Eva’s ancestry. Taken together, the "typical" amount plus the "extra" amount of DNA sharing, makes 106 cM of expected DNA sharing between Eva and Eric.

I think the root of your question centers around how this is different from the relationship between Eva and Ethan who are 3rd cousins as well, but also 1st cousins due to their grandparents being 1st cousins themselves. For Eva and Ethan there are 6 distinct paths (2 for their 1st cousin relationship, and 4 for their second cousin relationship). The key difference to me comes through needing to disregard two of the paths of the 3rd cousin relationship through Adam and Anna, because these are already contained in the "typical" amount of DNA that 1st cousins share. The "extra" for Eva and Ethan due to pedigree collapse comes from just 2 of the paths through the 3rd cousin relationship as these are outside of the typical relationship paths that exist between 1st cousins that do not have incidence of pedigree collapse.

This is what I see in the data.

6. Hi Jayne. This is Kenneth Wing again.

Thank you for responding to my question regarding the calculation of the effect of Pedigree Collapse on the average expected amount of DNA two descendents of a common ancestor (pair) have in common. I am happy that you could confirm my calculation of the DNA shared by Eva and Eric. I would like to add that this calculation applies only to a situation in which inter-marriage occurred in one leg of the relationship downstream from the common ancestor(s).

I had some difficulty understanding your calculation of the expected average DNA shared by Eva and Ethan as a result of their being both first and twice third cousins, particularly your reasoning for having to disregard two of the paths of the 3rd cousin relationship for Eva and Ethan from Anna and Adam. It feels arbitrary designating two of the paths through their grandparents as the routes for "typical" DNA in common and the other two paths as the routes for "extra" DNA in common.

I have followed a different line of reasoning. As the result of their grandparents Cindy and Charlie being first cousins, Eva and Ethan each inherited about 25% of their DNA from Adam and Anna. As cousins, Eva and Ethan share about 12.5% of their total DNA. As Adam and Anna are the only ancestors Eva and Ethan have in common (in this example at least), 12.5% represents half of the DNA they each inherited from Adam and Anna.

As twice third cousins, Eva and Ethan should have 1.56% of their DNA in common. However, as they also share half of the DNA they each inherited from Anna and Adam, statistically only half the 1.56% or 0.781% should be added to the DNA from Anna and Adam which they share as first cousins. The remaining 0.781% is an integral part of the DNA they share as first cousins.

Thus it would appear that we arrived at the same result using two different methods of reasoning. That too makes me very happy!

Thanks again for this rewarding discussion.

Best regards, Kenneth

P.S. If you would like to pursue this discussion further, it might be easier if you contacted me directly using my e-mail address: kennethrwing@gmail.com

7. I am collaborating with Ancestry DNA matches; we share common in-breeding situations (Parent-Child; Sibling-Sibling). Can you describe the approach for distinct paths in these situations?

• Hi Jane. It really depends on what you want to know. If you are researching a common ancestor between you, there is nothing to separate, necessarily. You can just use both of you to help find more matches that have to do with your shared ancestor. It is really just the descendants of these relationships who have different amounts of shared DNA. But as long as you are aware of the relationships, you can account for it.

8. Greetings Jayne, I found your articles on pedigree collapse while attempting to determine the average percentage of shared DNA between the offspring of two brothers who married women who are first cousins. The children of these two marriages share grandparents through their fathers, making them first cousins, and great-grandparents through their mothers, making them second cousins. By my calculations, the average percentage of shared DNA between the offspring is 37.5%. Did I understand the information contained in your articles correctly?

• Thanks for your comment! You are absolutely right, in a case where brothers marry first cousins, the children of these couples would be first cousins AND second cousins. First cousins share about 11.6% of DNA with each other, while second cousins share about 3% of DNA, so we would expect these children to share about 14-15% of DNA. That’s an average of course, so anywhere in a range of 6-25% would be a possibility. The Shared centimorgan tool is a great resource for cases like this (https://dnapainter.com/tools/sharedcmv4)

9. Hello
My mother was adopted and it is likely her birth parents were brother and sister.
Is there a formula I can use to help figure out the true likelihood of her relationship with her DNA matches?
Thank you!

• Hi Christine, there is a formula and it’s pretty simple! It’s just addition. So for instance, if you mom’s birth parents had another child together, genetically that child would be a Sibling to your mom as well as a first cousin. You would add the amount of DNA siblings share on average (2600 cM) and the amount 1C share (800 cm) and add them to get your new weighted amount, about 3400 cM. For each relative you investigate you’ll want to go through this process based on what their two relationships to your mom would be.