
How Big Is The Huntington’s Disease Iceberg?
We can count how many people are seen in a clinic with Huntington’s disease by genetic testing, but how many people with the gene for HD are never diagnosed by a doctor? A new mathematical approach seeks to revisit this tricky question.
Huntington’s disease (HD) is caused by repeating C-A-G letters of genetic code that are too long. Everyone who develops HD is born with 36 or more CAG repeats, but not everyone with 36 or more CAG repeats is actually diagnosed with HD. That’s because either they are not old enough yet to have symptoms, or because they have symptoms but have not been given a correct diagnosis by a doctor. Because of this, mathematical models of how many people have HD don’t match up with how many people have been predictively tested or diagnosed in the clinic. Researchers have attempted a new way of calculating how many people have HD but are not diagnosed.
Three repeating letters – and 36 or more cause HD
The repeating C-A-G letters in the huntingtin gene that cause HD are like three letters repeated on a specific page of a book. People who develop HD are born with 36 or more CAG repeats, one after the other, like this: …CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG … (That’s 40 CAGs by the way.)
The genetic cause of HD means that everyone who develops the disease has an identical and easily identified region in their genetic code that can be used for diagnostic or predictive testing. When a doctor suspects someone has HD, based on their symptoms, they will order a test that counts the number of CAG repeats a person has. If that test comes back with 36 or more CAG repeats in the huntingtin gene, then that person is formally diagnosed with HD. Counting up all these people with formal diagnosis of HD is how we measure the prevalence of HD.
However, not everyone who has 36 or more repeats is diagnosed with HD. For one thing, someone might decide to get the test predictively, because they may have inherited 36 or more repeats, but are not old enough yet to have symptoms of HD. Someone like this who receives a test result with 36 or more CAG repeats, but does not yet have symptoms of HD, is usually called gene-positive. They are not counted in prevalence, because they don’t yet have symptoms of HD.
But there are also people who have 36 or more CAG repeats and symptoms of HD who have not been tested. This could be because they don’t have adequate access to health care, because of the negative social stigma of HD, or because of insurance concerns. Or, perhaps they’ve never even suspected they may have HD. This could be because they either don’t know about their family history of HD or they are the first person in their family to develop the disease. This begs the question, how many people with 36 or more CAG repeats have symptoms of HD but don’t get counted in the prevalence data for HD?

Finding everyone with 36 or more CAG repeats – how big is the iceberg?
Figuring out how many people have 36 or more CAG repeats, but never show up to a doctor, is a bit like an iceberg. There’s a visible part above water and an unknown part hidden out of view. The visible part of the iceberg is like the people who get a positive test for 36 or more CAG repeats – we can see and count them.
The size of the iceberg below the water is the many people who have 36 or more CAG repeats but are never tested. Most of these people in the hidden part of the iceberg are too young to have symptoms of HD, even though they have 36 or more CAG repeats. But at least some of the hidden part of the iceberg are people with symptoms of HD who are never tested or diagnosed.
HD researchers have tried to figure out how many people have 36 or more CAG repeats, but are never tested, and they are getting close to an answer. Some scientists have anonymously tested thousands of people from the general public to determine how many have 36 or more CAG repeats within their huntingtin gene. Researchers with newer technology and bigger pools of DNA have further refined these numbers. The consensus is that about 1 in 400 people has 36 or more CAG repeats in Europe and North America, where HD is most common.
The size of the iceberg below the water is the many people who have 36 or more CAG repeats but are never tested.
How many people have HD but are never tested?
Ok, so 1 in 400 people has 36 or more CAG repeats. But remember, some of these are people who are too young to develop symptoms of HD. How many people with 36 or more CAG repeats actually have symptoms of HD, but haven’t been tested or diagnosed?
This question has been surprisingly hard to answer, because we don’t know how many people in that underwater part of the iceberg actually have symptoms of HD. We can only count people with symptoms of HD in the visible part of the iceberg, who get tested and diagnosed in a doctor’s office.
Some researchers think a large portion of people in the hidden part of the iceberg don’t have HD and will never get HD. A tantalizing thought for HD families! But why do they think this? Because 1 in 400 is already a lot more people than ever get diagnostically tested.
The prevalence of HD – meaning people with HD in the visible part of the iceberg – is about 1 in 8000. This is how many people actually get diagnosed with HD by a doctor, which is way less (about one-tenth!) than the number of all people who have 36 or more CAG (1 in 400). Even after accounting for people who are gene-positive and too young to have symptoms, that would leave a huge number of unknown cases of HD, which some researchers think doesn’t make sense. Other researchers think most people with 36 or more CAGs will eventually develop HD symptoms if they live long enough, but just aren’t getting tested and appropriately diagnosed. They may have symptoms that simply don’t get noticed as HD, especially if they are very old. This concept of people having symptoms of HD but not getting diagnosed is called underascertainment. Literally this means that some people with HD are undercounted from prevalence.

Using clever math to tackle the problem
A well-known research group at Massachusetts General Hospital has recently tackled this question, using a new mathematical approach to explore underascertainment. They started with the question above: how many people have HD but are not diagnosed?
To estimate how many people have 36 or more CAG and might have symptoms of HD, the researchers used an interesting feature of CAG repeats across people: there are fewer and fewer of each CAG as they get longer and longer. 17 is the most common number of CAG repeats in people, but there are fewer people with 18 CAG repeats, then still fewer with 19 CAG repeats, and fewer and fewer all the way up to 36 CAG repeats.
This is part of why HD is a relatively rare disease: because repeats of 36 or more CAG are actually pretty uncommon among people in general. Dr. Jong-Min Lee and his team used this observation to estimate how many CAG repeats with 36 or more should be found among millions of people.
More than expected – but still just an estimate
The researchers estimated that about 1 in 325 people have 36 or more CAG repeats. That’s a bit more than reported in the anonymized studies mentioned earlier. But it’s important to note this is a simulated number, and not directly tested from people, so we don’t know if it’s any more accurate than 1 in 400.
The researchers then did some further calculations to simulate ages of people, estimate how many people should have developed symptoms of HD, and also estimate how many would have died of HD or other causes. This complex math is needed given that people develop HD at different ages and also pass away at different ages. They applied these calculations to the total number of people with 36 or more CAG repeats and – Voila! This calculation yielded an estimate of the number of people with 36 or more CAG who actually have symptoms of HD. Finally, the researchers then compared this estimate of how many people have HD to the published prevalence of HD, or how many people have been formally counted from a clinical diagnosis. Surprisingly, they estimate that only about 50% of people with symptoms of HD might be counted in prevalence.
Why might half of people estimated to have HD not be diagnosed and appropriately counted? There are many potential explanations.
What about the rest?
Why might half of people estimated to have HD not be diagnosed and appropriately counted? There are many potential explanations. One explanation is that some people have symptoms but don’t recognize them, or don’t seek to be tested. Another is that some people have subtle symptoms later in life that are just mistaken for old age. Or perhaps CAG repeats between 36 and 39 – repeats found in a grey zone known as reduced penetrance – don’t lead to symptoms of HD as often as we thought. CAG repeats between 36 and 39 are found in the general public, but aren’t that common in people diagnosed with HD. We still don’t know how often these CAG repeats between 36 and 39 might lead to symptoms of HD.
But you can be sure that researchers like these are hard at work to figure out how many people have HD and how to find them. Having a better understanding of how many people there are that have the gene for HD, but who don’t develop symptoms of the disease, or only do so very late in their lives, could help scientists prolong the healthspan and/or lifespan of people with HD and help develop future treatments.
Summary
- HD is caused by 36+ CAG repeats in the huntingtin gene, but not everyone with this expansion is diagnosed.
- Prevalence estimates don’t match reality because many people with the gene aren’t tested or diagnosed.
- Population studies suggest ~1 in 400 (and possibly as high as 1 in 325) people in Europe/North America carry 36+ repeats — far more than the ~1 in 8,000 clinically diagnosed.
- This mismatch raises two possibilities: either many carriers never develop HD, or many people with symptoms remain undiagnosed (underascertainment).
- A new mathematical model suggests only ~50% of people with HD symptoms are formally diagnosed.
- Reasons for undercounting may include lack of testing, subtle late-onset symptoms, misdiagnosis, or reduced penetrance at 36–39 repeats.
- Understanding the full “iceberg” of HD prevalence is critical for preparing treatments and supporting families.
Learn More
Original research article, “Significant underascertainment in Huntington’s disease” (open access).
For more information about our disclosure policy see our FAQ…