Gene and allele: how to tell the difference
These two concepts are often confounded. Learning what is a gene and what is an allele you will get ready to the next step: understanding what is haplosufficiency.
By Alfonso Prado-Cabrero, PhD

Baby girl with a subset of chromosomes
Our genetic information is in duplicate
Each human being inherits 23 chromosomes from their mother and 23 chromosomes from their father, making in total 46 chromosomes received (Fig. 1). As these two sets of 23 chromosomes are essentially equivalent, we can pair each chromosome from dad to its equivalent (or homologous) chromosome from mum. Homologous means that both chromosomes of each pair (generally) contain the same genes and in the same positions. That is, we receive our genetic information in duplicate. Nevertheless, there are lots of small differences between each pair of homologous chromosomes. These differences in most cases do not ruin anything, but make us taller or shorter, blue eyed or brown eyed. However, small diferences can also lead to disease.

Figure 1. Baby girl with her mother (pink), her father (purple) and the respective inherited chromosomes. In pink, the chromosomes that she inherited from mum, and in purple the chromosomes that she inherited from dad.
If we focus on the first pair of homologous chromosomes (Fig. 2), inside them there are black lines of different lengths. Each of these lines symbolizes a gene. If you look at this pattern in both chromosomes, it is the same. To emphasize this homology, the name of the first four genes is shown.

Figure 2. Zoom of the first pair of homologous chromosomes. The black lines of different lengths represent the genes that each chromosome contains. Of note, the pattern of lines is the same on each chromosome. The name of the first four genes is shown.
Now we delve into a gene. Its alleles have a little difference…
We will take the gene TBH as an example. This gene does not really exist, but it will help us understand what the alleles of a gene are. Fig. 3 shows the DNA sequence of the two alleles of the gene TBH that the girl of Fig. 1 has inherited. Each allele is named following the nomenclature used in human genetics (TBH*1 and TBH*2).
If we look at the DNA sequence of each allele (the A, G, C and T thread), it is identical in both cases, except for an adenine (A) in the allele TBH*1 (distinguishable by its larger font size), which is a guanine (G) in the allele TBH*2. This change is minimal, and such a small change in the sequence of a gene is usually of no consequence, but sometimes such a change can affect the function of the gene and therefore the protein it encodes.

Figure 3. DNA sequence of two alleles of the gene TBH. The different nucleotide in both alleles (A in the allele *1 anad G in the allele *2) can be distinguished by its larger font size.
Which implications such small difference may have?
As we know, the cell transcribes each allele into messenger RNA (mRNA), and then translates each mRNA into protein. Fig. 4 shows schematically the two alleles of the gene TBH of the girl of Fig. 1, highlighting the nucleotide that is different in both alleles (A in TBH*1 and G in TBH*2). The purpose of Fig. 4 is to show that the cell transcribes both alleles to mRNA normally, and then succesfully translates these mRNAs into protein. Nevertheless, if we look at the proteins produced by each allele, TBH*2 has a small bump (pointed by the arrow). This is the effect of the A and G difference: a single amino acid has changed, making the structure of the two proteins slightly different.

Figure 4. Schematic representation of the flow from gene to mRNA and protein for the two alleles of the TBH gene. These alleles have a deoxynucleotide which is different: A for TBH*1 and G for TBH*2. This difference is responsible for the subtle difference in structure of the resulting protein, pointed by the arrow.
What are the practical implications of such difference?
In this particular case, Fig. 5 shows these implications. TBH*1 can add a white ball to 20 black balls per second, and TBH*2 can do this to only 18 black balls per second. This is because the small bump highlighted in Fig. 4 in the protein TBH*2 is impairing to some extent the entrance of the substrate (black ball) into the protein. Researchers take the protein that works better (TBH*1) as reference, and say that this protein performs 100% activity. TBH*2 then works at 90% of possible activity.

Figure. 5. Performance of the proteins encoded by TBH*1 and TBH*2. The protein TBH*1 can convert 20 black balls into black and white balls per second. The protein TBH*2, instead, and due to the protuberance in its entrance, can conly process 18 black balls per second.
Are there only two alleles per gene?
No, usually there are more than two alleles of each gene. In the ficticious gene of our example, after a hard work, researchers have identified and characterized four alleles of TBH in the human population. In Fig. 7 we show the two remaining alleles.
If we look at TBH*3, we see that it looks like TBH*2, because it also has the G that makes the protein have a bump in its structure; but it has another change in sequence, consisting of an A in a position where TBH*1 and TBH*2 have another nucleotide. This change impairs transcription to mRNA of this allele, and therefore affects the number of proteins synthesized. This protein should have an activity of 90%, but in reality its activity is lower because there is less protein. Researchers have found that for the protein TBH*3, the bump plus the lower number of proteins yield a work which is 40% of the work that TBH*1 can do.

Figure 6. Alleles *3 and *4 of the TBH gene. The additional nucleotidic differences of these alleles over *1 and *2 have additional consequences at the protein level. For TBH*3, the additional difference makes DNA transcription into mRNA more difficult and therefore scarce. The consequence of this is a lower number of proteins produced. For TBH*4, the additional nucleotidic difference is not affecting transcription, but the mRNA produced contains an ‘end of translation’ signal, which makes the protein mostly incomplete and therefore useless.
The TBH*4 allele resembles TBH*1 because it has an A in the position we studied at the beginning, and its activity should be 100%. However, this allele contains another nucleotide change, which produces a change in its code that causes that when the mRNA is translated into protein, the ribosome finds a termination signal and releases the protein in construction, leaving it, therefore, unfinished. The synthesized protein is therefore useless, and its activity is, therefore, 0% when compared with the activity of TBH*1.
Who has these alleles of the TBH gene?
Fig. 8 is an example of how the alleles TBH*1, TBH*2, TBH*3 and TBH*4 are distributed in the population. If you want to learn about the implications of bearing different combinations of alleles of a gene, you can check out our post haplosufficiency and haploinsufficiency.

Figure 8. The four alleles of the TBH gene are distributed in the population.

This content is licensed under a Creative Commons CC0 Universal Public Domain Dedication license.

Alfonso Prado-Cabrero is a research fellow at Nutrition Research Centre Ireland, Waterford Institute of Technology. He is specialised in molecular biology, biotechnology, genetics, carotenoids and fatty acids