Genotype, Epigenotype, and Phenotype
Objective 6.8
6.8.1 Define: genotype, epigenotype, and phenotype.
6.8.2 Explain the connection between genotype and the manufacture of proteins.
6.8.3 List some of the epigenetic factors which alter gene expression and therefore phenotype.
6.8.4 Compare and contrast specific examples of gene expression which give each individual unique features.
Genotype
The genotype of a human is the precise sequence of 3 billion DNA bases found in human DNA.
In the late 19th century, the Moravian monk Gregor Mendel proposed a model for the inheritance of observable characteristics. This was more than 50 years before Rosalind Franklin discovered the structure of DNA which explained inheritance at the molecular level.
Mendel proposed a genetic unit called an allele. We now know that an allele corresponds to the sequence of DNA bases on a single chromatid (i.e. a single DNA molecule).
One allele is inherited from your mother, and one from your father. Mendel proposed independent assortment of these alleles: that is, your mother has two alleles for each gene, and you have an equal chance of inheriting each one. We have already seen how those alleles are patched together in each parent by crossing over, which happens during the process of germ cell (egg or sperm) formation by meiosis.
Although it’s difficult to count the differences between humans and chimpanzees, these two species share at least 96% of their DNA. Yet a chimpanzee does not look or act 96% like a human. How can this be? Something much more important than genotype must be going on.
Still, we will use genotype as a starting point. Let’s review how the precise sequence of bases is turned into proteins, which are responsible for almost all anatomical and physiological specializations of human cells.
Codons Carry the Genetic Code; Alterations in the Code Lead to Alterations in Proteins
In this image we see the precise expression of the Central Dogma. Notice we are changing the DNA language to RNA language (transcription). Each thymine (T) becomes a uracil (U) and the RNA strand is built on a ribonucleophosphate backbone, but otherwise the language is unchanged. Translation, as the name implies, converts the nucleic acid language to a different language: the primary sequence of amino acids.
Why is the codon three DNA or RNA bases, and not one? Or two? Or four?
There are four DNA bases: A, C, G, and T. There are also four RNA bases: A, C, G, and U. Four bases can code for four amino acids: 1. A 2. C 3. G 4. U. Mathematically, we say
41 = 4
How many different amino acids could we code for with two of the four bases? The possibilities are 1. AA, 2. AC, 3. AG, 4. AU, 5. CA, 6. CC, 7. CG, 8. CU, 9. GA, 10. GC, 11. GG, 12. GU, 13. UA, 14. UC, 15. UG, 16. UU
42 =16
That’s still short of where we need to be for 20 amino acids.
I won’t list all the possibilities for four bases taken in groups of three, but I hope you’ll see that the number is
43 =64
(If not, count the possibilities in the codon table shown below.) That’s enough to code for 20 amino acids, with some information left over. So, four bases in groups of three is how we do it. Nirenberg and his lab figured out the probability of each combination of three bases in mixtures with different proportions of A, C, G, and U, and then matched that to the proportion of amino acids in artificial proteins made by ribosomes exposed to those artificial mRNA sequences. In this way, they found that AUG codes for methionine, CCG for proline, CAA for glutamine, and so forth.
Because these are 64 combinations of bases for 20 amino acids, the genetic code is called degenerate. In information theory terms, that means there are more possible codes than there are amino acids to encode. That lets us use multiple combinations to code for the same amino acid. For example, there are four codons for proline, two for histidine, one for tryptophan, and so forth.
We’ve been able to determine phenotypes for a long time, ever since we started looking at each other’s faces. The analysis of human genotypes is a fairly recent development, dating back to the 1990s. The culmination of this scientific achievement was the Human Genome Project, which sequenced an entire human genome. This also allowed us to infer the protein sequences that we studied in Unit 3 in the exercise on the BLAST database. In order to make the linkage between DNA sequences and protein sequences, we needed a Rosetta Stone*: a way of converting one language to another, the same way a ribosome converts the nucleic acid language to the amino acid language of proteins.
Some clever science in the 1960s provided this Rosetta Stone. Nirenberg made artificial RNAs that contained different proportions of A, C, G, and U. Then his lab applied some information theory to the problem: how many DNA or RNA bases are needed to code for 20 amino acids?
* The Rosetta Stone was a carved rock that had both Greek and Egyptian hieroglyphics on it. There was no “hieroglyphic dictionary” that survived to the modern period so the Rosetta Stone was the only way we could translate hieroglyphs. Archeologists knew Ancient Greek, so they used the Rosetta Stone to convert hieroglyphs to Ancient Greek, and from there to French and other modern languages.
Epigenotype
As we saw in Objective 3, genes can be turned “on” or “off” at different times, in different cells, by transcriptional control. Sometimes, genes are permanently turned “on” or “off” by DNA modification (methylation turns DNA “off”) or by modifying the way in which DNA interacts with histones (histone acetylation turns the nearby DNA “on” while histone methylation turns the nearby DNA “on”). Transcription factors (non-coding RNAs) can dynamically turn a specific gene “on” or “off”. All these factors make up the epigenotype of the cell.
Phenotype
The phenotype is the collection of observable characteristics shown by an individual. The name was originally German (Phænotypus) but pheno–, as in the English word phenomenon, is from the Greek word phainein which means to “show or “shine”.
The phenotype is generally related to, but not exactly the same as, the genotype.
Take the example shown here. As Mendel suggested, there are two alleles for wing shape in fruit flies (genus Drosophila, which literally means “garbage-loving”).
Mendel called an allele dominant if inheriting only one copy caused an observable characteristic. An allele is called recessive if one has to have two copies of that allele to cause an observable characteristic.
In the case shown here, one form of the gene (allele) is designated with a capital W, while the other is designated with a lower-case w. The parents are both Ww. Each offspring has a 50% chance of inheriting a W from mother, and a 50% chance of inheriting a w from mother. Each offspring has a 50% chance of inheriting a W from father, and a 50% chance of inheriting a w from father.
Putting this together, there are four possible genotypes, each with an equal chance of occurring: they can either be WW (1/4 of the total), Ww (1/4 of the total), wW (1/4 of the total), or ww (1/4 of the total).
Because the wrinkled wings phenotype only appears if the fly inherits two lower-case w (i.e. ww), we call this characteristic recessive.
The WW genotype results in a normal winged phenotype. (Scientists call this normal appearance “wild-type” because it’s what we see in wild-caught fruit flies.)
The Ww or wW genotypes also result in a normal winged phenotype. The dominant W allele is “strong” enough to create a wild-type wing all by itself.
In the absence of the W instructions, a normal, wild-type wing cannot be made. The flies with the ww genotype are phenotypically abnormal and have wrinkled wings.
Putting it All Together
Now we can see what our intuition has told us already: very few human traits or characteristics can be explained simply by genetics. For years, evolutionary psychologists have asked of human behaviors: nature or nurture? (In other words, genes or environment?) The answer is both and neither at the same time. Rather, there is a complex and poorly understood interaction between genes and environment, acting through epigenetic mechanisms, which gives each individual a unique epigenotype. The epigenotype then is reflected in the phenotype which other individuals can observe.
Pink Flamingos
Flamingos feathers are not genetically pink. We can see this when looking at a newborn flamingo: its genotype is to have feathers without pigmentation.
The characteristic pink color of flamingos comes from a chemical called carotene, which as the name suggests is found in carrots but is also found in brine shrimp. Brine shrimp eat carotene-containing algae and the flamingos eat the algae and the brine shrimp, which turns them pink. There is an epigenetic component as well, because different birds have slightly different enzymes for metabolizing what they eat and the genes for depositing the pigment in feathers have to be turned on as well.
We will re-visit the effect of carotene on human skin color in Unit 8.
Race and High Blood Pressure
Americans of African ancestry have a higher incidence of high blood pressure (hypertension) and respond differently to antihypertensive medications than Americans of European ancestry. In fact, drug companies have marketed one antihypertensive drug, Nebivolol, for use in only African-American populations (Weiss, 2006).
In the past, researchers have investigated whether this difference is genetic or environmental. Surprisingly, it’s a mixture of both. This is because both genetics and epigenetics come into play and interact with each other in unexpected ways.
Genotype plays a role. A key family of enzymes, the cytochromes P450, are involved in metabolism of different drugs, including antihypertensives. Research has shown that African-Americans tend to have a different set of single nucleotide polymorphisms (SNPs, changes in a single DNA base) in the genes for some types of cytochrome P450. We can imagine that changing amino acid –R groups in the structure of the P450 enzyme (shown above) might have some subtle effects on its function, for example by changing the active site (“pocket”) shape as we discussed in Unit 3. Also, a blood pressure-regulating hormone system we will study in Unit 14, the RAAS system, is set up differently in African-Americans than in European-Americans (Brewster & Seedat, 2013).
But genetics does not explain many of these differences (Brewster & Seedat, 2013), and diet, gender, and stress interact differently with genetics than they do in European-Americans. Male hormones regulate cytochrome P450 (subtype 4A11) gene expression. Androgens released into the bloodstream of African-American males and androgens released into the bloodstream of European-American males affect cytochromes P450 in different ways (Pikuleva & Waterman, 2013).
What matters most is not what ancestry you have (i.e. your genotype). What matters most is what ancestry you identify with (Brewster & Seedat, 2013). Patients who claim African ancestry do not respond to low salt diets the way patients who claim European ancestry do (Brewster et al., 2016). Self-identified race is a better predictor of the effects of salt intake, and response to the drug Nabivolol, than genotype.
In lieu of biochemical or pharmacogenomic parameters, self-defined African ancestry seems the best available predictor of individual responses to antihypertensive drugs. — Brewster & Seedat, 2013
It is quite likely (though certainly not proven) that the stressors unique to living as a minority in a mostly European population, such as the microaggressions, socioeconomic forces, and cultural norms experienced by self-defined Black people, interact with genetics to create the unique response of this group. Otherwise, genetic ancestry would be a better predictor than self-defined ancestry for the observed differences in phenotype.
Literature cited in this section
Brewster LM & Seedat YK. Why do hypertensive patients of African ancestry respond better to calcium blockers and diuretics than to ACE inhibitors and β-adrenergic blockers? A systematic review. BMC Med (2013) 11:141 https://doi.org/10.1186/1741-7015-11-141
Brewster LM, van Montfrans GA, Oehlers GP, and Seedat YK. Systematic review: antihypertensive drug therapy in patients of African and South Asian ethnicity. Intern Emerg Med (2016) 11:355–374. https://doi. org/10.1007/s11739-016-1422-x
Pikuleva IA and Waterman MR. Cytochromes P450: roles in diseases. J Biol Chem (2013) 288(24):P17091-P17098. https://doi.org/10.1074/jbc.R112.431916
Weiss R. Nebivolol: a novel beta-blocker with nitric oxide-induced vasodilatation. Vasc Health Risk Manag (2006) 2(3):303–308. https://doi.org/10.2147/vhrm.2006.2.3.3036-59
The Sickle Cell Mutation
Sickle cell anemia is a genetic mutation where a single change in a single DNA base causes a change in hemoglobin, the protein which carries oxygen in the blood. In one form of the disease, a glutamic acid (GAA) codon is changed to a valine (GTA) codon. Recall from Unit 3 that the glutamic acid –R group is polar (hydrophilic) with a negative charge, while the –R group of valine is non-polar (hydrophobic). Just this single amino acid substitution results in a completely different structure for hemoglobin, from its usual “glob” shape (from which we get the name “globin”) to a rod shape. Because red blood cells are merely bags of hemoglobin, the change in the hemoglobin structure changes the red blood cell shape from round to a crescent or sickle shape.
These sickle cells tend to get stuck in capillaries, causing problems for the patients. They are also not as good at carrying oxygen as round red blood cells.
Individuals with sickle cell anemia tend to be weak and die at an early age, especially in the primordial environment. According to the theory of natural selection, anything that reduces an individual’s fitness to reproduce should eventually be eliminated from the population. Yet the sickle cell mutation persists. How could this be?
Evolutionary scientists believe there must be a countervailing benefit: the malaria parasite, which likes to live inside red blood cells, does not like to live in sickled red blood cells. Thus individuals with the sickle cell mutation are less likely to get malaria. Malaria also reduces an individual’s fitness to reproduce, and the effect of the sickle cell mutation on the ability to reproduce and pass along one’s genes, and the effect of malaria on that ability, must be about the same size.
In support of this theory, the map showing the prevalence of the sickle mutation (purple) and the map showing the prevalence of malaria (green) are almost a perfect overlap.
Media Attributions
- U06-042 Genotype © Crookston, Alexa is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
- U06-043 DNA to Proteins © Crookston, Alexa is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
- U06-029 codon table 640x480px © Hutchins, Jim is licensed under a CC BY-SA (Attribution ShareAlike) license
- U06-046 Unit 06 Obj 8 fig 4 phenotype © National Human Genome Research Institute is licensed under a CC0 (Creative Commons Zero) license
- U06-047 Genotype, Epigenetic, Phenotype © Crookston, Alexa is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
- Flamants_à_Thyna_(Sfax) © Mohamed, El Golli is licensed under a CC BY-SA (Attribution ShareAlike) license
- U06-051 Unit 06 Obj 8 fig 9 cytochrome P450 © Sevrioukova, I. and Astrojan is licensed under a CC BY-SA (Attribution ShareAlike) license
- Sickle_cell_anemia © Pkleong is licensed under a Public Domain license
- U06-055 Sickle_cell_distribution © Muntuwandi is licensed under a CC BY-SA (Attribution ShareAlike) license
- OLYMPUS DIGITAL CAMERA © Burini, João P. is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
- U06-056 Malaria_distribution © Muntuwandi is licensed under a CC BY-SA (Attribution ShareAlike) license