For our lab on October 8th, we worked through this activity: hhmi_lab_hw_malaria_populationgenetics_student

But first, we did some introductory work on the Hardy Weinberg equation:

- A Bozeman (Paul Anderson) video on where the HW equation comes from and its connection to the Punnett Square.

For many students, the HW (Hardy-Weinberg) equation is just another math equation and don’t realize that it comes directly from a probability square the biologist Reginald Punnett was working on at Cambridge. The HW equation came up when Professor G.H. Hardy saw Punnett’s square (he was working with something like below – no numbers yet)

and quickly realized that it was analogous to a probability square. And thus the equation was discovered (Weinberg came up with it independently in Germany, which is why it is called the Hardy Weinberg equation).

Doing HW problems: we started on doing using the HW equation to solve problems, but didn’t get a chance to work completely through the set at the end of our lab (hhmi_lab_hw_malaria_populationgenetics_student).

See below for the questions and additional simulations (done with a piece of software called AlleleA1) and then we have a virtual meeting Saturday, October 15th, 3pm to go over any questions or problems that you didn’t understand.

Remember the equation (p2 + 2pq + q2 = 1.0) and where it comes from. You can always start with a Punnett square if the problem seems confusing or unclear. Use the probability square we learned about in Paul Anderson’s video (little squares) to make the frequencies do the work. You will also need to download AlleleA1 (see link below).

Questions to answer on your own (after filling out the tables and charts at the end of our HW lab):

PRE_QUESTIONS:

1. The Hardy-Weinberg principle and its equations predict that frequencies of alleles and genotypes remain constant from generation to generation in populations that are not evolving. What five conditions does this prediction assume to be true about such a population?

a. _________________________________

b. _________________________________

c. _________________________________

d. _________________________________

e. _________________________________

2. Before beginning the activity, answer the following general Hardy-Weinberg problems for practice (assume that the population is at Hardy-Weinberg equilibrium). a. If the frequency of a recessive allele is 0.3, what is the frequency of the dominant allele? ____________

b. If the frequency of the homozygous dominant genotype is 0.36, what is the frequency of the dominant allele? ____________

c. If the frequency of the homozygous recessive genotype is 0.49, what is the frequency of the dominant allele? ____________

d. In a certain population, the dominant phenotype of a certain trait occurs 87% of the time. What is the frequency of the dominant allele? ____________

e. If the frequency of the homozygous dominant genotype is 0.49, what is the frequency of the homozygous recessive genotype? ____________

f. If the frequency of an autosomal recessive disease is 1 in 1,500 births, what are the allele and genotype frequencies in a population of 3,000?

p = _______ q = _______ p2 = _______ 2pq = _______ q2 = _______

Remember the power of using the square and where the HW equation comes from. If the terms start getting confusing, or you don’t know if you want to obtain allele frequency or genotypic frequency or whatever, just make a quick square, and figure out what p, q, pq are in the problem.

p2 + 2pq + q2 = 1.0

Finishing up our HW Lab from October 8th

Make sure Table 4.1 is filled out and that you have graphed your data for the graphs on page 9 of the handout. Remember the following:

Simulation 1 = all genotypes were 100% fit (I.e., 100% of offspring survived). No selection pressure and in HW equilibrium.

Simulation 2 = The recessive homozygous genotype (SS) has 0% fitness (100% lethal).

Simulation 3 = Again, SS genotype has 0% fitness, and the dominant homozygous genotype has 50% (or 0.5) fitness, or only 50% of the offspring survive.

It is important to keep track of the genotype AND allele frequencies, and how they relate to one another.

Before we can tackle the data analysis, we need to run some more generations for our 3 simulations. It took us quite some time to run 2-3 generations for each simulation, so we are going to look to the computer to make this process painless for us. But it is crucial to realize that the computer is doing EXACTLY what we did with the beads in a bag: putting the alleles into a gene pool and randomly selecting pairs (just like we did) to collect data on how many of each genotype is produced. There is no mathematical formula at work here, just random mating and pairing up. This is why one of the criteria for HW equilibrium is a large population, and you will why this is true in just a few minutes.

AlleleA1 Program

Now, we can use a spreadsheet already made for us to run the same simulations we ran in lab for 100-200 generations to see what will happen to our frequencies. I have tried the three possible applications, and the AlleleA1 program gives us the numbers instead of just a graph. Go here to download it:http://faculty.washington.edu/herronjc/SoftwareFolder/AlleleA1.html. You’ll need to extract the files and then open the application.

Now, run our three simulations again (I promise it will be super quick!)

Simulation 1: Sickle cell alleles with no selection

– Start with 60 members in the population, and input 0.5 for the allele frequencies for both A1(the A allele) and A2 (the S allele)

– select 1.00 fitness for all (this means all offspring survive)

– select 200 generations

– Hit “run”.

Notice that there is a button on the lower right that allows you to have multiple lines on one graph. Select “auto”, and hit “run again. Do at least 10 runs. What do you notice? That our small starting population size is hugely affected by random probability. Did you get a few runs where the frequency of A1 dropped to 0? Increased to 1.0? The Allele A1 program only uses the available alleles from the previous generations so it provides a nice instance of how random affects can have a large influence on allele and genotype frequency.

Now, clear the runs, and enter in 10,000 for the starting population. We are still in simulation #1. What are your results now? Much less effect from random occurrences and much more stable numbers, right? You can also enter in “infinite” for the population size, but the 10,000 size and infinite size will have similar results. Or will they? Hmmm…

Make sure you note the conditions of the run and the ending allele and genotype frequencies. You can also do 25, 100 or even more generations; it can be illustrative doing fewer runs to see how the frequencies change.

Now, notice the little arrow on the Y axis: you can select for the different frequencies! Unfortunately the program does not allow you to toggle between those frequencies on the same run, so if you want to compare results, it is probably best to use large population sizes (or infinite) to remove the effect of randomness on the results. There is also a little chart of “Final Frequencies” to the right of the graph that is handing for noting final numbers.

Simulation 2: Sickle cell alleles with homozygous genotype 100% lethal

– Start with 60 members in the population, and input 0.5 for the allele frequencies for both A1(the A allele) and A2 (the S allele)

– select 1.00 fitness for A1A1, A1A2 and 0 for A2A2

– select 200 generations

– Hit “run”.

Make sure you note the conditions of the run and the ending allele and genotype frequencies. You can also do 25, 100 or even more generations; it can be illustrative doing fewer runs to see how the frequencies change. We were only able to do 2-3 generations in our lab class, so we were only getting an slight indication of how the numbers were going to go.

Again, play with population size, # generations, and so on. Record your data.

Simulation #3: Sickle cell with homozygous recessive being lethal/hetereozygous genotype being advantageous

– Again, start with 60 members in the population, and input 0.5 for the allele frequencies for both A1(the A allele) and A2 (the S allele)

– Again, the A2A2 genotype has 0 fitness (100% lethal), but this time, make the fitness for A1A1 0.5. This gives the hetereozygote a selectional advantage.

– select 200 generations

Hit “run”.

Again, do multiple runs to see the variation in the results.

Again, change the population size to 10,000 and infinite to see how the results smooth out and become more predictable.

Any surprises??

Questions from our 3 simulations:

1. Provide two explanations for why the S allele persists after five generations.

2. If you continued both simulation 2 and simulation 3 for three more generations (up to five generations), do you predict that the frequency of the S allele in simulation 2 would be greater, less than, or equal to the S allele frequency in simulation 3? Explain your answer.

3. Which simulation might represent a population of people who live in the moist lowlands of East Africa? Use data to explain why you chose this simulation.

4. Which simulation might represent a population of people living in a remote village in the dry highlands of Africa? Use data to explain why you chose this simulation.

Design your own simulation with the hetereozygous genotype being a disadvantage:

Design a simulation that models equal selection for the two homozygous genotypes and selection against the heterozygous genotype. Start with the original parent population as established in Table 1.1. Design your simulation to have 60 parents and 60 offspring in each successive generation. Feel free to vary the survival percentage of particular genotypes, as in simulation 3; however, be sure to incorporate selection for the two homozygous genotypes and selection against the heterozygous genotype. Explain your simulation in the space provided.

So, how do we do this? Be ready to share data Saturday.

Answers to beginning HW problems above:

- For a population to be in Hardy Weinberg equilibrium, the five conditions are: large population, no mutations, random mating, no selection, and no gene flow.
- a. This is allele frequency, and the problem assumes we are talking about two alleles, one dominant and one recessive. Say, A1 and A2. If the frequency of A2 = 0.3, the frequency of the dominant allele (A1) must be 0.7.

q = 0.3

p+q = 1.0

Therefore, p = 0.7.b. Frequency of homozygous dominant genotype = 0.36, what is frequency of dominant allele?

Okay, let’s see what these terms mean.

If A1 = dominant allele, A2 = recessive allele, then A1A1 = homozygous dominant genotype.

A2A2 = homozygous recessive genotype

A1A2 = hetereozygous genotype

From our Punnett square, frequency of A1A1 = p2 (p squared), which means p = 0.6Since p = frequency of dominant allele, our answer is 0.6

c. If the frequency of the homozygous recessive genotype = 0.49, what is frequency of dominant allele?

Okay, here they have given us the frequency of the A2A2 genotype, which is q2 (q squared) = 0.49

Take the square root, and q = 0.7

Remember q + p = 1.0, so p = 0.3d. Dominant phenotype = 87% of the time.

Okay, so we have three possible genotypes here: A1A1, A1A2, A2A2

Since A1 is dominant, that means two of the genotypes will have the same phenotypes (A1A1 and A1A2)

So, the frequencies of the homozygous dominant genotype (p2) and the hetereozygous genotype (2pq) add up to 0.87

Which means q2 = 0.13, and then q = 0.36e. Frequency of homozygous dominant genotype = 0.49; what is frequency of homozygous recessive genotype?

Notice we only have p2 here, not p2 + 2pq

P2 = 0.49, so p = 0.7

If p = 0.7, then q = 0.3f. Frequency of autosomal recessive disease is 1 in 1500 births. What are the allele and genotype frequencies in a population of 3000?

If you divide 1 by 1500, you have 0.00067

But what does it refer to? We are talking about phenotypes here, so you have two possibilities: A1A2 or A2A2

However, the “autosomal recessive” tells you that ONLY the homozygous recessive genotype with give you the recessive genotype. The phenotype of A1A2 will look like that of A1A1.Therefore, 0.00067 = frequency of A2A2 genotype, or q2 (q squared).

q must then be = 0.0258

1 – 0.0258 = 0.974 = p, which means p2 = 0.949So, frequency of A1 = p = 0.974

frequency of A2 = q = 0.0258

frequency of A1A1 = p2 = 0.949

frequency of A2A2 = q2 = 0.00067

frequency of A1A2 = 2pq = 2(0.974)(0.0258) = 0.05