Math 340 Remarks and Hints Related to Homework Problems ___________________________________________________________________________ Below I will use "\geq" for "greater than or equal to." ___________________________________________________________________________ 1. (p.108, #3) a. First note that here the Law of Large Numbers is not the appropriate theorem to use. This law is the (intuitively reasonable) statement that as the number of trials increases, the ratio (number of successes in n trials) / (total number of trials, n) approaches the "theoretical" value of P(S). However, in this problem we are concerned with the event: "55 or more successes in 100 trials" and "220 or more successes in 400 trials." So, you see in the case of "220 or MORE successes in 400 trials" we are asking for many more successes to occur, as opposed to "55 or MORE successes in 100 trials." Hence, it should be a much more rare event. Namely, we expect for the probability of "220 or more successes in 400 trials" to be much smaller than the other event in question. b. Here we need to calculate two separate probabilities and compare them. i. n=100, p=1/2, X = no. of heads in 100 tosses. Then, mu = np = (100)(1/2) = 50 sigma = \sqrt(npq) = \sqrt[(100)(1/2)(1/2)] = 5 Now, P(X \geq 55) \approx P(Y > 54.5) = P(Z > 0.9) = 1 - 0.8159 = 0.1841 where, 0.9 is the z-score for 54.5. Note that the answer in the back of the book is inaccurate. ii. n=400, p=1/2, X = no of heads in 400 tosses. Then, mu = np = (400)(1/2) = 200 sigma = \sqrt(npq) = \sqrt[(400)(1/2)(1/2)] = 10 Now, P(X \geq 220) \approx P(Y > 219.5) = P(Z>1.95) = 1 - 0.9744 = 0.0256 where, 1.95 is the z-score for 219.5. Thus, P(X \geq 55) > P(X \geq 220) as expected. ____________________________________________________________________________ 2. (p.108, #7) Ask yourself the following question: in approximating the value of p=P(S) for a population, what happens to the standard deviation (in making our estimates) as we increase the sample size n? For a further discussion of this question read in your book starting on page 101 through 104. a. The sample size for city A is (0.01%)(4000000) = 400 and for city B it would be 600. Since n is larger for city B, the stanard deviation sigma=\sqrt(npq) would be larger for city B. Hence, the conficence interval for city B, and therefore our estimate for p, would be more accurate. b. Since n and p are the same for both cities, our estimates would (i.e., our confidence intervals) would be equivalent. c. Again look at the standard deviation and see which sigam gives a bigger confidence interval. That would be a better estimate for p. ADDITIONAL COMMENTS: The proportion of women in both cities is the same unknown number p. The goal of sampling is to approximate this number. The statistical procedure for goal is called the Test of Confidence. A test of confidence bounds a population parameter, like p in this case, within an interval called a confidence interval. So for example the conclusion of such a test would be p lies in the interval (p-hat - E, p-hat + E) with probability 95%. The probability 95% is our confidence level, E is the error of this approximation, and p-hat is the proportion of women in a sample of size n. Note that in literature a confidence interval is often written in the form: p = p-hat plus/minus E. When the confidence level is implicit, it is assumed to be 95%. In this example we are working with what is called the "sample proportion r.v." Based on a thm that we will cover later the standard deviation of this r.v. is sqrt[p(1-p)/n] where n is the sample size and p = p-hat the proportion of women in our sample. Hence, we can set up this problem as follows: f_A = number of women in the sample from city A p_A = proportion of women in the sample from city A f_B = number of women in the sample from city B p_B = proportion of women in the sample from city B Note that since the proportion of women in these two cities is known to be about the same, we expect that in a random sample from these two cities p_A \approx p_B. Hence, a 95% confidence interval for each city would be as follows: City A: p = p_A plus/minus E_A City B: p = p_B plus/minus E_B where, E_A = (1.96) sqrt[p_A (1 - P_A) / 400] E_B = (1.96) sqrt[p_B (1 - P_B) / 600] 1.96 is related to the confidence level 95%. Since p_A and p_B are expected to be the same, hence all the results for this problem are derived from a comparison of sample sizes. ___________________________________________________________________________