SharedFinal Exam Form A.ipynbOpen in CoCalc

Final Exam Form A

Question 1

Emerson's math test scores are given in the table below:

8787, 9393, 9292, 2525, 9696

a) Find the median.

dat1 <- c(87,93,92,25,96)
median(dat1)
92

b) Find the sample mean.

mean(dat1)
78.6

c) Find the sample standard deviation.

sd(dat1)
30.1380158603714

Question 2: Answer the following questions.

a) An animal shelter has a 58%58\% adoption rate for puppies. Of all puppies in the shelter, 80%80\% live to be 77 years or older. Of the puppies who are adopted, 90%90\% live to be 77 years or older. What is the probability that a randomly selected puppy in the shelter will get adopted and live 77 or more years?

Answer:

The sample space is the set of all pupies living in the shelter.

Let AA be the event that a puppy is adopted, and OO the event that a puppy lives to be 77 years or older. We are given, P(A)=.58P(A) = .58, P(O)=.8P(O) = .8, and P(OA)=.9P(O|A) = .9, and we are asked to find P(A and O)P( A \textbf{ and } O). We will use the formula

P(A and O)=P(A)P(OA) P( A \textbf{ and } O) = P(A)\, P(O|A)

.58 * .9

So P(A and O)=.522P(A \textbf{ and } O) = .522

Question 2 b) The probability of a plant going to seed is 34%34\%. The probability of that same type of plant surviving the winter is 38%38\%, and the probability of both is 10%10\%. What is the probability that a randomly selected plant will go to seed or survive the winter?

Answer:

The sample space is the set of all plants of that type.

Let SS be the event that a plant will seed, and WW the event that a plant will survive the winter. We are given P(S)=.34P(S) = .34, P(W)=.38P(W) = .38, and P(S and W)=.1P(S \textbf{ and } W) = .1, and we are asked to find P(S or W)P(S \textbf{ or } W). We will use the formula:

P(S or W)=P(S)+P(W)P(S and W) P(S \textbf{ or } W) = P(S) + P(W) - P(S \textbf{ and } W)

.34 + .38 - .1

Question 3: A hair salon completed a survey of 347347 customers about satisfaction with service and type of customer. A walk-in customer is one who has seen no ads and not been referred. The other customers either saw a TV ad or were referred to the salon (but not both). The results follow.

Walk-In TV Ad Referred Total
Dissatisfied 2121 77 22 3030
Neutral 2121 2222 3636 7979
Satisfied 2626 4141 6161 128128
Very Satisfied 3131 3535 4444 110110
Total 9999 105105 143143 347347

Assume the table represents the entire population of customers. Find the probability that a customer is

a) Dissatisfied

Answer: From the table we see that P(Dissatisfied)=30347P(\text{Dissatisfied}) = \dfrac{30}{347}

30/347

b) Dissatisfied and a walk-in

Answer: From the table we see that P(Dissatisfied and a walk-in)=21347P(\text{Dissatisfied and a walk-in}) = \dfrac{21}{347}

21/347

c) Referred

Answer: P(Referred)=143347P(\text{Referred}) = \dfrac{143}{347}

143/347

d) Very satisfied, given referred

Answer: There are 143143 referred customers, 4444 of which are very satisfied. So

P(Very Satisfied, given Referred)=44143P(\text{Very Satisfied, given Referred}) = \frac{44}{143}

44/143

e) Very satisfied or saw a TV ad

Answer: We will use the formula: P(Very Satisfied or Saw a TV ad)=P(Very Satisfied)+P(Saw a TV ad)P(Very Satisfied and Saw a TV ad)P(\text{Very Satisfied} \textbf{ or } \text{Saw a TV ad}) = P(\text{Very Satisfied}) + P(\text{Saw a TV ad}) -P(\text{Very Satisfied} \textbf{ and } \text{Saw a TV ad})

From the table we get:

110/347 + 105/347 - 35/347

Question 4: A basketball player makes 70%70\% of the free throws he shoots. Suppose that he tries 1515 free throws.

a) What is the probability that he will make more than 77 throws?

b) Find the expected value.

c) Find the standard deviation.

Answer:

We have a binomial distribution with probability of success (makes the free throw) p=.7p=.7, and n=15n=15.

a) We are asked for P(r>7)=P(r=8)+P(r=9)++P(r=15)P(r > 7) = P(r=8) + P(r=9) + \cdots + P(r=15)

 dbinom(8, size = 15, prob = .7) + dbinom(9, size = 15, prob = .7) + dbinom(10, size = 15, prob = .7) + dbinom(11, size = 15, prob = .7) + dbinom(12, size = 15, prob = .7) + dbinom(13, size = 15, prob = .7) + dbinom(14, size = 15, prob = .7) + dbinom(15, size = 15, prob = .7)

Alternatively we could have used the cumulative probability for the binomial distribution:

1 - pbinom(7, size = 15, prob = .7)

b) The expected value is given by μ=np\mu = n\,p

15*.7

c) The standard deviation is given by the formula: σ=npq\sigma = \sqrt{n\, p\,q} where, in our case, the probability of failure q=1.7=.3q = 1 - .7 = .3

sqrt(15*.7*.3)

Question 5: Let xx be a random variable that represents the length of time it takes a student to write a response paper. It was found that xx has an approximately normal distribution with mean μ=7.2\mu = 7.2 hours and standard deviation σ=1.8\sigma = 1.8 hours.

a) What is the probability that it takes at least 55 hours for a student to write a response paper?

b) Suppose 2020 students are selected at random. What is the probability that the mean time xˉ\bar{x} of writing a paper for these 2020 students is not more than 88 hours?

Answer: We have a normal distribution with μ=7.2\mu = 7.2 and σ=1.8\sigma = 1.8, and for Part a) we are asked to find P(x5)P(x\ge 5).

1 - pnorm(5, mean = 7.2, sd = 1.8)

b) The distribution of the sample means xˉ\bar{x} is normal with μxˉ=7.2\mu_{\bar{x}} = 7.2 and standard deviation σxˉ=1.820\sigma_{\bar{x}} = \frac{1.8}{\sqrt{20}}.

1.8/sqrt(20)

We want the probability P(xˉ8)P(\bar{x} \le 8)

pnorm(8, mean = 7.2, sd = 0.402)

Question 6: A random sample of 14 candy store franchises had a mean start up cost of xˉ=$104.70\bar{x} = \$104.70 thousand and s=$28.30s = \$28.30 thousand. Find a 95%95 \% confidence interval for the population average start-up cost μ\mu for candy store franchises. Assume xx has a distribution that is approximately normal.

Answer: We have to find a confidence interval using the Student t-distribution. The confidence level is 95%95\% so the sum of the two tails will be α=.05\alpha = .05, and therefore each tail should be .25.25, the area up to the critical value then should be .975.975. We use the inverse t function, with n1=13n-1 = 13 degrees of freedom:

t <- qt(.975, df = 13)
t

So we can calculate the margin of error using the formula: E=tcsnE = t_c \frac{s}{\sqrt{n}}

E <- t*28.30/sqrt(14)
E

The endpoints of the 95%95\% confidence interval are then xˉ±E=104.70±16.3399\bar{x} \pm E = 104.70 \pm 16.3399

104.70 - E
104.70 + E

So the interval is [88.36,121.04][ 88.36, 121.04]

Question 7: Let xx be a random variable that represents the hemoglobin count (HC) in human blood (measured in grams per milliliter). In healthy adult females, xx has an approximately normal distribution with a population mean of μ=14.2\mu=14.2, and population standard deviation of σ=2.5\sigma=2.5. Suppose a female patient had 1010 blood tests over the past year, and the sample mean HC was determined to be x=15.1\overline x =15.1. With a level of significance α=.05\alpha =.05, determine whether the patient's HC is higher than the population average. Specifically, do the following:

a) State the null hypothesis H0H_0 and the alternate hypothesis H1H_1.

b) Determine the value of the sample test statistic (either zz or tt).

c) Find the PP-value

d) Based on your answers for parts (a) through (c), would you reject or fail to reject the null hypothesis?

Answer: We have:

The population standard deviation is known so we are going to use a zz-test. This is a right tailed test with level of significance α=.05\alpha = .05. The critical zz-value is given by the formula:

z=xˉμσn=15.114.22.510z = \frac{\bar{x} - \mu}{\sigma}\, \sqrt{n} = \frac{15.1 - 14.2}{2.5}\, \sqrt{10}

z = (15.1 - 14.2)*sqrt(10)/2.5
z

The pp-value for this test is then P(z>1.96)=P(z<1.96)P(z > 1.96) = P(z<-1.96).

pnorm(-z, mean = 0, sd = 1)

Since the pp-value is more than the level of significance we fail to reject H0H_0.

Question 8: In South Africa the size of locust populations may be related to the average temperature during the time of year when most insect eggs incubate. In the following table xx is a random variable representing the average temperature over the incubation period in degrees Celsius while yy represents the length of incubation period in hours.

x\mathrm{x} 1111 1515 1919 2424 2525
y\mathrm{y} 0.50.5 5.25.2 21.021.0 30.130.1 28.928.9

a) Plot a scatter diagram of the data

Answer:

x <- c(11, 15, 19, 24, 25)
y <- c(0.5, 5.2, 21.0, 30.1, 28.9)
plot(x,y)

b) Based on a scatter diagram, would you estimate the correlation coefficient to be positive, close to zero, or negative?

Answer: The correlation coefficient should be positive.

c) Interpret your results from parts (a) and (b).

Answer: We expect that the higher the temperature during the incubation period the longer the incubation period is.