Question 1
Emerson's math test scores are given in the table below:
$87$, $93$, $92$, $25$, $96$
a) Find the median.
dat1 <- c(87,93,92,25,96)
median(dat1)
b) Find the sample mean.
mean(dat1)
c) Find the sample standard deviation.
sd(dat1)
Question 2: Answer the following questions.
a) An animal shelter has a $58\%$ adoption rate for puppies. Of all puppies in the shelter, $80\%$ live to be $7$ years or older. Of the puppies who are adopted, $90\%$ live to be $7$ years or older. What is the probability that a randomly selected puppy in the shelter will get adopted and live $7$ or more years?
Answer:
The sample space is the set of all pupies living in the shelter.
Let $A$ be the event that a puppy is adopted, and $O$ the event that a puppy lives to be $7$ years or older. We are given, $P(A) = .58$, $P(O) = .8$, and $P(O|A) = .9$, and we are asked to find $P( A \textbf{ and } O)$. We will use the formula
$P( A \textbf{ and } O) = P(A)\, P(O|A)$
.58 * .9
So $P(A \textbf{ and } O) = .522$
Question 2 b) The probability of a plant going to seed is $34\%$. The probability of that same type of plant surviving the winter is $38\%$, and the probability of both is $10\%$. What is the probability that a randomly selected plant will go to seed or survive the winter?
Answer:
The sample space is the set of all plants of that type.
Let $S$ be the event that a plant will seed, and $W$ the event that a plant will survive the winter. We are given $P(S) = .34$, $P(W) = .38$, and $P(S \textbf{ and } W) = .1$, and we are asked to find $P(S \textbf{ or } W)$. We will use the formula:
$P(S \textbf{ or } W) = P(S) + P(W) - P(S \textbf{ and } W)$
.34 + .38 - .1
Question 3: A hair salon completed a survey of $347$ customers about satisfaction with service and type of customer. A walk-in customer is one who has seen no ads and not been referred. The other customers either saw a TV ad or were referred to the salon (but not both). The results follow.
Walk-In | TV Ad | Referred | Total | |
---|---|---|---|---|
Dissatisfied | $21$ | $7$ | $2$ | $30$ |
Neutral | $21$ | $22$ | $36$ | $79$ |
Satisfied | $26$ | $41$ | $61$ | $128$ |
Very Satisfied | $31$ | $35$ | $44$ | $110$ |
Total | $99$ | $105$ | $143$ | $347$ |
Assume the table represents the entire population of customers. Find the probability that a customer is
a) Dissatisfied
Answer: From the table we see that $P(\text{Dissatisfied}) = \dfrac{30}{347}$
30/347
b) Dissatisfied and a walk-in
Answer: From the table we see that $P(\text{Dissatisfied and a walk-in}) = \dfrac{21}{347}$
21/347
c) Referred
Answer: $P(\text{Referred}) = \dfrac{143}{347}$
143/347
d) Very satisfied, given referred
Answer: There are $143$ referred customers, $44$ of which are very satisfied. So
$P(\text{Very Satisfied, given Referred}) = \frac{44}{143}$
44/143
e) Very satisfied or saw a TV ad
Answer: We will use the formula: $P(\text{Very Satisfied} \textbf{ or } \text{Saw a TV ad}) = P(\text{Very Satisfied}) + P(\text{Saw a TV ad}) -P(\text{Very Satisfied} \textbf{ and } \text{Saw a TV ad})$
From the table we get:
110/347 + 105/347 - 35/347
Question 4: A basketball player makes $70\%$ of the free throws he shoots. Suppose that he tries $15$ free throws.
a) What is the probability that he will make more than $7$ throws?
b) Find the expected value.
c) Find the standard deviation.
Answer:
We have a binomial distribution with probability of success (makes the free throw) $p=.7$, and $n=15$.
a) We are asked for $P(r > 7) = P(r=8) + P(r=9) + \cdots + P(r=15)$
dbinom(8, size = 15, prob = .7) + dbinom(9, size = 15, prob = .7) + dbinom(10, size = 15, prob = .7) + dbinom(11, size = 15, prob = .7) + dbinom(12, size = 15, prob = .7) + dbinom(13, size = 15, prob = .7) + dbinom(14, size = 15, prob = .7) + dbinom(15, size = 15, prob = .7)
Alternatively we could have used the cumulative probability for the binomial distribution:
1 - pbinom(7, size = 15, prob = .7)
b) The expected value is given by $\mu = n\,p$
15*.7
c) The standard deviation is given by the formula: $\sigma = \sqrt{n\, p\,q}$ where, in our case, the probability of failure $q = 1 - .7 = .3$
sqrt(15*.7*.3)
Question 5: Let $x$ be a random variable that represents the length of time it takes a student to write a response paper. It was found that $x$ has an approximately normal distribution with mean $\mu = 7.2$ hours and standard deviation $\sigma = 1.8$ hours.
a) What is the probability that it takes at least $5$ hours for a student to write a response paper?
b) Suppose $20$ students are selected at random. What is the probability that the mean time $\bar{x}$ of writing a paper for these $20$ students is not more than $8$ hours?
Answer: We have a normal distribution with $\mu = 7.2$ and $\sigma = 1.8$, and for Part a) we are asked to find $P(x\ge 5)$.
1 - pnorm(5, mean = 7.2, sd = 1.8)
b) The distribution of the sample means $\bar{x}$ is normal with $\mu_{\bar{x}} = 7.2$ and standard deviation $\sigma_{\bar{x}} = \frac{1.8}{\sqrt{20}}$.
1.8/sqrt(20)
We want the probability $P(\bar{x} \le 8)$
pnorm(8, mean = 7.2, sd = 0.402)
Question 6: A random sample of 14 candy store franchises had a mean start up cost of $\bar{x} = \$104.70$ thousand and $s = \$28.30$ thousand. Find a $95 \%$ confidence interval for the population average start-up cost $\mu$ for candy store franchises. Assume $x$ has a distribution that is approximately normal.
Answer: We have to find a confidence interval using the Student t-distribution. The confidence level is $95\%$ so the sum of the two tails will be $\alpha = .05$, and therefore each tail should be $.25$, the area up to the critical value then should be $.975$. We use the inverse t function, with $n-1 = 13$ degrees of freedom:
t <- qt(.975, df = 13)
t
So we can calculate the margin of error using the formula: $E = t_c \frac{s}{\sqrt{n}}$
E <- t*28.30/sqrt(14)
E
The endpoints of the $95\%$ confidence interval are then $\bar{x} \pm E = 104.70 \pm 16.3399$
104.70 - E
104.70 + E
So the interval is $[ 88.36, 121.04]$
Question 7: Let $x$ be a random variable that represents the hemoglobin count (HC) in human blood (measured in grams per milliliter). In healthy adult females, $x$ has an approximately normal distribution with a population mean of $\mu=14.2$, and population standard deviation of $\sigma=2.5$. Suppose a female patient had $10$ blood tests over the past year, and the sample mean HC was determined to be $\overline x =15.1$. With a level of significance $\alpha =.05$, determine whether the patient's HC is higher than the population average. Specifically, do the following:
a) State the null hypothesis $H_0$ and the alternate hypothesis $H_1$.
b) Determine the value of the sample test statistic (either $z$ or $t$).
c) Find the $P$-value
d) Based on your answers for parts (a) through (c), would you reject or fail to reject the null hypothesis?
Answer: We have:
The population standard deviation is known so we are going to use a $z$-test. This is a right tailed test with level of significance $\alpha = .05$. The critical $z$-value is given by the formula:
$z = \frac{\bar{x} - \mu}{\sigma}\, \sqrt{n} = \frac{15.1 - 14.2}{2.5}\, \sqrt{10}$
z = (15.1 - 14.2)*sqrt(10)/2.5
z
The $p$-value for this test is then $P(z > 1.96) = P(z<-1.96)$.
pnorm(-z, mean = 0, sd = 1)
Since the $p$-value is more than the level of significance we fail to reject $H_0$.
Question 8: In South Africa the size of locust populations may be related to the average temperature during the time of year when most insect eggs incubate. In the following table $x$ is a random variable representing the average temperature over the incubation period in degrees Celsius while $y$ represents the length of incubation period in hours.
$\mathrm{x}$ | $11$ | $15$ | $19$ | $24$ | $25$ |
---|---|---|---|---|---|
$\mathrm{y}$ | $0.5$ | $5.2$ | $21.0$ | $30.1$ | $28.9$ |
a) Plot a scatter diagram of the data
Answer:
x <- c(11, 15, 19, 24, 25) y <- c(0.5, 5.2, 21.0, 30.1, 28.9)
plot(x,y)
b) Based on a scatter diagram, would you estimate the correlation coefficient to be positive, close to zero, or negative?
Answer: The correlation coefficient should be positive.
c) Interpret your results from parts (a) and (b).
Answer: We expect that the higher the temperature during the incubation period the longer the incubation period is.