Practice final exam questions STAT2331.

Topics:

Displaying data (stemplots etc including ideas of symmetry, skewness, outliers,
modes)

Measures of center (mean, median), ideas of robustness vs efficiency

Measures of variation (IQR, s)

Regression and correlation

Density curves including the normal curve and the 68-95-99.7 rule

two-way tables, 3-way tables and Simpson's paradox

Probability rules, sampling distributions, the central limit theorem,
conditional probability, confidence intervals. Conditional probability using two-way
tables

Testing and confidence intervals. CI's for means (one
and two sample), proportions (also one and two sample), and the 1-sample t-test.
Don't forget about the little things like defining parameters. Also you will
need to know about definitions like those of Type I and Type II errors.
Definition of p-value.

Disclaimer: These questions are intended as a guide to the possible kinds of questions and topics you may encounter on the real exam. The real exam may contain questions on topics that are not presented in this practice exam. In addition it may contain questions on the same topics, but exploring different aspects. There are no guarantees.

What to bring with you: 3 sheets of formulae from your previous exams, a calculator and your t-table. Don't forget to add in any additional material.

Here are the practice questions;

Multiple choice:

1) The output from a sewage treatment plant is constantly monitored to
assess treatment efficacy. Suppose that the mean coliform content is 20
bacteria/ml with a standard deviation of 4 bacteria/ml. An automatic measuring
device is being used to monitor the bacterial levels. An alarm should ring
whenever the bacterial level exceeds the 97.5

- 20 per ml
- 24 per ml
- 28 per ml
- 32 per ml
- 16 per ml

2) A research biologist has carried out an experiment on a random sample of 15 experimental plots in a field. Following the collection of data, a test of significance was conducted under appropriate null and alternative hypotheses and the P-value was determined to be approximately .03. This indicates that:

- this result is statistically significant at the .01 level.
- the probability of being wrong in this situation is only .03.
- this result is statistically significant at the .05 level..
- if this experiment were repeated 3 per cent of the time we would get this same result.
- the sample is so small that little confidence can be placed on the result

3) Does playing music to dairy cattle increase their milk production? An experiment was conducted where a group of dairy cattle was divided into two groups. Music was played to one group; the control group did not have music played. The average increase in production of the music (treatment) group as compared to the control group was 2.5 L/cow over the time period in question. A 95% confidence interval for the difference (treatment-control) in the mean production was computed to be (1.5,3.5) L/cow. (Note here that L=liter.) This means:

- 95% of the cows increased their production by between 1.5 and 3.5 L.
- We are 95% confident that the average increase in production in the sample is 2.5 L/cow.
- Because the confidence interval does not contain zero, we are 95% confident that there was no effect of playing music.
- We don't know the true increase in production, but we will claim that the true increase in production is between 1.5 and 3.5 L/cow.
- Because the confidence interval does not include zero, we are 95% confident that the true increase in production for all cows is 2.5 L

Short answer

4) I am weighing cats to determine the average weight of a cat. I obtain a random sample of 25 cats and find a mean weight of 10 pounds with a standard deviation of 2 pounds. Find a 98% CI for the true mean weight of cats.

5) A study of post-operative pain relief is conducted to determine if drug A
has a different duration of pain relief from drug B. Observations of the hours
of pain relief are recorded for 55 patients given drug A and 58 given drug B.
For drug A the average number of hours of pain relief is 5.64 with a standard
deviation of 1.25 hours. For drug B the mean is 5.03 hours with a standard
deviation of 1.82 hours.

(a) Find a 95% CI for the true mean difference in hours of pain relief between
the two drugs.

(b) Use your CI to test for a difference. Don;t forget to make a conclusion.

6) The Denver Post reported that based on a survey of 300 residents in the Denver area, 75% of those surveyed agreed with the statement "leaving old-growth forests unharvested is a good strategy". However a random sample of 116 Aspen residents showed that 82% agreed with this statement. Can you claim there is a difference between the proportions who support the statement?

7) It has been found through a study of elite male runners that their weights
and heights are roughly linearly related by the following equation

weight = -100 + 3.43 height

where weight is in pounds and height in inches.

(a) Predict the weight of an elite male runner of height 68 inches.

(b) I have an elite male runner of 140 pounds. How tall do you think he is?

(c) Is the correlation between height and weight positive here?

(d) Is this a strong positive relationship? (Answer Yes, No or can't tell, with
a brief reason.)

(e) Suppose I have an elite runner who weighs 200 pounds. What height would he
be? Comment on the use of this model for such a runner.

8) Studies show that the chances of being killed in a serious auto accident are .2 if you are wearing a seatbelt and .7 if you aren't wearing a seatbelt. Also about 75% of the driving population regularly wear seatbelts. Suppose you hear that someone has been killed driving in a serious auto accident. What is the chance they were wearing their seatbelt?

9) Two species of predatory birds, collared flycatchers and tits, compete for nest holes during breeding season on the island of Gotland, Sweden. Frequently, dead flycatchers are found in nest boxes occupied by tits. A field study examined whether the risk of mortality to flycatchers is related to the degree of competition between the two bird species for nest sites. Specifically the study found a relationship between y, the number of flycatchers killed, and x, the nest box tit occupancy rate expressed as a percentage (not as a fraction!!). The relationship was approximately linear, with the best-fitting line given by

y = -3.05 + .11x

(a) Predict the average number of flycatchers killed if the nest box tit occupancy is 50% (don’t write 50% as a fraction, it is x=50, not x=0.5).

(b) Suppose you find a site where 3 flycatchers have been killed. What is your best guess as to the % of nest boxes occupied by tits?

(c) Suppose r=0.75. What proportion of variation in number of flycatchers killed is explained by it’s linear relationship to nest box occupancy by tits?

(d) Predict the number of flycatchers killed if nest box occupancy by tits is 10%.

Comment on the answer to (d). What does it mean about the model?