Dr. Sarah's Statistics Review Sheet
One 8.5*11 sheet with writing on both sides
allowed.
Calculator and straight edge/ruler
are mandatory.
Some of the questions will be similar to problems you have seen before.
There will also be some problems with new twists to test your
understanding of the material. Partial credit will be granted
so, for example, you can receive full credit for part b) even if it depends
on part a) which you did incorrectly, as long as you communicate
understanding of part b) on your test.
Creating various statistical
representations of data:
Analyzing and critiquing various statistical
representations of data:
-
Make sure that you understand
what kind of information you can get from statistical
representations where you are not given the underlying data (like the
ASULearn Material Review Questions for Test 3. ).
- Make sure that you
understand how to give various spins on data and statistical
representations (as if you were
in advertising and wanted to use the stats to say something positive or
negative about the situation, like "Here's Good News... SAT scores
are declining at a slower rate").
- Make sure that you understand statistical common sense
in the context of the real life problem that we are working on
in order to critique statistical methodology of data collection,
interpretations, conclusions and predictions
- problematic instructions like the MRT test may affect the results,
- it doesn't make sense
to predict someone's ability when they haven't slept for 2000
hours because a human can't stay awake that long
- the stock market
is based on emotion and thus not predictable.
If the r2 value was 100% in an increasing stock, we anticipate
something fishy is going on, since
our stocks should fluctuate instead of following a linear trend. So we might
be just as safe shorting the stock (betting it will decline) as investing in
it.
- It is problematic to use data to predict long in the future
except in very specialized circumstances like the egg bungee where we
had good reason to expect a linear trend to continue because
the slope, the change in distance stretched per change in
rubber bands, was approximately a constant
since each rubber band stretched about the same amount...
Review
- mean and median (lab, Heart of Math reading)
- boxplot (How Do You Know readings,
and ASULearn Material Review Questions for Test 3
- be sure that you understand and could do similar problems-
- linear regression, including
what kind of predictor something is (see HDYK p. 179
and put this chart on your cheat sheet),
what prediction the regression line
gives for x-values not on the graph (see Example 4.4 on p. 179,
and p. 187 #15 part b and c,
and be able to do similar calculations using the line),
and whether or not the prediction makes sense given the context of the
problem (see also egg bungee and linear regression lab worksheets,
class notes, ASULearn questions, Buchanan votes class analysis...).
Know big picture ideas, but it is NOT necessary to know the
calculation details, ie I won't ask you to do a calculation, but
I do expect you to be able to discuss the highlights and point of our
homework readings and class discussions on
- golden mean
- census and sampling issues
- stereotype vulnerability and statistics related to success in mathematics
- stock market issues
- HIV testing
- election issues
The big picture as related to our mathematicians
First review the top portion of the
test 2 study guide for the
themes.
We employed the same methodology in the statistics segment.
For this test you will be asked to look at examples from the statistics
segment and discuss the similarities. Ideas from the rest of this
study guide work well here. For example:
Impossibility of checking all the cases, but finding a solution by
shifting our viewpoint.
It is impossible to
conduct a census of everyone for most data collection situations, so
we have to carefully shift our view to sampling issues (elaborate).
Viewing objects that are impossible to see by managing small pieces at
a time.
When we examined the Vietnam Draft info, we obtained a scatterplot
that looked random to the naked eye, but it was hard to tell because there
were so many points. The regression line had a negative
slope, indicating that as the birthday occurred later in the year,
the draft number was lower, but the r2 value was very small,
indicating a very weak prediction. It was impossible to see any patterns
via the complete picture, so instead we broke the data up into smaller pieces -
by month, via 12 boxplots. Here it was easy to see the pattern.
November and December birthdays had a 75% chance of being drafted because
q3 was under the 196 draft number, while
earlier birthdays only had a 50% chance, since the median was near the 196
draft number.
Similarly, reflect on statistics examples that satisfy the following:
Impossibility of constructing a solution but finding a non-constructivist
approach.
Reaching numerous conclusions from a complete set of measurements