Teaching
Lecture Content
1. Biological inference• Why statistics?
• Hypothesis versus description (in general)
• Deducing causality. Correlation vs. causation.
• Pseudoreplication. Independence.
• Strong versus weak inference
• Subjectivity/bias.
• Statistical versus biological significance.
2. Study design.
• Observational & experimental studies: relative merits
• Designing an observational study. Sampling. Stratification.
• Efficient regression designs
• Designing an experiment. Controls. Randomisation. Replication. Initial conditions.
• Natural experiments
• Basic experimental designs: completely randomised; fully factorial; recognition of less than fully cross-factored:
nested, latin squares, repeated measures, matched pairs.
• Blocking
• Interpreting interactions
3. Models of Randomness
• Populations versus samples.
• Null models: what would have happened if nothing was going on?
• Randomisation tests
• Models of biological ‘randomness’:
• Families of curves for describing data and test statistics
• They these curves represent probability distributions that can be used for hypothesis testing
At the end, a feel for normal, standard normal, binomial and poisson distributions as descriptions of biological variability
and their use in estimating the probability that the data could have arisen by chance if nothing was going on (hypothesis
testing).
4. Confidence intervals and hypothesis testing.
• Confidence intervals - what they mean
• t-distribution
• Degrees of freedom
• Hypothesis testing
• meaning of a p-value
• one- and two-tailed tests
5. When to do which tests
• Do pictures first!!!
• Choice charts.
• Relative merits of parametric and non-parametric analyses
• Assumptions of parametric tests
• Recognising violations: model fitting; diagnostics
• What to do about violations: transformations
• Transformations: arcsin, log, square root, Box-Cox.
6. ANOVA I [This lecture is largely from the chapter in Grafen and Hails which is available free at click
here]
• ANOVA as an improvement on multiple t-tests.
• One way ANOVA (yield and fertiliser)
• What ANOVA is, how it works graphically
• Partitioning SS and df.
• F ratios.
• What the ANOVA table means.
• Assumptions; Diagnostics
• 2-Way ANOVA with replication (yield = fertiliser + pesticide + fertiliser*pesticide)
• Interactions
• 2-way ANOVA, no replication - no interactions
• ANOVA as a particular case of a GLM
7. ANOVA II/General Linear Models
• Controlling for other factors in ANOVA: step from 1- to 2- way ANOVA
• GLM of a two-way ANOVAS (GLM for unbalanced designs)
• What model fits mean
• Complex ANOVAs from simple situations: nesting and repeated measures
• Dirty ways of avoiding repeated measures
• Directionality tests (Rice and Gaines)
• Simple regression
• relationship to ANOVA
• which is the response variable?
• unusual/influential data points
8. Multiple Regression/General Linear Models
• Multiple regression: 2+ continuous variables
• ANCOVA
• Uses: prediction and statistical eliminations
• Standard tests as GLMs
• Interactions (again)
• Modelling with GLMs
9. Hypothesis testing II
• What a p-value is (again!)
• Type I and II errors
• Power: definition, determinants of, why it matters, what you learn from power calculations
• Relative merits of Type 1 & II errors: axiomatic or context-dependent
• Multiplicity - loss of control of Type I error: the problem, some solutions
10. Analysing frequencies
• Frequency data/contingency tables (cf continuous data and replicated counts)
• When to make continuous data discrete
• Confidence Intervals of a proportion
• chi-sq as a test statistic - its strange properties when expected values small
• Goodness of fit
• Contingency tables
• Problems with analysing some tables
• Tables that mislead: Use raw data; do not pool across heterogeneous tables
• Multi-way tables (brief mention)