# IB/NRES 509: Statistical Modeling

## Instructor: Dr. Michael Dietze

mdietze at illinois.edu
Morrill 183 / IGB 1405
Office hours by appointment

## Teaching Assistant: Ryan Kelly

rkelly at life.illinois.edu
Morrill 171B
Phone: 244-9871
Office hours: Mon/Wed 11-12pm or by appointment

## Course Format:

4 credit hours -- 3 50-min lectures and 1 3-hr computer lab

## Course Description:

Researchers in the biological and environmental sciences are often confronted with data that is complex in nature and is beyond the assumptions of classical statistical tests. The goal of confronting our scientific theories with data often requires us to embrace the complexity of our data -- to make inference on indirectly observed quantities, to bring multiple types of data to bear on a single question, to synthesize past observations with new data, or to separate the effects of different error processes (e.g. observation error vs. inherent variability in the process). This class provides an introduction to modern statistical modeling from both likelihood and Bayesian perspectives. The focus is on science-driven, problem-specific design of statistical analyses for complex data. Topics include point estimation, interval estimation, model selection, regression, non-linear models, non-Gaussian model, generalized linear models, hierarchical models, time-series analysis, spatial models, data assimilation, and statistical forecasting. Computational methods such as numerical optimization and Markov-Chain Monte-Carlo simulation are covered with a focus on hands-on application to real data. Course is designed around case-study problem sets using R and BUGS. Students are encouraged to use their own datasets for class projects.

## Prerequisites:

Calculus (e.g. MATH 220); introductory statistics; or consent of instructor
In practice, a basic familiarity with math and statistics is what is is required. You should know what a derivative, integral, and sum are, you should have heard of ANOVA and regression, and you should have a general understanding of experimental design, randomization, and exploratory data analysis.

## Text:

Models for Ecological Data by James S. Clark

While the primary text takes an ecological perspective the methods are applicable to all aspects of research in the biological and environmental sciences. The primary text will be supplemented with select readings from additional textbooks and primary literature. Literature readings will focus on examples of the application of statistical models in the biological and environmental literature rather than methods papers. These case studies will also serve as the focus for the analysis problems in the lab component.

## Software:

The R project for statistical computing
The BUGS project - Bayesian inference Using Gibbs Sampling
Hints for Mac Users on how to install BUGS using WINE

Grading will be based on lab reports/problem sets, a semester-long project, and four exams. Students are encouraged to make use of their own data sets for the semester project.
• Lab reports/problem sets (10 points each) = 150
• Semester project = 95
• Project proposal (10)
• Model description (15)
• Preliminary analysis (20)
• Final report (50)
• Exams (25,30,30,30 points) = 115
Total = 360

## Lecture Schedule

 Date Topics Reading Assignments 1/18 Introduction to model-based inference Clark: Chapter 1 Optional: Otto and Day: Math ReviewLecture Notes 1/20 Probability theory: joint, conditional, and marginal distributions Hilborn and Mangel Ch 3 p39-62 Optional: Clark Appendix DLecture Notes 1/23 Probability theory: discrete and continuous distributions Hilborn and Mangel Ch 3 p62-93 Optional: Clark Appendix F Lecture Notes 1/25 Maximum Likelihood Chapter 3.1-3.2 Optional: Chapter 2 Lecture Notes 1/27 Point estimation by MLE Chapter 3.3-3.5 Lecture Notes 1/31 Analytically tractable MLEs Chapter 3.6-3.9 Optional: Bolker Ch 3 Lecture Notes 2/1 Intractable MLEs and basic numerical optimization Chapter 3.10-3.13 Lecture Notes 2/3 EXAM 1: Probability Theory, Maximum Likelihood 2/6 Bayes Theorem Chapter 4.1Ellison 2004 Lecture Notes 2/8 Point estimation using Bayes Chapter 4.2 Lecture Notes 2/10 Analytically-tractable Bayes: conjugacy and priors Chapter 4.3, Appendix G Lecture Notes 2/13 Numerical methods for Bayes: MCMC Chapter 7.1-7.2, 7.3 intro Lecture Notes 2/15 MCMC: Metropolis-Hastings 7.3.1, 7.3.2, 7.5 Lecture Notes Project Proposals 2/17 MCMC: Gibbs sampler Chapter 7.3.3, 7.3.4 Lecture Notes BUGS code 2/20 Interval Estimation: Bayesian credible intervals Chapter 5 Lecture Notes 2/22 Frequentist confidence intervals I Chapter 5 Lecture Notes 2/24 Frequentist confidence intervals II Chapter 5 Lecture Notes 2/27 EXAM 2:Bayes, CI 2/29 Model Selection: Likelihood ratio test, AIC Hilborn and Mangel Chapter 2 Lecture Notes 3/2 Model Selection: DIC, predictive loss, model averaging Chapter 6 Lecture Notes 3/5 Regression Chapter 5.4 & 7.4 Lecture Notes 3/07 Regression: Errors in variables, heteroskedasticity Chapter 7.6, 7.7, 8.1 Lecture Notes 3/09 Logistic regression Chapter 8.2-8.2.3 Lecture Notes 3/12 GLMs Chapter 8.2-8.2.3 3/14 Mixed Models Chapter 8.2.4Lecture Notes Model Description 3/16 Regression Lecture Notes R demo 3/26 Hierarchical Bayes Chapter 8.2.5 - 8.3 Lecture Notes 3/28 Nonlinear models Chapter 8.4Lecture Notes 3/30 Applications of random effects models Chapter 8.5-8.7Lecture Notes 4/2 EXAM 3GLMM, HB 4/4 Time series: Basics and State-Space Chapters 9.1, 9.2, 9.6Lecture Notes 4/6 Time series: State Space and Mark-Recapture Chapter 9.7, 9.8, 9.16 Lecture Notes 4/09 Time series: ARMA Chapter 9.3, 9.5 Lecture Notes 4/11 Time Series: Repeated Measures Chapter 9.10, 9.14, 9.15 Lecture Notes Preliminary Analysis 4/13 Time series: Matrix models, SIR Chapter 9.17, 9.18 Hatala et al. 2011 4/16 Spatial: point pattern data Chapter 10.6Lecture Notes 4/18 Spatial: point-referenced (geostatistical) data and Kreiging Chapter 10.7Lecture Notes 4/20 Spatial: block-referenced data and misalignment Chapter10.8Lecture Notes 4/23 Spatial: conditional autoregressive models (CAR) Chapter 10.9, 10.10Lecture Notes 4/25 Data assimilation: classic Kalman filter Wikle and Berliner 2007 4/27 Data assimilation: Kalman variants 4/30 Data assimilation: Bayesian state-space revisited 5/2 Forecasting: Ensemble analysis 5/8 8 AM EXAM 4 Time, Space, DA 211 Davenport Hall FINAL PROJECT

## Lab Syllabus

 Lab Week Topics Software 1 1/18 Introduction to R R 2 1/25 Probability distributions and sampling R 3 2/1 Maximum likelihood - basics R 4 2/8 Maximum likelihood - numerical optimization R 5 2/15 Introduction to BUGS BUGS 6 2/22 Gibbs sampler R 7 2/29 Metropolis Algorithm R 8 3/7 Interval estimation and model selection R 9 3/14 Regression Both 10 3/28 Hierarchical modeling WinBUGS 11 4/4 State-space time series WinBUGS 12 4/11 Peer Assessment 13 4/18 Exploratory data analysis: space and time R 14 4/25 Spatial CAR and Kriging WinBUGS 15 5/2 Data Assimilation R

### Books

• Ecological Perspectives
• Hilborn, R., M. Mangel. 1997. The Ecological Detective. Princeton University Press.
• Bolker, B. 2008. Ecological Models and Data and R
• Clark, J.S. & A. E. Gelfand. 2006. Hierarchical Modelling for the Environmental Sciences: Statistical Methods and Applications. Oxford University Press.
• McCarthy, M. 2007. Bayesian methods for Ecology. Cambridge University Press.
• WinBUGS examples
• Congdon, P. 2001. Bayesian Statistical Modelling. John Wiley & Sons.
• Congdon, P. 2003. Applied Bayesian Modelling. John Wiley & Sons.
• Congdon, P. 2005. Bayesian Models for Categorical Data. John Wiley & Sons.
• Intro Bayesian & Hierarchical Bayes
• Bolstad, W. M. 2007. Introduction to Bayesian Statistics, 2nd Edition. John Wiley & Sons, Inc., Hoboken, NJ.
• Gelman, A., J.B. Carlin, H.S. Stern, D.B. Rubin. 2003. Bayesian Data Analysis, 2nd Edition. Chapman & Hall/CRC Press.
• Gelman, A. & J. Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge.
• Silvia, D.S. & Skilling J. 2006. Data Analysis: A Bayesian Tutorial, 2nd Edition. Oxford University Press, Oxford.
• Winkler, R.L. 2003. An Introduction to Bayesian Inference and Decision. Probabilistic Publishing, Gainesville, FL.
• Statistical books
• Albert, J. 2008. Bayesian Computation with R. Springer. 270 pages.
• Carlin, B.P., T.A. Louis. 2000. Bayes and Empirical Bayes Methods for Data Analysis. Chapmall & Hall/CRC.
• Gamerman, D., H. F. Lopes 2006. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, 2nd Edition. Chapman & Hall/CRC.
• Gilks, W.R., S. Richardson, D.J. Spiegelhalter. 1996. Markov Chain Monte Carlo in Practice. Chapman & Hall/CRC.
• Marin, J.-M., C. P. Robert. 2007. Bayesian Core: A Practical Approach to Computational Bayesian Statistics. Springer, New York.

### General review articles

• Special issue in Ecology. 2003. Ecological Uncertainty and Forecasting. Vol. 84(6): 1349-1414.
• Special issue in Ecological Applications. 1996. Bayesian Inference. Vol. 6(4): 1034+.
• Special issue in Ecological Applications. 2006. Deepening Ecological Insights Using Contemporary Statistics. Vol. 16(1): 3-124.
• Forum articles in Ecological Applications. 2008. Forum-Hierarchical Statistical Models in Ecology. Vol. 19(3): 551-596.
• Ellison (2004) Bayesian inference in ecology. Ecology Letters 7:509-520.
• Clark (2005) Why Environmental scientists are becoming Bayesians. Ecology Letters 8:2-14.
• Clark & Gelfand (2006) A future for models and data in environmental sciences. Trends in Ecology & Evolution 21:375-380.
• Ogle & Barber (2008) Bayesian data-model integration in plant physiological and ecosystem ecology. Progress in Botany Vol. 69:281-311.