Linear regression analysis
Predicting future performance and explaining observations
Linear regression analysis means “fitting a straight line to data”. It’s a widely used technique to help model and understand real-world phenomena, which is easy to use and to understand intuitively. It allows prediction of future outputs from the phenomenon you are modelling. Learn how to use plots for exploratory data analysis, to determine whether a linear model might be suitable for your data. Learn how to build univariate and multivariate linear models using the Python statsmodel library.
We also present a number of possible pitfalls when using linear regression, including sample size issues, treatment of outliers and order of effect problems.
This submodule is a part of the risk analysis module.
Regression analysis using Python
Python notebook on regression analysis of health impact of smoking
Python notebook on regression analysis of a combined cycle power plant
Python notebook with bootstrap regression of helmet performance data
We recommend the following sources of further information on this topic:
The Stanford Online (via EdX) class on Statistical Learning introduces supervised learning with a focus on regression and classification methods
Khan Academy material on regression
EdX course The Analytics Edge from MIT
The online, open-access textbook Forecasting: principles and practice, (uses R rather than Python)