Linear regression analysis
Predicting future performance and explaining observations


Linear regression analysis means “fitting a straight line to data”. It’s a widely used technique to help model and understand real-world phenomena, which is easy to use and to understand intuitively. It allows prediction of future outputs from the phenomenon you are modelling. Learn how to use plots for exploratory data analysis, to determine whether a linear model might be suitable for your data. Learn how to build univariate and multivariate linear models using the Python statsmodel library.

We also present a number of possible pitfalls when using linear regression, including sample size issues, treatment of outliers and order of effect problems.

This submodule is a part of the risk analysis module.

Course material

Regression analysis using Python

Lecture slides (PDF)
View on Slideshare

Python notebook on regression analysis of health impact of smoking

Python notebook on regression analysis of a combined cycle power plant

In these course materials, applications are presented using the NumPy, SciPy and statsmodels libraries for the Python programming language.

Linear and societal regression
Linear regression and societal regression, by J. Cham (

Other resources

We recommend the following sources of further information on this topic:

Last updated: