Risk Engineering

Linear regression analysis
Predicting future performance and explaining observations


Overview

Linear regression analysis means “fitting a straight line to data”. It’s a widely used technique to help model and understand real-world phenomena, which is easy to use and easy to understand intuitively. It allows prediction of future outputs from the phenomenon you are modelling. Learn how to use plots for exploratory data analysis, to determine whether a linear model might be suitable for your data. Learn how to build univariate and multivariate linear models using the Python statsmodel library.

The lecture also presents a number of possible pitfalls when using linear regression, including sample size issues, treatment of outliers and order of effect problems.

This submodule is a part of the risk analysis module.

Course material

Regression analysis using Python

Lecture slides (PDF)
View on Slideshare

Python notebook on regression analysis of health impact of smoking

IPython notebook
View notebook online

Python notebook on regression analysis of a combined cycle power plant

IPython notebook
View notebook online

In these course materials, applications are presented using the NumPy, SciPy and statsmodels libraries for the Python programming language.

Other resources

We recommend the following sources of further information on this topic:

Last updated: