Statistical modelling
Introduction to probabilistic and statistical modelling of risk
Overview
Risk analysis is sometimes based on the analysis of data concerning a hazardous event, such as the occurrence of an earthquake, or the exceedance of a threshold. This analysis is based on statistical modelling, most often with computer tools. When the risk analyst puts her data scientist hat on, she collects data (measurements, observations) from various sources and inputs them into the computer, obtains a general overview of the data and its distribution, and builds a statistical model which attempts to reproduce properties of the underlying phenomena. After checking that the statistical model is a good fit for the observations, she can generate various risk metrics and quantify the level of uncertainty in the predictions.
Statistical modelling (or “data science” or “machine learning”, to use related and more trendy terms) is an important part of risk analysis and safety in various engineering areas (mechanical engineering, nuclear engineering), in the management of natural hazards, in quality control, and in finance.
This submodule is a part of the risk analysis module.
Learning objectives
Upon completion of this submodule, you should be able to:
Analyze data using descriptive statistics and graphical tools
Fit a probability distribution to data (estimate distribution parameters)
Express various risk measures as statistical tests
Determine quantile measures of various risk metrics
Build flexible models to allow estimation of quantities of interest and associated uncertainty measures
Select appropriate distributions of random variables/vectors for random phenomena
Course material
Statistics and risk modelling with Python 

Python notebook on basic statistics 

Python notebook on coins and dice 

Python notebook on probability distributions 

Brief reminder on statistics 

Interactive examples of probability distributions 
Analyzing data with Python 

Exercise (Python notebook) on descriptive statistics 

Python notebook on simple descriptive statistics 

Python notebook on analysis and curve fitting for weather data 

Python notebook on analysis of speed of light measurements 

Python notebook on analysis of earthquake data 

Python notebook on Semmelweis’s work on risk reduction in hospitals 

Python notebook on intervention analysis using a hierarchical model 
In these course materials, applications are presented using the NumPy and SciPy libraries for the Python programming language. You can try out our computational notebooks using free online services such as MyBinder, Google Colab notebooks or Microsoft Azure notebooks.
Other resources
We recommend the following sources of further information on this topic:
CMU Open Learning Initiative course Probability and Statistics, a free and open (course materials can be followed at any time) course
Course materials for the MIT Introduction to Probability and Statistics course, which is free and open (course materials can be followed at any time)
EdX course Introduction to probability (MITx) – note that the course can only be taken at specific periods during the year
Udacity course Introduction to statistics (no prerequisites, but note that the course can only be taken at specific periods during the year)
Harvard Extension School online lecture on Sets, Counting and Probability
Textbook Introduction to Probability by C. Grinstead and J. L. Snell, freely available under GNU Free Documentation Licence
Textbook Statistical inference for everyone, freely available under a Creative Commons licence
Computational statistics in Python, an online textbook with many examples
Python for econometrics (University of Oxford, UK)
Exploratory computing with Python, a set of Python notebooks on analyzing data using NumPy/SciPy
Book: Statistical modeling: a fresh approach by Daniel Kaplan
The Probability and statistics cookbook, by Matthias Vallentin
The NIST Engineering Statistics Handbook, an online compendium of information on statistics useful for engineering analysis
Seeing theory, a visual introduction to basic concepts in probability and statistics
Back to top
Published:
Last updated: