Sensitivity analysis¶

This notebook is an element of the risk-engineering.org courseware. It can be distributed under the terms of the Creative Commons Attribution-ShareAlike licence.

Author: Eric Marsden eric.marsden@risk-engineering.org

This notebook contains an introduction to use of Python, SciPy, SymPy and the SALib library for sensitivity analysis. Consult the accompanying course materials for details of the applications of sensitivity analysis and some intuition and theory of the technique, and to download this content as a Jupyter/Python notebook.

In [1]:

import numpy
import scipy.stats
import pandas
import matplotlib.pyplot as plt
plt.style.use("bmh")
%config InlineBackend.figure_formats=["svg"]

The Rosenbrock function is a classic in uncertainty analysis and sensitivity analysis.

In [2]:

def rosenbrock(x1, x2):
    return 100 * (x2 - x1**2)**2 + (1 - x1)**2

Since we are lucky enough to be working in a small number of dimensions, let’s plot the function over the domain $[-2, 2]^2$ to get a feel for its shape.

In [3]:

from mpl_toolkits.mplot3d import Axes3D

N = 100
ax = plt.figure().add_subplot(projection="3d")
x = numpy.linspace(-2, 2, N)
y = numpy.linspace(-2, 2, N)
X, Y = numpy.meshgrid(x, y)
ax.plot_surface(X, Y, rosenbrock(X, Y), alpha=0.8)
ax.set_facecolor("white")
plt.xlabel(r"$x_1$")
plt.ylabel(r"$x_2$");

No description has been provided for this image

Local sensitivity analysis¶

We can undertake a local sensitivity analysis by calculating the local derivatives of the Rosenbrock function, with respect to the two input parameters. The local derivatives can be estimated numerically, or calculated analytically (if you know the analytical form of the function you are interested in, and if the function is not excessively difficult to differentiate).

In this case our Rosenbrock function is easy to differentiate by hand, but let us demonstrate the use of the SymPy library to do symbolic differentiation with the computer. You may need to install the SymPy package for your Python installation.

You may be interested in the minireference.com tutorial on SymPy.

In [4]:

import sympy
from sympy.interactive import printing
printing.init_printing(use_latex="mathjax")

In [5]:

x1 = sympy.Symbol("x1")
x2 = sympy.Symbol("x2")
rosen = 100 * (x2 - x1**2)**2 + (1 - x1)**2
d1 = sympy.diff(rosen, x1)
d1

Out[5]:

$\displaystyle - 400 x_{1} \left(- x_{1}^{2} + x_{2}\right) + 2 x_{1} - 2$

In [6]:

d2 = sympy.diff(rosen, x2)
d2

Out[6]:

$\displaystyle - 200 x_{1}^{2} + 200 x_{2}$

The rosenbrock function looks pretty flat around $(0, 0)$; let’s check the local sensitivity in that location. First check $\frac{∂f}{∂x_1}(0, 0)$, then $\frac{∂f}{∂x_2}(0, 0)$

In [7]:

d1.subs({x1: 0, x2: 0})

Out[7]:

$\displaystyle -2$

In [8]:

d2.subs({x1: 0, x2: 0})

Out[8]:

$\displaystyle 0$

The function looks much steeper (higher local sensitivity) around $(-2, -2)$; let’s check that numerically.

In [9]:

d1.subs({x1: -2, x2: -2})

Out[9]:

$\displaystyle -4806$

In [10]:

d2.subs({x1: -2, x2: -2})

Out[10]:

$\displaystyle -1200$

At $(-2, 2)$ the sensitivity should be somewhere in between these two points.

In [11]:

d1.subs({x1: -2, x2: 2})

Out[11]:

$\displaystyle -1606$

In [12]:

d2.subs({x1: -2, x2: 2})

Out[12]:

$\displaystyle -400$

We can use SciPy’s optimization functionality to find the minimum of the Rosenbrock function on the domain $[-2, 2]^2$, then check that (as we expect) the local sensitivity at the minimum is zero. The last argument [2, 2] to the function scipy.optimize.fmin is the starting point of the optimization search.

In [13]:

import scipy
scipy.optimize.fmin(lambda x: rosenbrock(x[0], x[1]), [2, 2])

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 62
         Function evaluations: 119

Out[13]:

array([0.99998292, 0.99996512])

The subs function in SymPy does variable substitution; it allows you to evaluate an expression with given values for the variables (x1 and x2 in this case).

In [14]:

d1.subs({x1: 1, x2: 1})

Out[14]:

$\displaystyle 0$

In [15]:

d2.subs({x1: 1, x2: 1})

Out[15]:

$\displaystyle 0$

Global sensitivity analysis¶

We can use the SALib library (available for download from https://github.com/SALib/SALib) to undertake a global sensitivity analysis, using Saltelli’s scheme to estimate the Sobol’ sensitivity indices (this is one implementation of the family of methods sometimes called Monte Carlo pick-freeze). You may need to install this library. If you’re using Linux, a command that may work is

sudo pip install SALib

sudo pip3 install SALib

or if you’re using a CoLaboratory notebook, execute a code cell containing

!pip install SALib

In [16]:

# this will fail if SALib isn't properly installed
import SALib.sample.sobol
import SALib.analyze.sobol

As indicated in the SALib documentation, a typical sensitivity analysis using SALib follows four steps:

Specify the model inputs (parameters) and their bounds (amount of input variability)
Run the sample function to generate the model inputs
Evaluate the model at each generate input point and save the outputs
Run the analyze function on the outputs to compute the sensitivity indices

In [17]:

N = 1024
# Specify the model inputs and their bounds. The default probability
# distribution is a uniform distribution between lower and upper bounds.
problem = {
    "num_vars": 2, 
    "names": ["x1", "x2"], 
    "bounds": [[-2, 2], [-2, 2]]
}
# generate the input sample
sample = SALib.sample.sobol.sample(problem, N)
Y = numpy.empty([sample.shape[0]])
# now we evaluate our model for each point in the input sample
for i in range(len(Y)):
    x = sample[i]
    Y[i] = rosenbrock(x[0], x[1])
# estimate the sensitivity indices, using the Sobol' method
sensitivity = SALib.analyze.sobol.analyze(problem, Y)

The variable sensitivity is a Python dictionary that contains the different sensitivity indices. sensitivity["S1"] contains the first-order sensitivity indices, which tell us how much $x_1$ and $x_2$ each contribute to the overall output variability of the rosenbrock function over the domain $[-2, 2]^2$.

In [18]:

sensitivity["S1"]

Out[18]:

array([0.50356558, 0.294727  ])

Interpretation: we note that $x_1$ (whose sensitivity index is around 0.5) contributes to roughly half of total output uncertainty, and is a little less than two times more influential (or sensitive) over this domain than $x_2$ (whose sensitivity index is around 0.3).

ST contains the total indices, which include the interaction effects with other variables.

In [19]:

sensitivity["ST"]

Out[19]:

array([0.69933197, 0.50365889])

Interpretation: The total sensitivity of $x_1$ (around 0.7) indicates that a significant amount (around 20%) of our total output uncertainty is due to the interaction of $x_1$ with other input variables. Since there are only two input variables, we know that this interaction effect must be with $x_2$.

Some sensitivity analysis methods are also able to provide second and third order sensitivity indices. A second order index $s_{i,j}$ tells you the level of interaction effects between $x_i$ and $x_j$ (interaction effects are greater than zero when your function is non-linear: the sensitivity of parameter $i$ may then depend on the value of parameter $j$). A third order index $s_{i,j,k}$ tells you the level of interaction between three parameters $x_i$, $x_j$ and $x_k$.

Using non-uniform probability distributions¶

If your input variability is representing uncertainty, you may wish to represent your input variables using normal probability distributions (a standard choice for measurement uncertainty) or triangular probability distributions (commonly used to represent epistemic uncertainty). This can be specified in the problem dictionary.

In [20]:

N = 1024
# Specify the model inputs and their bounds. Here we are using normal probability
# distributions for x1 and x2 (specify the mean and stdev in `bounds`)
problem = {
    "num_vars": 2, 
    "names": ["x1", "x2"], 
    "dists": ["norm", "norm"], # or "unif", "triang", "lognorm"
    "bounds": [[0, 1], [0, 1]]
}
# generate the input sample using Saltelli's scheme
sample = SALib.sample.sobol.sample(problem, N)
Y = numpy.empty([sample.shape[0]])
# now we evaluate our model for each point in the input sample
for i in range(len(Y)):
    x = sample[i]
    Y[i] = rosenbrock(x[0], x[1])
# estimate the sensitivity indices, using the Sobol' method
sensitivity = SALib.analyze.sobol.analyze(problem, Y)

In [21]:

sensitivity["S1"]

Out[21]:

array([1.00483068, 0.03110834])

In [22]:

sensitivity["ST"]

Out[22]:

array([0.87462909, 0.10580669])

Exercise¶

The Ishigami function is a well-known test function for uncertainty analysis and sensitivity analysis (it is highly non-linear). Its definition is given below.

In [23]:

def ishigami(x1, x2, x3) -> float:
    return numpy.sin(x1) + 7*numpy.sin(x2)**2 + 0.1 * x3**4 * numpy.sin(x1)

Task: undertake a global sensitivity analysis of the Ishigami function over the domain $[-\pi, \pi]^3$ (uniform probability distribution) and estimate the first-order and total sensitivity indices.

Check: your estimated first-order indices should be approximately 0.3139, 0.4424 and 0 for x1, x2 and x3 respectively.

In [ ]:

By Eric Marsden