Home Course Concepts About

Bootstrap analysis of helmet safety data

This notebook is an element of the risk-engineering.org courseware. It can be distributed under the terms of the Creative Commons Attribution-ShareAlike licence.

Author: Eric Marsden eric.marsden@risk-engineering.org.


This notebook contains an short case study using Python and SciPy to analyze some online data on the safety performance of different helmet technologies. It includes regression analysis to estimate the impact of technology on performance, and bootstrap resampling to obtain an estimate despite a small sample size. See the associated course materials for background information and to download this content as a Jupyter notebook.

In [31]:
import numpy
import scipy.stats
import pandas
import matplotlib.pyplot as plt
plt.style.use("bmh")
%config InlineBackend.figure_formats=["svg"]

Acknowledgement. This notebook is heavily inspired by a blog post by Daniel Phadley where the analysis is undertaken in R.

We are going to analyze test data produced by researchers at Virginia Tech (a public university in the US state of Virginia). The researchers run impact tests on a range of helmets and assess the likelihood of sustaining a concussion from different types of impacts. From this, they attribute a performance score. This type of testing, and of course publication of the results to help consumers make an informed choice, is very important to encourage manufacturers to improve the safety performance of their products; congratulations to the Virginia Tech researchers.

Their data is published on a web page, but isn’t available in a convenient format that we can use directly from Python, such as CSV or JSON. We’re going to have to “scrape” the data from the web page, which means download the web page content and use some simple parsing techniques to extract the data. This is a fairly common activity for data analysts, because quite often people who generate interesting data don’t think of, or sometimes don’t want to, make it available as a dataset.

Scraping data from web page

Techniques for scraping data from the web are fairly well documented online. They require a little technical knowledge of HTML parsing and a little Python text processing.

In [32]:
import requests
from bs4 import BeautifulSoup

page = requests.get("https://helmet.beam.vt.edu/bicycle-helmet-ratings.html")
soup = BeautifulSoup(page.content, "lxml")
items = soup.find_all("li", class_="filterLi")
items[0]
Out[32]:
<li class="filterLi mountain">
<h2 class="helmet">Fox Dropframe Pro (MIPS)</h2>
<img alt="Fox Dropframe Pro MIPS helmet" class="helmetphoto" src="images/bike-helmets/fox-dropframe-pro-mips.jpg"/>
<p class="stars">
<i class="fas fa-star"></i>
<i class="fas fa-star"></i>
<i class="fas fa-star"></i>
<i class="fas fa-star"></i>
<i class="fas fa-star"></i>
</p>
<p class="cost">Cost: $200</p>
<p class="detail">Style: Mountain</p>
<p class="note">Certifications: CPSC</p>
<p class="score">Score: 8.9</p>
</li>

From the result shown above, we see that the data is presented in an HTML list, with each helmet presented in a list item (li element in HTML). We can access the data as subelements of each list item.

In [33]:
# extract the helmet name
items[0].h2.text
Out[33]:
'Fox Dropframe Pro (MIPS)'
In [34]:
# extract the helmet cost
items[0].find(class_="cost").text
Out[34]:
'Cost: $200'
In [35]:
def parse_helmet_data(li: str):
    helmet = {}
    helmet["name"] = li.h2.text
    cost = li.find(class_="cost").text
    if cost.startswith("Cost: $"):
        helmet["price"] = int(cost[7:])
    score = li.find(class_="score").text
    if score.startswith("Score: "):
        helmet["score"] = float(score[7:])
    style = li.find(class_="detail").text
    if style.startswith("Style: "):
        helmet["style"] = style[7:]
    return helmet

parse_helmet_data(items[0])
Out[35]:
{'name': 'Fox Dropframe Pro (MIPS)',
 'price': 200,
 'score': 8.9,
 'style': 'Mountain'}

We can apply the function above that parses the data for one helmet in the list to the entire list and collect everything in a pandas DataFrame.

In [36]:
helmets = pandas.DataFrame([parse_helmet_data(li) for li in items])
len(helmets)
Out[36]:
116
In [37]:
helmets.head()
Out[37]:
name price score style
0 Fox Dropframe Pro (MIPS) 200 8.9 Mountain
1 Lazer G1 MIPS 240 9.2 Road
2 Bontrager Rally MIPS 150 9.3 Mountain
3 Specialized Align II (MIPS) 50 9.6 Multi-sport
4 Troy Lee Designs A2 MIPS Decoy 179 10.0 Mountain

Let's explore our dataset a little, looking at the distribution of helmet price and of performance score.

In [38]:
plt.hist(helmets.price, bins=20, alpha=0.5)
plt.title("Distribution of helmet price");
2021-04-05T18:54:08.504435 image/svg+xml Matplotlib v3.3.4, https://matplotlib.org/
In [39]:
plt.hist(helmets.score, bins=20, alpha=0.5)
plt.title("Distribution of helmet score");
2021-04-05T18:54:08.657442 image/svg+xml Matplotlib v3.3.4, https://matplotlib.org/

We can examine whether there is a correlation between helmet price and performance score.

In [40]:
helmets.plot("price", "score", "scatter", alpha=0.5)
plt.xlabel("Helmet price (USD)")
plt.ylabel("Helmet score (lower is better)");
2021-04-05T18:54:08.826521 image/svg+xml Matplotlib v3.3.4, https://matplotlib.org/

Analyze impact of helmet technology

The helmets in this dataset use three design technologies:

  • MIPS or Multi-directional Impact Protection System, which reduces the harmful rotational motion generated on impact which would otherwise be transferred to the brain

  • WaveCel, a shock-absorbing material with a collapsible cellular structure. This technology is only available from a single manufacturer, so few helmets are available.

  • conventional impact-absorbing materials, with a certification provided by the US Consumer Product Safety Commission (all helmets sold in the USA must meet this certification)

There is some commercial and technical squabbling about the relative performance of these design approaches, so it’s interesting to see what the test data says.

The helmet technology is not represented explicitly in our dataset, but it’s encoded in the helmet name.

In [41]:
def technology_from_name(name: str):
    if "MIPS" in name: return "MIPS"
    if "WaveCel" in name: return "WaveCel"
    return "conventional"

We can add a column to the dataset with the design technology used.

In [42]:
helmets["technology"] = helmets["name"].apply(technology_from_name)
helmets.head(20)
Out[42]:
name price score style technology
0 Fox Dropframe Pro (MIPS) 200 8.9 Mountain MIPS
1 Lazer G1 MIPS 240 9.2 Road MIPS
2 Bontrager Rally MIPS 150 9.3 Mountain MIPS
3 Specialized Align II (MIPS) 50 9.6 Multi-sport MIPS
4 Troy Lee Designs A2 MIPS Decoy 179 10.0 Mountain MIPS
5 Lazer Century MIPS 180 10.0 Road MIPS
6 SCOTT Centric Plus (MIPS) 200 10.2 Multi-sport MIPS
7 Giant Rev Pro MIPS 273 10.2 Road MIPS
8 Liv Rev Pro MIPS 273 10.2 Road MIPS
9 Lazer Sphere MIPS 160 10.3 Road MIPS
10 Lazer Cyclone MIPS 75 10.5 Multi-sport MIPS
11 POC Octal X SPIN 250 10.6 Road conventional
12 SCOTT ARX PLUS (MIPS) 2020 100 10.7 Multi-sport MIPS
13 Bontrager Specter WaveCel 150 10.8 Road WaveCel
14 Fox Speedframe Pro (MIPS) 160 10.8 Mountain MIPS
15 Bontrager Ballista MIPS 200 10.9 Road MIPS
16 Lazer Anverz NTA MIPS 200 11.0 Urban MIPS
17 Troy Lee Designs A3 MIPS 220 11.0 Mountain MIPS
18 Lazer Jackal MIPS 200 11.2 Mountain MIPS
19 Bontrager Starvos WaveCel 100 11.2 Road WaveCel

Let’s examine a few plots of this dataset to get a feel for the impact of helmet technology on performance score, and see whether there’s a relationship between price and performance.

In [43]:
import seaborn as sns

sns.violinplot(x="technology", y="score", data=helmets);
2021-04-05T18:54:09.011902 image/svg+xml Matplotlib v3.3.4, https://matplotlib.org/

A violin plot, shown above, is an enhanced version of the traditional boxplot, or “box-and-whisker” plot.

This plot shows that MIPS and WaveCel are clearly scoring better (remember that lower scores indicate better safety performance) than conventional helmets, but it isn’t very helpful in determining whether MIPS performs better than WaveCel (the median points for the two categories, shown by the white dot in the violin plot, are very similar). We'll undertake a more sophisticated analysis below to help resolve this question.

In [44]:
sns.relplot(x="price", y="score", hue="technology", data=helmets);
2021-04-05T18:54:09.495282 image/svg+xml Matplotlib v3.3.4, https://matplotlib.org/

Note how the plot above encodes information on three dimensions of our dataset, the price and score but also the technology as the color of the points. This plot highlights the fact that we have few helmets in our dataset using the WaveCel technology.

We can also check whether the type of helmet (targeting road use, urban use or mountainbike use) is correlated with performance.

In [45]:
sns.violinplot(x="style", y="score", data=helmets);
2021-04-05T18:54:09.701601 image/svg+xml Matplotlib v3.3.4, https://matplotlib.org/

Regression analysis

We will run a regression analysis to analyze the comparative impact of helmet technology (MIPS or WaveCel) on performance score. To do this, we need to add columns to the Pandas dataframe indicating whether each technology is present.

In [46]:
# generate extra boolean columns that indicate whether MIPS and WaveCell technologies
# are present
helmets["has_mips"] = helmets["technology"].apply(lambda t: t == "MIPS")
helmets["has_wavecel"] = helmets["technology"].apply(lambda t: t == "WaveCel")
helmets.head(20)
Out[46]:
name price score style technology has_mips has_wavecel
0 Fox Dropframe Pro (MIPS) 200 8.9 Mountain MIPS True False
1 Lazer G1 MIPS 240 9.2 Road MIPS True False
2 Bontrager Rally MIPS 150 9.3 Mountain MIPS True False
3 Specialized Align II (MIPS) 50 9.6 Multi-sport MIPS True False
4 Troy Lee Designs A2 MIPS Decoy 179 10.0 Mountain MIPS True False
5 Lazer Century MIPS 180 10.0 Road MIPS True False
6 SCOTT Centric Plus (MIPS) 200 10.2 Multi-sport MIPS True False
7 Giant Rev Pro MIPS 273 10.2 Road MIPS True False
8 Liv Rev Pro MIPS 273 10.2 Road MIPS True False
9 Lazer Sphere MIPS 160 10.3 Road MIPS True False
10 Lazer Cyclone MIPS 75 10.5 Multi-sport MIPS True False
11 POC Octal X SPIN 250 10.6 Road conventional False False
12 SCOTT ARX PLUS (MIPS) 2020 100 10.7 Multi-sport MIPS True False
13 Bontrager Specter WaveCel 150 10.8 Road WaveCel False True
14 Fox Speedframe Pro (MIPS) 160 10.8 Mountain MIPS True False
15 Bontrager Ballista MIPS 200 10.9 Road MIPS True False
16 Lazer Anverz NTA MIPS 200 11.0 Urban MIPS True False
17 Troy Lee Designs A3 MIPS 220 11.0 Mountain MIPS True False
18 Lazer Jackal MIPS 200 11.2 Mountain MIPS True False
19 Bontrager Starvos WaveCel 100 11.2 Road WaveCel False True

We can now run a linear regression on the dataset to determine which boolean indicator variable (has_mips or has_wavecel) has the largest impact (the largest absolute value). The two indicator variables have negative impact here, which means (remember that lower performance scores are better) that the presence of either technology, compared with the bulk of our sample that uses conventional shock-absorbing materials, improves performance.

In [47]:
import statsmodels.formula.api as smf

lm = smf.ols(formula="score ~ style + has_wavecel + has_mips", data=helmets).fit()
lm.params
Out[47]:
Intercept               17.888782
style[T.Multi-sport]     1.057642
style[T.Road]           -0.089425
style[T.Urban]           3.164297
has_wavecel[T.True]     -6.254786
has_mips[T.True]        -5.457927
dtype: float64
In [48]:
# lm.params is a pandas Series, which we can index by the parameter name
lm.params.loc["has_mips[T.True]"]
Out[48]:
-5.4579267978938555

On the full sample, WaveCel performs slightly better than MIPS. Our sample size is fairly small, so we can undertake a bootstrap analysis to see what level of confidence we have in that observation.

Bootstrap regression

Bootstrapping is a statistical procedure that resamples a single dataset to create many simulated samples. Assuming the original dataset is representative of the larger population that it was drawn from, the simulated samples (resampled values) should have similar statistical characteristics to the full dataset. Therefore, if we estimate statistics on our resampled values, it should give as a reasonable estimation for population statistics.

We include background on the bootstrap method and a procedure to implement it in Python (and in Excel if you're into that kind of thing!) in our free course materials on statistical modelling and data analysis methods, and illustrate its use to estimate confidence intervals in this notebook.

Below, we are using the bootstrap method to run a large number (N) of linear regression analyses on the resampled data, to help determine the comparative impact of the two helmet technologies that we want to compare, MIPS and WaveCel.

In [49]:
# a bootstrap analysis comparing the effect of has_wavecel and has_mips
runs = 1000
effect_mips = numpy.empty(runs)
effect_wavecel = numpy.empty(runs)
for i in range(runs):
    sample = helmets.sample(len(helmets), replace=True)
    lm = smf.ols(formula="score ~ style + has_wavecel + has_mips", data=sample).fit()
    effect_mips[i] = lm.params.loc["has_mips[T.True]"]
    effect_wavecel[i] = lm.params.loc["has_wavecel[T.True]"]
In [50]:
plt.hist(effect_mips, bins=30, label="MIPS", color="blue", alpha=0.3)
plt.hist(effect_wavecel, bins=30, label="WaveCel", color="red", alpha=0.3)
plt.legend();
2021-04-05T18:54:17.029882 image/svg+xml Matplotlib v3.3.4, https://matplotlib.org/

We can calculate the proportion of bootstrap samples where WaveCel dominates MIPS (ie its performance score is lower, remember that a lower score is better here).

In [51]:
(effect_wavecel < effect_mips).sum() / float(len(effect_wavecel))
Out[51]:
0.915
In [ ]: