0% found this document useful (0 votes)
1K views

Reliability Python Library PDF

This document provides an introduction to the reliability engineering Python library. It begins with a quickstart example demonstrating how to generate random data from a Weibull distribution, fit the data to a 2-parameter Weibull model, and plot the probability plot and survival function. Next, it provides a brief overview of the field of reliability engineering and its focus on failure analysis and lifespan prediction. Finally, it lists some recommended resources for further reliability engineering study.

Uploaded by

Hernn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

Reliability Python Library PDF

This document provides an introduction to the reliability engineering Python library. It begins with a quickstart example demonstrating how to generate random data from a Weibull distribution, fit the data to a 2-parameter Weibull model, and plot the probability plot and survival function. Next, it provides a brief overview of the field of reliability engineering and its focus on failure analysis and lifespan prediction. Finally, it lists some recommended resources for further reliability engineering study.

Uploaded by

Hernn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 217

reliability Documentation

Release latest

Aug 29, 2020


Quickstart Intro

1 Quickstart for reliability 3

2 Introduction to the field of reliability engineering 5

3 Recommended resources 7

4 Equations of supported distributions 11

5 Creating and plotting distributions 17

6 Fitting a specific distribution to data 23

7 Fitting all available distributions to data 35

8 Mixture models 41

9 Competing risks models 51

10 Probability plots 63

11 Quantile-Quantile plots 71

12 Probability-Probability plots 77

13 Kaplan-Meier estimate of reliability 81

14 Nelson-Aalen estimate of reliability 87

15 Stress-Strength interference for any distributions 91

16 Stress-Strength interference for normal distributions 95

17 Reliability growth 97

18 Optimal replacement time 101

19 ROCOF 105

20 Mean cumulative function 109

i
21 SN diagram 115

22 Stress-strain and strain-life 119

23 Fracture mechanics 129

24 Creep 135

25 Palmgren-Miner linear damage model 139

26 Acceleration factor 141

27 Solving simultaneous equations with sympy 143

28 ALT probability plots 147

29 Fitting a model to ALT data 155

30 Similar Distributions 161

31 Convert dataframe to grouped lists 163

32 Crosshairs 165

33 Histogram 167

34 Make right censored data 171

35 Datasets 173

36 One sample proportion 177

37 Two proportion test 179

38 Sample size required for no failures 181

39 Sequential sampling chart 183

40 Reliability test planner 187

41 Reliability test duration 191

42 Changelog 195

43 Development roadmap 201

44 Citing reliability in your work 203

45 How to request or contribute a new feature 205

46 How to get help 207

47 How to donate to the project 209

48 About the author 211

49 Credits to those who helped 213

ii
reliability Documentation, Release latest

reliability is a Python library for reliability engineering and survival analysis. It significantly extends the functionality
of scipy.stats and also includes many specialist tools that are otherwise only available in proprietary software.

Quickstart Intro 1
reliability Documentation, Release latest

2 Quickstart Intro
CHAPTER 1

Quickstart for reliability

1.1 Installation

Install via pip:

pip install reliability

1.2 A quick example

In this example, we will create a Weibull Distribution, and from that distribution we will draw 20 random samples.
Using those samples we will Fit a 2-parameter Weibull Distribution. The fitting process generates the probability plot.
We can then access the distribution object to plot the survival function.

from reliability.Distributions import Weibull_Distribution


from reliability.Fitters import Fit_Weibull_2P
from reliability.Probability_plotting import plot_points
import matplotlib.pyplot as plt

dist = Weibull_Distribution(alpha=30, beta=2) # creates the distribution object


data = dist.random_samples(20, seed=42) # draws 20 samples from the distribution.
˓→Seeded for repeatability

plt.subplot(121)
fit = Fit_Weibull_2P(failures=data) # fits a Weibull distribution to the data and
˓→generates the probability plot

plt.subplot(122)
fit.distribution.SF(label='fitted distribution') # uses the distribution object from
˓→Fit_Weibull_2P and plots the survival function

dist.SF(label='original distribution', linestyle='--') # plots the survival function


˓→of the original distribution

plot_points(failures=data, func='SF') # overlays the original data on the survival


˓→function

(continues on next page)

3
reliability Documentation, Release latest

(continued from previous page)


plt.legend()
plt.show()

A key feature of reliability is that probability distributions are created as objects, and these objects have many
properties (such as the mean) that are set once the parameters of the distribution are defined. Using the dot operator
allows us to access these properties as well as a large number of methods (such as drawing random samples as seen in
the example above).
Each distribution may be visualised in five different plots. These are the Probability Density Function (PDF), Cumu-
lative Distribution Function (CDF), Survival Function (SF) [also known as the reliabilty function], Hazard Function
(HF), and the Cumulative Hazard Function (CHF). Accessing the plot of any of these is as easy as any of the other
methods. Eg. dist.SF() in the above example is what plots the survival function using the distribution object that
was returned from the fitter.

4 Chapter 1. Quickstart for reliability


CHAPTER 2

Introduction to the field of reliability engineering

Reliability engineering is a field of study that deals with the estimation, prevention, and management of failures by
combining statistics, risk analysis, and physics. By understanding how failures may occur or have occurred, we are
able to better predict the lifespan of a product, allowing us to manage its lifecycle and risks associated with failure.
All engineering systems, components, and structures will eventually fail, and knowing how and when that failure will
occur is of great interest to the owners and operators of those systems. Due to the similarities between the lifecycle of
engineering systems and the lifecycle of humans, the field of study known as survival analysis has many concepts that
are used in reliability engineering.
Everyone is acutely aware of the importance of reliability, particularly when something doesn’t work at a time we
expect it to. Whether it be your car not starting, your computer’s hard drive failing, or the chance of being delayed on
the runway because your aircraft’s airconditioning unit just stopped working, we know that system failure is something
we all want to avoid, and when it can’t be avoided, we at least want to know when it is likely to occur. Reliability en-
gineering is most frequently used for systems which are of critical safety importance (such as in the nuclear industry),
or in systems which are numerous (such as vehicles or electronics) where the cost of fleetwide reliability problems is
extremely expensive.
Much of reliability engineering involves the analysis of data (such as time to failure), to uncover the patterns in how
failures occur and to use those patterns to forecast how the failures will occur throughout the lifetime of a population
of items, or the lifetime of one or more repairable items. It is the data analysis part of reliability engineering that this
python library is designed to help with.
Further reading is available on Wikipedia and in many other reliability resources.

5
reliability Documentation, Release latest

6 Chapter 2. Introduction to the field of reliability engineering


CHAPTER 3

Recommended resources

The following collection of resources are things I have found useful during my reliability engineering studies and
also while writing the Python reliability library. There are many other resources available (especially textbooks and
academic papers), so I encourage you to do your own research. If you find something you think is worth adding here,
please send me an email.
Textbooks
• Reliability Engineering and Risk Analysis: A practical Guide, Third Edition (2017), by M. Modarres, M.
Kaminskiy, and V. Krivtsov.
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson.
• Probability Distributions Used in Reliability Engineering (2011), by A. O’Conner, M. Modarres, and A. Mosleh.
• Practical Reliability Engineering, Fifth Edition (2012), by P. O’Conner and A. Kleyner.
• Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other Applications (2003), by
W. Nelson
• Reliasoft has compiled a much more comprehensive list of textbooks.
• The reliability analytics toolkit (linked below in free online tools and calculators) has also compiled a much
more comprehensive list of textbooks.
Free software
• Lifelines - a Python library for survival analysis. Very powerful collection of tools, only a few of which overlap
with the Python reliability library.
• Parameter Solver v3.0 - a biostatistics tool for quickly making some simple calculations with probability distri-
butions.
• Orange - a standalone data mining and data visualization program that runs using Python. Beautifully inter-
active data analysis workflows with a large toolbox. Not much reliability related content but good for data
preprocessing.
• R (Programming Language) - R is one of the most popular programming languages for data science, and it has
several libraries that are targeted towards reliability engineering and survival analysis. These include WeibullR,
abrem, and survival.

7
reliability Documentation, Release latest

• CumFreq - a program for cumulative frequency analysis with probability distribution fitting for a wide range of
distributions. Limited functionality beyond fitting distributions.
Paid software
The listing of a software package here does not imply my endorsement, and is only intended to give readers an
understanding of the broad range of reliability engineering software packages that are available. It is difficult to
find a comprehensive list of software resources since most developers of proprietary software rarely acknowledge the
existence of any software other than their own. I have not used most of the paid software listed here due to the high
cost, so most of my comments in this section are based purely on the content from their websites.
• Minitab - a great collection of statistical tools. A few reliability focussed tools included.
• Reliasoft - the industry leader for reliability engineering software.
• SAS JMP - lots of statistical tools for data modelling and visualization. A few purpose built reliability tools.
Its utility for reliability engineering will depend on your application. SAS has also released the SAS University
Edition which is a free software package that runs in VirtualBox and offers a reduced set of tools compared to
the paid package.
• PTC Windchill - a powerful tool for risk and reliability. Similar to Reliasoft but it forms one part of the larger
PTC suite of tools.
• Isograph Reliability Workbench - A collection of tools designed specifically for reliability engineering.
• Item Software - A collection of tools for reliability engineering including FMECA, fault trees, reliability pre-
diction, and many others.
• SuperSMITH - This software is designed specifically for reliability engineering and has many useful tools.
The user interface looks like it is from the early 1990s but the methods used are no less relevant today. This
software was developed alongside the New Weibull Handbook, an excellent resource for interpreting the results
of reliability engineering software.
• RAM Commander - A software tool for Reliability and Maintainability Analysis and Prediction, Spares Opti-
misation, FMEA/FMECA, Testability, Fault Tree Analysis, Event Tree Analysis and Safety Assessment.
• RelCalc - RelCalc for Windows automates the reliability prediction procedure of Telcordia SR-332, or MIL-
HDBK-217, providing an alternative to tedious, time consuming, and error prone manual methods.
• @RISK - A comprehensive Excel addon that allows for distribution fitting, reliability modelling, MC simulation
and much more.
• Quanterion Automated Reliability Toolkit (QuART) - A collection of reliability tools including reliability pre-
diction, FMECA, derating, stress-strength interference, and many other. Quanterion produces several software
products so their tools are not all available in one place.
• TopEvent FTA - Fault Tree Analysis software. Tailored specifically for fault tree analysis so it lacks other RAM
tools but it is good at its intended function. A demo version is available with size and data export limitations.
• Maintenance Aware Design (MADe) - FMECA and RCM software that is extremely useful at the product design
stage to inform the design and service plan which then improves the inherent reliability and maintainability.
There is an academic license which allows non-profit users to run the software for free.
Free online tools and calculators
• Reliability Analytics Toolkit - a collection of tools which run using the Google App Engine. Includes a tool for
fitting a Weibull_2P distribution.
• Weibull App - An online tool for fitting a Weibull_2P distibution. Download the example template to see what
format the app is expecting your data to be in before you can upload your own data. The backend is powered by
the abrem R package. This tool has limited applications beyond fitting a Weibull_2P distribution.

8 Chapter 3. Recommended resources


reliability Documentation, Release latest

• Distributome - Provides PDF and CDF of a large number of probability distributions that can be easily changed
using sliders for their parameters. It also includes a quantile / CDF calculator. Similar to the Distribution
calculator below.
• Distribution Calculator - Provides PDF and CDF of a large number of probability distributions that can be easily
changed using sliders for their parameters. It also includes a quantile / CDF calculator. Similar to Distributome
above.
• Kijima G-renewal process - an online calculator for simulating the G-renewal process.
• Prediction of future recurrent events - an online calculator for predicting future recurrent events with different
underlying probability functions.
• Maintenance optimization - an online calculator for optimal replacement policy (time) under Kijima imperfect
repair model.
• e-Fatigue - This website provides stress concentration factors (Kt) for various notched geometries. You will
need this if using the functions for fracture machanics in the Physics of Failure section.
• Fault Tree Analyser - A simple online tool where you can build a fault tree, give each branch a failure rate and
run a variety of reports including reliability prediction at time, minimal cut sets, and several others.
• Wolfram Alpha - an amazing computational knowledge engine. Great for checking your calculations.
• Derivative calculator - calculates derivatives. Slightly more user friendly input method than Wolfram alpha and
doesn’t time out as easily for big calculations.
• Integral calculator - calculates integrals. Slightly more user friendly input method than Wolfram alpha and
doesn’t time out as easily for big calculations.
• GeoGebra - An interactive calculator that is extremely useful for plotting equations. Also includes many math-
ematical operations (such as integrals and derivatives) that allow you to keep your equations in symbolic form.
You can download your current calculator to save it. The only downside is that there are not many probability
distribution functions inbuilt so you will need to enter the equations manually.
Online information resources
• Reliawiki - an excellent reference written by Reliasoft that is intended largely as a guide to reliability engineering
when using Reliasoft’s software but is equally as good to understand concepts without using their software.
• Reliasoft’s Accelerated Life Testing Data Analysis Reference
• Reliasoft’s collection of Military Directives, Handbooks and Standards Related to Reliability
• Univariate distributions relationships - a great interactive diagram for understanding more about probability
distributions and how they are related. Some strange parametrisations are used in the documentation.
• Cross Validated - an forum for asking statistics and mathematics questions. Check for existing answers before
posting your own question.
• Stack Overflow - a forum for programmers where you can post questions and answers related to programming.
Check for existing answers before posting your own question.
• Wikipedia - it’s always worth checking if there’s an article on there about the topic you’re trying to understand.
Getting free access to academic papers
• arXiv - a database run by Cornell university that provides open access to over 1.5 million academic papers that
have been submitted. If you can’t find it here then check on Sci-Hub.
• Sci-Hub - paste in a DOI to get a copy of the academic paper. Accessing academic knowledge should be free
and this site makes it possible.

9
reliability Documentation, Release latest

10 Chapter 3. Recommended resources


CHAPTER 4

Equations of supported distributions

The following expressions provide the equations for the Probability Density Function (PDF), Cumulative Distribution
Function (CDF), Survival Function (SF) (this is the same as the reliability function R(t)), Hazard Function (HF), and
Cumulative Hazard Function (CHF) of all supported distributions. Readers should note that there are many ways
to write the equations for probability distributions and careful attention should be afforded to the parametrization
to ensure you understand each parameter. For more equations of these distributions, see the textbook “Probability
Distributions Used in Reliability Engineering” listed in recommended resources.

4.1 Weibull Distribution

𝛼 = scale parameter (𝛼 > 0)


𝛽 = shape parameter (𝛽 > 0)
Limits (𝑡 ≥ 0)
𝛽𝑡𝛽−1 −( 𝛼
𝑡 𝛽
)
PDF: 𝑓 (𝑡) = 𝛼𝛽
e
𝛽
(︀ 𝑡 )︀(𝛽−1) 𝑡 𝛽
= 𝛼 𝛼 e−( 𝛼 )
𝑡 𝛽
CDF: 𝐹 (𝑡) = 1 − e−( 𝛼 )
𝑡 𝛽
SF: 𝑅(𝑡) = e−( 𝛼 )
𝛽 𝑡 𝛽−1
HF: ℎ(𝑡) = 𝛼(𝛼)

CHF: 𝐻(𝑡) = ( 𝛼𝑡 )𝛽

4.2 Exponential Distribution

𝜆 = scale parameter (𝜆 > 0)


Limits (𝑡 ≥ 0)

11
reliability Documentation, Release latest

PDF: 𝑓 (𝑡) = 𝜆e−𝜆𝑡


CDF: 𝐹 (𝑡) = 1 − e−𝜆𝑡
SF: 𝑅(𝑡) = e−𝜆𝑡
HF: ℎ(𝑡) = 𝜆
CHF: 𝐻(𝑡) = 𝜆𝑡
1
Note that some parametrizations of the Exponential distribution (such as the one in scipy.stats) use 𝜆 in place of 𝜆.

4.3 Normal Distribution

𝜇 = location parameter (−∞ < 𝜇 < ∞)


𝜎 = scale parameter (𝜎 > 0)
Limits (−∞ < 𝑡 < ∞)
[︁ )︀2 ]︁
√1 exp − 12 𝑡−𝜇
(︀
PDF: 𝑓 (𝑡) = 𝜎 2𝜋 𝜎

1
[︀ 𝑡−𝜇 ]︀
= 𝜎𝜑 𝜎

where 𝜑 is the standard normal PDF with 𝜇 = 0 and 𝜎 = 1


[︂ (︁ )︁2 ]︂
1
∫︀ 𝑡 1 𝜃−𝜇
CDF: 𝐹 (𝑡) = 𝜎 2𝜋 −∞ exp − 2

𝜎 d𝜃
(︁ )︁
1 𝑡−𝜇
= 2 + 12 erf √
𝜎 2
(︀ 𝑡−𝜇 )︀
=Φ 𝜎

where Φ is the standard normal CDF with 𝜇 = 0 and 𝜎 = 1


𝑅(𝑡) = 1 − Φ 𝑡−𝜇
(︀ )︀
SF: 𝜎

= Φ 𝜇−𝑡
(︀ )︀
𝜎
𝜑[ 𝑡−𝜇
𝜎 ]
HF: ℎ(𝑡) = 𝜎 (Φ[ 𝜇−𝑡
𝜎 ])

𝐻(𝑡) = −ln Φ 𝜇−𝑡


[︀ (︀ )︀]︀
CHF: 𝜎

4.4 Lognormal Distribution

𝜇 = scale parameter (−∞ < 𝜇 < ∞)


𝜎 = shape parameter (𝜎 > 0)
Limits (𝑡 ≥ 0)
[︂ (︁ )︁2 ]︂
1 ln(𝑡)−𝜇
PDF: 𝑓 (𝑡) = √
𝜎𝑡 2𝜋
exp − 12 𝜎
[︁ ]︁
1 ln(𝑡)−𝜇
= 𝜎𝑡 𝜑 𝜎

where 𝜑 is the standard normal PDF with 𝜇 = 0 and 𝜎 = 1


[︂ (︁ )︁2 ]︂
1
∫︀ 𝑡 1 1 ln(𝜃)−𝜇
CDF: 𝐹 (𝑡) = 𝜎 2𝜋 0 𝜃 exp − 2

𝜎 d𝜃

12 Chapter 4. Equations of supported distributions


reliability Documentation, Release latest

(︁ )︁
= 1
2 + 12 erf ln(𝑡)−𝜇

𝜎 2
(︁ )︁
ln(𝑡)−𝜇
=Φ 𝜎

where Φ is the standard normal CDF with 𝜇 = 0 and 𝜎 = 1


(︁ )︁
SF: 𝑅(𝑡) = 1 − Φ ln(𝑡)−𝜇
𝜎
ln(𝑡)−𝜇
𝜑[ 𝜎 ]
HF: ℎ(𝑡) = ln(𝑡)−𝜇
𝑡𝜎 ( 1−Φ (𝜎 ))
[︁ (︁ )︁]︁
ln(𝑡)−𝜇
CHF: 𝐻(𝑡) = −ln 1 − Φ 𝜎

4.5 Gamma Distribution

𝛼 = scale parameter (𝛼 > 0)


𝛽 = shape parameter (𝛽 > 0)
Limits (𝑡 ≥ 0)
𝑡𝛽−1 𝑡
PDF: 𝑓 (𝑡) = Γ(𝛽)𝛼𝛽
e− 𝛼
∫︀ ∞
where Γ(𝑥) is the complete gamma function. Γ(𝑥) = 0
𝑡𝑥−1 e−𝑡 d𝑡
1
𝛾 𝛽, 𝛼𝑡
(︀ )︀
CDF: 𝐹 (𝑡) = Γ(𝛽)
∫︀ 𝑦
where 𝛾(𝑥, 𝑦) is the lower incomplete gamma function. 𝛾(𝑥, 𝑦) = 1
Γ(𝑥) 0
𝑡𝑥−1 e−𝑡 d𝑡
1
Γ 𝛽, 𝛼𝑡
(︀ )︀
SF: 𝑅(𝑡) = Γ(𝛽)
∫︀ ∞
where Γ(𝑥, 𝑦) is the upper incomplete gamma function. Γ(𝑥, 𝑦) = 1
Γ(𝑥) 𝑦
𝑡𝑥−1 e−𝑡 d𝑡
𝑡𝛽−1 exp(− 𝛼
𝑡
)
HF: ℎ(𝑡) = 𝛼𝛽 Γ(𝛽, 𝛼
𝑡
)
[︁ )︀]︁
1
𝛽, 𝛼𝑡
(︀
CHF: 𝐻(𝑡) = −ln Γ(𝛽) Γ

Note that some parametrizations of the Gamma distribution use 𝛼1 in place of 𝛼. There is also an alternative
parametrization which uses shape and rate instead of shape and scale. See Wikipedia for an example of this.

4.6 Beta Distribution

𝛼 = shape parameter (𝛼 > 0)


𝛽 = shape parameter (𝛽 > 0)
Limits (0 ≤ 𝑡 ≤ 1)
Γ(𝛼+𝛽) 𝛼−1
PDF: 𝑓 (𝑡) = Γ(𝛼)Γ(𝛽) .𝑡 (1 − 𝑡)𝛽−1
1 𝛼−1
= 𝐵(𝛼,𝛽) .𝑡 (1 − 𝑡)𝛽−1
∫︀ ∞
where Γ(𝑥) is the complete gamma function. Γ(𝑥) = 0
𝑡𝑥−1 e−𝑡 d𝑡
∫︀ 1
where 𝐵(𝑥, 𝑦) is the complete beta function. 𝐵(𝑥, 𝑦) = 0
𝑡𝑥−1 (1 − 𝑡)𝑦−1 d𝑡

4.5. Gamma Distribution 13


reliability Documentation, Release latest

Γ(𝛼+𝛽) ∫︀ 𝑡
CDF: 𝐹 (𝑡) = Γ(𝛼)Γ(𝛽) 0
𝜃𝛼−1 (1 − 𝜃)𝛽−1 d𝜃
𝐵𝑡 (𝑡|𝛼,𝛽)
= 𝐵(𝛼,𝛽)

= 𝐼𝑡 (𝑡|𝛼, 𝛽)
∫︀ 𝑡
where 𝐵𝑡 (𝑡|𝑥, 𝑦) is the incomplete beta function. 𝐵𝑡 (𝑡|𝑥, 𝑦) = 0
𝜃𝑥−1 (1 − 𝜃)𝑦−1 d𝜃
where 𝐼𝑡 (𝑡|𝑥, 𝑦) is the regularized incomplete beta function which is defined in terms of the incomplete beta function
and the complete beta function. 𝐼𝑡 (𝑡|𝑥, 𝑦) = 𝐵𝐵(𝑥,𝑦)
𝑡 (𝑡|𝑥,𝑦)

SF: 𝑅(𝑡) = 1 − 𝐼𝑡 (𝑡|𝛼, 𝛽)


𝑡𝛼−1 (1−𝑡)
HF: ℎ(𝑡) = 𝐵(𝛼,𝛽)−𝐵𝑡 (𝑡|𝛼,𝛽)

CHF: 𝐻(𝑡) = −ln [1 − 𝐼𝑡 (𝑡|𝛼, 𝛽)]


Note that there is a parameterization of the Beta distribution that changes the lower and upper limits beyond 0 and 1.
For this parametrization, see the reference listed in the opening paragraph of this page.

4.7 Loglogistic Distribution

Note: The Loglogistic distribution will be available in version 0.5.3 which is currently unreleased.

𝛼 = scale parameter (𝛼 > 0)


𝛽 = shape parameter (𝛽 > 0)
Limits (𝑡 ≥ 0)
𝛽−1
( 𝛽 )( 𝑡 )
(︁ 𝛼 𝛼 𝛽 )︁2
PDF: 𝑓 (𝑡) =
1+( 𝛼
𝑡
)
1
CDF: 𝐹 (𝑡) = 𝑡 −𝛽
1+( 𝛼 )
𝛽
( 𝛼𝑡 )
= 𝑡 𝛽
1+( 𝛼 )
𝑡𝛽
= 𝛼𝛽 +𝑡𝛽
1
SF: 𝑅(𝑡) = 𝑡 𝛽
1+( 𝛼 )
𝛽−1
( 𝛼𝛽 )( 𝛼𝑡 )
HF: ℎ(𝑡) = 𝑡 𝛽
1+( 𝛼 )
(︁ (︀ )︀𝛽 )︁
CHF: 𝐻(𝑡) = 𝑙𝑛 1 + 𝛼𝑡

There is another parameterization of the loglogistic distribution using 𝜇 and 𝜎 which is designed to look more like the
parametrization of the logistic distribution and is related to the above parametrization by 𝜇 = 𝑙𝑛(𝛼) and 𝜎 = 𝛽1 . This
parametrisation can be found here.

4.8 Location shifting the distributions

Within reliability the parametrization of the Exponential, Weibull, Gamma, Lognormal, and Loglogistic distri-
butions allows for location shifting using the gamma parameter. This will simply shift the distribution’s lower limit to

14 Chapter 4. Equations of supported distributions


reliability Documentation, Release latest

the right from 0 to 𝛾. In the location shifted form of the distributions, the equations listed above are almost identical,
except everywhere you see 𝑡 replace it with 𝑡 − 𝛾. The reason for using the location shifted form of the distribution is
because some phenomena that can be modelled well by a certain probability distribution do not begin to occur imme-
diately, so it becomes necessary to shift the lower limit of the distribution so that the data can be accurately modelled
by the distribution.
If implementing this yourself, ensure you set all y-values to 0 for 𝑡 ≤ 𝛾 as the raw form of the location shifted
distributions above will not automatically zeroise these values for you and may result in negative values. This zeroizing
is done automatically within reliability.

4.9 Relationships between the five functions

The PDF, CDF, SF, HF, CHF of a probability distribution are inter-related and any of these functions can be obtained
by applying the correct transformation to any of the others. The following list of transformations are some of the most
useful:
𝑑
PDF = 𝑑𝑡 CDF
∫︀ 𝑡
CDF = −∞
PDF
SF = 1 − CDF
PDF
HF = SF

CHF = −ln (SF)

4.9. Relationships between the five functions 15


reliability Documentation, Release latest

16 Chapter 4. Equations of supported distributions


CHAPTER 5

Creating and plotting distributions

Probability distributions within reliability are Python objects, which allows us to specify just the type of dis-
tribution and parameters. Once the distribution object is created, we can access a large number of methods, some
of which will require additional input. There are 6 different probability distributions available in reliability.
Distributions. These are:
• Weibull Distribution (𝛼, 𝛽, 𝛾)
• Exponential Distribution (𝜆, 𝛾)
• Gamma Distribution (𝛼, 𝛽, 𝛾)
• Normal Distribution (𝜇, 𝜎)
• Lognormal Distribution (𝜇, 𝜎, 𝛾)
• Beta Distribution (𝛼, 𝛽)
In all of the distributions which use 𝛾, the 𝛾 parameter is used to location shift the distribution to the right. The Beta
distribution is only defined in the range {0,1}. All distributions except the Normal distribution are defined in the
positive domain only (x>0).
Understanding how to create and plot distributions is easiest with an example. The following code will create a
Lognormal Distribution with parameters mu=5 and sigma=1. From this distribution, we will use the plot() method
which provides a quick way to visualise the five functions and also provides a summary of the descriptive statistics.

from reliability.Distributions import Lognormal_Distribution


dist = Lognormal_Distribution(mu=5,sigma=1)
dist.plot()

17
reliability Documentation, Release latest

The following methods are available for all distributions:


• name - a string of the distribution name. Eg. ‘Weibull’
• name2 - a string of the distribution name including the number of parameters. Eg. ‘Weibull_2P’
• param_title_long - Useful in plot titles, legends and in printing strings. Varies by distribution. eg. ‘Weibull
Distribution (𝛼=5,𝛽=2)’
• param_title - Useful in plot titles, legends and in printing strings. Varies by distribution. eg. ‘𝛼=5,𝛽=2’
• parameters - returns an array of parameters. These are in the order specified in the bullet points above, so for
Lognormal it would return [mu,sigma,gamma].
• alpha, beta, gamma, Lambda, mu, sigma - these vary by distribution but will return the value of their respective
parameter. Eg. dist.mu would return 5 in the above example.
• mean
• variance
• standard_deviation
• skewness
• kurtosis
• excess_kurtosis

18 Chapter 5. Creating and plotting distributions


reliability Documentation, Release latest

• median
• mode
• b5 - the time at which 5% of units have failed. Same as dist.quantile(0.05)
• b95 - - the time at which 95% of units have failed. Same as dist.quantile(0.95)
• plot() - plots all functions (PDF, CDF, SF, HF, CHF). No additional arguments are accepted.
• PDF() - plots the probability density function. Also accepts xvals, xmin, xmax, show_plot.
• CDF() - plots the cumulative distribution function. Also accepts xvals, xmin, xmax, show_plot.
• SF() - plots the survival function (also known as reliability function). Also accepts xvals, xmin, xmax,
show_plot.
• HF() - plots the hazard function. Also accepts xvals, xmin, xmax, show_plot.
• CHF() - plots the cumulative hazard function. Also accepts xvals, xmin, xmax, show_plot.
• quantile() - Calculates the quantile (time until a fraction has failed) for a given fraction failing. Also known as
‘b’ life where b5 is the time at which 5% have failed. Eg. dist.quantile(0.05) will give the b5 life.
• inverse_SF() - Calculates the inverse of the survival function. Useful when producing QQ plots.
• mean_residual_life() - Average residual lifetime of an item given that the item has survived up to a given time.
Effectively the mean of the remaining amount (right side) of a distribution at a given time. You must specify the
x-value at which to calculate MRL. Eg. dist.mean_residual_life(10)
• stats() - prints all the descriptive statistics. Same as the statistics shown using .plot() but printed to console.
• random_samples() - draws random samples from the distribution to which it is applied. Same as rvs in
scipy.stats. You must specify the number of samples. Eg. data = dist.random_samples(100) will set data
as a list of 100 random samples from the distribution. If you want repeatability, set the random seed using
numpy.random.seed(1) before drawing the random samples.
For all of the individual plotting functions (PDF, CDF, SF, HF, CHF), all standard matplotlib plotting keywords (such
as label, color, linestyle, etc.) are accepted and used. If not specified they are preset. In specifying the plotting
positions for the x-axis, there are optional keywords to be used. The first of these is ‘xvals’ which accepts a list of
x-values to use for the horizontal axis. Alternatively, the user may specify ‘xmin’ and ‘xmax’ and the axis will be
created using np.linspace(xmin, xmax, 1000).
Note that .plot() does not require plt.show() to be used as it will automatically show, however the other 5 plotting
functions will not be displayed until plt.show() is used. This is to allow the user to overlay multiple plots on the figure
or change titles, labels, and legends as required. The plot can be turned off by specifying show_plot=False.
As another example, we will create a bathtub curve by creating and layering several distributions. The bathtub curve
is only for the Hazard function as it shows how a variety of failure modes throughout the life of a population can shape
the hazard into a bathtub shape. The three distinct regions are infant mortality, random failures, and wear out.
from reliability.Distributions import Weibull_Distribution, Lognormal_Distribution,
˓→Exponential_Distribution

import matplotlib.pyplot as plt


import numpy as np
xvals = np.linspace(0,1000,1000)
infant_mortality = Weibull_Distribution(alpha=400,beta=0.7).HF(xvals=xvals,label=
˓→'infant mortality [Weibull]')

random_failures = Exponential_Distribution(Lambda=0.001).HF(xvals=xvals,label='random
˓→failures [Exponential]')

wear_out = Lognormal_Distribution(mu=6.8,sigma=0.1).HF(xvals=xvals,label='wear out


˓→[Lognormal]')

combined = infant_mortality+random_failures+wear_out
(continues on next page)

19
reliability Documentation, Release latest

(continued from previous page)


plt.plot(xvals,combined,linestyle='--',label='Combined hazard rate')
plt.legend()
plt.title('Example of how multiple failure modes at different stages of\nlife create
˓→a "Bathtub curve" for the total Hazard function')

plt.show()

Further detail about all of the functions is available using the help function within Python. Simply type:

from reliability.Distributions import Lognormal_Distribution


print(help(Lognormal_Distribution))

20 Chapter 5. Creating and plotting distributions


reliability Documentation, Release latest

21
reliability Documentation, Release latest

22 Chapter 5. Creating and plotting distributions


CHAPTER 6

Fitting a specific distribution to data

The module reliability.Fitters provides many probability distribution fitting functions. These functions can
be thought of in two categories; non-location shifted distributions [eg. Weibull (𝛼,𝛽)], and location shifted distributions
[eg. Weibull (𝛼,𝛽,𝛾)]. All of the distributions can be fitted to both complete and imcomplete (right censored) data. All
distributions in the Fitters module are named with their number of parameters (eg. Fit_Weibull_2P uses 𝛼,𝛽, whereas
Fit_Weibull_3P uses 𝛼,𝛽,𝛾). This is intended to remove ambiguity about what distribution you are fitting.
Distributions are fitted simply by using the desired function and specifying the data as failures or right_censored data.
You must have at least as many failures as there are distribution parameters or the fit would be under-constrained. It
is generally advisable to have at least 4 data points as the accuracy of the fit is proportional to the amount of data.
Once fitted, the results are assigned to an object and the fitted parameters can be accessed by name, as shown in the
examples below. The goodness of fit criterions are also available as AICc (Akaike Information Criterion corrected) and
BIC (Bayesian Information Criterion), though these are more useful when comparing the fit of multiple distributions
such as in the Fit_Everything function. As a matter of convenience, each of the modules in Fitters also generates a
distribution object that has the parameters of the fitted distribution.
The supported distributions are:
• Weibull_2P
• Weibull_3P
• Exponential_1P
• Exponential_2P
• Gamma_2P
• Gamma_3P
• Lognormal_2P
• Lognormal_3P
• Normal_2P
• Beta_2P
• Weibull_Mixture (see the section on this)

23
reliability Documentation, Release latest

Note: The Beta distribution is only for data in the range 0 to 1. Specifying data outside of this range will cause an
error.

Note: If you have a very large amount of data (>100000 samples) then it is likely that your computer will take
significant time to compute the results. This is a limitation of interpreted languages like Python compared to com-
piled languages like C++ which many commerial reliability software packages are written in. If you have very large
volumes of data, you may want to consider using commercial software for faster computation time. The function
Fit_Weibull_2P_grouped is designed to accept a dataframe which has multiple occurrences of some values (eg. multi-
ple values all right censored to the same value when the test was ended). Depending on the size of the data set and the
amount of grouping in your data, Fit_Weibull_2P_grouped may be much faster than Fit_Weibull_2P and achieve the
same level of accuracy. This difference is not noticable if you have less than 10000 samples. For more information,
see the example below on using Fit_Weibull_2P_grouped.

Note: Heavily censored data (>99.9% censoring) may result in a failure of the optimizer to find a solution. If you
have heavily censored data, you may have a limited failure population problem. It is recommended that you do not try
fitting one of these standard distributions to such a dataset as your results (while they may have achieved a successful
fit) will be a poor description of your overall population statistic and you risk drawing the wrong conclusions when
the wrong model is fitted. The limited failure population model is planned for a future release of reliability, though
development on this model is yet to commence. In the meantime, see JMP Pro’s model for Defective Subpopulations.

Note: The confidence intervals shown on the plots below are only available for the Exponential (1P and 2P) and
Weibull (2P and 3P) fitters. This library is in active development and over the next few months the confidence intervals
will be added to the Normal and Lognormal Fitters followed by the Gamma and Beta Fitters.

If you do not know which distribution you want to fit, then please see the section on using the Fit_Everything function
which will find the best distribution to describe your data. It is highly recommended that you always try to fit everything
and accept the best fit rather than choosing a particular distribution for other reasons.
Each of the fitters listed above (except Fit_Weibull_Mixture and Fit_Weibull_2P_grouped) has the following inputs
and outputs:
Inputs:
• failures - an array or list of failure data
• right_censored - an array or list of right censored data. Optional input
• show_probability_plot - True/False. Defaults to True. Produces a probability plot of the failure data and fitted
distribution.
• print_results - True/False. Defaults to True. Prints a dataframe of the point estimate, standard error, Lower CI
and Upper CI for each parameter.
• CI - confidence interval for estimating confidence limits on parameters. Must be between 0 and 1. Default is
0.95 for 95% CI.
• force_beta (in Fit_Weibull_2P) or force_sigma (in Fit_Normal_2P and Fit_Lognormal_2P). This allows the
user to force the shape parameter to be a set value. Useful for ALT probability plotting. Optional input. Not
available for Fit_Beta_2P (due to there being 2 shape parameters), Fit_Expon_1P (due to there being only 1
parameter), Fit_Gamma_2P (due to Gamma_2P not being suitable for ALT probability plotting) or any of the
location shifted distributions (due to these not typically being used for ALT probability plotting).

24 Chapter 6. Fitting a specific distribution to data


reliability Documentation, Release latest

• keyword argumets are also accepted for the probability plot (eg. color, linestyle, marker)
Outputs (the following example outputs are for the Fit_Weibull_2P distribution but for other distributions the parameter
names may be different from alpha and beta):
• alpha - the fitted Weibull_2P alpha parameter
• beta - the fitted Weibull_2P beta parameter
• loglik - Log-Likelihood (as used in Minitab and Reliasoft)
• loglik2 - Log-Likelihood*-2 (as used in JMP Pro)
• AICc - Akaike Information Criterion
• BIC - Bayesian Information Criterion
• distribution - a Distribution object with the parameters of the fitted distribution
• alpha_SE - the standard error (sqrt(variance)) of the parameter
• beta_SE - the standard error (sqrt(variance)) of the parameter. This will be ‘’ if the shape parameter has been
forced to a set value.
• Cov_alpha_beta - the covariance between the parameters. This will be ‘’ for Fit_Expon_1P or if the shape
parameter has been forced to a set value.
• alpha_upper - the upper CI estimate of the parameter
• alpha_lower - the lower CI estimate of the parameter
• beta_upper - the upper CI estimate of the parameter. This will be ‘’ if the shape parameter has been forced to a
set value.
• beta_lower - the lower CI estimate of the parameter. This will be ‘’ if the shape parameter has been forced to a
set value.
• results - a dataframe of the results (point estimate, standard error, Lower CI and Upper CI for each parameter)
• success - True/False. Indicated whether the solution was found by autograd. If success is False a warning will
be printed indicating that scipy’s fit was used as autograd failed. This fit will not be accurate if there is censored
data as scipy does not have the ability to fit censored data. Failure of autograd to find the solution should be
rare and if it occurs, it is likely that the distribution is an extremely bad fit for the data. Try scaling your data,
removing extreme values, or using another distribution.
To learn how we can fit a distribution, we will start by using a simple example with 10 failure times. These times were
generated from a Weibull distribution with 𝛼=50, 𝛽=2. Note that the output also provides the confidence intervals
and standard error of the parameter estimates. The probability plot is generated be default (you will need to specify
plt.show() to show it). See the section on probability plotting to learn how to interpret this plot.

from reliability.Fitters import Fit_Weibull_2P


import matplotlib.pyplot as plt
data = [42.1605147, 51.0479599, 41.424553, 35.0159047, 87.3087644, 30.7435371, 52.
˓→2003467, 35.9354271, 71.8373629, 59.171129]

wb = Fit_Weibull_2P(failures=data)
plt.show()

'''
Results from Fit_Weibull_2P (95% CI):
Point Estimate Standard Error Lower CI Upper CI
Parameter
Alpha 56.682213 6.062570 45.962610 69.901889
Beta 3.141680 0.733552 1.987993 4.964886
(continues on next page)

25
reliability Documentation, Release latest

(continued from previous page)


Log-Likelihood: -42.4263105092607
'''

The above probability plot is the typical way to visualise how the CDF (the blue line) models the failure data (the
black points). If you would like to view the failure points alongside the CDF, SF, or CHF without the axis be-
ing scaled then you can generate the scatter plot using the function plot_points which is available within reliabil-
ity.Probability_plotting. In the example below we create some data, then fit a Weibull distribution to the data (ensuring
we turn off the probability plot). From the fitted distribution object we plot the Survival Function (SF). We then use
plot_points to generate a scatter plot of the plotting positions for the survival function.
For the function plot_points the inputs are:
• failures - an array or list of failure data
• right_censored - an array or list of right censored data. Optional input
• func - the function to be plotted. Must be ‘CDF’, ‘SF’, or ‘CHF’. Default is ‘CDF’
• h1 and h2 - these are the plotting heuristics. See probability plotting for more details.
• keywords for the scatterplot are also accepted.

from reliability.Distributions import Weibull_Distribution


from reliability.Fitters import Fit_Weibull_2P
from reliability.Probability_plotting import plot_points
import matplotlib.pyplot as plt
(continues on next page)

26 Chapter 6. Fitting a specific distribution to data


reliability Documentation, Release latest

(continued from previous page)


data = Weibull_Distribution(alpha=25,beta=4).random_samples(30)
weibull_fit = Fit_Weibull_2P(failures=data,show_probability_plot=False,print_
˓→results=False)

weibull_fit.distribution.SF(label='Fitted Distribution',color='steelblue')
plot_points(failures=data,func='SF',label='failure data',color='red',alpha=0.7)
plt.legend()
plt.show()

It is beneficial to see the effectiveness of the fitted distribution in comparison to the original distribution. In this
second example, we are creating 500 samples from a Weibull distribution and then we will right censor all of the data
above our chosen threshold. Then we are fitting a Weibull_3P distribution to the data. Note that we need to specify
“show_probability_plot=False, print_results=False” in the Fit_Weibull_3P to prevent the normal outputs of the fitting
function from being displayed.
from reliability.Distributions import Weibull_Distribution
from reliability.Fitters import Fit_Weibull_3P
from reliability.Other_functions import make_right_censored_data, histogram
import matplotlib.pyplot as plt

a = 30
b = 2
g = 20
threshold=55
dist = Weibull_Distribution(alpha=a, beta=b, gamma=g) # generate a weibull
˓→distribution
(continues on next page)

27
reliability Documentation, Release latest

(continued from previous page)


raw_data = dist.random_samples(500, seed=2) # create some data from the distribution
data = make_right_censored_data(raw_data,threshold=threshold) #right censor some of
˓→the data

print('There are', len(data.right_censored), 'right censored items.')


wbf = Fit_Weibull_3P(failures=data.failures, right_censored=data.right_censored, show_
˓→probability_plot=False, print_results=False) # fit the Weibull_3P distribution
print('Fit_Weibull_3P parameters:\nAlpha:', wbf.alpha, '\nBeta:', wbf.beta, '\nGamma',
˓→ wbf.gamma)

histogram(raw_data,white_above=threshold) # generates the histogram using optimal bin


˓→width and shades the censored part as white

dist.PDF(label='True Distribution') # plots the true distribution's PDF


wbf.distribution.PDF(label='Fit_Weibull_3P', linestyle='--') # plots to PDF of the
˓→fitted Weibull_3P

plt.title('Fitting comparison for failures and right censored data')


plt.legend()
plt.show()

'''
There are 118 right censored items.
Fit_Weibull_3P parameters:
Alpha: 28.87483648004299
Beta: 2.0295019039127347
Gamma 20.383900235710296
'''

28 Chapter 6. Fitting a specific distribution to data


reliability Documentation, Release latest

As a final example, we will fit a Gamma_2P distribution to some partially right censored data. To provide a comparison
of the fitting accuracy as the number of samples increases, we will do the same experiment with varying sample sizes.
The results highlight that the accuracy of the fit is proportional to the amount of samples, so you should always try to
obtain more data if possible.
from reliability.Distributions import Gamma_Distribution
from reliability.Fitters import Fit_Gamma_2P
from reliability.Other_functions import make_right_censored_data, histogram
import matplotlib.pyplot as plt

a = 30
b = 4
threshold = 180 # this is used when right censoring the data
trials = [10, 100, 1000, 10000]
subplot_id = 221
plt.figure(figsize=(9, 7))
for sample_size in trials:
dist = Gamma_Distribution(alpha=a, beta=b)
raw_data = dist.random_samples(sample_size, seed=2) # create some data. Seeded
˓→for repeatability

data = make_right_censored_data(raw_data, threshold=threshold) # right censor


˓→the data

gf = Fit_Gamma_2P(failures=data.failures, right_censored=data.right_censored,
˓→show_probability_plot=False, print_results=False) # fit the Gamma_2P distribution
print('\nFit_Gamma_2P parameters using', sample_size, 'samples:', '\nAlpha:', gf.
˓→alpha, '\nBeta:', gf.beta) # print the results
plt.subplot(subplot_id)
histogram(raw_data,white_above=threshold) # plots the histogram using optimal bin
˓→width and shades the right censored part white

dist.PDF(label='True') # plots the true distribution


gf.distribution.PDF(label='Fitted', linestyle='--') # plots the fitted Gamma_2P
˓→distribution

plt.title(str(str(sample_size) + ' samples\n' + r'$\alpha$ error: ' +


˓→str(round(abs(gf.alpha - a) / a * 100, 2)) + '%\n' + r'$\beta$ error: ' +

˓→str(round(abs(gf.beta - b) / b * 100, 2)) + '%'))

plt.ylim([0, 0.012])
plt.xlim([0, 500])
plt.legend()
subplot_id += 1
plt.subplots_adjust(left=0.11, bottom=0.08, right=0.95, top=0.89, wspace=0.33,
˓→hspace=0.58)

plt.show()

'''
Fit_Gamma_2P parameters using 10 samples:
Alpha: 19.426036067937076
Beta: 4.690126148584235

Fit_Gamma_2P parameters using 100 samples:


Alpha: 36.26423078012859
Beta: 3.292935791420746

Fit_Gamma_2P parameters using 1000 samples:


Alpha: 28.825237245698354
Beta: 4.062929181298052

Fit_Gamma_2P parameters using 10000 samples:


Alpha: 30.30127379917658
(continues on next page)

29
reliability Documentation, Release latest

(continued from previous page)


Beta: 3.960086262727073
'''

6.1 Using Fit_Weibull_2P_grouped for large data sets

The function Fit_Weibull_2P_grouped is effectively the same as Fit_Weibull_2P, except for a few small differences
that make it more efficient at handling grouped data sets. Grouped data sets are typically found in very large data that
may be heavily censored. The function includes a choice between two optimizers and a choice between two initial
guess methods for the initial guess that is given to the optimizer. These help in cases where the data is very heavily
censored (>99.9%). The defaults for these options are usually the best but you may want to try different options to see
which one gives you the lowest log-likelihood. The inputs and outputs are the same as for Fit_Weibull_2P except for
the following:
• initial_guess_method - ‘scipy’ OR ‘least squares’. Default is ‘least squares’. Both do not take into account
censored data but scipy uses MLE, and least squares is least squares regression of the plotting positions. Least
squares proved more accurate during testing.
• optimizer - ‘L-BFGS-B’ or ‘TNC’. These are both bound constrained methods. If the bounded method fails,
nelder-mead will be used. If nelder-mead fails then the initial guess will be returned with a warning. For more
information on optimizers see the scipy documentation.

30 Chapter 6. Fitting a specific distribution to data


reliability Documentation, Release latest

• dataframe - a pandas dataframe of the appropriate format. The requirements of the input dataframe are: The
column titles MUST be ‘category’, ‘time’, ‘quantity’. The category values MUST be ‘F’ for failure or ‘C’ for
censored (right censored). The time values are the failure or right censored times. The quantity is the number of
items at that time. The quantity must be specified for all values even if the quantity is 1.
The following example shows how we can use Fit_Weibull_2P_grouped to fit a Weibull_2P distribution to grouped
data from a spreadsheet (shown below) on the Windows desktop. We change the optimiser from the default (L-BFGS-
B) to TNC as it is more successful for this dataset. In almost all cases the L-BFGS-B optimizer is better than TNC
but it is worth trying both if the first does not look good. You may also want to try changing the initial_guess_method
as the results from the optimizers can be sensitive to their initial guess for problems in which there are local minima
or insufficient gradients to find the global minima. If you would like to access this data, it is available in reliabil-
ity.Datasets.electronics and includes both the failures and right_censored format as well as the dataframe format. An
example of this is provided in the code below (option 2).

from reliability.Fitters import Fit_Weibull_2P_grouped


import pandas as pd

# option 1 for importing this dataset (from an excel file on your desktop)
filename = 'C:\\Users\\Current User\\Desktop\\data.xlsx'
df = pd.read_excel(io=filename)

## option 2 for importing this dataset (from the dataset in reliability)


# from reliability.Datasets import electronics
# df = electronics().dataframe

print(df.head(15),'\n')
Fit_Weibull_2P_grouped(dataframe=df, optimizer='TNC', show_probability_plot=False)
˓→#note that the TNC optimiser usually underperforms the default (L-BFGS-B) optimiser

˓→but in this case it is better

'''
time quantity category
0 220 1 F
(continues on next page)

6.1. Using Fit_Weibull_2P_grouped for large data sets 31


reliability Documentation, Release latest

(continued from previous page)


1 179 1 F
2 123 1 F
3 146 1 F
4 199 1 F
5 181 1 F
6 191 1 F
7 216 1 F
8 1 1 F
9 73 1 F
10 44798 817 C
11 62715 823 C
12 81474 815 C
13 80632 813 C
14 62716 804 C

Results from Fit_Weibull_2P_grouped (95% CI):


Point Estimate Standard Error Lower CI Upper CI
Parameter
Alpha 6.205380e+21 7.780317e+22 1.319435e+11 2.918427e+32
Beta 1.537356e-01 4.863029e-02 8.270253e-02 2.857789e-01
Log-Likelihood: -144.61675864154972
Number of failures: 10
Number of right censored: 4072
Fraction censored: 99.75502 %
'''

6.2 How does the code work with censored data?

All functions in this module work using a Python library called autograd to find the derivative of the log-likelihood
function. In this way, the code only needs to specify the log PDF and log SF in order to apply Maximum-Likelihood
Estimation (MLE) to obtain the fitted parameters. Initial guesses of the parameters are essential for autograd and are
obtained using scipy.stats on all the data as if it wasn’t censored (since scipy doesn’t accept censored data). If the
distribution is an extremely bad fit or is heavily censored (>99% censored) then these guesses may be poor and the fit
might not be successful. In this case, the scipy fit is used which will be incorrect if there is any censored data. If this
occurs, a warning will be printed. Generally the fit achieved by autograd is highly successful.
A special thanks goes to Cameron Davidson-Pilon (author of the Python library lifelines and website dataorigami.net)
for providing help with getting autograd to work, and for writing the python library autograd-gamma, without which
it would be impossible to fit the Beta or Gamma distributions using autograd.

32 Chapter 6. Fitting a specific distribution to data


reliability Documentation, Release latest

6.2. How does the code work with censored data? 33


reliability Documentation, Release latest

34 Chapter 6. Fitting a specific distribution to data


CHAPTER 7

Fitting all available distributions to data

To fit all of the distributions available in reliability, is a similar process to fitting a specific distribution. The
user needs to specify the failures and any right censored data. The Beta distribution will only be fitted if you specify
data that is in the range {0,1}. The selection of what can be fitted is all done automatically based on the data provided.
Inputs:
• failures - an array or list of the failure times.
• right_censored - an array or list of the right failure times.
• sort_by - goodness of fit test to sort results by. Must be either ‘BIC’ or ‘AIC’. Default is BIC.
• show_histogram_plot - True/False. Defaults to True. Will show the PDF and CDF of the fitted distributions
along with a histogram of the failure data.
• show_PP_plot - True/False. Defaults to True. Provides a comparison of parametric vs non-parametric fit using
a semiparametric Probability-Probability (PP) plot for each fitted distribution.
• show_probability_plot - True/False. Defaults to True. Provides a probability plot of each of the fitted distribu-
tions.
• print_results - True/False. Defaults to True. Will show the results of the fitted parameters and the goodness of
fit tests in a dataframe.
Outputs:
• results - a dataframe of the fitted distributions and their parameters, along with the AICc and BIC goodness of
fit statistics. This is sorted automatically to provide the best fit first. Use the sort_by=’BIC’ to change the sort
between AICc and BIC. Default sort is BIC. print_results controls whether this is printed. In displaying these
results, the pandas dataframe is designed to use the common greek letter parametrisations rather than the scale,
shape, location, threshold parametrisations which can become confusing for some distributions.
• a plot of the PDF and CDF of each fitted distribution along with a histogram of the failure data. The legend is
not in any particular order.
• a probability plot of each of the fitted distributions.
• a semi-parametric probability-probability plot of parametric vs non-parametric distribution (a better fit is will
lie on the red diagonal).

35
reliability Documentation, Release latest

• best_distribution - a distribution object created based on the parameters of the best fitting distribution. The best
distribution is created as a distribution object that can be used like any of the other distribution objects. See the
examples below for how this can be used.
• best_distribution_name - the name of the best fitting distribution. E.g. ‘Weibull_3P’
• parameters and goodness of fit tests for each fitted distribution. For example, the Weibull_3P distribution values
are: Weibull_3P_alpha, Weibull_3P_beta, Weibull_3P_gamma, Weibull_3P_BIC, Weibull_3P_AICc.
Confidence intervals for each of the fitted parameters are not reported by Fitters.Fit_Everything as this
would be a large number of outputs. If you need the confidence intervals for the fitted parameters you can repeat
the fitting using just a specific distribution and the results will include the confidence intervals. Whilst Minitab uses
the Anderson-Darling statistic for the goodness of fit, it is generally recognised that AICc and BIC are more accurate
measures as they take into account the number of parameters in the distribution.
In this first example, we will use Fit_Everything on some data and will return only the dataframe of results. Note that
we are actively supressing the 3 plots that would normally be shown to provide graphical goodness of fit indications.
from reliability.Fitters import Fit_Everything
data = [4,4,2,4,7,4,1,2,7,1,4,3,6,6,6,3,2,3,4,3,2,3,2,4,6,5,5,2,4,3] # created using
˓→Weibull_Distribution(alpha=5,beta=2), and rounded to nearest int

Fit_Everything(failures=data, show_histogram_plot=False, show_probability_plot=False,


˓→show_PP_plot=False)

'''
Alpha Beta Gamma Mu Sigma Lambda AICc
˓→ BIC
Distribution
Weibull_2P 4.21932 2.43761 117.696224
˓→ 120.054175

Gamma_2P 0.816685 4.57132 118.404666


˓→ 120.762616

Normal_2P 3.73333 1.65193 119.697592


˓→ 122.055543

Lognormal_2P 1.20395 0.503621 120.662122


˓→ 123.020072

Lognormal_3P 0 1.20395 0.503621 123.140754


˓→ 123.020072

Weibull_3P 3.61252 2.02388 0.530239 119.766821


˓→ 123.047337

Exponential_2P 0.999 0.36572 124.797704


˓→ 127.155654

Gamma_3P 3.49645 0.781773 0.9999 125.942453


˓→ 129.222968

Exponential_1P 0.267857 141.180947


˓→ 142.439287

'''

In this second example, we will create some right censored data and use Fit_Everything. All outputs are shown, and
the best fitting distribution is accessed and printed.
from reliability.Fitters import Fit_Everything
from reliability.Distributions import Weibull_Distribution
from reliability.Other_functions import make_right_censored_data

raw_data = Weibull_Distribution(alpha=12, beta=3).random_samples(100, seed=2) #


˓→create some data

data = make_right_censored_data(raw_data, threshold=14) # right censor the data


results = Fit_Everything(failures=data.failures, right_censored=data.right_censored)
˓→# fit all the models (continues on next page)

36 Chapter 7. Fitting all available distributions to data


reliability Documentation, Release latest

(continued from previous page)


print('The best fitting distribution was', results.best_distribution_name, 'which had
˓→parameters', results.best_distribution.parameters)

'''
Alpha Beta Gamma Mu Sigma Lambda AICc
˓→ BIC
Distribution
Weibull_2P 11.2773 3.30301 488.041154
˓→493.127783

Normal_2P 10.1194 3.37466 489.082213


˓→494.168842

Gamma_2P 1.42315 7.21352 490.593729


˓→495.680358

Weibull_3P 10.0786 2.85825 1.15083 489.807329


˓→497.372839

Gamma_3P 1.42315 7.2135 0 492.720018


˓→500.285528

Lognormal_2P 2.26524 0.406436 495.693518


˓→500.780147

Lognormal_3P 0.883941 2.16125 0.465752 500.938298


˓→500.780147

Exponential_2P 2.82802 0.121869 538.150905


˓→543.237534

Exponential_1P 0.0870022 594.033742


˓→596.598095

The best fitting distribution was Weibull_2P which had parameters [11.27730642 3.
˓→30300716 0. ]
'''

37
reliability Documentation, Release latest

The histogram is scaled based on the amount of censored data. If your censored data is all above or below your
failure data then the histogram bars should line up well with the fitted distributions (assuming you have enough data).
However, if your censored data is not always greater than the max of your failure data then the heights of the histogram
bars will be scaled down and the plot may look incorrect. This is to be expected as the histogram is only a plot of the
failure data and the totals will not add to 100% if there is censored data.

Note: The confidence intervals shown on the probability plots are only available for the Exponential (1P and 2P) and
Weibull (2P and 3P) fitters. This library is in active development and over the next few months the confidence intervals
will be added to the Normal and Lognormal Fitters followed by the Gamma and Beta Fitters.

38 Chapter 7. Fitting all available distributions to data


reliability Documentation, Release latest

39
reliability Documentation, Release latest

40 Chapter 7. Fitting all available distributions to data


CHAPTER 8

Mixture models

8.1 What are mixture models?

Mixture models are a combination of two or more distributions added together to create a distribution that has a shape
with more flexibility than a single distribution. Each of the mixture’s components must be multiplied by a proportion,
and the sum of all the proportions is equal to 1. The mixture is generally written in terms of the PDF, but since the
CDF is the integral (cumulative sum) of the PDF, we can equivalently write the Mixture model in terms of the PDF or
CDF. For a mixture model with 2 distributions, the equations are shown below:
𝑃 𝐷𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒 = 𝑝 × 𝑃 𝐷𝐹 1 + (1 − 𝑝) × 𝑃 𝐷𝐹 2
𝐶𝐷𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒 = 𝑝 × 𝐶𝐷𝐹 1 + (1 − 𝑝) × 𝐶𝐷𝐹 2
𝑆𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒 = 1 − 𝐶𝐷𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒
𝑃 𝐷𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒
𝐻𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒 = 𝑆𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒

𝐶𝐻𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒 = −𝑙𝑛(𝑆𝐹 𝑚𝑖𝑥𝑡𝑢𝑟𝑒 )


Another option to obtain the hazard function is to find the numerical derivative of the CHF. This is often more reliable
since the PDF/SF approach can lead to noisy results caused by limitations in floating point precision when the SF is
near zero.
Mixture models are useful when there is more than one failure mode that is generating the failure data. This can be
recognised by the shape of the PDF and CDF being outside of what any single distribution can accurately model. On a
probability plot, a mixture of failure modes can be identified by bends or S-shapes in the data that you might otherwise
expect to be linear. An example of this is shown in the image below. You should not use a mixture model just because
it can fit almost anything really well, but you should use a mixture model if you suspect that there are multiple failure
modes contributing to the failure data you are observing. To judge whether a mixture model is justified, look at the
goodness of fit criterion (AICc or BIC) which penalises the score based on the number of parameters in the model.
The closer the goodness of fit criterion is to zero, the better the fit.
See also competing risk models for another method of combining distributions using the product of the SF rather than
the sum of the CDF.

41
reliability Documentation, Release latest

8.2 Creating a mixture model

Within reliability.Distributions is the Mixture_Model. This class accepts an array or list of distribution
objects created using the reliability.Distributions module (available distributions are Exponential, Weibull, Normal,
Lognormal, Gamma, Beta). There is no limit to the number of components you can add to the mixture, but is is
generally preferable to use as few as are required to fit the data appropriately (typically 2 or 3). In addition to the
distributions, you can specify the proportions contributed by each distribution in the mixture. These proportions must
sum to 1. If not specified the proportions will be set as equal for each component.
As this process is additive for the survival function, and may accept many distributions of different types, the math-
ematical formulation quickly gets complex. For this reason, the algorithm combines the models numerically rather
than empirically so there are no simple formulas for many of the descriptive statistics (mean, median, etc.). Also, the
accuracy of the model is dependent on xvals. If the xvals array is small (<100 values) then the answer will be “blocky”
and inaccurate. The variable xvals is only accepted for PDF, CDF, SF, HF, and CHF. The other methods (like random
samples) use the default xvals for maximum accuracy. The default number of values generated when xvals is not given
is 1000. Consider this carefully when specifying xvals in order to avoid inaccuracies in the results.
The API is similar to the other probability distributions (Weibull, Normal, etc.) and has the following inputs and
methods:
Inputs:
• distributions - a list or array of probability distributions used to construct the model
• proportions - how much of each distribution to add to the mixture. The sum of proportions must always be 1.

42 Chapter 8. Mixture models


reliability Documentation, Release latest

Methods:
• name - ‘Mixture’
• name2 - ‘Mixture using 3 distributions’
• mean
• median
• mode
• variance
• standard_deviation
• skewness
• kurtosis
• excess_kurtosis
• b5 - The time where 5% have failed. Same as quantile(0.05)
• b95 - The time where 95% have failed. Same as quantile(0.95)
• plot() - plots all functions (PDF,CDF,SF,HF,CHF)
• PDF() - plots the probability density function
• CDF() - plots the cumulative distribution function
• SF() - plots the survival function (also known as reliability function)
• HF() - plots the hazard function
• CHF() - plots the cumulative hazard function
• quantile() - Calculates the quantile (time until a fraction has failed) for a given fraction failing. Also known as
b life where b5 is the time at which 5% have failed.
• inverse_SF() - the inverse of the Survival Function. This is useful when producing QQ plots.
• mean_residual_life() - Average residual lifetime of an item given that the item has survived up to a given time.
Effectively the mean of the remaining amount (right side) of a distribution at a given time.
• stats() - prints all the descriptive statistics. Same as the statistics shown using .plot() but printed to console.
• random_samples() - draws random samples from the distribution to which it is applied.
The following example shows how the Mixture_Model object can be created, visualised and used.

from reliability.Distributions import Lognormal_Distribution, Gamma_Distribution,


˓→Weibull_Distribution, Mixture_Model

import matplotlib.pyplot as plt

# create the mixture model


d1 = Lognormal_Distribution(mu=2, sigma=0.8)
d2 = Weibull_Distribution(alpha=50, beta=5, gamma=100)
d3 = Gamma_Distribution(alpha=5, beta=3, gamma=30)
mixture_model = Mixture_Model(distributions=[d1, d2, d3], proportions=[0.3, 0.4, 0.3])

# plot the 5 functions using the plot() function


mixture_model.plot()

# plot the PDF and CDF


(continues on next page)

8.2. Creating a mixture model 43


reliability Documentation, Release latest

(continued from previous page)


plot_components = True # this plots the component distributions. Default is False
plt.figure(figsize=(9, 5))
plt.subplot(121)
mixture_model.PDF(plot_components=plot_components, color='red', linestyle='--')
plt.subplot(122)
mixture_model.CDF(plot_components=plot_components, color='red', linestyle='--')
plt.subplots_adjust(left=0.1, right=0.95)
plt.show()

# extract the mean of the distribution


print('The mean of the distribution is:', mixture_model.mean)

'''
The mean of the distribution is: 74.91674657035722
'''

44 Chapter 8. Mixture models


reliability Documentation, Release latest

8.3 Fitting a mixture model

Within reliability.Fitters is Fit_Weibull_Mixture. This function will fit a weibull mixture model consisting
of 2 x Weibull_2P distributions (this does not fit the gamma parameter). Just as with all of the other distributions
in reliability.Fitters, right censoring is supported, though care should be taken to ensure that there still
appears to be two groups when plotting only the failure data. A second group cannot be made from a mostly or totally
censored set of samples.
Whilst some failure modes may not be fitted as well by a Weibull distribution as they may be by another distribution,
it is unlikely that a mixture of data from two distributions (particularly if they are overlapping) will be fitted noticeably
better by other types of mixtures than would be achieved by a Weibull mixture. For this reason, other types of mixtures
are not implemented.
Inputs:
• failures - an array or list of the failure data. There must be at least 4 failures, but it is highly recommended to
use another model if you have less than 20 failures.
• right_censored - an array or list of right censored data
• print_results - True/False. This will print results to console. Default is True
• CI - confidence interval for estimating confidence limits on parameters. Must be between 0 and 1. Default is
0.95 for 95% CI.
• show_probability_plot - True/False. This will show the probability plot with the fitted mixture CDF. Default is
True.
Outputs:
• alpha_1 - the fitted Weibull_2P alpha parameter for the first (left) group
• beta_1 - the fitted Weibull_2P beta parameter for the first (left) group
• alpha_2 - the fitted Weibull_2P alpha parameter for the second (right) group

8.3. Fitting a mixture model 45


reliability Documentation, Release latest

• beta_2 - the fitted Weibull_2P beta parameter for the second (right) group
• proportion_1 - the fitted proportion of the first (left) group
• proportion_2 - the fitted proportion of the second (right) group. Same as 1-proportion_1
• alpha_1_SE - the standard error on the parameter
• beta_1_SE - the standard error on the parameter
• alpha_2_SE - the standard error on the parameter
• beta_2_SE - the standard error on the parameter
• proportion_1_SE - the standard error on the parameter
• alpha_1_upper - the upper confidence interval estimate of the parameter
• alpha_1_lower - the lower confidence interval estimate of the parameter
• beta_1_upper - the upper confidence interval estimate of the parameter
• beta_1_lower - the lower confidence interval estimate of the parameter
• alpha_2_upper - the upper confidence interval estimate of the parameter
• alpha_2_lower - the lower confidence interval estimate of the parameter
• beta_2_upper - the upper confidence interval estimate of the parameter
• beta_2_lower - the lower confidence interval estimate of the parameter
• proportion_1_upper - the upper confidence interval estimate of the parameter
• proportion_1_lower - the lower confidence interval estimate of the parameter
• loglik - Log Likelihood (as used in Minitab and Reliasoft)
• loglik2 - LogLikelihood*-2 (as used in JMP Pro)
• AICc - Akaike Information Criterion
• BIC - Bayesian Information Criterion
• results - a dataframe of the results (point estimate, standard error, Lower CI and Upper CI for each parameter)
In this first example, we will create some data using two Weibull distributions and then combine the data using
np.hstack. We will then fit the Weibull mixture model to the combined data and will print the results and show the
plot. As the input data is made up of 40% from the first group, we expect the proportion to be around 0.4.

from reliability.Fitters import Fit_Weibull_Mixture


from reliability.Distributions import Weibull_Distribution
from reliability.Other_functions import histogram
import numpy as np
import matplotlib.pyplot as plt

# create some failures from two distributions


group_1 = Weibull_Distribution(alpha=10, beta=3).random_samples(40, seed=2)
group_2 = Weibull_Distribution(alpha=40, beta=4).random_samples(60, seed=2)
all_data = np.hstack([group_1, group_2]) # combine the data
results = Fit_Weibull_Mixture(failures=all_data) #fit the mixture model

# this section is to visualise the histogram with PDF and CDF


# it is not part of the default output from the Fitter
plt.figure(figsize=(9, 5))
plt.subplot(121)
(continues on next page)

46 Chapter 8. Mixture models


reliability Documentation, Release latest

(continued from previous page)


histogram(all_data)
results.distribution.PDF(xmin=0, xmax=60)
plt.subplot(122)
histogram(all_data, cumulative=True)
results.distribution.CDF(xmin=0, xmax=60)

plt.show()

'''
Results from Fit_Weibull_Mixture (95% CI):
Point Estimate Standard Error Lower CI Upper CI
Parameter
Alpha 1 8.654923 0.394078 7.916006 9.462815
Beta 1 3.910594 0.509724 3.028959 5.048845
Alpha 2 38.097040 1.411773 35.428112 40.967028
Beta 2 3.818227 0.421366 3.075574 4.740207
Proportion 1 0.388206 0.050264 0.295325 0.489987
Log-Likelihood: -375.9906311550037
'''

8.3. Fitting a mixture model 47


reliability Documentation, Release latest

In this second example, we will compare how well the Weibull Mixture performs vs a single Weibull_2P. Firstly, we
generate some data from two Weibull distributions, combine the data, and right censor it above our chosen threshold.
Next, we will fit the Mixture and Weibull_2P distributions. Then we will visualise the histogram and PDF of the fitted
mixture model and Weibull_2P distributions. The goodness of fit measure is used to check whether the mixture model
is really a much better fit than a single Weibull_2P distribution (which it is due to the lower BIC).

from reliability.Fitters import Fit_Weibull_Mixture, Fit_Weibull_2P


from reliability.Distributions import Weibull_Distribution
from reliability.Other_functions import histogram, make_right_censored_data
import numpy as np
import matplotlib.pyplot as plt

# create some failures and right censored data


group_1 = Weibull_Distribution(alpha=10, beta=2).random_samples(700, seed=2)
group_2 = Weibull_Distribution(alpha=30, beta=3).random_samples(300, seed=2)
all_data = np.hstack([group_1, group_2])
data = make_right_censored_data(all_data, threshold=30)

# fit the Weibull Mixture and Weibull_2P


mixture = Fit_Weibull_Mixture(failures=data.failures, right_censored=data.right_
˓→censored, show_probability_plot=False, print_results=False)

single = Fit_Weibull_2P(failures=data.failures, right_censored=data.right_censored,


˓→show_probability_plot=False, print_results=False)

print('Weibull_Mixture BIC:', mixture.BIC, '\nWeibull_2P BIC:', single.BIC) # print


˓→the goodness of fit measure

# plot the Mixture and Weibull_2P


histogram(all_data, white_above=30)
xvals = np.linspace(0, 60, 1000)
mixture.distribution.PDF(label='Weibull Mixture',xvals=xvals)
single.distribution.PDF(label='Weibull_2P',xvals=xvals)
plt.title('Comparison of Weibull_2P with Weibull Mixture')
plt.legend()
(continues on next page)

48 Chapter 8. Mixture models


reliability Documentation, Release latest

(continued from previous page)


plt.show()

'''
Weibull_Mixture BIC: 6432.417425636481
Weibull_2P BIC: 6511.51175959736
'''

8.3. Fitting a mixture model 49


reliability Documentation, Release latest

50 Chapter 8. Mixture models


CHAPTER 9

Competing risks models

9.1 What are competing risks models?

Competing risks models are a combination of two or more distributions that represent failure modes which are “com-
peting” to end the life of the system being modelled. This model is similar to a mixture model in the sense that it uses
multiple distributions to create a new model that has a shape with more flexibility than a single distribution. However,
unlike a mixture models, we are not adding proportions of the PDF or CDF, but are instead multiplying the survival
functions. The formula for the competing risks model is typically written in terms of the survival function (SF). Since
we may consider the system’s reliability to depend on the reliability of all the parts of the system (each with its own
failure modes), the equation is written as if the system was in series, using the product of the survival functions for
each failure mode. For a competing risks model with 2 distributions, the equations are shown below:
𝑆𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠 = 𝑆𝐹 1 × 𝑆𝐹 2
𝐶𝐷𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠 = 1 − 𝑆𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠
Since 𝑆𝐹 = 𝑒𝑥𝑝(−𝐶𝐻𝐹 ) we may equivalently write the competing risks model in terms of the hazard or cumulative
hazard function as:
𝐻𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠 = 𝐻𝐹 1 + 𝐻𝐹 2
𝐶𝐻𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠 = 𝐶𝐻𝐹 1 + 𝐶𝐻𝐹 2
𝑃 𝐷𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠 = 𝐻𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠 × 𝑆𝐹 𝐶𝑜𝑚𝑝𝑒𝑡𝑖𝑛𝑔 𝑅𝑖𝑠𝑘𝑠
Another option to obtain the PDF, is to find the derivative of the CDF. This is easiest to do numerically since the
formula for the SF of the competing risks model can become quite complex as more risks are added. Note that using
the PDF = HF x SF method requires the conversion of nan to 0 in the PDF for high xvals. This is because the HF
of the component distributions is obtained using PDF/SF and for the region where the SF and PDF of the component
distributions is 0 the resulting HF will be nan.
The image below illustrates the difference between the competing risks model and the mixture model, each of which is
made up of the same two component distributions. Note that the PDF of the competing risks model is always equal to
or to the left of the component distributions, and the CDF is equal to or higher than the component distributions. This
shows how a failure mode that occurs earlier in time can end the lives of units under observation before the second

51
reliability Documentation, Release latest

failure mode has the chance to. This behaviour is characteristic of real systems which experience multiple failure
modes, each of which could cause system failure.

Competing risks models are useful when there is more than one failure mode that is generating the failure data. This
can be recognised by the shape of the PDF and CDF being outside of what any single distribution can accurately
model. On a probability plot, a combination of failure modes can be identified by bends in the data that you might
otherwise expect to be linear. An example of this is shown in the image below. You should not use a competing risks
model just because it fits your data better than a single distribution, but you should use a competing risks model if you
suspect that there are multiple failure modes contributing to the failure data you are observing. To judge whether a
competing risks model is justified, look at the goodness of fit criterion (AICc or BIC) which penalises the score based
on the number of parameters in the model. The closer the goodness of fit criterion is to zero, the better the fit.
See also mixture models for another method of combining distributions using the sum of the CDF rather than the
product of the SF.

52 Chapter 9. Competing risks models


reliability Documentation, Release latest

9.2 Creating a competing risks model

Within reliability.Distributions is the Competing_Risks_Model. This class accepts an array or list of


distribution objects created using the reliability.Distributions module (available distributions are Exponential, Weibull,
Normal, Lognormal, Gamma, Beta). There is no limit to the number of components you can add to the model, but is
is generally preferable to use as few as are required to fit the data appropriately (typically 2 or 3). Unlike the mixture
model, you do not need to specify any proportions.
As this process is multiplicative for the survival function (or additive for the hazard function), and may accept many
distributions of different types, the mathematical formulation quickly gets complex. For this reason, the algorithm
combines the models numerically rather than empirically so there are no simple formulas for many of the descriptive
statistics (mean, median, etc.). Also, the accuracy of the model is dependent on xvals. If the xvals array is small (<100
values) then the answer will be “blocky” and inaccurate. The variable xvals is only accepted for PDF, CDF, SF, HF,
and CHF. The other methods (like random samples) use the default xvals for maximum accuracy. The default number
of values generated when xvals is not given is 1000. Consider this carefully when specifying xvals in order to avoid
inaccuracies in the results.
The API is similar to the other probability distributions (Weibull, Normal, etc.) and has the following inputs and
methods:
Inputs:
• distributions - a list or array of probability distributions used to construct the model
Methods:

9.2. Creating a competing risks model 53


reliability Documentation, Release latest

• name - ‘Competing risks’


• name2 - ‘Competing risks using 3 distributions’
• mean
• median
• mode
• variance
• standard_deviation
• skewness
• kurtosis
• excess_kurtosis
• b5 - The time where 5% have failed. Same as quantile(0.05)
• b95 - The time where 95% have failed. Same as quantile(0.95)
• plot() - plots all functions (PDF,CDF,SF,HF,CHF)
• PDF() - plots the probability density function
• CDF() - plots the cumulative distribution function
• SF() - plots the survival function (also known as reliability function)
• HF() - plots the hazard function
• CHF() - plots the cumulative hazard function
• quantile() - Calculates the quantile (time until a fraction has failed) for a given fraction failing. Also known as
b life where b5 is the time at which 5% have failed.
• inverse_SF() - the inverse of the Survival Function. This is useful when producing QQ plots.
• mean_residual_life() - Average residual lifetime of an item given that the item has survived up to a given time.
Effectively the mean of the remaining amount (right side) of a distribution at a given time.
• stats() - prints all the descriptive statistics. Same as the statistics shown using .plot() but printed to console.
• random_samples() - draws random samples from the distribution to which it is applied.
The following example shows how the Competing_Risks_Model object can be created, visualised and used.

from reliability.Distributions import Lognormal_Distribution, Gamma_Distribution,


˓→Weibull_Distribution, Competing_Risks_Model

import matplotlib.pyplot as plt

# create the competing risks model


d1 = Lognormal_Distribution(mu=4, sigma=0.1)
d2 = Weibull_Distribution(alpha=50, beta=2)
d3 = Gamma_Distribution(alpha=30,beta=1.5)
CR_model = Competing_Risks_Model(distributions=[d1, d2, d3])

# plot the 5 functions using the plot() function


CR_model.plot(xmin=0,xmax=100)

# plot the PDF and CDF


plot_components = True # this plots the component distributions. Default is False
plt.figure(figsize=(9, 5))
(continues on next page)

54 Chapter 9. Competing risks models


reliability Documentation, Release latest

(continued from previous page)


plt.subplot(121)
CR_model.PDF(plot_components=plot_components, color='red', linestyle='--',xmin=0,
˓→xmax=130)

plt.subplot(122)
CR_model.CDF(plot_components=plot_components, color='red', linestyle='--',xmin=0,
˓→xmax=130)

plt.subplots_adjust(left=0.1, right=0.95)
plt.show()

# extract the mean of the distribution


print('The mean of the distribution is:', CR_model.mean)

'''
The mean of the distribution is: 27.04449126275214
'''

9.2. Creating a competing risks model 55


reliability Documentation, Release latest

9.3 Fitting a competing risks model

Within reliability.Fitters is Fit_Weibull_CR. This function will fit a weibull competing risks model consist-
ing of 2 x Weibull_2P distributions (this does not fit the gamma parameter). Just as with all of the other distributions
in reliability.Fitters, right censoring is supported.
Whilst some failure modes may not be fitted as well by a Weibull distribution as they may be by another distribution,
it is unlikely that a competing risks model of data from two distributions (particularly if they are overlapping) will be
fitted noticeably better by other types of competing risks models than would be achieved by a Weibull competing risks
model. For this reason, other types of competing risks models are not implemented.
Inputs:
• failures - an array or list of the failure data. There must be at least 4 failures, but it is highly recommended to
use another model if you have less than 20 failures.
• right_censored - an array or list of right censored data
• print_results - True/False. This will print results to console. Default is True.
• CI - confidence interval for estimating confidence limits on parameters. Must be between 0 and 1. Default is
0.95 for 95% CI.
• show_probability_plot - True/False. This will show the probability plot with the fitted Weibull_CR CDF. Default
is True.
Outputs:
• alpha_1 - the fitted Weibull_2P alpha parameter for the first distribution
• beta_1 - the fitted Weibull_2P beta parameter for the first distribution
• alpha_2 - the fitted Weibull_2P alpha parameter for the second distribution
• beta_2 - the fitted Weibull_2P beta parameter for the second distribution

56 Chapter 9. Competing risks models


reliability Documentation, Release latest

• alpha_1_SE - the standard error on the parameter


• beta_1_SE - the standard error on the parameter
• alpha_2_SE - the standard error on the parameter
• beta_2_SE - the standard error on the parameter
• alpha_1_upper - the upper confidence interval estimate of the parameter
• alpha_1_lower - the lower confidence interval estimate of the parameter
• beta_1_upper - the upper confidence interval estimate of the parameter
• beta_1_lower - the lower confidence interval estimate of the parameter
• alpha_2_upper - the upper confidence interval estimate of the parameter
• alpha_2_lower - the lower confidence interval estimate of the parameter
• beta_2_upper - the upper confidence interval estimate of the parameter
• beta_2_lower - the lower confidence interval estimate of the parameter
• loglik - Log Likelihood (as used in Minitab and Reliasoft)
• loglik2 - LogLikelihood*-2 (as used in JMP Pro)
• AICc - Akaike Information Criterion
• BIC - Bayesian Information Criterion
• results - a dataframe of the results (point estimate, standard error, Lower CI and Upper CI for each parameter)
In this first example, we will create some data using a competing risks model from two Weibull distributions. We will
then fit the Weibull mixture model to the data and will print the results and show the plot.

from reliability.Distributions import Weibull_Distribution, Competing_Risks_Model


from reliability.Fitters import Fit_Weibull_CR
from reliability.Other_functions import histogram
import matplotlib.pyplot as plt

# create some data that requires a competing risks models


d1 = Weibull_Distribution(alpha=50, beta=2)
d2 = Weibull_Distribution(alpha=40, beta=10)
CR_model = Competing_Risks_Model(distributions=[d1, d2])
data = CR_model.random_samples(100, seed=2)

# fit the Weibull competing risks model


results = Fit_Weibull_CR(failures=data)

# this section is to visualise the histogram with PDF and CDF


# it is not part of the default output from the Fitter
plt.figure(figsize=(9, 5))
plt.subplot(121)
histogram(data)
results.distribution.PDF(xmin=0, xmax=60)
plt.subplot(122)
histogram(data, cumulative=True)
results.distribution.CDF(xmin=0, xmax=60)

plt.show()

'''
(continues on next page)

9.3. Fitting a competing risks model 57


reliability Documentation, Release latest

(continued from previous page)


Results from Fit_Weibull_CR (95% CI):
Point Estimate Standard Error Lower CI Upper CI
Parameter
Alpha 1 55.185550 14.385243 33.108711 91.983192
Beta 1 1.896577 0.454578 1.185637 3.033816
Alpha 2 38.192099 1.083595 36.126262 40.376067
Beta 2 7.978213 1.181428 5.968403 10.664810
Log-Likelihood: -352.47978488894165
'''

58 Chapter 9. Competing risks models


reliability Documentation, Release latest

In this second example, we will compare the mixture model to the competing risks model. The data is generated from
a competing risks model so we expect the Weibull competing risks model to be more appropriate than the Mixture
model. Through comparison of the AICc or BIC we can see which model is more appropriate. Since the AICc and
BIC penalise the goodness of fit criterion based on the number of parameters and the mixture model has 5 parameters
compared to the competing risk model’s 4 parameters, we expect the competing risks model to have a lower (closer to
zero) goodness of fit than the Mixture model, and this is what we observe in the results. Notice how the log-likelihood
of the mixture model indicates a better fit (because the value is closer to zero), but this does not take into account the
number of parameters in the model.

from reliability.Distributions import Weibull_Distribution, Competing_Risks_Model


from reliability.Fitters import Fit_Weibull_CR, Fit_Weibull_Mixture
import matplotlib.pyplot as plt
import pandas as pd

# create some data that requires a competing risks models


d1 = Weibull_Distribution(alpha=250, beta=2)
d2 = Weibull_Distribution(alpha=210, beta=10)
CR_model = Competing_Risks_Model(distributions=[d1, d2])
data = CR_model.random_samples(50, seed=2)

CR_fit = Fit_Weibull_CR(failures=data) # fit the Weibull competing risks model


MM_fit = Fit_Weibull_Mixture(failures=data) # fit the Weibull mixture model
plt.legend()
plt.show()

# create a dataframe to display the goodness of fit criterion as a table


goodness_of_fit = {'Model': ['Competing Risks', 'Mixture'], 'AICc': [CR_fit.AICc, MM_
˓→fit.AICc], 'BIC': [CR_fit.BIC, MM_fit.BIC]}

df = pd.DataFrame(goodness_of_fit, columns=['Model', 'AICc', 'BIC'])


print(df)

'''
Results from Fit_Weibull_CR (95% CI):
(continues on next page)

9.3. Fitting a competing risks model 59


reliability Documentation, Release latest

(continued from previous page)


Point Estimate Standard Error Lower CI Upper CI
Parameter
Alpha 1 229.232894 50.606249 148.717724 353.338649
Beta 1 2.509209 0.747318 1.399663 4.498318
Alpha 2 199.827204 8.581465 183.696232 217.374691
Beta 2 9.220198 2.205702 5.769147 14.735637
Log-Likelihood: -255.44381150244595

Results from Fit_Weibull_Mixture (95% CI):


Point Estimate Standard Error Lower CI Upper CI
Parameter
Alpha 1 103.714797 14.256788 79.219721 135.783855
Beta 1 4.029494 1.329587 2.110497 7.693363
Alpha 2 190.242625 5.107832 180.490306 200.521884
Beta 2 7.851357 1.365270 5.583796 11.039768
Proportion 1 0.226028 0.084489 0.101793 0.429403
Log-Likelihood: -254.50393768335337

Model AICc BIC


0 Competing Risks 519.776512 526.535715
1 Mixture 520.371512 528.567990
'''

60 Chapter 9. Competing risks models


reliability Documentation, Release latest

9.3. Fitting a competing risks model 61


reliability Documentation, Release latest

62 Chapter 9. Competing risks models


CHAPTER 10

Probability plots

Proabability plots are a general term for several different plotting techniques. One of these techniques is a graphical
method for comparing two data sets and includes probability-probability (PP) plots and quantile-quantile (QQ) plots.
The second plotting technique is used for assessing the goodness of fit of a distribution by plotting the empirical CDF
of the failures against their failure time and scaling the axes in such as way that the distribution appears linear. This
method allows the reliability analyst to fit the distribution parameters using a simple “least squares” fitting method for
a straight line and was popular before computers were capable of calculating the MLE estimates of the parameters.
While we do not typically favour the use of least squares as a fitting method, we can still use probability plots to assess
the goodness of fit. The module reliability.Probability_plotting contains functions for each of the six
distributions supported in reliability. These functions are:
• Weibull_probability_plot
• Normal_probability_plot
• Lognormal_probability_plot
• Gamma_probability_plot
• Beta_probability_plot
• Exponential_probability_plot
• Exponential_probability_plot_Weibull_Scale
There is also a function to obtain the plotting positions as well as the functions for custom axes scaling. These are
explained more in the help file and will not be discussed further here.
Within each of the above probability plotting functions you may enter failure data as well as right censored data.
For those distributions that have a function in reliability.Fitters for fitting location shifted distributions
(Weibull_3P, Gamma_3P, Lognormal_3P, Exponential_2P), you can explicitly tell the probability plotting function to
fit the gamma parameter using fit_gamma=True. By default the gamma parameter is not fitted. Fitting the gamma
parameter will also change the x-axis to time-gamma such that everything will appear linear. An example of this is
shown in the second example below.
Note that the Beta and Gamma probability plots have their y-axes scaled based on the distribution’s parameters so you
will find that when you overlay two gamma or two beta distributions on the same gamma or beta probability paper,
one will be a curved line if they have different shape parameters. This is unavoidable due to the nature of gamma and

63
reliability Documentation, Release latest

beta probability paper and is the reason why you will never find a hardcopy of such paper and also the reason why
these distributions are not used in ALT probability plotting.
Inputs:
• failures - the array or list of failure times
• right_censored - the array or list of right censored failure times
• fit_gamma - this is only included for Weibull, Gamma, Lognormal, and Exponential probability plots. Specify
fit_gamma=True to fit the location shifted distribution.
• h1 and h2 - these are the heuristic constants for plotting positions of the form (k-h1)/(n+h2). Default is
h1=0.3,h2=0.4 which is the median rank method (same as in Minitab). You can specify any heuristic that
follows this form. For more heuristics, see wikipedia.
• show_fitted_distribution - True/False. If True, the fitted distribution will be plotted on the probability plot.
Defaults to True. If you want a probability plot with just the data points and no line for the distribution then set
this to False.
• plotting keywords are also accepted where relevant and they are mostly applied to the fitted distribution line.
The exception to this is for color which defaults to red line and black points but if specified the chosen color will
be applied to both line and points. This is useful when overlaying multiple datasets on a single probability plot.
• keyword arguments for the plot are accepted (eg. color, linestyle, marker)
Outputs:
• The plot is the only output. Use plt.show() to show it.
In the example below we generate some samples from a Normal Distribution and provide these to the probability
plotting function. It is also possible to overlay other plots of the CDF as is shown by the dashed line.

from reliability.Distributions import Normal_Distribution


from reliability.Probability_plotting import Normal_probability_plot
import matplotlib.pyplot as plt
dist = Normal_Distribution(mu=50,sigma=10)
dist.CDF(linestyle='--',label='True CDF') #this is the actual distribution provided
˓→for comparison

failures = dist.random_samples(100)
Normal_probability_plot(failures=failures) #generates the probability plot
plt.show()

64 Chapter 10. Probability plots


reliability Documentation, Release latest

In this second example, we will fit an Exponential distribution to some right censored data. To create this data, we
will draw it from an Exponential distribution that has a location shift of 12. Once again, the true CDF has also been
plotted to provide the comparison. Note that the x-axis is time-gamma as it is necessary to subtract gamma from the
x-plotting positions if we want the plot to appear linear.

from reliability.Distributions import Exponential_Distribution


from reliability.Probability_plotting import Exponential_probability_plot
import matplotlib.pyplot as plt
from reliability.Other_functions import make_right_censored_data

dist = Exponential_Distribution(Lambda=0.25, gamma=12)


raw_data = dist.random_samples(100, seed=42) # draw some random data from an
˓→exponential distribution

data = make_right_censored_data(raw_data, threshold=17) # right censor the data at 17


Exponential_Distribution(Lambda=0.25).CDF(linestyle='--', label='True CDF') # we can
˓→'t plot dist because it will be location shifted

Exponential_probability_plot(failures=data.failures, right_censored=data.right_
˓→censored, fit_gamma=True) # do the probability plot. Note that we have specified
˓→to fit gamma

plt.show()

65
reliability Documentation, Release latest

Note: The confidence intervals appear on the Exponential plot above but not in the Normal probability plot in the
first example. This is because the confidence intervals are only available for the Exponential (1P and 2P) and Weibull
(2P and 3P) fitters. This library is in active development and over the next few months the confidence intervals will be
added to the Normal and Lognormal Fitters followed by the Gamma and Beta Fitters.

In this third example, we will see how probability plotting can be used to highlight the importance of getting as much
data as possible. This code performs a loop in which increasing numbers of samples are used for fitting a Weibull
distribution and the accuracy of the results (shown both in the legend and by comparison with the True CDF) increases
with the number of samples.

from reliability.Distributions import Weibull_Distribution


from reliability.Probability_plotting import Weibull_probability_plot
import matplotlib.pyplot as plt

dist = Weibull_Distribution(alpha=250, beta=3)


for i, x in enumerate([10, 100, 1000]):
plt.subplot(131 + i)
dist.CDF(linestyle='--', label='True CDF')
failures = dist.random_samples(x, seed=42) # take 10, 100, 1000 samples
Weibull_probability_plot(failures=failures) # this is the probability plot
plt.title(str(str(x) + ' samples'))
plt.gcf().set_size_inches(15, 7) # adjust the figuresize after creation. Necessary
˓→to do it after as it it automatically ajdusted within probability_plot

(continues on next page)

66 Chapter 10. Probability plots


reliability Documentation, Release latest

(continued from previous page)


plt.subplots_adjust(left=0.08, right=0.98, top=0.92, wspace=0.35) # formatting for
˓→the figure layout

plt.show()

In this fourth example, we will take a look at the special case of the Exponential probability plot using the Weibull
Scale. This plot is essentially a Weibull probability plot, but the fitting and plotting functions are Exponential. The
reason for plotting an Exponential distribution on Weibull probability paper is to achieve parallel lines for different
Lambda parameters rather than having the lines radiating from the origin as we see in the Exponential probability plot
on Exponential probability paper. This has applications in ALT probability plotting. An example of the differences
between the plots are shown below. Remember that the alpha parameter from the Weibull distribution is equivalent
to 1/Lambda from the Exponential distribution and a Weibull distribution with Beta = 1 is the same as an exponential
distribution.

from reliability.Distributions import Exponential_Distribution


from reliability.Probability_plotting import Exponential_probability_plot, Weibull_
˓→probability_plot, Exponential_probability_plot_Weibull_Scale

import matplotlib.pyplot as plt

data1 = Exponential_Distribution(Lambda=1 / 10, gamma=5).random_samples(50, seed=42)


˓→# should give Lambda = 0.01 OR Weibull alpha = 10

data2 = Exponential_Distribution(Lambda=1 / 100, gamma=5).random_samples(50, seed=42)


˓→ # should give Lambda = 0.001 OR Weibull alpha = 100

plt.subplot(131)
Exponential_probability_plot(failures=data1, fit_gamma=True)
Exponential_probability_plot(failures=data2, fit_gamma=True)
plt.subplot(132)
Weibull_probability_plot(failures=data1, fit_gamma=True)
Weibull_probability_plot(failures=data2, fit_gamma=True)
plt.subplot(133)
Exponential_probability_plot_Weibull_Scale(failures=data1, fit_gamma=True)
Exponential_probability_plot_Weibull_Scale(failures=data2, fit_gamma=True)
plt.gcf().set_size_inches(15, 7)
plt.subplots_adjust(left=0.08, right=0.97, top=0.91, wspace=0.30) # format the plot
plt.show()

67
reliability Documentation, Release latest

In this final example, we take a look at how a probability plot can show us that there’s something wrong with our as-
sumption of a single distribution. To generate the data, the random samples are drawn from two different distributions
which are shown in the left image. In the right image, the scatterplot of failure times is clearly non-linear. The red line
is the attempt to fit a single Weibull_2P distribution and this will do a poor job of modelling the data. Also note that
the points of the scatterplot do not fall on the True CDF of each distribution. This is because the median rank method
of obtaining the plotting positions does not work well if the failure times come from more than one distribution. If you
see a pattern like this, try a mixture model. Always remember that cusps, corners, and doglegs indicate a mixture of
failure modes.

from reliability.Probability_plotting import Weibull_probability_plot


from reliability.Distributions import Weibull_Distribution
import matplotlib.pyplot as plt
import numpy as np

dist_1 = Weibull_Distribution(alpha=200, beta=3)


dist_2 = Weibull_Distribution(alpha=900, beta=4)
plt.subplot(121) # this is for the PDFs of the 2 individual distributions
dist_1.PDF(label=dist_1.param_title_long)
dist_2.PDF(label=dist_2.param_title_long)
plt.legend()
plt.title('PDF of two different distributions\nthat are contributing the failure data
˓→')

plt.subplot(122) # this will be the probability plot


dist_1_data = dist_1.random_samples(50, seed=1)
dist_2_data = dist_2.random_samples(50, seed=1)
all_data = np.hstack([dist_1_data, dist_2_data]) # combine the failure data into one
˓→array

dist_1.CDF(label=dist_1.param_title_long) # plot each individual distribution for


˓→comparison

dist_2.CDF(label=dist_2.param_title_long)
Weibull_probability_plot(failures=all_data) # do the probability plot
plt.gcf().set_size_inches(13, 7) # adjust the figuresize after creation. Necessary
˓→to do it after as it it automatically ajdusted within probability_plot

plt.subplots_adjust(left=0.08, right=0.96) # formatting the layout


plt.legend(loc='lower right')
plt.show()

68 Chapter 10. Probability plots


reliability Documentation, Release latest

10.1 What does a probability plot show me?

A probability plot shows how well your data is modelled by a particular distribution. By scaling the axes in such a way
that the fitted distribution’s CDF appears to be a straight line, we can judge whether the empirical CDF of the failure
data (the black dots) are in agreement with the CDF of the fitted distribution. Ideally we would see that all of the black
dots would lie on the straight line but most of the time this is not the case. A bad fit is evident when the line or curve
formed by the black dots is deviating significantly from the straight line. We can usually tolerate a little bit of deviation
at the tails of the distribution but the majority of the black dots should follow the line. A historically popular test was
the ‘fat pencil test’ which suggested that if a fat pencil could cover the majority of the data points then the fit was
probably suitable. Such a method makes no mention of the size of the plot window which could easily affect the result
so it is best to use your own judgement and experience. This approach is not a substitute for statistical inference so it
is often preferred to use quantitative measures for goodness of fit such as AICc and BIC. Despite being an imprecise
measure, probability plots remain popular among reliability engineers and in reliability engineering software.

from reliability.Probability_plotting import Weibull_probability_plot, Exponential_


˓→probability_plot

from reliability.Distributions import Weibull_Distribution


import matplotlib.pyplot as plt

data = Weibull_Distribution(alpha=5,beta=3).random_samples(100)
plt.subplot(121)
Weibull_probability_plot(failures=data)
plt.title('Example of a good fit')
plt.subplot(122)
Exponential_probability_plot(failures=data)
plt.title('Example of a bad fit')
plt.subplots_adjust(bottom=0.1, right=0.94, top=0.93, wspace=0.34) # adjust the
˓→formatting

plt.show()

10.1. What does a probability plot show me? 69


reliability Documentation, Release latest

70 Chapter 10. Probability plots


CHAPTER 11

Quantile-Quantile plots

This section contains two different styles of quantile-quantile plots. These are the fully parametric quantile-quantile
plot (reliability.Probability_plotting.QQ_plot_parametric) and the semi-parametric quantile-
quantile plot (reliability.Probability_plotting.QQ_plot_semiparametric). These will be de-
scribed separately below. A quantile-quantile (QQ) plot is made by plotting failure units vs failure units for shared
quantiles. A quantile is the value of random variable (time, cycles, landings, etc.) which corresponds to a given
fraction failing (that ranges from 0 to 1).

11.1 Parametric Quantile-Quantile plot

To generate this plot we calculate the failure units (these may be units of time, strength, cycles, landings, rounds fired,
etc.) at which a certain fraction has failed (0.01,0.02,0.03. . . 0.99). We do this for each distribution so we have an array
of failure units and then we plot these failure units against eachother. The time (or any other failure unit) at which a
given fraction has failed is found using the inverse survival function. If the distributions are similar in shape, then the
QQ plot should be a reasonably straight line (but not necessarily a 45 degree line). By plotting the failure times at
equal quantiles for each distribution we can obtain a conversion between the two distributions. Such conversions are
useful for accelerated life testing (ALT) to easily convert field time to test time.
Inputs:
• X_dist - a probability distribution. The failure times at given quantiles from this distribution will be plotted
along the X-axis.
• Y_dist - a probability distribution. The failure times at given quantiles from this distribution will be plotted
along the Y-axis.
• show_fitted_lines - True/False. Default is True. These are the Y=mX and Y=mX+c lines of best fit.
• show_diagonal_line - True/False. Default is False. If True the diagonal line will be shown on the plot.
Outputs:
• The QQ_plot will always be output. Use plt.show() to show it.
• [m,m1,c1] - these are the values for the lines of best fit. m is used in Y=mX, and m1 and c1 are used in
Y=m1X+c1

71
reliability Documentation, Release latest

In the example below, we have determined that the field failures follow a Weibull distribution (𝛼=350, 𝛽=2.01) with
time represented in months. By using an accelerated life test we have replicated the failure mode and Weibull shape
parameter reasonably closely and the Lab failures follow a Weibull distribution (𝛼=128, 𝛽=2.11) with time measured
in hours. We would like to obtain a simple Field-to-Lab conversion for time so we know how much lab time is required
to simulate 10 years of field time. The QQ plot will automatically provide the equations for the lines of best fit. If we
use the Y=mX equation we see that Field(months)=2.757×Lab(hours). Therefore, to simulate 10 years of field time
(120 months) we need to run the accelerated life test for approximately 43.5 hours in the Lab.

from reliability.Probability_plotting import QQ_plot_parametric


from reliability.Distributions import Weibull_Distribution
import matplotlib.pyplot as plt
Field = Weibull_Distribution(alpha=350,beta=2.01)
Lab = Weibull_Distribution(alpha=128,beta=2.11)
QQ_plot_parametric(X_dist=Lab, Y_dist=Field)
plt.show()

11.2 Semiparametric Quantile-Quantile plot

This plot is still a Quantile-Quantile plot (plotting failure units vs failure units for shared quantiles), but instead of
using two parametric distributions, we use the failure data directly as one set of quantiles. We then estimate what the
quantiles of the parametric distribution would be and plot the parametric (theoretical) failure units against the actual
failure units. To generate this plot we begin with the failure units (these may be units of time, strength, cycles, landings,
etc.). We then obtain an emprical CDF using either Kaplan-Meier or Nelson-Aalen. The empirical CDF gives us the

72 Chapter 11. Quantile-Quantile plots


reliability Documentation, Release latest

quantiles we will use to equate the actual and theoretical failure times. Once we have the empirical CDF, we use the
inverse survival function of the specified distribution to obtain the theoretical failure units and then plot the actual and
theoretical failure units together. The primary purpose of this plot is as a graphical goodness of fit test. If the specified
distribution is a good fit to the data then the QQ plot should be a reasonably straight line along the diagonal.
Inputs:
• X_data_failures - the failure times in an array or list. These will be plotted along the X-axis.
• X_data_right_censored - the right censored failure times in an array or list. Optional input.
• Y_dist - a probability distribution. The quantiles of this distribution will be plotted along the Y-axis.
• method - ‘KM’ or ‘NA’ for Kaplan-Meier or Nelson-Aalen. Default is ‘KM’
• show_fitted_lines - True/False. Default is True. These are the Y=mX and Y=mX+c lines of best fit.
• show_diagonal_line - True/False. Default is False. If True the diagonal line will be shown on the plot.
Outputs:
• The QQ_plot will always be output. Use plt.show() to show it.
• [m,m1,c1] - these are the values for the lines of best fit. m is used in Y=mX, and m1 and c1 are used in
Y=m1X+c1
In the example below, we generate 100 random samples from a Normal distribution. We then fit a Weibull_2P dis-
tribution to this data and using QQ_plot_semiparametric we compare the actual quantile (the original data) with the
theoretical quantiles (from the fitted distribution). The lines of best fit are automatically provided and the Y=0.992X
shows the relationship is very close to perfect with only some deviation around the tails of the distribution. The final
example on this page compares a QQ_plot_semiparametric with a PP_plot_semiparametric for the same dataset to
show the differences between the two.

from reliability.Probability_plotting import QQ_plot_semiparametric


from reliability.Fitters import Fit_Weibull_2P
from reliability.Distributions import Normal_Distribution
import matplotlib.pyplot as plt
data = Normal_Distribution(mu=50,sigma=12).random_samples(100)
fitted_dist = Fit_Weibull_2P(failures=data,print_results=False,show_probability_
˓→plot=False).distribution

QQ_plot_semiparametric(X_data_failures=data,Y_dist=fitted_dist)
plt.show()

11.2. Semiparametric Quantile-Quantile plot 73


reliability Documentation, Release latest

11.3 Comparing PP plots with QQ plots

In this example we compare a QQ_plot_parametric with a PP_plot_parametric for the same pair of distributions.
Normally, it is not practical to compare the output of the two plots as they are so vastly different and are used for
different purposes, but the comparison is provided for the reader’s understanding. The differences between these plots
are so significant because one is the time at which the fraction has failed (the Quantile) and the other is the fraction
failing (the CDF). Parametric PP plots are not very common as their only use is in providing a graphical understanding
of the differences between the CDFs of two distributions, such as how one lags or leads the other at various times. See
Probability-Probability plots for more detail on the uses of parametric PP plots.

from reliability.Probability_plotting import QQ_plot_parametric, PP_plot_parametric


from reliability.Distributions import Weibull_Distribution
import matplotlib.pyplot as plt
Field = Weibull_Distribution(alpha=350,beta=2.01)
Lab = Weibull_Distribution(alpha=128,beta=2.11)
plt.figure(figsize=(10,5))
plt.subplot(121)
QQ_plot_parametric(X_dist=Lab, Y_dist=Field,show_diagonal_line=True,show_fitted_
˓→lines=False)

plt.subplot(122)
PP_plot_parametric(X_dist=Lab, Y_dist=Field,show_diagonal_line=True)
plt.show()

74 Chapter 11. Quantile-Quantile plots


reliability Documentation, Release latest

In this example we compare a QQ_plot_semiparametric with a PP_plot_semiparametric for the same dataset. Both
plots are intended to be used as graphical goodness of fit tests. In a PP plot we get a lot of resolution in the center
of the distributions, but less at the tails, whereas the QQ plot gives very good resolution at the tails, but less in the
center. Because most data analysts are more concerned about the extremes (tails) of a distribution, QQ plots are the
more commonly used plot between the two.

from reliability.Probability_plotting import PP_plot_semiparametric, QQ_plot_


˓→semiparametric

from reliability.Fitters import Fit_Normal_2P


from reliability.Distributions import Weibull_Distribution
import matplotlib.pyplot as plt
data = Weibull_Distribution(alpha=100,beta=3).random_samples(100) #create some data
dist = Fit_Normal_2P(failures=data,print_results=False,show_probability_plot=False).
˓→distribution #fit a normal distribution

plt.figure(figsize=(10,5))
plt.subplot(121)
QQ_plot_semiparametric(X_data_failures=data,Y_dist=dist,show_fitted_lines=False,show_
˓→diagonal_line=True)

plt.subplot(122)
PP_plot_semiparametric(X_data_failures=data,Y_dist=dist)
plt.show()

11.3. Comparing PP plots with QQ plots 75


reliability Documentation, Release latest

76 Chapter 11. Quantile-Quantile plots


CHAPTER 12

Probability-Probability plots

This section contains two different styles of probability-probability plots. These are the fully paramet-
ric probability-probability plot (reliability.Probability_plotting.PP_plot_parametric)
and the semi-parametric probability-probability plot (reliability.Probability_plotting.
PP_plot_semiparametric). These will be described separately below. A probability-probability (PP)
plot is made by plotting the fraction failing (CDF) of one distribution vs the fraction failing (CDF) of another
distribution. In the semiparametric form, when we only have the failure data and one hypothesised distribution, the
CDF for the data can be obtained non-parametrically to generate an Empirical CDF.

12.1 Parametric Probability-Probability plot

To generate this plot we simply plot the CDF of one distribution vs the CDF of another distribution. If the distributions
are very similar, the points will lie on the 45 degree diagonal. Any deviation from this diagonal indicates that one dis-
tribution is leading or lagging the other. Fully parametric PP plots are rarely used as their utility is limited to providing
a graphical comparison of the similarity between two CDFs. To aide this comparison, the PP_plot_parametric function
accepts x and y quantile lines that will be traced across to the other distribution.
Inputs:
• X_dist - a probability distribution. The CDF of this distribution will be plotted along the X-axis.
• Y_dist - a probability distribution. The CDF of this distribution will be plotted along the Y-axis.
• y_quantile_lines - starting points for the trace lines to find the X equivalent of the Y-quantile. Optional input.
Must be list or array.
• x_quantile_lines - starting points for the trace lines to find the Y equivalent of the X-quantile. Optional input.
Must be list or array.
• show_diagonal_line - True/False. Default is False. If True the diagonal line will be shown on the plot.
Outputs:
• The PP_plot is the only output. Use plt.show() to show it.

77
reliability Documentation, Release latest

In the example below, we generate two parametric distributions and compare them using a PP plot. We are interested in
the differences at specific quantiles so these are specified and the plot traces them across to the opposing distribution.

from reliability.Probability_plotting import PP_plot_parametric


from reliability.Distributions import Weibull_Distribution,Normal_Distribution
import matplotlib.pyplot as plt
Field = Normal_Distribution(mu=100,sigma=30)
Lab = Weibull_Distribution(alpha=120,beta=3)
PP_plot_parametric(X_dist=Field, Y_dist=Lab, x_quantile_lines=[0.3, 0.6], y_quantile_
˓→lines=[0.1, 0.6])

plt.show()

12.2 Semiparametric Probability-Probability plot

A semiparametric PP plot is still a probability-probability plot, but since we only have one parametric distribution to
give us the CDF, we must use the failure data to obtain the non-parametric estimate of the empirical CDF. To create a
semiparametric PP plot, we must provide the failure data and the non-parametric method (‘KM’ or ‘NA’ for Kaplan-
Meier or Nelson-Aalen) to estimate the empirical CDF, and we must also provide the parametric distribution for the
parametric CDF. The failure units (times, cycles, rounds fired, strength units, etc.) are the limiting values here so the
parametric CDF is only calculated at the failure units since that is the result we get from the empirical CDF. Note that
the empirical CDF also accepts X_data_right_censored just as Kaplan-Meier and Nelson-Aalen will also accept right
censored data. If the fitted distribution is a good fit the PP plot will follow the 45 degree diagonal line. Assessing
goodness of fit in a graphical way is the main purpose of this type of plot. The Fit_everything function also uses a

78 Chapter 12. Probability-Probability plots


reliability Documentation, Release latest

semiparametric PP plot to show the goodness of fit in a graphical way.


Inputs:
• X_data_failures - the failure times in an array or list. The empirical CDF of this data will be plotted along the
X-axis.
• X_data_right_censored - the right censored failure times in an array or list. This is an optional input.
• Y_dist - a probability distribution. The CDF of this distribution will be plotted along the Y-axis.
• method - ‘KM’ or ‘NA’ for Kaplan-Meier or Nelson-Aalen. Default is ‘KM’
• show_diagonal_line - True/False. Default is True. If True the diagonal line will be shown on the plot.
Outputs:
• The PP_plot is the only output. Use plt.show() to show it.
In the example below, we create 100 random samples from a Weibull distribution. We hypothesise that a Normal
distribution may fit this data well so we fit the Normal distribution and then plot the CDF of the fitted distribution
against the empirical CDF (obtained using the Kaplan-Meier estimate). We see that the plot follows the 45 degree
diagonal quite well so we may consider that the fitted Normal distribution is reasonably good at describing this data.
Ideally, this comparison should be made against other distributions as well and the graphical results are often hard to
tell apart which is why we often use quantitative goodness of fit measures like AICc and BIC.

from reliability.Probability_plotting import PP_plot_semiparametric


from reliability.Fitters import Fit_Normal_2P
from reliability.Distributions import Weibull_Distribution
import matplotlib.pyplot as plt
data = Weibull_Distribution(alpha=5,beta=3).random_samples(100)
dist = Fit_Normal_2P(failures=data,show_probability_plot=False,print_results=False).
˓→distribution

PP_plot_semiparametric(X_data_failures=data,Y_dist=dist)
plt.show()

12.2. Semiparametric Probability-Probability plot 79


reliability Documentation, Release latest

To see how semiparametric PP plots compare with semiparametric QQ plots as a graphical goodness of fit test, please
see the second example in the section on comparing PP plots with QQ plots.

80 Chapter 12. Probability-Probability plots


CHAPTER 13

Kaplan-Meier estimate of reliability

The Kaplan-Meier estimator provides a method by which to estimate the survival function (reliability function) of a
population without assuming that the data comes from a particular distribution. Due to the lack of parameters required
in this model, it is a non-parametric method of obtaining the survival function. With a few simple transformations,
the survival function (SF) can be used to obtain the cumulative hazard function (CHF) and the cumulative distribution
function (CDF). It is not possible to obtain a useful version of the probability density function (PDF) or hazard function
(HF) as this would require the differentiation of the CDF and CHF respectively, which results in a very spikey plot due
to the non-continuous nature of these plots.
The Kaplan-Meier estimator is very similar in result (but quite different in method) to the Nelson-Aalen estimator.
While neither has been proven to be more accurate than the other, the Kaplan-Meier estimator is generally more
popular as a non-parametric means of estimating the SF. Confidence intervals are provided using the Greenwood
method with Normal approximation.
The Kaplan-Meier estimator can be used with both complete and right censored data. This function can be accessed
from reliability.Nonparametric.KaplanMeier.
Inputs:
• failures - an array or list of failure times.
• right_censored - an array or list of right censored failure times. Defaults to None.
• show_plot - True/False. Default is True. Plots the SF.
• print_results - True/False. Default is True. Will display a pandas dataframe of results in the console.
• plot_CI - shades the upper and lower confidence interval
• CI - confidence interval between 0 and 1. Default is 0.95 for 95% CI.
• plot_type - SF, CDF, or CHF. Default is SF.
Outputs:
• results - dataframe of results
• KM - list of Kaplan-Meier column from results dataframe. This column is the non parametric estimate of the
Survival Function (reliability function).

81
reliability Documentation, Release latest

• xvals - the x-values to plot the stepwise plot as seen when show_plot=True
• SF - survival function stepwise values (these differ from the KM values as there are extra values added in to
make the plot into a step plot)
• CDF - cumulative distribution function stepwise values
• CHF - cumulative hazard function stepwise values
• SF_lower - survival function stepwise values for lower CI
• SF_upper - survival function stepwise values for upper CI
• CDF_lower - cumulative distribution function stepwise values for lower CI
• CDF_upper - cumulative distribution function stepwise values for upper CI
• CHF_lower - cumulative hazard function stepwise values for lower CI
• CHF_upper - cumulative hazard function stepwise values for upper CI
Other plotting keywords (such as color, label, linestyle, etc.) are accepted and used on the point estimate line. The
color of the confidence intervals is matched automatically to the point estimate line, but no other keywords are carried
across to the confidence interval plot as it is only a shaded region.
In this first example, we will provide Kaplan-Meier with a list of failure times and right censored times. By leav-
ing everything else unspecified, the plot will be shown with the confidence intervals shaded. We will layer this first
Kaplan-Meier plot with a second one using just the failure data. As can be seen in the example below, the impor-
tance of including censored data is paramount to obtain an accurate estimate of the reliability, because without it the
population’s survivors are not included so the reliability will appear much lower than it truly is.
from reliability.Nonparametric import KaplanMeier
import matplotlib.pyplot as plt
f = [5248, 7454, 16890, 17200, 38700, 45000, 49390, 69040, 72280, 131900]
rc = [3961, 4007, 4734, 6054, 7298, 10190, 23060, 27160, 28690, 37100, 40060, 45670,
˓→53000, 67000, 69630, 77350, 78470, 91680, 105700, 106300, 150400]

KaplanMeier(failures=f, right_censored=rc, label='Failures + right censored',print_


˓→results=False)

KaplanMeier(failures=f, label='Failures only') #this will print results to console


plt.title('Kaplan-Meier estimates showing the\nimportance of including censored data')
plt.xlabel('Miles to failure')
plt.legend()
plt.show()

'''
Censoring code (censored=0) Items remaining Kaplan Meier Estimate
˓→ Lower CI bound Upper CI bound
Failure times
5248.0 1.0 10 0.9
˓→ 0.714061 1.000000
7454.0 1.0 9 0.8
˓→ 0.552082 1.000000
16890.0 1.0 8 0.7
˓→ 0.415974 0.984026
17200.0 1.0 7 0.6
˓→ 0.296364 0.903636
38700.0 1.0 6 0.5
˓→ 0.190102 0.809898
45000.0 1.0 5 0.4
˓→ 0.096364 0.703636
49390.0 1.0 4 0.3
˓→ 0.015974 0.584026
(continues on next page)

82 Chapter 13. Kaplan-Meier estimate of reliability


reliability Documentation, Release latest

(continued from previous page)


69040.0 1.0 3 0.2
˓→ 0.000000 0.447918
72280.0 1.0 2 0.1
˓→ 0.000000 0.285939
131900.0 1.0 1 0.0
˓→ 0.000000 0.000000
'''

In this second example, we will create some data from a Weibull distribution, and then right censor the data above our
chosen threshold. We will then fit a Weibull_2P distribution to the censored data, and also obtain the Kaplan-Meier
estimate of this data. Using the results from the Fit_Weibull_2P and the Kaplan-Meier estimate, we will plot the CDF,
SF, and CHF, for both the Weibull and Kaplan-Meier results. Note that the default plot from KaplanMeier will only
give you the SF, but the results object provides everything you need to reconstruct the SF plot yourself, as well as what
we need to plot the CDF and CHF.

from reliability.Distributions import Weibull_Distribution


from reliability.Fitters import Fit_Weibull_2P
from reliability.Nonparametric import KaplanMeier
from reliability.Other_functions import make_right_censored_data
import matplotlib.pyplot as plt

dist = Weibull_Distribution(alpha=5, beta=2) # create a distribution


raw_data = dist.random_samples(100, seed=2) # get some data from the distribution.
˓→Seeded for repeatability

(continues on next page)

83
reliability Documentation, Release latest

(continued from previous page)


data = make_right_censored_data(raw_data, threshold=9)
wbf = Fit_Weibull_2P(failures=data.failures, right_censored=data.right_censored, show_
˓→probability_plot=False, print_results=False) # Fit the Weibull_2P

# Create the subplots and in each subplot we will plot the parametric distribution
˓→and obtain the Kaplan Meier fit.

# Note that the plot_type is being changed each time


plt.figure(figsize=(12, 5))
plt.subplot(131)
KaplanMeier(failures=data.failures, right_censored=data.right_censored, plot_type='SF
˓→', print_results=False, label='Kaplan-Meier')

wbf.distribution.SF(label='Parametric')
plt.legend()
plt.title('SF')
plt.subplot(132)
KaplanMeier(failures=data.failures, right_censored=data.right_censored, plot_type='CDF
˓→', print_results=False, label='Kaplan-Meier')

wbf.distribution.CDF(label='Parametric')
plt.legend()
plt.title('CDF')
plt.subplot(133)
KaplanMeier(failures=data.failures, right_censored=data.right_censored, plot_type='CHF
˓→', print_results=False, label='Kaplan-Meier')

wbf.distribution.CHF(label='Parametric')
plt.legend()
plt.title('CHF')
plt.subplots_adjust(left=0.07, right=0.95, top=0.92, wspace=0.25) # format the plot
˓→layout

plt.show()

84 Chapter 13. Kaplan-Meier estimate of reliability


reliability Documentation, Release latest

85
reliability Documentation, Release latest

86 Chapter 13. Kaplan-Meier estimate of reliability


CHAPTER 14

Nelson-Aalen estimate of reliability

The Nelson-Aalen estimator provides a method by which to estimate the hazard function of a population without
assuming that the data comes from a particular distribution. From the hazard function, the Nelson-Aalen method
obtains the cumulative hazard function, which is then used to obtain the survival function. Due to the lack of parameters
required in this model, it is a non-parametric method of obtaining the survival function. As with the Kaplan-Meier
estimator, once we have the survival function (SF), then we also have the cumulative hazard function (CHF) and the
cumulative distribution function (CDF). It is not possible to obtain a useful version of the probability density function
(PDF) or hazard function (HF). While the hazard function is obtained directly by the Nelson-Aalen method, it is a
useless function on its own as it is a very spikey plot due to the non-continuous nature of the hazard. It is only when
we smooth the results out using the cumulative hazard function that we obtain some utility from the results.
The Nelson-Aalen estimator is very similar in result (but quite different in method) to the Kaplan-Meier estimator.
While neither has been proven to be more accurate than the other, the Kaplan-Meier estimator is generally more
popular as a non-parametric means of estimating the SF. Confidence intervals are provided using the Greenwood
method with Normal approximation.
The Nelson-Aalen estimator can be used with both complete and right censored data. This function can be accessed
from reliability.Nonparametric.NelsonAalen.
Inputs:
• failures - an array or list of failure times.
• right_censored - an array or list of right censored failure times. Defaults to None.
• show_plot - True/False. Default is True. Plots the SF.
• print_results - True/False. Default is True. Will display a pandas dataframe of results in the console.
• plot_CI - shades the upper and lower confidence interval
• CI - confidence interval between 0 and 1. Default is 0.95 for 95% CI.
• plot_type - SF, CDF, or CHF. Default is SF.
Outputs:
• results - dataframe of results

87
reliability Documentation, Release latest

• NA - list of Nelson-Aalen column from results dataframe. This column is the non parametric estimate of the
Survival Function (reliability function).
• xvals - the x-values to plot the stepwise plot as seen when show_plot=True
• SF - survival function stepwise values (these differ from the NA values as there are extra values added in to
make the plot into a step plot)
• CDF - cumulative distribution function stepwise values
• CHF - cumulative hazard function stepwise values
• SF_lower - survival function stepwise values for lower CI
• SF_upper - survival function stepwise values for upper CI
• CDF_lower - cumulative distribution function stepwise values for lower CI
• CDF_upper - cumulative distribution function stepwise values for upper CI
• CHF_lower - cumulative hazard function stepwise values for lower CI
• CHF_upper - cumulative hazard function stepwise values for upper CI
Other plotting keywords (such as color, label, linestyle, etc.) are accepted and used on the point estimate line. The
color of the confidence intervals is matched automatically to the point estimate line, but no other keywords are carried
across to the confidence interval plot as it is only a shaded region.
In the example below, we will compare the results from the Kaplan-Meier estimator with the results from the Nelson-
Aalen estimator. We will also extract the column of point estimates from the results and print these for each method
in a dataframe.
from reliability.Nonparametric import KaplanMeier, NelsonAalen
import matplotlib.pyplot as plt
import pandas as pd
failures = [5248,7454,16890,17200,38700,45000,49390,69040,72280,131900]
censored = [3961,4007,4734,6054,7298,10190,23060,27160,28690,37100,40060,45670,53000,
˓→67000,69630,77350,78470,91680,105700,106300,150400]

KM = KaplanMeier(failures=failures,right_censored=censored,label='Kaplan-Meier',print_
˓→results=False)

NA = NelsonAalen(failures=failures,right_censored=censored,label='Nelson-Aalen',print_
˓→results=False)

plt.title('Comparison of Kaplan-Meier vs Nelson-Aalen\nwith 95% CI bounds')


plt.legend()

#print a table of the SF estimates for each method


data = {'Kaplan-Meier': KM.KM,'Nelson-Aalen': NA.NA}
df = pd.DataFrame (data, columns = ['Kaplan-Meier','Nelson-Aalen'])
print(df)

plt.show()

'''
Kaplan-Meier Nelson-Aalen
0 1.000000 1.000000
1 1.000000 1.000000
2 1.000000 1.000000
3 0.964286 0.964916
4 0.964286 0.964916
5 0.964286 0.964916
6 0.925714 0.927081
7 0.925714 0.927081
(continues on next page)

88 Chapter 14. Nelson-Aalen estimate of reliability


reliability Documentation, Release latest

(continued from previous page)


8 0.885466 0.887637
9 0.845217 0.848193
10 0.845217 0.848193
11 0.845217 0.848193
12 0.845217 0.848193
13 0.845217 0.848193
14 0.795499 0.799738
15 0.795499 0.799738
16 0.742465 0.748161
17 0.742465 0.748161
18 0.685353 0.692768
19 0.685353 0.692768
20 0.685353 0.692768
21 0.616817 0.626842
22 0.616817 0.626842
23 0.539715 0.553186
24 0.539715 0.553186
25 0.539715 0.553186
26 0.539715 0.553186
27 0.539715 0.553186
28 0.539715 0.553186
29 0.269858 0.335524
30 0.269858 0.335524
'''

89
reliability Documentation, Release latest

Two further examples are provided in the documentation for the Kaplan-Meier estimator as this function is written to
work exactly the same way as the Nelson-Aalen estimator.

90 Chapter 14. Nelson-Aalen estimate of reliability


CHAPTER 15

Stress-Strength interference for any distributions

Stress-Strength interference is a model to predict the probability of failure when the probability distributions of the
stress and the strength are known. Failure is defined as when stress > strength. The model calculates the probability
of failure through integration.
If both the stress and strength distributions are normal distributions, then there exists a simple analytical solution
which will give an exact result. For this method, use the function Probability_of_failure_normdist. For two Normal
distributions, the difference between the integration method and the analytical method is very small (around 1e-11).
Inputs:
• stress - a probability distribution from the Distributions module
• strength - a probability distribution from the Distributions module
• show_distribution_plot - True/False (default is True)
• print_results - True/False (default is True)
• warn - a warning will be issued if both stress and strength are Normal as you should use Probabil-
ity_of_failure_normdist. You can supress this using warn=False
Outputs:
• the probability of failure
• the distribution plot (only shown if show_distribution_plot=True)
• results printed to console (only shown if print_results=True)
In this example, we will create a stress and strength distribution, and leaving everything else as dafault, we will see
the results printed and the distribution plot.

from reliability import Distributions


from reliability.Stress_strength import Probability_of_failure
import matplotlib.pyplot as plt

stress = Distributions.Weibull_Distribution(alpha=2, beta=3, gamma=1)


strength = Distributions.Gamma_Distribution(alpha=2, beta=3, gamma=3)
(continues on next page)

91
reliability Documentation, Release latest

(continued from previous page)


result = Probability_of_failure(stress=stress, strength=strength)
plt.show()

'''
Probability of failure: 0.0017078240697779028
'''

Note: In Version 0.5.0 the calculation method was changed from monte-carlo to integration. This resulted in an API
change. See the Changelog for details.

Easter Egg: The logo for reliability was created using the stress strength interference plot of two Weibull
distributions.

92 Chapter 15. Stress-Strength interference for any distributions


reliability Documentation, Release latest

93
reliability Documentation, Release latest

94 Chapter 15. Stress-Strength interference for any distributions


CHAPTER 16

Stress-Strength interference for normal distributions

Stress-Strength interference is a model to predict the probability of failure when the probability distributions of the
stress and the strength are known. The model calculates the probability of failure by determining the probability
that a random stress (drawn from a stress distribution) is greater than a random strength (drawn from a strength
distribution). If you are working with stress and strength distributions that are not both Normal distributions, then you
will need to use the integration method provided in Probability_of_failure. However, if both the stress and strength
distributions are Normal distributions then the exact solution can be found using an analytical approach provided here
in Probability_of_failure_normdist.
Inputs:
• stress - a Normal probability distribution from the Distributions module
• strength - a Normal probability distribution from the Distributions module
• show_distribution_plot - True/False (default is True)
• print_results - True/False (default is True)
Outputs:
• the probability of failure
• the distribution plot (only shown if show_distribution_plot=True)
• results printed to console (only shown if print_results=True)
In this example, we will create a stress and strength distribution (both of which are Normal distributions), and leaving
everything else as default, we will see the results plotted and printed. Unlike the integration method this approach is
analytical, though a comparison of the two methods reveals the error in the integration method is negligible (around
1e-11).

from reliability import Distributions


from reliability.Stress_strength import Probability_of_failure_normdist
import matplotlib.pyplot as plt
stress = Distributions.Normal_Distribution(mu=20,sigma=6)
strength = Distributions.Normal_Distribution(mu=40,sigma=7)
result = Probability_of_failure_normdist(stress=stress, strength=strength)
(continues on next page)

95
reliability Documentation, Release latest

(continued from previous page)


plt.show()

'''
Probability of failure: 0.015029783946206214
'''

96 Chapter 16. Stress-Strength interference for normal distributions


CHAPTER 17

Reliability growth

Reliability growth occurs involves gradual product improvement through the elimination of design deficiencies. In
repairable systems, reliability growth is observable through an increase in the interarrival times of failures. Reliability
growth is applicable to all levels of design decomposition from complete systems down to components.
The Duane method of modeling reliability growth involves the use of the total time on test [t] (we may also use distance,
cycles, etc.) when the failure occurred and the sequence of the failure [N]. The cumulative mean time between failures
(MTBF) is t/N. By plotting ln(t) vs ln(t/N) we obtain a straight line which is used get the parameters lambda and beta.
Using these parameters, we can model the instantaneous MTBF in the form (t^(1-beta))/(lambda*beta). The function
reliability_growth accepts the failure times and performs this model fitting to obtain the parameters lambda
and beta, as well as produce the reliability growth plot. It is often of interest to know how much total time on test we
need to meet a target MTBF. This can be found analytically and is included in this function.
Inputs:
• times - array of list of failure times
• xmax - xlim to plot up to. Default is 1.5*max(times)
• target_MTBF - specify the target MTBF to obtain the total time on test required to reach it.
• show_plot - True/False. Defaults to True. Other keyword arguments (such as color, title, etc) are also accepted
and used.
• print_results - True/False. Defaults to True.
Outputs:
• Lambda - the lambda parameter from the Duane model
• Beta - the beta parameter from the Duane model
• time_to_target - The time (from the start of the test) until the target MTBF is reached. time_to_target is only
returned if target_MTBF is specified.
• If show_plot is True, it will plot the reliability growth. Use plt.show() to show the plot.
• If print_results is True, it will print a summary of the fitted parameters and time to target MTBF (if target is
specified).

97
reliability Documentation, Release latest

In the example below, we supply the total time on test when each failure occurred, and we also supply the target_MTBF
as 50000 so that we can find out how much total time on test will be needed to reach the target MTBF.

from reliability.Repairable_systems import reliability_growth


import matplotlib.pyplot as plt
times = [10400,26900,43400,66400,89400,130400,163400,232000,242000,340700]
reliability_growth(times=times,target_MTBF=50000,label='Reliability growth curve',
˓→xmax=500000)

plt.legend()
plt.show()

'''
Reliability growth model parameters:
lambda: 0.002355878294089656
beta: 0.6638280053477188
Time to reach target MTBF: 428131.18039448344
'''

98 Chapter 17. Reliability growth


reliability Documentation, Release latest

99
reliability Documentation, Release latest

100 Chapter 17. Reliability growth


CHAPTER 18

Optimal replacement time

When conducting maintenance planning, we must optimise the frequency of preventative maintenance (PM) for the
minimum overall cost. If PM is conducted too frequently then we will have high costs, but if not conducted often
enough then failures will result and we incur the higher cost of corrective maintenance (CM). Depending on the
underlying failure distribution, it is possible to model these costs for a range of PM intervals, with the lowest cost per
unit time resulting from the optimal replacement time. This function calculates the cost per unit time to determine
how cost varies with replacement time. The cost model can be used for HPP (ie. the maintenance makes the system
“as good as new”) or Power Law NHPP (ie. the maintenance makes the system “as good as old”). The default is for
“as good as new”. Currently, this model is only implemented to use the Weibull distribution as the underlying failure
distribution.
Inputs:
• cost_PM - cost of preventative maintenance (must be smaller than cost_CM)
• cost_CM - cost of corrective maintenance (must be larger than cost_PM)
• weibull_alpha - scale parameter of the underlying Weibull distribution
• weibull_beta - shape parameter of the underlying Weibull distribution. Should be greater than 1 otherwise
conducting PM is not economical.
• q - restoration factor. q=1 is Power Law NHPP (as good as old), q=0 is HPP (as good as new). Default is q=0
(as good as new).
• show_plot - True/False. Defaults to True. Other plotting keywords are also accepted and used.
• print_results - True/False. Defaults to True
Outputs:
• ORT - the optimal replacement time
• min_cost - the minimum cost per unit time
• Plot of cost model if show_plot is set to True. Use plt.show() to display it.
• Printed results if print_results is set to True.

101
reliability Documentation, Release latest

In the example below, we provide the cost of preventative maintenance (cost_PM), and the cost of corrective mainte-
nance (cost_CM), as well as the weibull parameters of the failure distribution. Leaving the default outputs, we obtain
a plot of the cost per unit time and the printed results. This example is based of the example provided on the reliasoft
article.

from reliability.Repairable_systems import optimal_replacement_time


import matplotlib.pyplot as plt
optimal_replacement_time(cost_PM=1, cost_CM=5, weibull_alpha=1000, weibull_beta=2.5,
˓→q=0)

plt.show()

'''
Cost model assuming as good as new replacement (q=0):
The minimum cost per unit time is 0.0035
The optimal replacement time is 493.19
'''

102 Chapter 18. Optimal replacement time


reliability Documentation, Release latest

103
reliability Documentation, Release latest

104 Chapter 18. Optimal replacement time


CHAPTER 19

ROCOF

ROCOF is the Rate of Occurrence of Failures. It is used to model the trend (constant, increasing, decreasing) in the
failure interarrival times. For a repairable system, we want the ROCOF to be improving (failure interarrival times to
be increasing). As failure times can often appear quite random, it is necessary to conduct a statistical test to determine
if there is a statistically significant trend, and if there is a trend we can then model that trend using a Power Law NHPP.
The test for statistical significance is the Laplace test which compares the Laplace test statistic (U) with the z value
(z_crit) from the standard normal distribution. If there is a statistically significant trend, the parameters of the model
(Lambda_hat and Beta_hat) are calculated. By default the results are printed and a plot of the failure interarrival times
and MTBF is plotted.
Inputs:
• times_between_failures - these are the failure interarrival times.
• failure_times - these are the actual failure times. Note 1: You can specify either times_between_failures OR
failure_times but not both. Both options are provided for convenience so the conversion between the two is done
internally. failure_times should be the same as np.cumsum(times_between_failures). Note 2: The repair time is
assumed to be negligible. If the repair times are not negligibly small then you will need to manually adjust your
input to factor in the repair times.
• test_end - use this to specify the end of the test if the test did not end at the time of the last failure.
• CI - the confidence interval for the Laplace test. Default is 0.95 for 95% CI.
• show_plot - True/False. Default is True. Plotting keywords are also accepted (eg. color, linestyle).
• print_results - True/False. Default is True
Outputs:
• U - The Laplace test statistic
• z_crit - (lower,upper) bound on z value. This is based on the CI.
• trend - ‘improving’,’worsening’,’constant’. This is based on the comparison of U with z_crit
• Beta_hat - the Beta parameter for the NHPP Power Law model. Only calculated if the trend is not constant.
• Lambda_hat - the Lambda parameter for the NHPP Power Law model. Only calculated if the trend is not
constant.

105
reliability Documentation, Release latest

• ROCOF - the Rate of Occurrence Of Failures. Only calculated if the trend is constant. If trend is not constant
then ROCOF changes over time in accordance with Beta_hat and Lambda_hat.
• printed results. Only printed if print_results is True.
• plotted results. Only plotted if plot_results is True. Use plt.show() to display it.
In the example below, we provide the failure interarrival times. The function will run the Laplace test using the default
95% confidence interval and then, when a trend is found, it will plot the MTBF based on the calculated NHPP Power
Law model. MTBF = 1/ROCOF.

from reliability.Repairable_systems import ROCOF


import matplotlib.pyplot as plt
t = [104,131,1597,59,4,503,157,6,118,173,114,62,101,216,106,140,1,102,3,393,96,232,89,
˓→61,37,293,7,165,87,99]

ROCOF(times_between_failures=t)
plt.show()

'''
Laplace test results: U = 2.409, z_crit = (-1.96,+1.96)
At 95 % confidence level the ROCOF is WORSENING. Assume NHPP.
ROCOF assuming NHPP has parameters: Beta_hat = 1.588 , Lambda_hat = 3.703e-05
'''

106 Chapter 19. ROCOF


reliability Documentation, Release latest

107
reliability Documentation, Release latest

108 Chapter 19. ROCOF


CHAPTER 20

Mean cumulative function

The Mean Cumulative Function (MCF) is a cumulative history function that shows the cumulative number of recur-
rences of an event, such as repairs over time. In the context of repairs over time, the value of the MCF can be thought
of as the average number of repairs that each system will have undergone after a certain time. It is only applicable to
repairable systems and assumes that each event (repair) is identical. For the non-parametric MCF it does not assume
that each system’s MCF is identical, but this assumption is made for the parametric MCF.
The shape of the MCF is a key indicator that shows whether the systems are improving, worsening, or staying the
same over time. If the MCF is concave down (appearing to level out) then the system is improving. A straight line
(constant increase) indicates it is staying the same. Concave up (getting steeper) shows the system is worsening as
repairs are required more frequently as time progresses.
Obtaining the MCF from failure times is an inherently non-parametric process (similar to Kaplan-Meier), but once the
values are obtained, a model can be fitted to obtain the parametric estimate of the MCF. Each of these two approaches
is described below as they are performed by seperate functions within reliability.Repairable_systems.
Note that in some textbooks and academic papers the Mean Cumulative Function is also referred to as the Cumulative
Intensity Function (CIF). These are two names for the same thing. If the shape of your MCF is more of an S than a
single smooth curve, you may have a change in operating condition or in the repair effectiveness factor. This can be
dealt with by splitting the MCF into segments, however, such models are more complex and are generally only found
in academic literature.

20.1 Non-parametric MCF

The non-parametric estimate of the MCF provides both the estimate of the MCF and the confidence bounds at a
particular time. The procedure to obtain the non-parametric MCF is outlined here. The confidence bounds are the
one-sided bounds as this was chosen to align with the method used by Reliasoft.
Inputs:
• data - the repair times for each system. Format this as a list of lists. eg. data=[[4,7,9],[3,8,12]] would be the
data for 2 systems. The largest time for each system is assumed to be the retirement time and is treated as a
right censored value. If the system was retired immediately after the last repair then you must include a repeated
value at the end as this will be used to indicate a right censored value. eg. A system that had repairs at 4, 7, and

109
reliability Documentation, Release latest

9 then was retired after the last repair would be entered as data = [4,7,9,9] since the last value is treated as a right
censored value. If you only have data from 1 system you may enter the data in a single list as data = [3,7,12]
and it will be nested within another list automatically.
• print_results - prints the table of MCF results (state, time, MCF_lower, MCF, MCF_upper, variance)
• CI - Confidence interval. Default is 0.95 for 95% CI (one sided).
• show_plot - if True the plot will be shown. Default is True. Use plt.show() to show it.
• plot_CI - the plot will include the confidence intervals. Default is True.
Outputs:
• If print_results is True, a table of the results will be printed showing state, time, MCF_lower, MCF, MCF_upper,
variance. In this table state is F for failure or C for right censored (retirement).
• If show_plot is True, the MCF plot will be shown.
• results - this is a dataframe of the results that are printed. It includes the blank lines for censored values
• time - this is the time column from results. Blank lines for censored values are removed
• MCF - this is the MCF column from results. Blank lines for censored values are removed
• variance - this is the Variance column from results. Blank lines for censored values are removed
• lower - this is the MCF_lower column from results. Blank lines for censored values are removed
• upper - this is the MCF_upper column from results. Blank lines for censored values are removed
The following example is taken from an example provided by Reliasoft. The failure times and retirement times
(retirement time is indicated by +) of 5 systems are:

from reliability.Repairable_systems import MCF_nonparametric


import matplotlib.pyplot as plt
times = [[5, 10, 15, 17], [6, 13, 17, 19], [12, 20, 25, 26], [13, 15, 24], [16, 22,
˓→25, 28]]

MCF_nonparametric(data=times)
plt.show()

'''
Mean Cumulative Function results (95.0% CI)
time MCF_lower MCF MCF_upper variance
state
F 5 0.0459299 0.2 0.870893 0.032
F 6 0.14134 0.4 1.13202 0.064
F 10 0.256603 0.6 1.40294 0.096
F 12 0.383374 0.8 1.66939 0.128
F 13 0.517916 1 1.93081 0.16
F 13 0.658169 1.2 2.18789 0.192
F 15 0.802848 1.4 2.44131 0.224
(continues on next page)

110 Chapter 20. Mean cumulative function


reliability Documentation, Release latest

(continued from previous page)


F 15 0.951092 1.6 2.69164 0.256
F 16 1.10229 1.8 2.93935 0.288
F 17 1.25598 2 3.18478 0.32
C 17
C 19
F 20 1.49896 2.33333 3.63215 0.394074
F 22 1.74856 2.66667 4.06684 0.468148
C 24
F 25 2.12259 3.16667 4.72431 0.593148
F 25 2.5071 3.66667 5.36255 0.718148
C 26
C 28
'''

20.2 Parametric MCF

The estimates of the parametric MCF are obtained using MCF_nonparametric as this is the procedure required to
obtain the points for the plot. To these points a Non-Homogeneous Poisson Process (NHPP) parametric model is fitted
of the form:
𝑀 𝐶𝐹 (𝑡) = ( 𝛼𝑡 )𝛽
You may notice that this looks identical to the Weibull CHF, but despite this similarity, they are entirely different

20.2. Parametric MCF 111


reliability Documentation, Release latest

functions and the alpha and beta parameters from the MCF cannot be applied to a Weibull distribution for fitting the
repair times or repair interarrival times.
The purpose of fitting a parametric model is to obtain the shape parameter (𝛽) which indicates the long term health
of the system/s. If the MCF is concave down (𝛽<1) then the system is improving. A straight line (𝛽=1) indicates it
is staying the same. Concave up (𝛽>1) shows the system is worsening as repairs are required more frequently as time
progresses.
Many methods exist for fitting the model to the data. Within reliability, scipy.optimize.curve_fit is used which returns
the covariance matrix and allows for the confidence intervals to be calculated using the appropriate formulas.
Inputs:
• data - the repair times for each system. Format this as a list of lists. eg. data=[[4,7,9],[3,8,12]] would be the
data for 2 systems. The largest time for each system is assumed to be the retirement time and is treated as a
right censored value. If the system was retired immediately after the last repair then you must include a repeated
value at the end as this will be used to indicate a right censored value. eg. A system that had repairs at 4, 7, and
9 then was retired after the last repair would be entered as data = [4,7,9,9] since the last value is treated as a right
censored value. If you only have data from 1 system you may enter the data in a single list as data = [3,7,12]
and it will be nested within another list automatically.
• CI - the confidence interval. Default is 0.95 for 95% CI.
• print_results - prints the fitted parameters (alpha and beta) of the parametric MCF model.
• show_plot - if True the plot will be shown. Default is True. Use plt.show() to show it.
• plot_CI - True/False. Plots the confidence intervals. Default is True.
Outputs:
• If print_results is True, the model parameters will be printed along with a brief diagnosis of the long term health
of the system based on the beta parameter.
• time - this is the times (x values) from the scatter plot. This value is calculated using MCF_nonparametric.
• MCF - this is the MCF (y values) from the scatter plot. This value is calculated using MCF_nonparametric.
• alpha - the calculated alpha parameter
• beta - the calculated beta parameter
• alpha_SE - the standard error in the alpha parameter
• beta_SE - the standard error in the beta parameter
• cov_alpha_beta - the covariance between the parameters
• alpha_upper - the upper CI estimate of the parameter
• alpha_lower - the lower CI estimate of the parameter
• beta_upper - the upper CI estimate of the parameter
• beta_lower - the lower CI estimate of the parameter
• results - a dataframe of the results (point estimate, standard error, Lower CI and Upper CI for each parameter)
The following example uses the same data as the MCF_nonparametric example provided above. From the output we
can clearly see that the system is degrading over time as repairs are needed more frequently.

from reliability.Repairable_systems import MCF_parametric


import matplotlib.pyplot as plt
times = [[5, 10, 15, 17], [6, 13, 17, 19], [12, 20, 25, 26], [13, 15, 24], [16, 22,
˓→25, 28]]

(continues on next page)

112 Chapter 20. Mean cumulative function


reliability Documentation, Release latest

(continued from previous page)


MCF_parametric(data=times)
plt.show()

'''
Mean Cumulative Function Parametric Model (95% CI):
MCF = (t/𝛼)^𝛽
Point Estimate Standard Error Lower CI Upper CI
Parameter
Alpha 11.980590 0.401372 11.219187 12.793666
Beta 1.673622 0.094654 1.498017 1.869813
Since Beta is greater than 1, the system repair rate is WORSENING over time.
'''

The parametric model that is fitted to the MCF is not always an appropriate model. The example below shows data
from a collection of systems, some of which are improving and some are worsening. The net effect is an S-shaped
MCF. The power model used by MCF_parametric is not able to accurately follow an S-shaped dataset. In this case,
the MCF_nonparametric model is more appropriate, though there are some other parametric models (discussed in the
first paragraph) which may be useful to model this dataset.

from reliability.Repairable_systems import MCF_parametric


from reliability.Datasets import MCF_2
import matplotlib.pyplot as plt

times = MCF_2().times
(continues on next page)

20.2. Parametric MCF 113


reliability Documentation, Release latest

(continued from previous page)


MCF_parametric(data=times, print_results=False)
plt.show()

114 Chapter 20. Mean cumulative function


CHAPTER 21

SN diagram

This function will plot the stress vs number of cycles (S-N) diagram when supplied with data from a series of fatigue
tests. An S-N diagram is a common procedure used to model the fatigue life of metals which are subject to known
cyclic loads. Typically, the plot is done using a semilog scale where the number of cycles is scaled logarithmically.
This has the effect of linearizing the plot and making the accuracy of the model much easier to visualize. For steels,
titanium alloys, and some other metals, there exists an endurance limit. This limit is the minimum stress required to
propagate faigue cracks, and all stresses below this endurance limit do not contribute to fatigue growth. The plot can
be adjusted to use an endurance limit using the optional inputs, however, there must be runout data (equivalent to right
censored data) supplied in order for the program to determine where to set the endurance limit.
Inputs:
• stress - an array or list of stress values at failure
• cycles - an array or list of cycles values at failure
• stress_runout - an array or list of stress values that did not result in failure Optional
• cycles_runout - an array or list of cycles values that did not result in failure. Optional
• xscale - ‘log’ or ‘linear’. Default is ‘log’. Adjusts the x-scale.
• stress_trace - an array or list of stress values to be traced across to cycles values.
• cycles_trace - an array or list of cycles values to be traced across to stress values.
• show_endurance_limit - This will adjust all lines of best fit to be greater than or equal to the average
stress_runout. Defaults to False if stress_runout is not specified. Defaults to True if stress_runout is speci-
fied.
• method_for_bounds - ‘statistical’, ‘residual’, or None. Defaults to ‘statistical’. If set to ‘statistical’ the CI value
is used, otherwise it is not used for the ‘residual’ method. Residual uses the maximum residual datapoint for
symmetric bounds. Setting the method for bounds to None will turn off the confidence bounds.
• CI - Must be between 0 and 1. Default is 0.95 for 95% confidence interval. Only used if method_for_bounds =
‘statistical’
• Other plotting keywords (eg. color, linestyle, etc) are accepted for the line of best fit.
Outputs:

115
reliability Documentation, Release latest

• The plot is the only output. All calculated values are shown on the plot.
In this first example, we use the data for stress and cycles to produce an S-N diagram. We will not provide any runout
data here so the endurance limit will not be calculated.
from reliability.PoF import SN_diagram
import matplotlib.pyplot as plt
stress = [340, 300, 290, 275, 260, 255, 250, 235, 230, 220, 215, 210]
cycles = [15000, 24000, 36000, 80000, 177000, 162000, 301000, 290000, 361000, 881000,
˓→1300000, 2500000]

SN_diagram(stress=stress, cycles=cycles)
plt.show()

In this second example, we will use the same data as above, but also supply runout data so that the endurance limit will
be calculated. We will also adjust the method_for_bounds to be ‘residual’. We are also going to find the life (in cycles)
at a stress of 260 by using stress_trace, and the stress required to achieve a life of 5x10^5 cycles using cycles_trace.
from reliability.PoF import SN_diagram
import matplotlib.pyplot as plt
stress = [340, 300, 290, 275, 260, 255, 250, 235, 230, 220, 215, 210]
cycles = [15000, 24000, 36000, 80000, 177000, 162000, 301000, 290000, 361000, 881000,
˓→1300000, 2500000]

stress_runout = [210, 210, 205, 205, 205]


cycles_runout = [10 ** 7, 10 ** 7, 10 ** 7, 10 ** 7, 10 ** 7]
SN_diagram(stress=stress, cycles=cycles, stress_runout=stress_runout, cycles_
˓→runout=cycles_runout,method_for_bounds='residual',cycles_trace=[5 * 10 ** 5],

˓→stress_trace=[260])
(continues on next page)

116 Chapter 21. SN diagram


reliability Documentation, Release latest

(continued from previous page)


plt.show()

References:
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson. pp.
17-21.

117
reliability Documentation, Release latest

118 Chapter 21. SN diagram


CHAPTER 22

Stress-strain and strain-life

In the strain-life method of fatigue analysis, the elastic and plastic deformation of the material is used to determine
how many cycles the material will last before failure. In this context, failure is defined as the formation of a small crack
(typically 1mm) so the geometry of the material does not need to be considered provided the material peoperties have
been accurately measured using a stress-strain test. This section of the documentation describes three functions which
are useful in strain-life analysis. The first of these is useful to fit the stress-strain and strain-life models to available
data, thereby providing the material properties. The second function is a diagram of the relationship between stress
and strain during cyclic fatigue which shows the hysteresis loop and finds the min and max stress and strain. The third
function produces the strain-life diagram and the equations for this diagram are used for calculating the number of
cycles to failure. Further detail is available below for each of the respective functions.
The equations used for stress-strain and strain life are:
𝜎 (︁ 𝜎 )︁ 𝑛1
Ramberg-Osgood equation: 𝜀𝑡𝑜𝑡 = +
⏟ 𝐸⏞ ⏟ 𝐾⏞
elastic plastic
(︂ )︂ 𝑛1
∆𝜎 ∆𝜎
Hysteresis curve equation: ∆𝜀 = +2
⏟ 𝐸⏞ ⏟
2𝐾

elastic plastic
𝜎𝑓 𝑏 𝑐
Coffin-Manson equation: 𝜀𝑡𝑜𝑡 = (2𝑁𝑓 ) + 𝜀𝑓 (2𝑁𝑓 )
⏟𝐸 ⏞ ⏟ ⏞
plastic
elastic
𝜎𝑓 − 𝜎𝑚 𝑏 𝑐
Morrow Mean Stress Correction: 𝜀𝑡𝑜𝑡 = (2𝑁𝑓 ) + 𝜀𝑓 (2𝑁𝑓 )
⏟ 𝐸 ⏞ ⏟ ⏞
plastic
elastic
(︂ )︂ 𝑐𝑏
𝜎𝑓 − 𝜎𝑚 𝑏 𝜎𝑓 − 𝜎𝑚 𝑐
Modified Morrow Mean Stress Correction: 𝜀𝑡𝑜𝑡 = (2𝑁𝑓 ) + 𝜀𝑓 (2𝑁𝑓 )
⏟ 𝐸 ⏞ 𝜎𝑓
⏟ ⏞
elastic plastic

119
reliability Documentation, Release latest

𝜎𝑓2 2𝑏 𝜎𝑓 𝜀𝑓 𝑏+𝑐
Smith-Watson-Topper Mean Stress Correction: 𝜀𝑡𝑜𝑡 = (2𝑁𝑓 ) + (2𝑁𝑓 )
𝜎𝑚𝑎𝑥 𝐸 𝜎𝑚𝑎𝑥
⏟ ⏞ ⏟ ⏞
elastic plastic

22.1 Stress-Strain and Strain-Life parameter estimation

The function stress_strain_life_parameters_from_data will use stress and strain data to calculate the
stress-strain parameters (K, n) from the Ramberg-Osgood relationship. If cycles is provided it will also produce the
strain-life parameters (sigma_f, epsilon_f, b, c) from the Coffin-Manson equation. You cannot find the strain-life
parameters without stress as we must use stress to find elastic strain. Note that if you already have the parameters K, n,
sigma_f, epsilon_f, b, c, then you can use the functions ‘stress_strain_diagram’ or ‘strain_life_diagram’ as described
below.
Inputs:
• strain - an array or list of strain
• stress - an array or list of stress
• E - The modulus of elasticity. Ensure this is in the same units as stress (typically MPa)
• cycles - the number of cycles to failure. Optional input. Required if you want to obtain the parameters sigma_f,
epsilon_f, b, c
• print_results - True/False. Default is True.
• show_plot - True/False. Default is True.
Outputs:
• The stress-strain plot will a be generated if show_plot is True. If cycles is provided then the strain-life plot will
also be generated. Use plt.show() to show any plots.
• The results will be printed in the console if print_results is True.
• K - the cyclic strength coefficient
• n - the cyclic strain hardening exponent
• sigma_f - the fatigue strength coefficient. Only generated if cycles is provided.
• epsilon_f - the fatigue strain coefficient. Only generated if cycles is provided.
• b - the elastic strain exponent. Only generated if cycles is provided.
• c - the plastic strain exponent. Only generated if cycles is provided.
Note that the parameters generated are stored to an object, so you may find it useful to use this function with
print_results=False and show_plot=False, and then access the calculated parameters later. This is done in the first
example for the section on stress-strain diagram.
In the example below, we provide data from a fatigue test including stress, strain, and cycles to failure. We must also
provide the modulus of elasticity (E) for the material. All other options are left as default values. The plots shown
below are provided and the results are printed to the console.
from reliability.PoF import stress_strain_life_parameters_from_data
import matplotlib.pyplot as plt
strain_data = [0.02, 0.015, 0.01, 0.006, 0.0035, 0.002]
stress_data = [650, 625, 555, 480, 395, 330]
cycles_data = [200, 350, 1100, 4600, 26000, 560000]
params = stress_strain_life_parameters_from_data(stress=stress_data, strain=strain_
˓→data,cycles=cycles_data, E=216000)
(continues on next page)

120 Chapter 22. Stress-strain and strain-life


reliability Documentation, Release latest

(continued from previous page)


plt.show()

'''
K (cyclic strength coefficient): 1462.4649152172044
n (strain hardening exponent): 0.19810419512368083
sigma_f (fatigue strength coefficient): 1097.405402055844
epsilon_f (fatigue strain coefficient): 0.23664541556833998
b (elastic strain exponent): -0.08898339316495743
c (plastic strain exponent): -0.4501077996416115
'''

22.1. Stress-Strain and Strain-Life parameter estimation 121


reliability Documentation, Release latest

22.2 Stress-Strain diagram

The function stress_strain_diagram is used to visualize how the stress and strain vary with successive load
cycles as described by the hysteresis curve equation. Due to residual tensile and compressive stresses, the stress and
strain in the material does not unload in the same way that it loads. This results in a hysteresis loop being formed and
this is the basis for crack propagation in the material leading to fatigue failure. The size of the hysteresis loop increases
for higher strains. Fatigue tests are typically strain controlled; that is they are subjected to a specified amount of strain
throughout the test, typically in a sinusoidal pattern. Fatigue tests may also be stress controlled, whereby the material
is subjected to a specified amount of stress. This function accepts either input (max_stress or max_strain) and will find
the corresponding stress or strain as required. If you do not specify min_stress or min_strain then it is assumed to be
negative of the maximum value.
The cyclic loading sequence defaults to begin with tension, but can be changed using ini-
tial_load_direction=’compression’. If your test begins with compression it is important to specify this as the
residual stresses in the material from the initial loading will affect the results for the first reversal. This difference is
caused by the Bauschinger effect. Only the initial loading and the first two reversals are plotted. For most materials
the shape of the hysteresis loop will change over many hundreds of cycles as a result of fatigue hardening (also known
as work-hardening) or fatigue-softening. More on this process is available in the eFatigue training documents.
Note that if you do not have the parameters K, n, but you do have stress and strain data then you can use the function
‘stress_strain_life_parameters_from_data’. This will be shown in the first example below.
Inputs:

122 Chapter 22. Stress-strain and strain-life


reliability Documentation, Release latest

• K - cyclic strength coefficient


• n - strain hardening exponent
• E - The modulus of elasticity. Ensure this is in the same units for which K and n were obtained (typically MPa)
• max_strain - the maximum strain to use for cyclic loading when plotting the hysteresis loop.
• max_stress - the maximum stress to use for cyclic loading when plotting the hysteresis loop. When specifying
min and max stress or strain, do not specify both stress and strain as the corresponding value will be automati-
cally calculated.
• min_strain - if this is not -max_strain then specify it here. Optional input.
• min_stress - if this is not -max_stress then specify it here. Optional input.
• initial_load_direction - ‘tension’ or ‘compression’. Default is ‘tension’.
Outputs:
• The stress-strain plot will always be generated. Use plt.show() to show it.
• If print_results is True, the calculated parameters below will be printed.
• max_stress
• max_strain
• min_stress
• min_strain
In the example below, we are using the same data from the first example, but this time, we will store the calculated
parameters in an object named ‘params’. Then we can specify the calculated parameters to the stress_strain_diagram
function. The hysteresis loop generated is for a strain-controlled fatigue test where the strain goes from -0.006 to
+0.006.

from reliability.PoF import stress_strain_life_parameters_from_data, stress_strain_


˓→diagram

import matplotlib.pyplot as plt


strain_data = [0.02, 0.015, 0.01, 0.006, 0.0035, 0.002]
stress_data = [650, 625, 555, 480, 395, 330]
cycles_data = [200, 350, 1100, 4600, 26000, 560000]
params = stress_strain_life_parameters_from_data(stress=stress_data, strain=strain_
˓→data, cycles=cycles_data, E=216000, show_plot=False, print_results=False)

stress_strain_diagram(E = 216000,n = params.n, K = params.K, max_strain=0.006)


plt.show()

'''
Max stress: 483.85816239406745
Min stress: -483.8581623940621
Max strain: 0.006
Min strain: -0.006
'''

22.2. Stress-Strain diagram 123


reliability Documentation, Release latest

In this second example, we will use the stress_strain_diagram to visualise the effects of residual stresses for a material
subjected to non-zero mean stress. The material parameters (K and n) are already known so we do not need to obtain
them from any data. We specify the max_stress is 378 MPa and the min_stress is -321 MPa. We will do this for two
scenarios; initial tensile load, and initial compressive load. Upon inspection of the results we see for the initial tensile
load, the min_stress in the material is actually -328.893 MPa which exceeds the min_stress we specified in our test.
When we have an initial compressive load, the max_stress is 385.893 MPa which exceeds the max_stress we specified
in our test. These results are not an error and are caused by the residual stresses in the material that were formed
during the first loading cycle. In the case of an initial tensile load, when the material was pulled apart in tension by
an external force, the material pulls back but due to plastic deformation, these internal forces in the material are not
entirely removed, such that when the first compressive load peaks, the material’s internal stresses add to the external
compressive forces. This phenomenon is important in load sequence effects for variable amplitude fatigue.
from reliability.PoF import stress_strain_diagram
import matplotlib.pyplot as plt
plt.figure()
plt.subplot(121)
print('Tension first:')
stress_strain_diagram(E=210000, K = 1200, n = 0.2, max_stress=378,min_stress=-321,
˓→initial_load_direction='tension')

plt.title('Cyclic loading - tension first')


plt.subplot(122)
print('Compression first:')
stress_strain_diagram(E=210000, K = 1200, n = 0.2, max_stress=378,min_stress=-321,
˓→initial_load_direction='compression')

plt.title('Cyclic loading - compression first')


(continues on next page)

124 Chapter 22. Stress-strain and strain-life


reliability Documentation, Release latest

(continued from previous page)


plt.gcf().set_size_inches(12,7)
plt.show()

'''
Tension first:
Max stress: 378.0
Min stress: -328.8931121800317
Max strain: 0.004901364196875
Min strain: -0.0028982508530831477
Compression first:
Max stress: 385.8931121800323
Min stress: -320.99999999999943
Max strain: 0.004901364196875
Min strain: -0.0028982508530831477
'''

22.3 Strain-Life diagram

The function strain_life_diagram provides a visual representation of the Coffin-Manson relationship between
strain and life. In this equation, strain is split into elastic strain and plastic strain which are shown on the plot as straight
lines (on a log-log scale), and life is represented by reversals (with 2 reversals per cycle). The total strain amplitude is
used to determine the fatigue life by solving the Coffin-Manson equation. When a min_stress or min_strain is specified
that results in a non-zero mean stress, there are several mean stress correction methods that are available. These are
‘morrow’, ‘modified_morrow’ (also known as Manson-Halford) , and ‘SWT’ (Smith-Watson-Topper). The default
method is ‘SWT’ but can be changed using the options described below. The equation used is displayed in the legend
of the plot. Also shown on the plot is the life of the material at the specified strain amplitude, and the transition life
(2Nt) for which the material failure transitions from being dominated by plastic strain to elastic strain.
Note that if you do not have the parameters sigma_f, epsilon_f, b, c, but you do have stress, strain, and cycles data
then you can use the function ‘stress_strain_life_parameters_from_data’.

22.3. Strain-Life diagram 125


reliability Documentation, Release latest

The residual stress in a material subjected to non-zero mean stress (as shown in the previous example) are not consid-
ered in this analysis, and the specified max and min values for stress or strain are taken as the true values to which the
material is subjected.
Inputs:
• E - The modulus of elasticity. Ensure this is in the same units for which K and n were obtained (typically MPa)
• sigma_f - fatigue strength coefficient
• epsilon_f - fatigue strain coefficient
• b - elastic strain exponent
• c - plastic strain exponent
• K - cyclic strength coefficient. Optional input. Only required if you specify max_stress or max_strain.
• n - strain hardening exponent. Optional input. Only required if you specify max_stress or max_strain.
• max_stress - specify the max_stress if you want cycles to failure. If specified, you will also need to specify K
and n.
• max_strain - specify the max_strain if you want cycles to failure.
• min_stress - if this is not -max_stress then specify it here. Optional input.
• min_strain - if this is not -max_strain then specify it here. Optional input. When specifying min and max stress
or strain, do not specify both stress and strain as the corresponding value will be automatically calculated. Only
specify the min if it is not -max
• mean_stress_correction_method - must be either ‘morrow’,’modified_morrow’, or ‘SWT’. Default is ‘SWT’.
This is only used if mean_stress is found to be non-zero.
• print_results - True/False. Defaults to True. If use_level_stress or use_level_strain is specified then the printed
results will be the cycles_to_failure
• show_plot - True/False. Default is True
Outputs:
• The strain-life plot will be generated if show_plot = True. Use plt.show() to show it.
• cycles_to_failure - only calculated if max_stress OR max_strain is specified. This will be printed if print_results
= True.

from reliability.PoF import strain_life_diagram


import matplotlib.pyplot as plt
strain_life_diagram(E=210000, sigma_f=1000, epsilon_f=1.1, b = -0.1,c=-0.6, K = 1200,
˓→n = 0.2, max_strain=0.0049, min_strain=-0.0029)

plt.show()

'''
Failure will occur in 13771.39 cycles (27542.78 reversals).
'''

126 Chapter 22. Stress-strain and strain-life


reliability Documentation, Release latest

References:
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson. pp.
23-33

22.3. Strain-Life diagram 127


reliability Documentation, Release latest

128 Chapter 22. Stress-strain and strain-life


CHAPTER 23

Fracture mechanics

Fracture mechanics is an approach to fatigue analysis that involves calculating the number of cycles until failure of a
component that is undergoing cyclic loading. We are generally interested in two values; the number of cycles needed
to initiate a crack, and the number of cycles to grow the crack to a certain size (usually the size for brittle fracture
to occur). The fracture mechanics functions described below are useful for solving typical fatigue problems given to
students. Unfortunately, the limitation of these functions is that they are only applicable to thin plates with through-
thickness cracks. As soon as you encounter a component that is not a thin plate, then the formulas required for analysis
will be different from those used below. This is part of what makes fracture mechanics such a complex topic that is
better handled by purpose-built fatigue analysis software such as Altair Hyperlife. Even in the relatively simple thin
flat plate geometry, there are many complications that make fracture mechanics a challenging subject. These include
variable amplitude loading, surface roughness, frequency effect, environmental effects (temperature, corrosion) and
other miscellaneous factors. Solving fracture mechanics problems for flat plates can still provide engineers with an
appreciation for how fatigue operates and the factors affecting fatigue so that they can incorporate these lessons learned
into their work.
Most textbooks (including Probabilistic Physics of Failure Approach to Reliability (2017) which was used to design
the functions below) apply a few simplifications for solving crack growth problems. These simplifications involve an
assumption that the stress in the component is constant, that the geometry factor is constant, and that the crack length
to cause failure (which has the geometry factor in its formula) is constant. These simplifications are necessary for hand
calculations, but in reality we know that they all must change as the crack length grows which necessitates an iterative
calculation. Both the simplified and iterative methods are included in the crack growth function. Also included in both
functions is the ability to solve these problems for notched components by providing the appropriate correction factors
for the notched geometry.

23.1 Crack initiation

The function fracture_mechanics_crack_initiation uses the material properties, the local cross-
sectional area, and force applied to the component to determine how many cycles until crack initiation (of a 1 mm
crack). Units should always be in MPa (and mm^2 for area). This function may be used for an un-notched or notched
component. If the component is un-notched, the parameters q and Kt may be left as their default values of 1.
While there are formulas to find the parameters q and Kt, these formulas have not been included here so that the

129
reliability Documentation, Release latest

function is reasonably generic to different materials and geometries. Resources for finding some of these parameters
if they are not given to you:
q = 1/(1+a/r) Where r is the notch radius of curvature (in mm), and a is 0.025*(2070/Su). Su is the ultimate strength
in MPa. This only applies to high strength steels where Su>550MPa.
Kt can be found from the eFatigue website which has an online calculator that will provide you with the appropriate
Kt for your notched geometry.
Inputs:
• P - Force applied on the component [units of MPa]
• A - Cross sectional area of the component (at the point of crack initiation) [units of mm^2]
• Sy - Yield strength of the material [units of MPa]
• E - Elastic modulus (Young’s modulus) [units of MPa]
• K - Strength coefficient of the material
• n - Strain hardening exponent of the material
• b - Elastic strain exponent of the material
• c - Plastic strain exponent of the material
• sigma_f - Fatigue strength coefficient of the material
• epsilon_f - Fatigue strain coefficient of the material
• q - Notch sensitivity factor. (default is 1 for no notch)
• Kt - stress concentration factor. (default is 1 for no notch)
• mean_stress_correction_method - must be either ‘morrow’, ’modified_morrow’, or ‘SWT’. Default is ‘modi-
fied_morrow’ as this is the same as the uncorrected Coffin-Manson relationship when mean stress is zero.
Outputs:
• The results will be printed to the console if print_results is True
• sigma_max
• sigma_min
• sigma_mean
• epsilon_max
• epsilon_min
• epsilon_mean
• cycles_to_failure
In the following example we will provide the function with the appropriate inputs for our problem (taken from Ex-
ample 2.8 in Probabilistic Physics of Failure Approach to Reliability (2017)). The mean_stress_correction_method is
changed to ‘SWT’ and the results will be printed to the console.

from reliability.PoF import fracture_mechanics_crack_initiation


fracture_mechanics_crack_initiation(P=0.15, A=5*80, Kt=2.41, q=0.9857, Sy=690,
˓→E=210000, K=1060, n=0.14, b=-0.081, c=-0.65, sigma_f=1160, epsilon_f=1.1,mean_

˓→stress_correction_method='SWT')

'''
A crack of 1 mm will be formed after: 2919.91 cycles ( 5839.82 reversals )
(continues on next page)

130 Chapter 23. Fracture mechanics


reliability Documentation, Release latest

(continued from previous page)


Stresses in the component: Min = -506.7291 MPa , Max = 506.7291 MPa , Mean = -5.
˓→684341886080802e-14 MPa.

Strains in the component: Min = -0.0075 , Max = 0.0075 , Mean = 8.673617379884035e-19


Mean stress correction method used: SWT
'''

23.2 Crack growth

The function fracture_mechanics_crack_growth uses the principles of fracture mechanics to find the num-
ber of cycles required to grow a crack from an initial length until a final length. The final length (a_final) may be
specified, but if not specified then a_final will be set as the critical crack length (a_crit) which causes failure due to
rapid fracture. This function performs the same calculation using two methods: similified and iterative. The simplified
method assumes that the geometry factor (f(g)), the stress (S_net), and the critical crack length (a_crit) are constant.
This method is the way most textbooks show these problems solved as they can be done by hand in a few steps.
The iterative method does not make those assumptions and as a result, the parameters f(g), S_net and a_crit must be
recalculated based on the current crack length at every cycle.
This function is applicable only to thin plates with a through thickness edge crack or a centre crack (which is to be
specified using the parameter crack_type). You may also use this function for notched components (edge notches
only, not centre holes) by specifying the parameters Kt and D which are based on the geometry of the notch. For any
notched components, this method assumes the notched component has a “shallow notch” where the notch depth (D) is
much less than the plate width (W). The value of Kt for notched components may be found on the eFatigue website.
In the case of notched components, the local stress concentration from the notch will often cause slower crack growth.
In these cases, the crack length is calculated in two parts (stage 1 and stage 2) which can clearly be seen on the plot
using the iterative method (as shown in the example below).
Inputs:
• Kc - fracture toughness
• Kt - stress concentration factor (default is 1 for no notch).
• D - depth of the notch (mm) (default is None for no notch). A notched component is assumed to be doubly-
notched (symmetric notches on both sides so that no bending occurs)
• C - material constant (sometimes referred to as A)
• m - material constant (sometimes referred to as n). This value must not be 2.
• P - external load on the material (MPa)
• t - plate thickness (mm)
• W - plate width (mm)
• a_initial - initial crack length (mm) (default is 1 mm)
• a_final - final crack length (mm) - default is None in which case a_final is assumed to be a_crit (length at failure).
It is useful to be able to enter a_final in cases where there are different loading regimes over time.
• crack_type - must be either ‘edge’ or ‘center’. Default is ‘edge’. The geometry factor used for each of these
in the simplified method is f(g) = 1.12 for edge and f(g) = 1.0 for center. The iterative method calculates these
values exactly using a_initial and W (plate width).
• print_results - True/False. Default is True
• show_plot - True/False. Default is True.
Outputs:

23.2. Crack growth 131


reliability Documentation, Release latest

• If print_results is True, all outputs will be printed with a description of the process.
• If show_plot is True, the crack growth plot will be shown for the iterative method.
• Nf_stage_1_simplified (in the case of single stage calculations this will be zero)
• Nf_stage_2_simplified
• Nf_total_simplified
• final_crack_length_simplified
• transition_length_simplified
• Nf_stage_1_iterative (in the case of single stage calculations this will be zero)
• Nf_stage_2_iterative
• Nf_total_iterative
• final_crack_length_iterative
• transition_length_iterative
In the following example, a crack of 1mm is grown to failure. The function determines that the notch (described by
Kt and D) causes a local stress concentration which initially slows the propogation of the crack until the crack reaches
the transition length. Once past the transition length, the crack grows much faster and results in brittle fracture of the
material. This change in crack growth rate is evident on the plot from the iterative method. The reason for the different
transition lengths between the simplified and iterative methods is that the simplified method uses 1.12 for the geometry
factor whereas the iterative method finds the geometry factor using the local geometry (using W and D).

from reliability.PoF import fracture_mechanics_crack_growth


import matplotlib.pyplot as plt
fracture_mechanics_crack_growth(Kc=66,C=6.91*10**-12,m=3,P=0.15,W=100,t=5,Kt=2.41,
˓→D=10)

plt.show()

'''
SIMPLIFIED METHOD (keeping f(g), S_max, and a_crit as constant):
Crack growth was found in two stages since the transition length ( 2.08 mm ) due to
˓→the notch, was greater than the initial crack length ( 1 mm ).

Stage 1 (a_initial to transition length): 6802 cycles


Stage 2 (transition length to a_final): 1133 cycles
Total cycles to failure: 7935 cycles.
Critical crack length to cause failure was found to be: 7.86 mm.

ITERATIVE METHOD (recalculating f(g), S_max, and a_crit for each cycle):
Crack growth was found in two stages since the transition length ( 2.45 mm ) due to
˓→the notch, was greater than the initial crack length ( 1 mm ).

Stage 1 (a_initial to transition length): 7576 cycles


Stage 2 (transition length to a_final): 671 cycles
Total cycles to failure: 8247 cycles.
Critical crack length to cause failure was found to be: 6.39 mm.
'''

132 Chapter 23. Fracture mechanics


reliability Documentation, Release latest

References:
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson. pp.
37-57

23.2. Crack growth 133


reliability Documentation, Release latest

134 Chapter 23. Fracture mechanics


CHAPTER 24

Creep

Creep is the progressive accumulation of plastic strain in a component under stress at an elevated temperature over a
period of time. All creep modelling requires data that is unique to the material undergoing creep since all materials
behave differently. This data may be stress, temperature, and time to failure data, or it may be material constants which
are derived from the former. This secion of reliability contains two functions to determine time to failure due to creep.
These functions are creep_rupture_curves and creep_failure_time. Creep is generally modelled using
the Larson-Miller relation or the Manson-Haferd relation.
The function creep_rupture_curves plots the creep rupture curves for a given set of creep data. The function
also fits the lines of best fit to each temperature. The time to failure for a given temperature can be found by specifying
stress_trace and temp_trace.
Inputs:
• temp_array - an array or list of temperatures
• stress_array- an array or list of stresses
• TTF_array - an array or list of times to failure at the given temperatures and stresses
• stress_trace - The stress to use for finding the time to failure (only 1 value is accepted)
• temp_trace - The temperature to use for finding the time to failure (only 1 value is accepted)
Outputs:
• The plot is the only output. Use plt.show() to show it.
In the following example (taken from example 2.16 of Probabilistic Physics of Failure Approach to Reliability (2017)),
we provide creep data in the form of temperatures, stresses, and times to failure in order to obtain the creep rupture
curves. We also are interested in the time to failure of a component at a stress of 70 and a temperature of 1100.
from reliability.PoF import creep_rupture_curves
import matplotlib.pyplot as plt
TEMP = [900,900,900,900,1000,1000,1000,1000,1000,1000,1000,1000,1100,1100,1100,1100,
˓→1100,1200,1200,1200,1200,1350,1350,1350]

STRESS = [90,82,78,70,80,75,68,60,56,49,43,38,60.5,50,40,29,22,40,30,25,20,20,15,10]
TTF = [37,975,3581,9878,7,17,213,1493,2491,5108,7390,10447,18,167,615,2220,6637,19,
˓→102,125,331,3.7,8.9,31.8]
(continues on next page)

135
reliability Documentation, Release latest

(continued from previous page)


creep_rupture_curves(temp_array=TEMP, stress_array=STRESS, TTF_array=TTF, stress_
˓→trace=70, temp_trace=1100)

plt.show()

The function creep_failure_time uses the Larson-Miller relation to find the time to failure due to creep. The
method uses a known failure time (time_low) at a lower failure temperature (temp_low) to find the unknown failure
time at the higher temperature (temp_high). This relation requires the input temperatures in Fahrenheit. To convert
Celsius to Fahrenheit use F = C*(9/5)+32. Also note that the conversion between Fahrenheit and Rankine used in this
calculation is R = F+459.67.
Inputs:
• temp_low - temperature (in degrees Fahrenheit) where the time_low is known
• temp_high - temperature (in degrees Fahrenheit) which time_high is unknown and will be found by this function
• time_low - time to failure at temp_low
• C - creep constant (default is 20). Typically 20-22 for metals
• print_results - True/False
Outputs:
• The time to failure at the higher temperature.
• If print_results is True, the output will also be printed to the console.

136 Chapter 24. Creep


reliability Documentation, Release latest

In the following example (which follows on from the previous example), we will use the Larson-Miller relation to find
the time to failure due to creep at 1100°F for a component which we know fails at 9878 hours when subjected to the
same stress at 900°F.

from reliability.PoF import creep_failure_time


creep_failure_time(temp_low=900,temp_high=1100,time_low=9878)

'''
The time to failure at a temperature of 1100 °F is 8.27520045913433
The Larson-Miller parameter was found to be 32624.83162890552
'''

References:
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson. pp.
81-90

137
reliability Documentation, Release latest

138 Chapter 24. Creep


CHAPTER 25

Palmgren-Miner linear damage model

The function palmgren_miner_linear_damage uses the Palmgren-Miner linear damage hypothesis to find the
outputs listed below.
Inputs:
• rated_life - an array or list of how long the component will last at a given stress level
• time_at_stress - an array or list of how long the component is subjected to the stress that gives the rated_life
• stress - what stress the component is subjected to. Not used in the calculation but is required for printing the
output.
Ensure that the time_at_stress and rated_life are in the same units. The answer will also be in those units. The number
of items in each input must be the same.
Outputs:
• Fraction of life consumed per load cycle
• Service life of the component
• Fraction of damage caused at each stress level
In the following example, we consider a scenario in which ball bearings fail after 50000 hrs, 6500 hrs, and 1000 hrs,
after being subjected to a stress of 1kN, 2kN, and 4kN respectively. If each load cycle involves 40 mins at 1kN, 15
mins at 2kN, and 5 mins at 4kN, how long will the ball bearings last?

from reliability.PoF import palmgren_miner_linear_damage


palmgren_miner_linear_damage(rated_life=[50000,6500,1000], time_at_stress=[40/60, 15/
˓→60, 5/60], stress=[1, 2, 4])

'''
Palmgren-Miner Linear Damage Model results:
Each load cycle uses 0.01351 % of the components life.
The service life of the component is 7400.37951 load cycles.
The amount of damage caused at each stress level is:
Stress = 1 , Damage fraction = 9.86717 %.
(continues on next page)

139
reliability Documentation, Release latest

(continued from previous page)


Stress = 2 , Damage fraction = 28.463 %.
Stress = 4 , Damage fraction = 61.66983 %.
'''

References:
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson. pp.
33-37

140 Chapter 25. Palmgren-Miner linear damage model


CHAPTER 26

Acceleration factor

[︁ (︁ )︁]︁
𝐸𝑎 1 1
The Arrhenius model for Acceleration factor due to higher temperature is 𝐴𝐹 = 𝑒𝑥𝑝 𝐾 𝐵 𝑇𝑢𝑠𝑒 − 𝑇𝑎𝑐𝑐 This
function accepts T_use as a mandatory input and you may specify any two of the three other variables, and the third
variable will be found.
Inputs:
• T_use - Temp of usage in Celsius
• T_acc - Temp of acceleration in Celsius (optional input)
• Ea - Activation energy in eV (optional input)
• AF - Acceleration factor (optional input)
• print_results - True/False. Default is True
Outputs:
• Results will be printed to console if print_results is True
• AF - Acceleration Factor
• T_acc - Accelerated temperature
• T_use - Use temperature
• Ea - Activation energy (in eV)
In the example below, the acceleration factor is found for an accelerated test at 100°C for a component that is normally
run at 60°C and has an activation energy of 1.2 eV.

from reliability.PoF import acceleration_factor


acceleration_factor(T_use=60,T_acc=100,Ea=1.2)

'''
Acceleration Factor: 88.29574588463338
Use Temperature: 60 °C
Accelerated Temperature: 100 °C
(continues on next page)

141
reliability Documentation, Release latest

(continued from previous page)


Activation Energy (eV): 1.2 eV
'''

142 Chapter 26. Acceleration factor


CHAPTER 27

Solving simultaneous equations with sympy

This document is a tutorial for how to use the Python module sympy to solve simultaneous equations. Since sympy
does this so well, there is no need to implement it within reliability, but users may find this tutorial helpful as
problems involving physics of failure will often require the solution of simultaneous equations. Sympy is not installed
by default when you install reliability so users following this tutorial will need to ensure sympy is installed on their
machine. The following three examples should be sufficient to illustrate how to use sympy for solving simultaneous
equations. Further examples are available in the sympy documentation.
Example 1
Eqn 1: 𝑥+𝑦 =5
Eqn 2: 𝑥2 + 𝑦 2 = 17
Solving with sympy:

import sympy as sym


x,y = sym.symbols('x,y')
eq1 = sym.Eq(x+y,5)
eq2 = sym.Eq(x**2+y**2,17)
result = sym.solve([eq1,eq2],(x,y))
print(result)

'''
[(1, 4), (4, 1)] #these are the solutions for x,y. There are 2 solutions because the
˓→equations represent a line passing through a circle.

'''

Example 2
Eqn 1: 𝑎1000000𝑏 = 119.54907
Eqn 2: 𝑎1000𝑏 = 405
Solving with sympy:

143
reliability Documentation, Release latest

import sympy as sym


a,b = sym.symbols('a,b')
eq1 = sym.Eq(a*1000000**b,119.54907)
eq2 = sym.Eq(a*1000**b,405)
result = sym.solve([eq1,eq2],(a,b))
print(result)

'''
[(1372.03074854535, -0.176636273742481)] #these are the solutions for a,b
'''

Example 3
Eqn 1: 2𝑥2 + 𝑦 + 𝑧 = 1
Eqn 2: 𝑥 + 2𝑦 + 𝑧 = 𝑐1
Eqn 3: − 2𝑥 + 𝑦 = −𝑧
The actual solution to the above set of equations is:

3
𝑥 = − 12 + 2

𝑦= 𝑐1 − 3 2 3 + 32

𝑧= −𝑐1 − 52 + 5 2 3
and a second solution:

3
𝑥 = − 21 − 2

𝑦= 𝑐1 + 3 2 3 + 32

𝑧= −𝑐1 − 52 − 5 2 3
Solving with sympy:

import sympy as sym


x,y,z = sym.symbols('x,y,z')
c1 = sym.Symbol('c1')
eq1 = sym.Eq(2*x**2+y+z,1)
eq2 = sym.Eq(x+2*y+z,c1)
eq3 = sym.Eq(-2*x+y,-z)
result = sym.solve([eq1,eq2,eq3],(x,y,z))
print(result)

'''
[(-1/2 + sqrt(3)/2, c1 - 3*sqrt(3)/2 + 3/2, -c1 - 5/2 + 5*sqrt(3)/2), (-sqrt(3)/2 - 1/
˓→2, c1 + 3/2 + 3*sqrt(3)/2, -c1 - 5*sqrt(3)/2 - 5/2)]

'''

Note: If you are using an iPython notebook, the display abilities are much better than the command line interface, so
you can simply add sym.init_printing() after the import line and your equations should be displayed nicely.

A special thanks to Brigham Young University for offering this tutorial.

144 Chapter 27. Solving simultaneous equations with sympy


reliability Documentation, Release latest

145
reliability Documentation, Release latest

146 Chapter 27. Solving simultaneous equations with sympy


CHAPTER 28

ALT probability plots

Before reading this section, you should be familiar with what a probability plot is and how to use it. For a detailed
explaination, please see the section on probability plots.
The module reliability.ALT_probability_plotting contains four ALT probability plotting functions.
These functions are:
• ALT_probability_plot_Weibull
• ALT_probability_plot_Lognormal
• ALT_probability_plot_Normal
• ALT_probability_plot_Exponential
An ALT probability plot produces a multi-dataset probability plot which includes the probability plots for the data
and the fitted distribution at each stress level, as well as a refitted distribution assuming a common shape parameter at
each stress level. All of these functions perform in a similar way, with the main difference being the distribution that
is fitted. The probability plots provided do not include the Beta distribution (because there are two shape parameters),
the Gamma distribution (because changing the scale parameter will also change the slope of the line even when the
shape parameter is constant) or any of the location shifted distributions (because these are not typically used for ALT
probability plotting). The ALT_probability_plot_Exponential is a special case of the ALT_probability_plot_Weibull
in which the common shape parameter is forced to be 1. An example of this is shown below.
When producing the ALT probability plot, the function automates the following process; fit a distribution to the data
for each unique stress level, find the common shape parameter (several methods are provided), refit the distribution to
the data for each unique stress level whilst forcing the shape parameter to be equal to the common shape parameter,
plot the data along with the original and new fitted distributions, calculate the change in the common shape parameter
from the original shape parameter to see if the model is applicable to this dataset. Each of the ALT plotting functions
listed above has the following inputs and outputs.
Inputs:
• failures - an array or list of all the failure times
• failure_stress - an array or list of the corresponding stresses (such as temperature) at which each failure occurred.
This must match the length of failures as each failure is tied to a failure stress.
• right_censored - an array or list of all the right censored failure times

147
reliability Documentation, Release latest

• right_censored_stress - an array or list of the corresponding stresses (such as temperature) at which each
right_censored datapoint was obtained. This must match the length of right_censored as each right_censored
value is tied to a right_censored stress.
• print_results - True/False. Default is True
• show_plot - True/False. Default is True
• common_shape_method - ‘BIC’, ‘weighted_average’, ‘average’. Default is ‘BIC’. This is the method used to
obtain the common_shape (beta) parameter. ‘BIC’ will find the common_shape that gives lowest total BIC
(equivalent to the best overall fit), ‘weighted_average’ will perform a weighted average based on the amount of
data (failures and right censored) for each stress, ‘average’ is simply the average. Note for the Lognormal and
Normal plots, this variable is named common_sigma_method as we are forcing sigma to be a common value.
Outputs:
• The plot will be produced if show_plot is True
• A dataframe of the fitted distributions parameters will be printed if print_results is True
• results - a dataframe of the fitted distributions parameters and change in the shape parameter
• common_shape - the common shape parameter. This is Beta for Weibull, Sigma for Lognormal and Normal,
and 1 for Exponential.
• BIC_sum - the sum of the BIC for each of the distributions when fitted using the common_shape
• AICc_sum - the sum of the AICc for each of the distributions when fitted using the common_shape
The time to run the function will be a few seconds if you have a large amount of data and the common_shape_method
is set to ‘BIC’. This is because the distributions need to be refitted for each iteration of the optimizer (which is usually
around 20 to 30 iterations). With 100 datapoints this should take less than 5 seconds for the ‘BIC’ method, and less
than 1 second for the ‘average’ and ‘weighted_average’ methods. The more data you have, the longer it will take, so
please be patient as a lot of computation is required.
In the following example we will use a dataset from reliability.Datasets which contains failures and
right_censored data for three stress levels. We will analyse this dataset using the Weibull and Lognormal ALT proba-
bility plots to determine which model is a better fit for the data. All other inputs are left to their default values which
gives us the plot and the results dataframe. From the printed results we can see how well the model fits our data. The
AICc and BIC values suggest that the Lognormal model is a slightly better fit overall for this dataset, but both models
would be suitable. The fitted distributions with a common shape parameter still agree well with the majority of our
data (except for the lower tail of the 40 degree data), and the amount of change to the shape parameter was within the
acceptable limits. See the section below for more details on what we are looking to get out of these plots.
from reliability.ALT_probability_plotting import ALT_probability_plot_Weibull, ALT_
˓→probability_plot_Lognormal

from reliability.Datasets import ALT_temperature


import matplotlib.pyplot as plt
plt.figure()
plt.subplot(121)
ALT_probability_plot_Weibull(failures=ALT_temperature().failures,failure_stress=ALT_
˓→temperature().failure_stresses,right_censored=ALT_temperature().right_censored,

˓→right_censored_stress=ALT_temperature().right_censored_stresses)

plt.subplot(122)
ALT_probability_plot_Lognormal(failures=ALT_temperature().failures,failure_stress=ALT_
˓→temperature().failure_stresses,right_censored=ALT_temperature().right_censored,

˓→right_censored_stress=ALT_temperature().right_censored_stresses)

plt.gcf().set_size_inches(15,7)
plt.show()

'''
(continues on next page)

148 Chapter 28. ALT probability plots


reliability Documentation, Release latest

(continued from previous page)


ALT Weibull probability plot results:
stress original alpha original beta new alpha common beta beta change
40 21671.954375 1.625115 21671.954523 1.519015 -6.53%
60 6628.557163 1.315739 6628.557053 1.519015 +15.45%
80 1708.487268 1.397979 1831.456045 1.519015 +8.66%
Total AICc: 693.399657861714
Total BIC: 700.5811529186203

ALT Lognormal probability plot results:


stress original mu original sigma new mu common sigma sigma change
40 9.814749 1.008337 9.717867 0.939793 -6.8%
60 8.644075 1.187552 8.507165 0.939793 -20.86%
80 7.141321 0.770355 7.147846 0.939793 +21.99%
Total AICc: 690.9683704897413
Total BIC: 698.1498655466478
'''

In this second example, we examine the difference between ALT_probability_plot_Weibull and


ALT_probability_plot_Exponential. A dataset is generated from several Exponential distributions. Ideally, we
want to fit a distribution to this data which does not overfit, such that it should have as few parameters as necessary.
Both the Weibull and Exponential distributions could be used here, but we know the Exponential is a more appropriate
distribution since it was the source of the data. Upon examination of the results, we see very little difference between
the common shape (from Exponential) and common beta (from Weibull) and very little difference in the plots, but
the AICc and BIC are both lower for the Exponential model indicating that the Exponential distribution should be
used preferrentially to the Weibull distribution. Conveniently, the function ALT_probability_plot_Exponential also
provides the AICc and BIC results from Weibull and will print a warning if it finds Weibull to be a more appropriate
fit than Exponential based on the BIC.

from reliability.ALT_probability_plotting import ALT_probability_plot_Weibull, ALT_


˓→probability_plot_Exponential

import matplotlib.pyplot as plt


import numpy as np
from reliability.Distributions import Exponential_Distribution

# create the data using an Exponential distribution


data1 = Exponential_Distribution(Lambda=1 / 100).random_samples(10, seed=42)
(continues on next page)

149
reliability Documentation, Release latest

(continued from previous page)


data2 = Exponential_Distribution(Lambda=1 / 500).random_samples(10, seed=42)
data3 = Exponential_Distribution(Lambda=1 / 3000).random_samples(10, seed=42)
f = np.hstack([data1, data2, data3])
f_stress = np.hstack([np.ones_like(data1) * 50, np.ones_like(data1) * 40, np.ones_
˓→like(data1) * 30])

# plot the data


plt.subplot(121)
ALT_probability_plot_Exponential(failures=f, failure_stress=f_stress)
plt.subplot(122)
ALT_probability_plot_Weibull(failures=f, failure_stress=f_stress, common_shape_method=
˓→'average')

plt.subplots_adjust(right=0.94,wspace=0.35)
plt.show()

'''
ALT Exponential probability plot results:
stress weibull alpha weibull beta new 1/Lambda common shape shape change
30.0 3304.561499 1.085037 3080.910234 1.0 -7.84%
40.0 527.347816 1.074161 513.485039 1.0 -6.9%
50.0 105.469669 1.074162 102.697008 1.0 -6.9%
Total AICc: 445.619684746024
Total BIC: 445.0274400250061
Total AICc (weibull): 455.05689896150307
Total BIC (weibull): 451.7295523766102

ALT Weibull probability plot results:


stress original alpha original beta new alpha common beta beta change
30.0 3304.561499 1.085037 3304.561393 1.077786 -0.67%
40.0 527.347816 1.074161 528.205229 1.077786 +0.34%
50.0 105.469669 1.074162 105.603906 1.077786 +0.34%
Total AICc: 455.05689896150307
Total BIC: 451.7295523766102
'''

150 Chapter 28. ALT probability plots


reliability Documentation, Release latest

28.1 Getting your input data in the right format

Because the ALT probability plots need failures and right censored data from many stress levels, it was not practical to
make an input for each stress level. Instead, the failure times are combined in a single input and the failure_stress input
provides a list of the corresponding stresses at which each failure occurred. The same is true of the right_censored and
right_censored_stress inputs.
To get your data in the correct format, ensure you have combined all your failure times into a single list or numpy array
and there is a corresponding list or array of the same length that provides all of the stresses. The following example
illustrates one method to do this if you do not have the list already imported from Excel or another source. This is
done for failures only but if you have right_censored data then you would do the same thing, but keeping it seperate
to the failure data. There is no need to sort the data in any particular order as this is all done automatically. The only
requirement is that the length of failures matches the length of the failure_stress, and that there are no new stresses in
right_censored_stress that are not present in failure_stress.
import numpy as np
#create the data
failure_times_at_stress_1 = [800,850,910,940]
failure_stress_1 = [40,40,40,40]
failure_times_at_stress_2 = [650,670,715,740]
failure_stress_2 = [50,50,50,50]
failure_times_at_stress_3 = [300,320,350,380]
failure_stress_3 = [60,60,60,60]
(continues on next page)

28.1. Getting your input data in the right format 151


reliability Documentation, Release latest

(continued from previous page)


#combine the data
failures = np.hstack([failure_times_at_stress_1,failure_times_at_stress_2,failure_
˓→times_at_stress_3])

failure_stresses = np.hstack([failure_stress_1,failure_stress_2,failure_stress_3])
#print for inspection
print(failures)
print(failure_stresses)

'''
[800 850 910 940 650 670 715 740 300 320 350 380]
[40 40 40 40 50 50 50 50 60 60 60 60]
'''

28.2 What does an ALT probability plot show me?

An ALT probability plot shows us how well our dataset can be modeled by the chosen distribution. This is more than
just a goodness of fit at each stress level, because the distribution needs to be a good fit at all stress levels and be
able to fit well with a common shape parameter. If you find the shape parameter changes significantly as the stress
increases then it is likely that your accelerated life test is experiencing a different failure mode at higher stresses. When
examining an ALT probability plot, the main things we are looking for are:
• Does the model appear to fit the data well at all stress levels (ie. the dashed lines pass reasonably well through
all the data points)
• Examine the AICc and BIC values when comparing multiple models. A lower value suggests a better fit.
• Is the amount of change to the shape parameter within the acceptable limits (generally less than 50% for each
distribution).
The image provided above shows two distributions that fit well. If we apply the same data to the function
ALT_probability_plot_Normal as shown in the example below, we get the image shown below. From this image
we can see that the model does not fit well at the higher stress (80 degrees) and the amount of change to the shape
parameter was up to 93%. Also note that the total AIC and total BIC for the Normal_2P model is higher (worse) than
for the Weibull_2P and Lognormal_2P models shown in the first example. Based on these results, we would reject the
Normal_2P model and try another model. If you find that none of the models work without large changes to the shape
parameter at the higher stresses, then you can conclude that there must be a change in the failure mode for higher
stresses and you may need to look at changing your accelerated test to keep the failure mode consistent across tests.

from reliability.ALT_probability_plotting import ALT_probability_plot_Normal


from reliability.Datasets import ALT_temperature
import matplotlib.pyplot as plt
ALT_probability_plot_Normal(failures=ALT_temperature().failures,failure_stress=ALT_
˓→temperature().failure_stresses,right_censored=ALT_temperature().right_censored,

˓→right_censored_stress=ALT_temperature().right_censored_stresses)

plt.show()

'''
ALT Normal probability plot results:
stress original mu original sigma new mu common sigma sigma change
40 9098.952789 3203.855964 7764.809372 2258.042218 -29.52%
60 5174.506831 3021.353462 4756.980072 2258.042218 -25.26%
80 1600.117162 1169.703757 1638.730664 2258.042218 +93.04%
Total AICc: 716.6685648106153
Total BIC: 723.8500598675216
'''

152 Chapter 28. ALT probability plots


reliability Documentation, Release latest

References:
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson. pp.
117-137

28.2. What does an ALT probability plot show me? 153


reliability Documentation, Release latest

154 Chapter 28. ALT probability plots


CHAPTER 29

Fitting a model to ALT data

Before reading this section, you should be familiar with ALT probability plots, and Fitting distributions to non-ALT
datasets.
The module reliability.ALT_fitters contains fitting function for 20 different ALT life-stress models. Each
model is a combination of the life model with the scale or location parameter replaced with a life-stress model. For
example, the Weibull-Exponential model is found by replacing the 𝛼 parameter with the equation for the exponential
life-stress model as follows:
𝛽−1
𝑓 (𝑡) = 𝛽𝑡𝛼𝛽 .𝑒𝑥𝑝 −( 𝛼𝑡 )𝛽
(︀ )︀
Weibull PDF:
Exponential Life-Stress model: 𝐿(𝑇 ) = 𝑏.𝑒𝑥𝑝 𝑇𝑎
(︀ )︀

Replacing 𝛼 with 𝐿(𝑇 ) gives the PDF of the Weibull-Exponential model:


(︃ (︂ )︂𝛽 )︃
𝛽𝑡𝛽−1 𝑡
Weibull-Exponential: 𝑓 (𝑡, 𝑇 ) = 𝛽 .𝑒𝑥𝑝 − 𝑏.𝑒𝑥𝑝 𝑎
(𝑏.𝑒𝑥𝑝( 𝑇𝑎 )) ( ( 𝑇 ))

The correct substitutions for each type of model are:


Weibull: 𝛼 = 𝐿(𝑇 )
Normal: 𝜇 = 𝐿(𝑇 )
Lognormal: 𝜇 = 𝑙𝑛 (𝐿(𝑇 ))
1
Exponential: 𝜆 = 𝐿(𝑇 )

The life models available are:


• Weibull_2P
• Normal_2P
• Lognormal_2P
• Expon_1P
The life-stress models available are:
(︀ 𝑎 )︀
Exponential (also used for Arrhenius equation): 𝐿(𝑇 ) = 𝑏.𝑒𝑥𝑝 𝑇

155
reliability Documentation, Release latest

1 𝑎
(︀ (︀ )︀)︀
Eyring: 𝐿(𝑇 ) = 𝑇 .𝑒𝑥𝑝 − 𝑐 − 𝑇

Power (also known as inverse power): 𝐿(𝑆) = 𝑎.𝑆 𝑛


𝐿(𝑇, 𝐻) = 𝑐.𝑒𝑥𝑝 𝑇𝑎 + 𝐻𝑏
(︀ )︀
Dual-Exponential (also known as Temperature-Humidity):
𝐿(𝑇, 𝑆) = 𝑐.𝑆 𝑛 .𝑒𝑥𝑝 𝑇𝑎
(︀ )︀
Power-Exponential (also known as Thermal-Non-Thermal):
When choosing a model, it is important to consider the physics involved in the life-stress model rather than just trying
everything to see what fits best. For example, the Power-Exponential model is most appropriate for a dataset that was
obtained from an ALT reliability test with a thermal and a non-thermal stress (such as temperature and voltage). It
would be inappropriate to model the data from a Temperature-Humidity test using a Power-Exponential model as the
physics suggests that a Temperature-Humidity test should be modelled using the Dual-Exponential model.
Each of the fitting functions works in a very similar way so the documentation below can be applied to all of the
models with minor modifications to the parameter names of the outputs. The following documentation is for the
Weibull-Power model.
Inputs:
• failures - an array or list of the failure times.
• failure_stress - an array or list of the corresponding stresses (such as temperature) at which each failure occurred.
This must match the length of failures as each failure is tied to a failure stress.
• right_censored - an array or list of all the right censored failure times
• right_censored_stress - an array or list of the corresponding stresses (such as temperature) at which each
right_censored data point was obtained. This must match the length of right_censored as each right_censored
value is tied to a right_censored stress.
• use_level_stress - The use level stress at which you want to know the mean life. Optional input.
• print_results - True/False. Default is True
• show_plot - True/False. Default is True
• CI - confidence interval for estimating confidence limits on parameters. Must be between 0 and 1. Default is
0.95 for 95% CI.
• initial_guess - starting values for [a,n]. Default is calculated using a curvefit to failure data. Optional input. If
fitting fails, you will be prompted to try a better initial guess and you can use this input to do it.
Outputs:
• a - fitted parameter from the Power model
• n - fitted parameter from the Power model
• beta - the fitted Weibull_2P beta
• loglik2 - LogLikelihood*-2
• AICc - Akaike Information Criterion
• BIC - Bayesian Information Criterion
• a_SE - the standard error (sqrt(variance)) of the parameter
• n_SE - the standard error (sqrt(variance)) of the parameter
• beta_SE - the standard error (sqrt(variance)) of the parameter
• a_upper - the upper CI estimate of the parameter
• a_lower - the lower CI estimate of the parameter
• n_upper - the upper CI estimate of the parameter

156 Chapter 29. Fitting a model to ALT data


reliability Documentation, Release latest

• n_lower - the lower CI estimate of the parameter


• beta_upper - the upper CI estimate of the parameter
• beta_lower - the lower CI estimate of the parameter
• results - a dataframe of the results (point estimate, standard error, Lower CI and Upper CI for each parameter)
• mean_life - the mean life at the use_level_stress. Only calculated if use_level_stress is specified
In the following example, we will fit the Weibull-Power model to an ALT dataset obtained from a fatigue test. This
dataset can be found in reliability.Datasets. We want to know the mean life at the use level of 60 so the
parameter use_level_stress is specified. All other values are left as defaults and the results and plot are shown.

from reliability.ALT_fitters import Fit_Weibull_Power


from reliability.Datasets import ALT_load2
import matplotlib.pyplot as plt
data = ALT_load2()
Fit_Weibull_Power(failures=data.failures,failure_stress=data.failure_stresses,right_
˓→censored=data.right_censored,right_censored_stress=data.right_censored_stresses,use_

˓→level_stress=60)

plt.show()

'''
Results from Fit_Weibull_Power (95% CI):
Point Estimate Standard Error Lower CI Upper CI
Parameter
a 398816.280655 519397.785342 -619184.672265 1.416817e+06
n -1.417306 0.243944 -1.895428 -9.391834e-01
beta 3.017297 0.716426 1.894563 4.805374e+00
At the use level stress of 60 , the mean life is 1075.32841
'''

157
reliability Documentation, Release latest

In this second example, we will fit a dual stress model to a dual stress data set. The data set contains temperature and
voltage data so it is most appropriate to model this dataset using a Power-Exponential model. A few differences to
note with the dual stress models is that each stress requires a separate input, so if you also have censored data then this
will require 6 inputs. If using the Power Exponential model it is essential that the thermal and non-thermal stresses go
in their named inputs or the model will likely fail to fit the data. In this example we want to know the life at a use level
stress of 325K and 0.5V which the output tells us is 4673 hours.

from reliability.ALT_fitters import Fit_Weibull_Power_Exponential


from reliability.Datasets import ALT_temperature_voltage
import matplotlib.pyplot as plt
data = ALT_temperature_voltage()
Fit_Weibull_Power_Exponential(failures=data.failures,failure_stress_thermal=data.
˓→failure_stress_temp,failure_stress_nonthermal=data.failure_stress_voltage,use_level_

˓→stress=[325,0.5])

plt.show()

'''
Results from Fit_Weibull_Power_Exponential (95% CI):
Point Estimate Standard Error Lower CI Upper CI
Parameter
a 3404.486044 627.680074 2174.255705 4634.716383
c 0.087610 0.141218 -0.189172 0.364393
n -0.713424 0.277561 -1.257434 -0.169413
beta 4.997527 1.173998 3.153512 7.919829
At the use level stresses of 325 and 0.5 , the mean life is 4673.15246
'''

158 Chapter 29. Fitting a model to ALT data


reliability Documentation, Release latest

References:
• Probabilistic Physics of Failure Approach to Reliability (2017), by M. Modarres, M. Amiri, and C. Jackson. pp.
136-168
• Accelerated Life Testing Data Analysis Reference - ReliaWiki, Reliawiki.com, 2019. [Online].

159
reliability Documentation, Release latest

160 Chapter 29. Fitting a model to ALT data


CHAPTER 30

Similar Distributions

The function similar_distributions is a tool for finding the probability distributions that are most similar to an input
distribution. It samples the CDF of an input distribution and then fits all other distributions to those samples to
determine the best fitting and therefore most similar distributions.
Inputs
• distribution - a distribution object created using the reliability.Distributions module
• include_location_shifted - True/False. Default is True. When set to True it will include Weibull_3P, Lognor-
mal_3P, Gamma_3P, Expon_2P
• show_plot - True/False. Default is True
• print_results - True/False. Default is True
• number_of_distributions_to_show - the number of similar distributions to show. Default is 3. If the number
specified exceeds the number available (typically 8), then the number specified will automatically be reduced.
Outputs
• If show_plot is True then the plot of PDF and CDF will automatically be shown.
• If print_results is True then the parameters of the most similar distributions will be printed.
• results - an array of distributions objects ranked in order of best fit.
• most_similar_distribution - a distribution object. This is the first item from results.
In the example below, we create a Weibull Distribution object using the reliability.Distributions module. We then pro-
vide the Weibull Distribution as input to simiar_distributions and the output reveals the top 3 most similar distributions.
The optional input of include_location_shifted has been set to False.

from reliability.Distributions import Weibull_Distribution


from reliability.Other_functions import similar_distributions
dist = Weibull_Distribution(alpha=50,beta=3.3)
similar_distributions(distribution=dist,include_location_shifted=False)

'''
(continues on next page)

161
reliability Documentation, Release latest

(continued from previous page)


The input distribution was:
Weibull Distribution (𝛼=50,𝛽=3.3)

The top 3 most similar distributions are:


Normal Distribution (𝜇=44.8471,𝜎=14.9226)
Gamma Distribution (𝛼=5.7607,𝛽=7.7849)
Lognormal Distribution (𝜇=3.7377,𝜎=0.3856)
'''

162 Chapter 30. Similar Distributions


CHAPTER 31

Convert dataframe to grouped lists

This function was written because many of our datasets are in the form of columns with one column indicating the
label (eg. success, failure) and the other column providing the value. Since many functions in reliability are
written to accept grouped lists (lists where all the values belong to the same group such as right_censored failure
times), it is beneficial to have a fast way to perform this conversion. This function will split the dataframe into as many
grouped lists as there are unique values in the labels (left) column.
Inputs:
• input_dataframe. This must be a dataframe containing 2 columns, where the label column is the left column and
the value column is the right column. The column titles are not important.
Outputs:
• lists , names - lists is a list of the grouped lists, and names is the identifying labels used to group the lists from
the first column.
In the example below, we will create some data in a pandas dataframe, print it to see what it looks like, and then split
it up using this function and print the results.

from reliability.Other_functions import convert_dataframe_to_grouped_lists


import pandas as pd
#create some data in a dataframe
data = {'outcome': ['Failed', 'Censored', 'Failed', 'Failed', 'Censored'],
'cycles': [1253, 1500, 1342, 1489, 1500]}
df = pd.DataFrame(data, columns=['outcome', 'cycles'])
print(df,'\n')
# usage of the function
lists, names = convert_dataframe_to_grouped_lists(df)
print(names[0],lists[0])
print(names[1],lists[1])

'''
outcome cycles
0 Failed 1253
1 Censored 1500
(continues on next page)

163
reliability Documentation, Release latest

(continued from previous page)


2 Failed 1342
3 Failed 1489
4 Censored 1500

Censored [1500, 1500]


Failed [1253, 1342, 1489]
'''

164 Chapter 31. Convert dataframe to grouped lists


CHAPTER 32

Crosshairs

This function provides interactive crosshairs on matplotlib plots. The crosshairs will follow the users’ mouse cursor
when they are near lines or points and will snap to these lines and points. Upon a mouse click the crosshairs will add
an annotation to the plot. This annotation can be dragged to a new position. To delete the annotation, right click on it.
To temporarily hide all annotations, toggle ‘h’ on your keyboard.
Note that crosshairs should be called after everything is added to the plot (but before plt.show()) so that the objects in
the plot are identified for the ‘snap to’ feature. If something is added to the plot after calling crosshairs then you will
not be able to move the crosshairs onto it.
If your interactive development environment does not generate the plot in its own window then your plot is not
interactive and this will not work. For iPython notebook users, the interactive window should be available by typing
“%matplotlib qt” after importing matplotlib as described here.
There are some customisable attributes of the crosshairs and annotations using the following inputs:
• xlabel - the x-label for the annotation. Default is x.
• ylabel - the y-label for the annotation. Default is y.
• decimals - the number of decimals to use when rounding values in the crosshairs and in the annotation. Default
is 2.
• plotting kwargs are also accepted. eg. color, linestyle, etc.
In the following example, we see the crosshairs being used to display the value of the Weibull CDF. The dynamic
nature of this feature is shown in the video at the bottom of this page.

from reliability.Other_functions import crosshairs


from reliability.Distributions import Weibull_Distribution
import matplotlib.pyplot as plt

Weibull_Distribution(alpha=50,beta=2).CDF()
crosshairs(xlabel='t',ylabel='F') #it is important to call this last
plt.show()

165
reliability Documentation, Release latest

A special thanks goes to Antony Lee, the author of mplcursors. The crosshairs function works using mplcursors to
enable the ‘snap to’ feature and the annotations. Antony was very helpful in getting this to work.

166 Chapter 32. Crosshairs


CHAPTER 33

Histogram

This function plots a histogram using the matplotlib histogram (plt.hist()), but adds some additional features. Default
formatting is improved, the number of bins is optimized using the Freedman–Diaconis rule, and there is an option to
shade the bins white above a chosen threshold. If you would like to specify the number of bins rather than having the
optimal number calculated, then the bins argument allows this.
Inputs:
• data - the data to plot. Array or list.
• white_above - bins above this value will be shaded white
• bins - the number of bins to use. Must be int. Leave empty to have the optimal number calculated automatically
• density - True/False. Default is True. Always use True if plotting with a probability distribution.
• cumulative - True/False. Default is False. Use False for PDF and True for CDF.
• kwargs - plotting kwargs for the histogram (color, alpha, etc.)
The following example shows the difference between the appearance of the default histogram in matplotlib, and the
histogram in reliability.
from reliability.Distributions import Gamma_Distribution
from reliability.Fitters import Fit_Gamma_2P
from reliability.Other_functions import make_right_censored_data, histogram
import matplotlib.pyplot as plt

a = 30
b = 4
threshold = 180 # this is used when right censoring the data
dist = Gamma_Distribution(alpha=30, beta=4)
raw_data = dist.random_samples(500, seed=2) # create some data. Seeded for
˓→repeatability

data = make_right_censored_data(raw_data, threshold=threshold) # right censor the


˓→data

gf = Fit_Gamma_2P(failures=data.failures,right_censored=data.right_censored,show_
˓→probability_plot=False,print_results=False)

(continues on next page)

167
reliability Documentation, Release latest

(continued from previous page)

plt.subplot(121)
gf.distribution.PDF()
plt.hist(raw_data, density=True) # default histogram from matplotlib
plt.title('matplotlib histogram')

plt.subplot(122)
gf.distribution.PDF()
histogram(raw_data, white_above=threshold) # histogram from reliability - better
˓→formatting, optimal bin width, white_above option

plt.title('reliability histogram')

plt.subplots_adjust(right=0.95, wspace=0.38)
plt.show()

168 Chapter 33. Histogram


reliability Documentation, Release latest

169
reliability Documentation, Release latest

170 Chapter 33. Histogram


CHAPTER 34

Make right censored data

This function is a simple tool to generate right censored data using complete data and a threshold from which to right
censor the data. It is primarily used in testing of the Fitters functions when some right censored data is needed.
Inputs:
• data - list or array of data
• threshold - point to right censor (right censoring is done if value is > threshold)
Outputs:
• failures - array of failures (data <= threshold)
• right_censored - array of right_censored values (data > threshold). These will be set at the value of the threshold.

from reliability.Other_functions import make_right_censored_data


output = make_right_censored_data(data=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], threshold=6)
print('Failures:',output.failures)
print('Right Censored:',output.right_censored)

'''
Failures: [1 2 3 4 5 6]
Right Censored: [6 6 6 6] #the numbers 7 to 10 have been set equal to the threshold
'''

171
reliability Documentation, Release latest

172 Chapter 34. Make right censored data


CHAPTER 35

Datasets

There are a few datasets that have been included with reliability that users may find useful for testing and experi-
menting. While this list is currently small, expect it to increase significantly over time. Within reliability.
Datasets the following datasets are available:
Standard datasets
• automotive - 10 failures, 21 right censored. It is used in this example
• defective_sample - 1350 failures, 12296 right censored. It exhibits the behavior of a defective sample (also
known as Limited failure population or Defective subpopulation).
• electronics - 10 failures, 4072 right censored. It is used in this example.
ALT Datasets
• ALT_temperature - conducted at 3 temperatures. 35 failures, 102 right censored. For example usage of many of
the ALT Datasets see the examples here.
• ALT_temperature2 - conducted at 4 temperatures. 40 failures, 20 right censored.
• ALT_temperature3 - conducted at 3 temperatures. 30 failures, 0 right censored.
• ALT_load - conducted at 3 loads. 20 failures, 0 censored.
• ALT_load2 - conducted at 3 loads. 13 failures, 5 right censored.
• ALT_temperature_voltage - conducted at 2 different temperatures and 2 different voltages. 12 failures, 0 right
censored.
• ALT_temperature_voltage2 - conducted at 3 different temperatures and 2 different voltages. 18 failures, 8 right
censored.
• ALT_temperature_humidity - conducted at 2 different temperatures and 2 different humidities. 12 failures, 0
right censored.
MCF Datasets
• MCF_1 - this dataset contains failure and retirement times for 5 repairable systems. Exhibits a worsening repair
rate.

173
reliability Documentation, Release latest

• MCF_2 - this dataset contains failure and retirement times for 56 repairable systems. Exhibits a worsening then
improving repair rate. Difficult to fit this dataset.
All datasets are functions which create objects and every dataset object has several attributes.
For the standard datasets, these attributes are:
• info - a dataframe of statistics about the dataset
• failures - a list of the failure data
• right_censored - a list of the right_censored data
• right_censored_stress - a list of the right_censored stresses (ALT datasets only)
For the ALT datasets, these attributes are similar to the above standard attributes, just with some variation for the
specific dataset. These include things like:
• failure_stress_humidity
• right_censored_stress_voltage
• failure_stress_temp
• other similarly named attributes based on the dataset
For the MCF datasets these attributes are:
• times
• number_of_systems
If you would like more information on a dataset, you can type the name of the dataset in the help function (after
importing it).

from reliability.Datasets import automotive


print(help(automotive))

If you would like the statistics about a dataset you can access the info dataframe as shown below.

from reliability.Datasets import defective_sample


print(defective_sample().info)

'''
Stat Value
Name defective_sample
Total Values 13645
Failures 1350 (9.89%)
Right Censored 12295 (90.11%)
'''

The following example shows how to import a dataset and use it. Note that we must use () before accessing the failures
and right_censored values.

from reliability.Datasets import automotive


from reliability.Fitters import Fit_Weibull_2P
Fit_Weibull_2P(failures=automotive().failures,right_censored=automotive().right_
˓→censored,show_probability_plot=False)

'''
Results from Fit_Weibull_2P (95% CI):
Point Estimate Standard Error Lower CI Upper CI
Parameter
(continues on next page)

174 Chapter 35. Datasets


reliability Documentation, Release latest

(continued from previous page)


Alpha 140882.303527 49299.609699 70956.382925 279718.647273
Beta 1.132769 0.301468 0.672370 1.908422
Log-Likelihood: -128.98350896528038
'''

If you have an interesting dataset, please email me (m.reid854@gmail.com) and I may include it in this database.

175
reliability Documentation, Release latest

176 Chapter 35. Datasets


CHAPTER 36

One sample proportion

This function calculates the upper and lower bounds of reliability for a given number of trials and successes. It is most
applicable to analysis of test results in which there are only success/failure results and the analyst wants to know the
reliability of the batch given those sample results.
Inputs:
• trials - the number of trials which were conducted
• successes - the number of trials which were successful
• CI - the desired confidence interval. Defaults to 0.95 for 95% CI.
Outputs:
• lower, upper - Confidence interval limits. Note that this will return nan for lower or upper if the one sided CI is
calculated (ie. when successes=0 or successes=trials).
In this example, consider a scenario in which we have a large batch of items that we need to test for their reliability.
The batch is large and testing is expensive so we will conduct the test on 30 samples. From those 30 samples, 29
passed the test. If the batch needs at least 85% reliability with a 95% confidence, then should we accept or reject the
batch?

from reliability.Reliability_testing import one_sample_proportion


result = one_sample_proportion(trials=30,successes=29)
print(result)

'''
(0.8278305443665873, 0.9991564290733695)
'''

The lower bound (with 95% confidence interval) on the reliability was 82.78%. Since this is below our requirement of
85%, then we should reject the batch.

177
reliability Documentation, Release latest

178 Chapter 36. One sample proportion


CHAPTER 37

Two proportion test

This function determines if there is a statistically significant difference in the results from two different tests. Similar
to the One_sample_proportion, we are interested in using results from a success/failure test, but we are now interested
in whether the difference in results is significant when comparing results between two tests.
Inputs:
• sample_1_trials - number of trials in the first sample
• sample_1_successes - number of successes in the first sample
• sample_2_trials - number of trials in the second sample
• sample_2_successes - number of successes in the second sample
• CI - desired confidence interval. Defaults to 0.95 for 95% CI.
Outputs:
• lower,upper,result - lower and upper are bounds on the difference. If the bounds include 0 then it is a statistically
non-significant difference.
In this example, consider that sample 1 and sample 2 are batches of items that two suppliers sent you as part of their
contract bidding process. You test everything each supplier sent you and need to know whether the reliability difference
between suppliers is significant. At first glance, the reliability for sample 1 is 490/500 = 98%, and for sample 2 is
770/800 = 96.25%. Without considering the confidence intervals, we might be inclined to think that sample 1 is almost
2% better than sample 2. Lets run the two proportion test with the 95% confidence interval.
from reliability.Reliability_testing import two_proportion_test
result = two_proportion_test(sample_1_trials=500,sample_1_successes=490,sample_2_
˓→trials=800,sample_2_successes=770)

print(result)

'''
(-0.0004972498915250083, 0.03549724989152493, 'non-significant')
'''

Because the lower and upper bounds on the confidence interval includes 0, we can say with 95% confidence that there
is no statistically significant difference between the suppliers based on the results from the batches supplied.

179
reliability Documentation, Release latest

180 Chapter 37. Two proportion test


CHAPTER 38

Sample size required for no failures

The function sample_size_no_failures is used to determine the minimum sample size required for a test in
which no failures are expected, and the desired outcome is the lower bound on the reliability based on the sample size
and desired confidence interval.
Inputs:
• reliability - lower bound on product reliability (between 0 and 1)
• CI - confidence interval of result (between 0.5 and 1). Defaults to 0.95 for 95% CI.
• lifetimes - if testing the product for multiple lifetimes then more failures are expected so a smaller sample size
will be required to demonstrate the desired reliability (assuming no failures). Conversely, if testing for less than
one full lifetime then a larger sample size will be required. Default is 1.
• weibull_shape - if the weibull shape (beta) of the failure mode is known, specify it here. Otherwise leave the
default of 1 for the exponential distribution.
Outputs:
• number of items required in the test. This will always be an integer (rounded up).
As an example, consider a scenario in which we want to be sure that a batch of LEDs meets the reliability target for
on/off cycles. Testing is for the planned lifetime (1 million cycles) and tested items will have most or all of their
lifetime used up during testing so we can’t test everything. How many items from the batch do we need to test to
ensure we achieve 99.9% reliability with a 95% confidence interval?

from reliability.Reliability_testing import sample_size_no_failures


result = sample_size_no_failures(reliability=0.999)
print(result)

'''
2995
'''

Based on this result, we need to test 2995 items from the batch and not have a single failure in order to be 95%
confident that the reliability of the batch meets or exceeds 99.9%. If we tested for more on/off cycles (lets say 3

181
reliability Documentation, Release latest

million which is 3 lifetimes), then the number of successful results would only need to be 999. In this way, we can
design our qualification test based on the desired reliability, CI, and lifetimes that are tested to.
In the event that we suffer a single failure during this test, then we will need to adjust the testing method, either by
finishing the testing and calculating the lower bound on reliability using the one_sample_proportion test, or by using
a sequential_sampling_chart.

182 Chapter 38. Sample size required for no failures


CHAPTER 39

Sequential sampling chart

A sequential sampling chart provides decision boundaries so that a success/failure test may be stopped as soon as there
have been enough successes or enough failures to exceed the decision boundary. The decision boundary is calculated
based on four parameters. These are the producer’s quality, consumer’s quality, producer’s risk, and consumer’s risk.
Consider the producer’s and consumer’s quality to be the desired reliability of the sample, and the producer’s and
consumer’s risk to be 1-confidence interval.
Inputs:
• p1 - producer_quality. The acceptable failure rate for the producer (typical around 0.01)
• p2 - consumer_quality. The acceptable failure rate for the consumer (typical around 0.1)
• alpha - producer_risk. Producer’s CI = 1-alpha (typically 0.05)
• beta - consumer_risk. Consumer’s CI = 1-beta (typically 0.1)
• test_results - array or list of binary test results. eg. [0,0,0,1] for 3 successes and 1 failure. Default=None
• show_plot - True/False. Defaults to True.
• print_results - True/False. Defaults to True.
• max_samples - the x_lim of the plot. Default=100.
Outputs:
• The sequential sampling chart - A plot of sequential sampling chart with decision boundaries. test_results are
only plotted on the chart if provided as an input.
• results - a dataframe of tabulated decision results. These will be printed if print_results=True
In the example below, we use the inputs p1=0.01, p2=0.10, alpha=0.05, beta=0.10. The resulting decision boundaries
are plotted, and the test results that we have supplied are also plotted as a stepped line. The plot shows that after our
3rd failure, the test should be stopped as the batch can be rejected. The dataframe of results is also printed by default.
In this dataframe, the value of x is used to replace impossible numbers, ie. we cannot reject 2 failures if we have only
conducted 1 inspection. This example is based on an example in the Engineering statistics handbook published online
by NIST.

183
reliability Documentation, Release latest

from reliability.Reliability_testing import sequential_samling_chart


test_results = [0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,
˓→0,0,1]

sequential_samling_chart(p1=0.01,p2=0.10,alpha=0.05,beta=0.10,test_results=test_
˓→results)

'''
Failures to accept Failures to reject
Samples
0 x x
1 x x
2 x 2
3 x 2
4 x 2
5 x 2
6 x 2
7 x 2
8 x 2
9 x 2
10 x 2
11 x 2
12 x 2
13 x 2
14 x 2
15 x 2
16 x 2
17 x 2
18 x 2
19 x 2
20 x 3
21 x 3
22 x 3
23 x 3
24 0 3
25 0 3
26 0 3
27 0 3
28 0 3
29 0 3
... ... ...
71 1 5
72 1 5
73 1 5
74 2 5
75 2 5
76 2 5
77 2 5
78 2 5
79 2 5
80 2 5
81 2 5
82 2 5
83 2 5
84 2 5
85 2 5
86 2 5
87 2 5
(continues on next page)

184 Chapter 39. Sequential sampling chart


reliability Documentation, Release latest

(continued from previous page)


88 2 5
89 2 5
90 2 5
91 2 5
92 2 5
93 2 5
94 2 5
95 2 5
96 2 6
97 2 6
98 2 6
99 2 6
100 3 6
'''

185
reliability Documentation, Release latest

186 Chapter 39. Sequential sampling chart


CHAPTER 40

Reliability test planner

A solver to determine the parameters of a reliability test when given 3 out of the 4 unknowns (lower confidence bound
on MTBF, test duration, number of failures, confidence interval).
The underlying assumption is that the failures follow an exponential distribution (ie. failures occur randomly and the
hazard rate does not change with age). Using this assumption, the The Chi-squared distribution is used to find the
lower confidence bound on MTBF for a given test duration, number of failures, and specified confidence interval.:
2𝑇
𝑀 𝑇 𝐵𝐹 = (1−𝐶𝐼)
𝜒2 ( 𝑛 ,2𝐹 +𝑝 )
Where:
• MTBF = Mean time between failures (same as mean time to failure (MTTF) when the hazard rate is constant as
it is here). Note that this is the lower confidence interval on MTBF. If you want the point estimate then specify
CI=0.5 and two_sided=False.
• T = Test duration (this is the total time on test across all units being tested)
• CI = Confidence interval (the confidence interval to be used for the lower bound on the MTBF)
• F = number of failures during the test
• n = adjustment for one sided (n=1) or two sided (n=2) test
• p = adjustment for time terminated (p=2) or failure terminated (p=0) test
The above formula can be rearranged, or solved iteratively to determine any of these parameters when given the other
3. The user must specify any 3 out of the 4 variables (not including one_sided, print_results, or time_terminated)
and the remaining variable will be calculated. Note the difference between the one-sided and two-sided confidence
intervals which are specified using the input one_sided=True/False described below. A description of the difference
between one-sided and two-sided confidence intervals is provided at the end of this page. The formula used defaults
to a time_terminated test (where the test was stopped at a particular time which was not related to the number of
failures). If the test was stopped after a particular number of failures (such as all items failing) then you must specify
time_terminated=False to ensure the correct formula is used.
A similar calculator is available in the reliability analytics toolkit.
Inputs:

187
reliability Documentation, Release latest

• MTBF - mean time between failures. This is the lower confidence bound on the MTBF. Units given in same
units as the test_duration.
• number_of_failures - the number of failures recorded (or allowed) to achieve the MTBF. Must be an integer.
• test_duration - the amount of time on test required (or performed) to achieve the MTBF. May also be distance,
rounds fires, cycles, etc. Units given in same units as MTBF.
• CI - the confidence interval at which the lower confidence bound on the MTBF is given. Must be between 0.5
and 1. For example, specify 0.95 for 95% confidence interval.
• print_results - True/False. Default is True.
• one_sided - True/False. Default is True. If set to False, the two sided confidence interval will be returned.
• time_terminated - True/False. Default is True. If set to False, the formula for the failure-terminated test will be
used.
Outputs:
• If print_results is True, all the variables will be printed.
• An output object is also returned with the same values as the inputs and the remaining value also calculated.
This allows for any of the outputs to be called by name.
In the example below, we have done a time-terminated reliability test for 19520 hours (units are not important here
as it may be days, cycles, rounds, etc.). During the test there were 7 failures. We want to know the MTBF that was
achieved during the test within an 80% confidence (two-sided). The second example shows how the output may be
supressed and the results accessed by name.

#example 1
from reliability.Reliability_testing import reliability_test_planner
reliability_test_planner(test_duration=19520,CI=0.8,number_of_failures=7, one_
˓→sided=False)

'''
Reliability Test Planner results for time-terminated test
Solving for MTBF
Test duration: 19520
MTBF (lower confidence bound): 1658.3248534993454
Number of failures: 7
Confidence interval (2 sided): 0.8
'''

#example 2
from reliability.Reliability_testing import reliability_test_planner
output = reliability_test_planner(number_of_failures=6, test_duration=10000, CI=0.8,
˓→print_results=False)

print(output.MTBF)

'''
1101.8815940201118
'''

40.1 One-sided vs two-sided confidence interval

The below image illustrates the difference between one-sided and two-sided confidence interval. You can use either
the one-sided or two-sided interval when you are seeking only the lower bound, but it is essential to understand that
they will give very different results for the same CI. They will give equivalent results if the CI is set appropriately (eg.

188 Chapter 40. Reliability test planner


reliability Documentation, Release latest

90% one-sided is the same as 80% two-sided). If you are unsure which to use, the more conservative approach is to
use the two-sided interval. If you want the point estimate, use the one-sided interval with a CI=0.5.

40.1. One-sided vs two-sided confidence interval 189


reliability Documentation, Release latest

190 Chapter 40. Reliability test planner


CHAPTER 41

Reliability test duration

Note: This function will be available in version 0.5.3 which is currently unreleased.

This function is an extension of the reliability_test_planner which allows users to calculate the required duration for
a reliability test to achieve the specified producers and consumers risks. This is done based on the specified MTBF
(mean time between failure) required and MTBF design.
This type of determination must be made when organisations looking to test an item are uncertain of how much testing
is required, but they know the amount of risk they are willing to accept as well as the MTBF required and the MTBF
to which the item has been designed.
Inputs:
• MTBF_required - the required MTBF that the equipment must demonstrate during the test.
• MTBF_design - the design target for the MTBF that the producer aims to achieve.
• consumer_risk - the risk the consumer is accepting. This is the probability that a bad product will be accepted
as a good product by the consumer.
• producer_risk - the risk the producer is accepting. This is the probability that a good product will be rejected as
a bad product by the consumer.
• one_sided - default is True. The risk is analogous to the confidence interval, and the confidence interval can be
one sided or two sided.
• time_terminated - default is True. whether the test is time terminated or failure terminated. Typically it will be
time terminated if the required test duration is sought.
• show_plot - True/False. Default is True. This will create a plot of the risk vs test duration. Use plt.show() to
show it.
• print_results - True/False. Default is True. This will print the results to the console.
Outputs:
• test duration

191
reliability Documentation, Release latest

• If print_results is True, all the variables will be printed to the console.


• If show_plot is True a plot of producer’s and consumer’s risk Vs test duration will be generated. Use plt.show()
to display it.
In the example below the consumer requires a vehicle to achieve an MTBF of 2500km and is willing to accept 20%
risk that they accept a bad item when they should have rejected it). The producer has designed the vehicle to have an
MTBF of 3000km and they are willing to accept 20% risk that the consumer rejects a good item when they should
have accepted it. How many kilometres should the reliability test be? Using the function we find the test needs to be
231616 km.

from reliability.Reliability_testing import reliability_test_duration


import matplotlib.pyplot as plt

reliability_test_duration(MTBF_required=2500, MTBF_design=3000, consumer_risk=0.2,


˓→producer_risk=0.2)

plt.show()

'''
Reliability Test Duration Solver for time-terminated test
Required test duration: 231615.79491309822 # Note that this duration is the total
˓→time on test and may be split across several vehicles.

Specified consumer's risk: 0.2


Specified producer's risk: 0.2
Specified MTBF required by the consumer: 2500
Specified MTBF designed to by the producer: 3000
'''

192 Chapter 41. Reliability test duration


reliability Documentation, Release latest

How does the algorithm work?


The underlying method is as follows:
Step 1) Begin with failures = 1. This will be iterated later.
Step 2) Using the function Reliability_testing.reliability_test_planner, we set CI = 1-consumer_risk, MTBF =
MTBF_required to solve for the test_duration that is achieved by this test. This is the test duration required if there
was 1 failure which would give the specified MTBF required and specified consumer’s risk.
Step 3) We again use the function Reliability_testing.reliability_test_planner but this time we set MTBF =
MTBF_design and use the test_duration as the output from step 2. Still keeping failures = 1 we are solving for
the CI achieved. This is effectively the producer’s risk for the given test_duration and number of failures.
Step 4) The answer is higher than the specified producer’s risk, so we now repeat steps 2 and 3 by increasing the
number of failures by 1 each iteration. This is continued until the producer’s risk is below what was specified. We
then go back 1 failure since is it standard that the producer’s risk can’t be below what was specified (or the consumer
may think the producer is cheating by lowering their risk).
We now have a value for test_duration that will give our required outputs in both equations. We also happen to arrive
at the number of failures, though this is not particularly relevant since it is just part of the solution process and the
actual number of failures will be determined based on the conduct of the reliability test.
The plot that is produced by Reliability_testing.reliability_test_duration displays a scatter plot at each failure. Since
the number of failures must be an integer, we get results for reliability test durations that go in steps. The result returned
corresponds to the test_duration at the last failure before the producer’s risk dropped below what was specified. Also
note that if the consumer’s risk is different from the producer’s risk, the solution for test_duration will not occur near
the point on the graph where producer’s risk and consumer’s risk are equal.

193
reliability Documentation, Release latest

Note: This function will be available in version 0.5.3 which is currently unreleased.

194 Chapter 41. Reliability test duration


CHAPTER 42

Changelog

42.1 Version: 0.5.3 — Currently Unreleased — Scheduled for release


around the end of September

New features
• Implemented Loglogistic_Distribution
• Implemented Fit_Loglogistic_2P
• Added the function Reliability_testing.reliability_test_duration
• Added Fit_Loglogistic_2P to Fitters.Fit_Everything
API Changes
• Reliability_testing.reliability_test_planner has an optional argument of two_sided which was set to True as
default. This has been changed to one_sided=True, making the default calculation use the one-sided confidence
interval and changing the argument name. The reason for this change was to align the function with the approach
more commonly used in industry.
Bug Fixes
• Fixed autoscale for cases where the HF is constant so it no longer lies along the yaxis upper limit
• Fit_Everything had a bug in the default xvals for the Beta_Distribution’s histogram which caused an error in
some special cases. This is now resolved.
Other
• Fixed the HF and CHF equations for Exponential_Distribution to be actual equations. The is preferred than
using the HF = PDF/SF and CHF=-ln(SF) relationships which breakdown when SF=0 at high xvals. This has
also been implemented for the loglogistic distribution. Can’t do it for Normal, Lognormal, Gamma, and Beta
distributions as these do not have closed form solutions for HF and CHF which don’t involve the SF.
• Changed the Gamma_Distribution and Weibull_Distribution mode to be self.gamma when beta < 1. Previously
it was “No mode exists when beta < 1” which is true from a formula perspective but it is clear that the mode

195
reliability Documentation, Release latest

is equal to gamma as that’s where the asymptote occurs. The only distribution with “no mode exists. . . ” is the
Beta distribution as it can have 2 modes for certain values of alpha and beta.
• Updated Utils.generate_X_array to use 200 points (rather than 100) and allocated more points to the right hand
side of the plot (beyond b99). This was because plots were not displaying smoothly enough for distributions
with high skewness.
• Changed default plotting upper limit to b9999. Previously it was slightly more and was not a round quantile.
Done for simplicity and minimal change will be noticed.
• Changed the layout of the Probability plots and PP plots in Fit_Everything from a 5x2 grid to a 4x3 grid. This
made more sense due to the addition of the Loglogistic Distribution which would have made the layout 6x2
which is too long.

42.2 Version: 0.5.2 — Released: 14 August 2020

New features
• New distributions
– Mixture_Distribution
– Competing_Risks_Distribution
• A new fitter for the Weibull competing risks model (Fit_Weibull_CR)
• The output of the Fit_Weibull_Mixture now includes a probability plot instead of a histogram of the PDF and
CDF
• The output of the Fit_Weibull_Mixture now prints the confidence interval estimates of the parameters
• Added some datasets for use with the mean cumulative function (MCF_1 and MCF_2).
API Changes
• Within Fitters.Fit_Weibull_mixture the option show_plot has been changed to show_probability_plot to align
with all the other fitters.
Bug Fixes
• Fixed the autoscale in Weibull and Exponential distributions that locked autoscaling when confidence intervals
were plotted sequentially.
• Automatic removal of zeros for all fitters (except Normal_2P). Previously the zeros were left in the data and
resulted in NaNs and crashes. Also added a dedicated error to report input with times below zero.
• Fixed the confidence interval bounds for Kaplan-Meier and Nelson-Aalen CHF plots. Some of the bounds were
inf since the CHF = -ln(SF) which will be inf when SF=0.
• MCF_Nonparametric and MCF_Parametric had a bug which caused crashes when the dataset included a system
with only one censored time. This has now been fixed.
Other
• Minor clean up of code. Removed unnecessary imports, removed unused variables, etc. Hopefully this will have
no noticable effects.
• Within Fitters.Fit_Everything the histogram output has been improved with better formatting and it now uses
the Freedman-Diaconis rule for obtaining optimal bin width.
• Fixed Weibull HF and CHF equations to use actual equations and not PDF/SF or -ln(SF) as these result in NaN
when SF=0 (an issue at high xvals). These changes are currently only implemented for Weibull_Distribution.

196 Chapter 42. Changelog


reliability Documentation, Release latest

• Improved creation of xvals for PDF,CDF,SF,HF,CHF within the Weibull Distribution. The changes now generate
datapoints where there is more detail (between the 0.1% and 99.9% quantiles) such that only 100 datapoints are
needed to show more detail than was previously achieved with 1000 datapoints. This is most noticable with
Weibull distributions that have high beta values and are significantly location shifted. An example of this is
shown in the plot below. These changes are only implemented for Weibull_Distribution but will be extended to
all distributions in the very near future.
• Improved autoscaling for the Weibull Distribution plots. For location shifted distributions, this zooms in on the
0.1% to 99.9% quantiles allowing users to see more detail. The HF and CHF ylimits are also limited based on
the quantiles so that they do not obscure the detail if there is an asymptote to large values or infinity. An example
of this is shown in the plot below. These changes are only implemented for Weibull_Distribution but will be
extended to all distributions in the very near future.

42.3 Version: 0.5.1 — Released: 08 July 2020

New features
• More efficient method used within Other_functions.similar_distributions. Results are always consistent and
more accurate now.
• Other_functions.histogram. This plots a histogram with optimal bin width, better default formatting, and an
option to shade bins white above a threshold.
API Changes
• Some of the functions in reliability.Other_functions have been moved into reliability.Utils and reliabil-
ity.Reliability_testing. The new layout is:
– Utils ⇒ round_to_decimals, transform_spaced, axes_transforms
– Other_functions ⇒ similar_distributions, convert_dataframe_to_grouped_lists, crosshairs,
make_right_censored_data
– Reliability_testing ⇒ one_sample_proportion, two_proportion_test, sample_size_no_failures, sequen-
tial_sampling_chart, reliability_test_planner

42.3. Version: 0.5.1 — Released: 08 July 2020 197


reliability Documentation, Release latest

• Within Other_functions.similar_distributions the option ‘monte_carlo_trials’ has been removed as the distribu-
tion sampling method is no longer random.
Bug Fixes
• Fixed confidence interval color inheritance for Nonparametric.Kaplan_Meier and Nonparamet-
ric.Nelson_Aalen. Previously the color was only inherited if specified rather than left as default.
• The default axes labels for both Stress_strength.Probability_of_failure and
Stress_strength.Probability_of_failure_normdist were reversed. The have now been switched to the cor-
rect labels.
Other
• Documentation updates to reflect the API changes in Version 0.5.1

42.4 Version: 0.5.0 — Released: 04 July 2020

New features
• Confidence intervals on fitted distributions ==> this has only been implemented for Weibull and Exponential.
Is is quite difficult and takes considerable time and testing. I will do Normal and Lognormal distributions next,
then Gamma and Beta distributions. I hope to finish them all by September 2020.
• Confidence intervals have been disabled in in ALT_probability_plotting and ALT_fitters to avoid cluttering on
the plot.
• The probability plot in Fit_Everything now uses the Exponential_probability_plot_Weibull_Scale instead of
Exponential_probability_plot. It is much clearer to see the effectiveness of the fit using the Weibull scale.
• Added an option to seed the random_samples functions within the Distributions module. This allows for repeat-
able results.
• Improvements to rounding of all titles, labels, and stats in Distributions and Probability_plotting using a new
function, round_to_decimals.
• Added Other_functions.round_to_decimals which keeps the specified number of decimals after leading zeros.
This is useful as round would make very small values appear as 0.
• Minor improvements to color inheritance for probability_plotting.
• Minor improvements to confidence interval color inheritance for Nonparametric.Kaplan_Meier and Nonpara-
metric.Nelson_Aalen.
• Within Stress_strength, the method of obtaining the solution has been changed from monte carlo to integration.
Thanks to Thomas Enzinger for providing the formula for this method in response to an Issue that was raised.
Using the integration method, accuracy is much higher (1e-11 error now vs 1e-3 error previously) and always
consistent, and the speed is significantly improved over the monte carlo method. As noted below in API changes,
there is no need to specify the number of monte_carlo_samples and no option to obtain the convergence plot.
• Within Stress_strength, the colors used for shading have been changed to improve the style.
• Probability_plotting.plot_points now includes the option to plot the points for the PDF and HF. These are not
very useful as they appear messy due to the discontinuous nature of the function, but they are added for com-
pleteness.
• Added Other_functions.transform_spaced. This is similar to np.linspace and np.logspace but it creates an array
that is ‘weibull spaced’, ‘normal spaced’, ‘exponential spaced’, ‘beta spaced’, or ‘gamma spaced’. It is used to
get data points for the confidence intervals so they are as evenly spaced as possible, particularly on probability
paper. This function is likely to be moved into utils.

198 Chapter 42. Changelog


reliability Documentation, Release latest

• Other_functions.make_right_censored_data has been added. This function accepts uncensored data and a
threshold, and returns failures and right_censored arrays.
• Added mplcursors to requirements in setup.py as it is needed for the crosshairs function.
• Added crosshairs function to Other_functions. This is a very useful feature that provides interactive crosshairs
to the plot using snap-to feature and also adds annotations on click events. Thanks to Antony Lee (the author of
mplcursors) for help with getting this to work using his library.
Bug fixes
• Within Stress_strength, there are improvements to the fill_between method as it had errors in some special cases.
• Fixed an Issue in Lognormal_Probability_Plot that occurred for very large numbers (above 1e20)
API Changes
• Within Stress_strength, the output format has changed from an object to a returned value of the probability of
failure. This makes it much more simple to access the answer since the object had only one value.
• Within Stress_strength, the method of obtaining the solution has been changed from monte carlo to integration.
As a result, there is now no need to specify the number of monte_carlo_samples and no option to obtain the
convergence plot.
• Added the options initial_guess_method and optimizer to Fit_Weibull_2P and Fit_Weibull_3P. They were pre-
viously only in Fit_Weibull_2P_grouped. It is planned to add these options to all fitters.
• There is now the option CI_type for the Weibull and Exponential fitters. This allows users to chose between
confidence bounds on reliability and time. This option will be added to all fitters as the confidence intervals for
the other distributions are completed.
Other
• Added tests folder. This is planned to include automated tests.
• Created utils module. I plan to move some utilities into here that are currently inside other modules where users
can access them, but users should never need to access them so they just create clutter in the dropdown lists of
your IDE.
• Added Reliability_testing module. I plan to move everything related to reliability testing out of Other_functions
as there is now enough functions to justify a new module dedicated to reliability testing.
• Documentation updates to reflect the changes in Version 0.5.0

42.5 Version: 0.4.9 — Released: 27 April 2020

New features
• Updates to reliability_test_planner to include option for failure terminated test
Other
• Addition of this Changelog to the documentation

42.5. Version: 0.4.9 — Released: 27 April 2020 199


reliability Documentation, Release latest

200 Chapter 42. Changelog


CHAPTER 43

Development roadmap

The following development roadmap is the current task list and implementation plan for the Python reliability library.
I welcome the addition of new suggestions, both large and small, as well as help with writing the code if you feel that
you have the ability. This roadmap is regularly changing and you may see some things remain on here for a while
without progressing, while others may be prioritized at short notice. If you have a suggested feature or you find a bug,
please raise an Issue on Github or email me (m.reid854@gmail.com) and I will endeavour to either add it rapidly (for
simple tasks and bug fixes) or add it to the roadmap.
Next task (currently in development)
• Plotting enhancements to increase the detail in plots using less points (by generating more points where the
plots curve and less where the plots are flat). Using 100 instead of 1000 points will make the plots much faster,
particularly when multiple distributions are layered. Done for Weibull but need to implement for the rest.
• Plotting enhancements to the x and y scale such that the limits are based on the quantiles. This will ensure more
relevant detail is shown, particularly for location shifted distributions. Done for Weibull but need to implement
for the rest.
• Writing more automated tests. This will speed up the code development processes and help prevent future
changes having unidentified effects.
• Confidence intervals for Normal and Lognormal distributions (Gamma and Beta will come later). Currently the
confidence intervals have only been completed for Weibull and Exponential distributions.
• Fitters for Loglogistic_Distribution: Fit_Loglogistic_2P, Fit_Loglogistic_3P
• Probability plot for loglogistic distribution: Loglogistic_probability_plot
High priority (expected by the end of 2020)
• New Distributions:
– Defective_Subpopulation_Distribution. This is for when the CDF does not reach 1 due to a lot of right
censored data.
– Zero_Inflated_Distribution. This is for when the CDF starts above 0 due to a lot of ‘dead on arrival’
products in the dataset.
– Gumbel_Distribution.

201
reliability Documentation, Release latest

• New Fitters for the above 3 new distributions:


– Fit_Weibull_DS
– Fit_Weibull_ZI
– Fit_Gumbel_2P, Fit_Gumbel_3P
• New probability plot for Gumbel_Distribution: Gumbel_probability_plot
• Add least squares as a method to obtain the initial guess for all Fitters. Currently this has only been implemented
in Fit_Weibull_2P, Fit_Weibull_2P_grouped, and Fit_Weibull_3P but all the other Fitters use scipy which is
slower but more accurate for small datasets.
• Amalgamate Fit_Weibull_2P and Fit_Weibull_2P_grouped under a single function. Need to decide the best
input format ==> failures=[], right_censored=[], xcn=[[x],[c],[n]] or df. If this works, do it for all Fitters so they
are fast for large datasets.
• Improvement to the online documentation for how some of these methods work, including the addition of more
formulas, algorithms, and better referencing.
Low priority (more of a wish list at this point)
• Warranty Module. This will be a new module of many tools for warranty calculation.
• New reliability growth models. Currently there is only the Duane model. It is planned to include the Crow
Extended and AMSAA PM2 models.
• Cox Proportional Hazards Model - This is available in Lifelines.
• Add the rank adjustment method to Nonparametric. Rank adjustment is the method used in Probability plotting
(eg. to obtain the Median Ranks) and is a common and useful nonparametric estimate of the CDF, SF, and CHF.
• 3D ALT probability plots (reliability vs time vs stress). This feature is seen in Reliasoft.
• Simple solver for various parameters - for example, find the parameters of a Weibull Distribution given the b5
and b95 life. Or for an Exponential Distribution find the time until the reliability reaches 50% given the MTTF.
There are so many combinations of these problems which are easy to solve but would be nice to have in a simple
tool. Reliasoft’s Quick Calculation Pad is a GUI version of this. The free software Parameter Solver also does a
few neat functions that I intend to include. Some of these do not have formulas so need to be solved iteratively.
• Speed improvements to fitters by using cython for key components.

202 Chapter 43. Development roadmap


CHAPTER 44

Citing reliability in your work

If reliability contributes to a project that leads to a scientific publication, please acknowledge this contribution
by citing the DOI 10.5281/zenodo.3938000.
The following reference is using APA:
Reid, M. (2020). Reliability – a Python library for reliability engineering (Version 0.5.1) [Computer software].
Zenodo. https://doi.org/10.5281/ZENODO.3938000
If you would like to use another referencing style, the details you may need are:
• Author: Matthew Reid
• Year published: 2020
• Title: Reliability – a Python library for reliability engineering
• Version: 0.5.1
• Platform: Python
• Available from: https://pypi.org/project/reliability/
• DOI: 10.5281/zenodo.3938000
Note that the version number is constantly changing so please check PyPI for the current version.
If you have used reliability in any published academic work, I would love to hear about it
(m.reid854@gmail.com). Depending on the usage, I may provide a link to your work below.
Links to articles and papers that have used the Python reliability library:
• Probabilistic characterization of random variables - Phase II - by Javier Alfonso Ochoa Moreno. Note that this
article is in Spanish.
• A tutorial for reliability engineers: going from scratch to building Weibull Analyses using Python - by Dr Sarah
Lukens.

203
reliability Documentation, Release latest

204 Chapter 44. Citing reliability in your work


CHAPTER 45

How to request or contribute a new feature

If you would like to see something added or an existing feature changed to make it better, please send me an email
(m.reid854@gmail.com) and from there we can put together a plan on how to proceed. I greatly appreciate all help,
even if it is just pointing out an error in the code or documentation. There are a large number of features currently
identified for inclusion into this library so if you would like to contribute something but don’t know what to help with,
please email me and we can discuss the functions that are currently planned for development.
If you are requesting something new, it is helpful to include as much detail as possible, such as:
• a detailed explaination of what the new feature is and why you think it would be useful
• example Python code
• any references such as academic papers or textbooks which describe the necessary steps
• screenshots from other programs (such as Minitab, JMP Pro, Reliasoft) which show the feature in use
Remember to upgrade reliability to the newest version and double check that the requested feature is not in
there, as the codebase is constantly evolving.

205
reliability Documentation, Release latest

206 Chapter 45. How to request or contribute a new feature


CHAPTER 46

How to get help

Questions on statistics or mathematics


If you have a question about a statistical or mathematical process that reliability performs, please consult Google
and Wikipedia to see if you can find an existing explaination. If you still need to ask someone then I recommend asking
your question on Stack Exchange. If you still can’t get an answer on there, you’re welcome to email me directly
(m.reid854@gmail.com) and I will try my best to help you.
Questions on using the Python reliability library
If you have a question about how to do something using reliability or how to use one of the features within
reliability then you should firstly consult the documentation and the help files within Python. An example
of how to access one of the help files is provided below. If the documentation and help files still do not answer
your question then you’re welcome to email me directly (m.reid854@gmail.com) and I will work to improve the
documentation if it is unclear.

from reliability import Fitters


print(help(Fitters))

207
reliability Documentation, Release latest

208 Chapter 46. How to get help


CHAPTER 47

How to donate to the project

The Python reliability library is free, open source, and it always will be. It aims to provide students and professionals
alike with a set of tools that you would otherwise only find in commercial software. Most commercial software is
prohibitively expensive, especially for individuals, so I aim to bring these features to you for free by continuing to
develop this library.
Developing and maintaining this library is all done in my free time and can be quite a time consuming process. If you
would like to donate as a way of showing your appreciation, you can send whatever amount you feel is appropriate
using paypal.me. If you don’t want to say thanks with money that is fine too, as I always appreciate getting emails
from people all over the world telling me that they find the library useful.

209
reliability Documentation, Release latest

210 Chapter 47. How to donate to the project


CHAPTER 48

About the author

The Python reliability library was written by Matthew Reid. Matthew holds a Masters of Reliability Engineering
from the University of Maryland, a Masters of Project Management from the University of New South Wales, and
a Bachelor of Aeronautical Engineering from the University of New South Wales. Based in Melbourne, Australia,
Matthew currently works as a reliability engineer on a variety of acquisition and sustainment projects for land materiel.
The Python reliability library was written because there were no dedicated reliability engineering libraries for Python,
and Matthew found himself needing to use scipy.stats, lifelines, Minitab, MATLAB, JMP Pro, and his own Python
scripts, for a variety of common reliability engineering tasks that were not available in one place. This library is
intended to make reliability engineering more accessible to the world, particularly to those individuals and small
businesses who find the high cost of proprietary software to be a barrier to entry into conducting reliability engineering
analysis.
This is Matthew’s first Python library on the Python Package Index and is currently in active development. In ac-
cordance with the LGPLv3 license, every effort has been made to ensure the software is free of errors, however, no
guarantees or warranties are provided in any form. Feedback on the Python reliability library is most welcome. If you
find an error, have a suggestion, would like to request something to be included, or would like to contribute something,
please send it through by email (m.reid854@gmail.com).

211
reliability Documentation, Release latest

212 Chapter 48. About the author


CHAPTER 49

Credits to those who helped

During the process of writing reliability there have been many problems that I was unable to solve alone. I
would like to thank the following people who provided help and feedback on problems with the code and with the
reliability concepts:
• Cameron Davidson-Pilon for help with getting autograd to work to fit censored data and for writing autograd-
gamma which makes it possible to fit the gamma and beta distributions. Also for providing help with obtaining
the Fisher Information Matrix so that the confidence intervals for parameters could be estimated.
• Dr. Vasiliy Krivtsov for providing feedback on PP and QQ plots, for further explaining optimal replacement
time equations, and for guidance in developing the Competing risk model. Dr. Krivtsov teaches “Collection and
analysis of Reliability Data (ENRE640)” at the University of Maryland.
• Dr. Mohammad Modarres for help with PoF, ALT_fitters, and ALT_probability_plotting. Dr. Modarres teaches
several reliability engineering subjects at the University of Maryland and has authored several of the textbooks
listed under recommended resources.
• The Stack Overflow user ImportanceOfBeingErnest for this answer that was necessary to get the probability
plotting functions working correctly for Gamma and Beta distributions.
• Antony Lee for help in adapting parts of his mplcursors library into the crosshairs function in reliabil-
ity.Other_functions.crosshairs
• Thomas Enzinger for help in improving the method of finding the area in stress-strength interference between
any two distributions. Previously this was done using a monte-carlo method, but Thomas’ method is much more
accurate and always consistent. This is incorporated in Version 0.5.0.

213

You might also like