Building PoD Curves

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12

Vancouver, Canada, July 12-15, 2015

Building Probability of Detection Curves via Metamodels

Thomas Browne
PhD Student, EDF R&D, Chatou, France
Loïc Le Gratiet
Research Engineer, EDF R&D, Chatou, France
Géraud Blatman
Research Engineer, EDF R&D, Moret sur Loing, France
Sara Cordeiro
Engineer, EDF CEIDRE, Saint Denis, France
Benjamin Goursaud
Research Engineer, EDF R&D, Clamart, France
Bertrand Iooss
Senior Researcher, EDF R&D, Chatou, France
Léa Maurice
Engineer, EDF CEIDRE, Saint Denis, France

ABSTRACT: Probability of Detection (POD) curves is a standard tool in several industries to evaluate
the performance of Non Destructive Testing (NDT) procedures. However, the classical methods for POD
determination rely on strong statistical assumptions (linearity, residuals normality and homoscedasticity).
In the context of numerical POD estimation (with data coming from numerical simulations of the system),
we study classic and novel model-based approaches. Applications are performed on Eddy Current Non
Destructive Examination numerical data.

1. INTRODUCTION make them sometimes unaffordable. To overcome


In several industries (as in aeronautics), the proba- this problem, it is possible to resort to numerical
bility of Detection (POD) curve is a standard tool to simulation of NDT process.
evaluate the performance of Non Destructive Test- In this work, we focus on the examination under
ing (NDT) procedures (Gandosi and Annis, 2010; wear anti-vibration bars (AVB) of steam generator
MIL-HDBK-1823A, 2009). The goal is to assess tubes with simulations performed by the computer
the quantification of inspection capability for the code Code_Carmel3D (developed by EDF R&D).
detection of harmful flaws for the inspected struc- The construction of the numerical model requires
ture. For the French company of electricity (EDF), a specific mesh, as displayed in Figure 1. In the
the potentialities of this tool are studied in the con- following experiments the input set was picked only
text of the Eddy Current Non Destructive Examina- in order to test the numerical models. It has not
tion in order to ensure integrity of steam generators been validated.
tubes (Maurice et al., 2012). The determination of this “numerical POD” is
However, high costs of the implementation of based on a four-step approach:
experimental POD campaigns combined with con- 1. Identify the set of parameters that significantly
tinuous increase in the complexity of configuration affect the NDT signal;

1
12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12
Vancouver, Canada, July 12-15, 2015

• E ∼ N (aE , bE ) : pipe thickness (mm) based


on data got from 5000 pipes,

• h1 ∼ U [ah1 , bh1 ] : first flaw height (mm),

• h2 ∼ U [ah2 , bh2 ] : second flaw height (mm),

• P1 ∼ U [aP1 , bP1 ] : first flaw depth (mm),

• P2 ∼ U [aP2 , bP2 ] : second flaw depth (mm),


Figure 1. Illustration of the mesh in the numerical • ebav1 ∼ U [−P1 + aebav1 , bebav1 ] : length of the
model of NDT simulation. gap between the AVB and the first flaw (mm),

2. Attribute a specific probability distribution to • ebav2 ∼ U [−P2 + aebav2 , bebav2 ] : length of


each of these parameters (for instance from ex- the gap between the AVB and the second flaw
pert judgment); (mm).
3. Propagate the input parameters uncertainties
As displayed in Figure 2, we consider the occur-
through the NDT numerical model;
rence of one flaw on each side of the pipe due to
4. Build the POD curve from standard ap-
AVB. To take this eventuality into account in the
proaches like the so-called Berens method.
computattions, 50% of the experiments are mod-
This approach is closely related to the generic
elized with one flaw, and 50% with two flaws.
uncertainty management methodology in numeri-
cal simulation as explained in de Rocquigny et al.
(2008) and Pasanisi and Dutfoy (2012). Several
statistical tools based on numerical design of ex-
periments, uncertainty propagation efficient algo-
rithms and metamodeling concepts will then be use-
ful (Fang et al., 2006).
The model parameterization and the design of
numerical experiments (Code_Carmel3D computa-
tion) are explained in the following section. The
third section introduces the three POD curves de-
termination methods: the Berens method (based
on a linear regression model), a binomial-Berens
method and a method based on the Gaussian pro-
cess surrogate model. A conclusion section synthe-
sizes the work and introduces our prospects.
Figure 2. Illustration of the considered inputs.

2. MODEL PARAMETERIZATION AND


NUMERICAL DESIGN 2.2. Definition of the design of experiments
2.1. Influent parameters and associated random In order to build a simplified model (i.e. a surrogate
distributions definition model) that estimates the output of interest Pro jY
By relying on both experts reports and data simu- related to the IP of the system, it is needed to evalu-
lations, the sample of the Influent Parameters (IP) ate Code_Carmel3D on some points in IP set. This
which can have an impact on the code outputs cho- dataset, called design of experiments, has to be de-
sen for this faisibility study are: fined at the very beginning of the study, which is

2
12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12
Vancouver, Canada, July 12-15, 2015

to say before any numerical simulation. A clas- −0.05 0.66


sic method consists in building the design of ex-

1500

1500
periments by picking completely randomly differ-
ent points of the IP set (Monte Carlo simulations

ProjY
type). However from time to time it leads to a

0 500

0 500
design which does not properly "fill-in" the IP set
(Fang et al., 2006): the idea is to spread the numer-
ical simulations all over the IP set so no big subset
1.07 1.11 0.1 0.4
is left "unknown". To this effect it is more relevant
to choose the values according to a deterministic E P1
rule, such as Quasi-Monte Carlo method. Indeed,
for a size of design N, it is proved that this method
0.51 0.66
often happens to be more precise than the clas-
sic Monte-Carlo method (Morokoff and Caflisch,

1500

1500
1995). Given the available computing time, a de-
ProjY
sign of experiments of size 100 is created.

0 500

0 500
3. METHODS OF POD CURVES ESTIMA-
TION
In this framework, one wants to build the POD 1.0 1.6 0.0 0.3
curve as a function of its most influent parame- iP2 P2
ter: a := max(P1 , P2). By using the computer code
0.09 −0.08
Code_Carmel3D, one focuses on the output Pro jY
which is a projection of the simulated signal we
1500

would get after NDT process. The other inputs are 1500
seen as random variables, which makes Pro jY itself
ProjY

an other random variable. The effects of all the IP


0 500

0 500

are displayed in Figure 3. The bold values are the


correlation coefficients between the output Pro jY
and the corresponding IP. Strong influences of P1 0.0 0.6 0.0 0.6
and P2 on Pro jY are detected.
ebav1 ebav2
Given a threshold s > 0, a flaw is considered
to be detected when Pro jY > s. Therefore the
one dimensional POD curve is denoted by: ∀a > 0.06 0.08
0 POD(a) = P (Pro jY > s | a). In this paper one
1500

1500

offers three different regression models of Pro jY in


order to build an estimation of the POD curve. Nu-
ProjY

merical simulations are computed for the N = 100


points of the design of experiments.
0 500

0 500

3.1. Berens method (Berens, 1988)


It consists in a linear regression of the output 0.0 1.5 3.0 0.0 1.5 3.0
Pro jY . To improve the model, a Box-Cox trans- h1 h2
formation (Box and Cox, 1964) is made on the out-
put, which means that we now focus on: yPro jY = Figure 3. Pro jY with respect to the IP.
Pro jY λ −1
λ . λ is determined by maximum likelihood

3
12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12
Vancouver, Canada, July 12-15, 2015

as the real number that offers the finest linear re-


gression of yPro jY regarding the parameter a (see
Figure 4).

25
(ProjYλ − 1) λ

20
25

15
S
(ProjYλ − 1) λ

20

10
15

0.1 0.2 0.3 0.4 0.5

a
10

Figure 5. Linear model illustration. The Gaussian


predictive distributions for a = max (P1 , P2 ) = 0.2,
0.1 0.2 0.3 0.4 0.5
0.3 and 0.4 are given. The horizontal line
a
represents the detection threshold.
Figure 4. Box-Cox transformation with parameter
λ = 0.3 for the response Pro jY .
1.0

The model is now based on yPro jY and is defined


0.8

as
yPro jY (a) = β0 + β1 a + ε ,
(1)
0.6
POD(a)

with ε the model error such as ε ∼ N 0, σε2 .



0.4

Maximum likelihood method provides the es-


timators βˆ0 , βˆ1 et σˆε . Hence the model im-
0.2

plies
 the following  result: ∀a > 0, yPro jY (a) ∼
0.0

N β0 + β1 a, σˆε . Then the value of the POD


ˆ ˆ 2
0.1 0.2 0.3 0.4 0.5 0.6
curve can be estimated as displayed in the Figure
5. a

We finally get the one dimension POD curve (see Figure 6. Example of POD curve estimation and
Figure 6). By considering the error that is provided confident interval with Berens method.
by the property of a maximum likelihood estima-
tor in a case of a linear regression, we can use this
of its realizations which we regroup in the follow-
uncertainty on both β0 and β1 to build confidence
ing vector
intervals. The 95% confidence curve that we have
on the estimated POD curve is also illustrated in ε N = yNpro jY − βˆ0 − βˆ1 aN . (2)
Figure 6.
Therefore we build its histogram and we add it to
3.2. Binomial-Berens mix method the prediction of the linear model as shown in Fig-
Here we keep the linear regression on yPro jY , which ure 7. By using the i.i.d. property of ε , let us
is: ∀a > 0 yPro jY = βˆ0 + βˆ1 a + ε but we do not consider that we have N realizations of the random
assume that ε is Gaussian anymore. However the value yPro jY (a) for a > 0 and we can use them to
errors are still assumed to be independent and iden- estimate the probability for yPro jY (a) to exceed the
tically distributed. We then consider that we have N threshold s (see Figure 7).

4
12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12
Vancouver, Canada, July 12-15, 2015

where Z is the centered Gaussian process. We make


the assumption that Z is second order stationary
with variance σ 2 . Besides, we assume that k(·, ·)
25

is the Matérn 5/2 kernel, which is parameterized


by its lengthscale θ (∈ R6 in this case). Thanks
(ProjYλ − 1) λ

20

to the maximum likelihood method, we can esti-


mate the values of the so far-unknown parameters:
15

S β0 , β1 , σ 2 and θ .
Kriging provides an estimator of yPro jY (x) which
10

we write y\ Pro jY (x). Moreover, through its variance


σZ2 (x), kriging quantifies the uncertainty induced
0.1 0.2 0.3 0.4 0.5 by estimating yPro jY (x) with y\ Pro jY (x). Indeed, one
a has the new probability distribution:

Pro jY (x), σZ (x)


Figure 7. Binomial-Berens method: Berens yPro jY (x) | yN 2
 
∀x Pro jY ∼ N y \
method without normal hypothesis. The Gaussian (6)
densities are replaced by the sample histogram. where \
yPro jY (x) is the kriging mean (i.e.
h i
Pro jY ) and σZ (x) the kriging
E yPro jY (x) | yN 2
For each a > 0, let Ns (a) be the number of re-
variance. They can both be explicitly estimated.
alizations of the random variable yPro jY (a) that are
Hence we can estimate the value of POD(a), for
higher than s. That is to say:
n o a > 0:
Ns (a) = Card (εi )i∈{1,...,N} | β0 + β1 a + εi > s .
ˆ ˆ
Pro jY (x), σZ (x) > s | a . (7)
2
 
POD(a) ≃ P N y\
(3)
Therefore an estimation of POD(a) is given by By using the uncertainty implied by the Gaussian
Ns (a)
N , with Ns (a) ∼ B (N, POD(a)). The assump- distribution regressions, one can build new confi-
tion on Ns (a) distribution can then be used to build dence intervals as it is illustrated in Figure 8. We
confidence intervals on the value of POD(a), for visualize the confidence interval induced by the
a > 0. Monte carlo estimation, the one induced by the
kriging approximation and the total confidence in-
3.3. Kriging method
terval (including both approximations).
As some criticism could be made at some point re-
garding the i.i.d. property of the model error ε , let 4. CONCLUSIONS
us set a Gaussian process regression (Sacks et al.,
This paper has presented different techniques to
1989; Fang et al., 2006) in order to build a surrogate
Probability of Detection (POD) curves determina-
model of the transformed output yPro jY . Now the
tion (flaw detection probability) in a context of Non
influence of the other inputs (described in section
Destructive Testing (NDT) procedures. As part of
2) are explicitly mentioned in the model whereas it
this study, we focus on the examination under wear
used to be all included in ε . That is why we now
anti-vibration bars of steam generator tubes with
consider the set of the most influent inputs
simulations performed by the finite-element com-
x = (E a ebav1 ebav2 h1 h2 ) . (4) puter code Code_Carmel3D. The model parameter-
ization and the design of numerical experiments are
Since the linear trend used in the two previous
explained.
methods was relevant, we keep it as the mean of the
For the POD curves determination, the Berens
Gaussian process that we are about to use. There-
method, based on a linear regression model, is
fore, the model is defined as follows:
firstly studied. It has to be noted that the
yPro jY (x) = β0 + β1 a + Z(x), (5) method to get confidence intervals on the POD

5
12th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP12
Vancouver, Canada, July 12-15, 2015

Fang, K.-T., Li, R., and Sudjianto, A. (2006). Design


and modeling for computer experiments. Chapman &
1.0

Hall/CRC.
0.8

Gandosi, L. and Annis, C. (2010). “Probability of de-


tection curves: Statistical best-practice.” ENIQ TGR
0.6

Technial Document, 41.


POD(a)

Iooss, B. and Lemaître, P. (2015). “A review on global


0.4

sensitivity analysis methods.” Uncertainty manage-


POD ment in Simulation-Optimization of Complex Sys-
0.2

MC 95% CI tems: Algorithms and Applications, C. Meloni and G.


PG 95% CI
PG + MC 95% CI Dellino, eds., Springer.
0.0

Maurice, L., Costan, V., Guillot, E., and Thomas, P.


0.1 0.2 0.3 0.4 0.5 (2012). “Eddy current NDE performance demon-
a strations using simulation tools.” Review of Progress
in Quantitative Non Destructive Evaluation, Denver,
Figure 8. Example of POD curve estimated with a
Colorado, USA, 32, 464–471.
kriging model. MIL-HDBK-1823A (2009). “Nondestructive evaluation
system reliability assessment.” Department of De-
first introduced in MIL-HDBK-1823A (2009) and fense Handbook http://mh1823.com/mh1823.
Gandosi and Annis (2010) was identified and cor- Morokoff, W. and Caflisch, R. (1995). “Quasi-Monte
rected by the authors. In addition, the normality Carlo integration.” Journal of Computational Physics,
122, 218–230.
assumption on residuals required for this method
Pasanisi, A. and Dutfoy, A. (2012). “An industrial
can be impaired. Second, we propose an alter-
viewpoint on uncertainty quantification in simula-
native strategy to both build POD curves and as- tion: Stakes, methods, tools, examples.” Uncertainty
sess confidence intervals without assuming Gaus- quantification in scientific computing - 10th IFIP
sian residuals. We compare it with the standard WG 2.5 working conference, WoCoUQ 2011, Boul-
Berens method. In both cases mentioned above, the der, CO, USA, August 1-4, 2011, A. Dienstfrey and
POD construction methods are based on a linear- R. Boisvert, eds., Vol. 377 of IFIP Advances in In-
ity assumption. We then present a third approach formation and Communication Technology, Berlin:
based on the non-linear model of the Gaussian pro- Springer, 27–45.
cess regression. Sacks, J., Welch, W., Mitchell, T., and Wynn, H. (1989).
It is important to remember that all the POD “Design and analysis of computer experiments.” Sta-
curves that we got are only examples as long as the tistical Science, 4, 409–435.
inputs are to be validated. Nevertheless these three
methods were proved to be valuable tools to evalu-
ate POD curves over a wide range of problems. Fur-
ther works will be performed on developing sensi-
tivity analysis methods (Iooss and Lemaître, 2015)
devoted to POD curves.

5. REFERENCES
Berens, A. (1988). NDE reliability data analysis,
Vol. 17. Metals Handbook, 9th edition, 689–701.
Box, G. and Cox, D. (1964). “An analysis of transfor-
mations.” Journal of the Royal Statistical Society, 26,
211–252.
E. de Rocquigny, N. Devictor, and S. Tarantola, eds.
(2008). Uncertainty in industrial practice. Wiley.

You might also like