The Application of Fuzzy Logic and Genetic Algorithm
Abstract. A 3D model of oil and gas fields is important for reserves estimation,
for cost effective well placing and for input into reservoir simulators. Reservoir
characterization of permeability, litho-facies and other properties of the rocks is
essential. A good model depends on calibration at the well locations, with cored
wells providing the best data. A subset of wells may contain specialized
information such as shear velocity data, whereas other wells may contain only
basic logs. We have developed techniques able to populate the entire field
database with a complete set of log and core data using fuzzy Logic, genetic
algorithms and hybrid models. Once the gaps in the well database have been filled,
well logs can be imported to a 3D modeling software package, blocked and
upscaled to match the geocellular model cell size.
1. Introduction
In this paper we describe two soft computing techniques, fuzzy logic and genetic
algorithms, for making predictions from electrical logs. These results are used to
improve reservoir characterization and modeling.
The soft computing concepts of fuzzy logic and genetic algorithms have been
around since the 1960’s, but have only recently been applied to reservoir
characterization and modeling. This is mainly due the dramatic improvement in
the speed of computers. The computer programs described in this paper take only
a couple of minutes to run on a 400 MHz computer. A number of oil and service
companies have confidential fuzzy logic and genetic algorithms software. It is
hoped that this paper will introduce these topics to the public domain.
This is the mathematics of fuzzy logic. Once the reality of the gray scale has been
accepted, a system is required to cope with the multitude of possibilities.
Probability theory helps quantify the grayness or fuzziness. It may not be possible
to understand the reason behind random events, but fuzzy logic can help bring
meaning to the bigger picture. Take, for instance, a piece of reservoir rock.
Aeolian rock generally has good porosity and fluvial rock poorer porosity. If we
find a piece of rock with a porosity of 2 porosity units (pu), is it aeolian or fluvial?
We could say it is definitely fluvial and get on with more important matters. But
let’s say it is probably fluvial but there is a slim probability that it could be
aeolian. Aeolian rocks are generally clean (i.e., contains little or no clay minerals)
and fluvial rocks shalier (i.e., contain clay minerals). The same piece of rock
contains 30% clay minerals. Is it aeolian or fluvial? We could say it is
approximately equally likely to be aeolian or fluvial based on this measurement.
This is how fuzzy logic works. It does not accept something is either this or that.
Rather, it assigns a grayness, or probability, to the quality of the prediction on
each parameter of the rock, whether it is porosity, shaliness or colour. There is
also the possibility that there is a measurement error and the porosity is 20 pu not
2 pu. Fuzzy logic combines these probabilities and predicts that, based on
porosity, shaliness and other characteristics, the rock is most likely to be aeolian
and provides a probability for this scenario. However, fuzzy logic says that there is
also the possibility it could be fluvial, and provides a probability for this to be the
case too. In essence, fuzzy logic maintains that any interpretation is possible but
some are more probable than others. One advantage of fuzzy logic is that we never
need to make a concrete decision. In addition, fuzzy logic can be described by
established statistical algorithms, and computers, which themselves work in ones
and zeros, can do this effortlessly for us.
Fuzzy logic does not require a normal distribution to work as any type of
distribution that can be described mathematically can be used. Because of the
prevalence of the normal distribution, supported by the Central Limit Theorem
and observation, it is the best distribution to use in most cases. The normal
distribution is completely described by two parameters, its mean and variance. As
a consequence, core-plugs from a particular litho-facies may have dozens of
underlying variables controlling their porosities but their porosity distribution will
tend to be normal in shape and defined by two parameters - their average value
(arithmetic mean) and their variance, which is a measure of the width of the
distribution. This variance (the standard deviation squared) depends on the hidden
underlying parameters and measurement error. This variance, or fuzziness, about
the average value, is key to the method and the reason why it is called fuzzy logic.
To clarify the importance of the fuzzy term, take an example of two litho-types.
Aeolian facies may have an average porosity of 20 pu and a variance, or fuzziness,
of ±2 pu. Fluvial facies may have an average porosity of 10 pu with a variance of
±4 pu. If we measure the porosity of an unknown facies as 15 pu, it could belong
to either litho-facies. However, it is less likely to be aeolian because the aeolian
distribution is much tighter, even though its porosity is equally distant from the
“most likely” or average porosity expected for each litho-type. Litho-facies
prediction using fuzzy logic is based on the assertion that a particular litho-facies
type can give any log reading although some readings are more likely than others.
4 The Fuzzy Mathematics of Litho-Facies Prediction
The normal distribution is given by:
e − ( x− µ ) / 2σ
2 2
P ( x) = (1)
σ 2π
P(x) is the probability density that an observation x is measured in the data-set
described by the arithmetic mean µ and the standard deviation=σ.
In conventional statistics the area under the curve described by the normal
distribution represents the probability of a variable x falling into a range, say
between x1 and x2. The curve itself represents the relative probability of variable x
occurring in the distribution. That is to say, the mean value is more likely to occur
than values 1 or 2 standard deviations from it. This curve is used to estimate the
relative probability, or fuzzy possibility, that a data value belongs to a particular
data set. If a litho-facies type has a porosity distribution with a mean µ and
standard deviation σ, the fuzzy possibility that a well log porosity value x is
measured in this litho-facies type can be estimated using Equation (1). The mean
and standard deviation are simply derived from the calibrating or conditioning
data set; usually core data.
Where there are several litho-facies types in a well, the porosity value x may
belong to any of these litho-facies, but some are more likely than others. Each of
these litho-facies types has its own mean and standard deviation, such that for f
litho-facies types there are f pairs of µ and σ. If the porosity measurement is
assumed to belong to litho-facies f, the fuzzy possibility that porosity x is
measured (logged) can be calculated using Equation (1) by substituting µf and σf.
Similarly, the fuzzy possibilities can be computed for all f litho-facies. These
fuzzy possibilities refer only to particular litho-facies and cannot be compared
directly as they are not additive and do not sum to unity. It is necessary, therefore,
to devise a means of comparing these possibilities.
We would like to know the ratio of the fuzzy possibility for each litho-facies to the
fuzzy possibility of the mean or most likely observation. This is achieved by de-
normalizing Equation (1).
−(x−µ )2 /2σ 2
R(x f ) = e f f
− ( x − µ f ) 2 / 2σ 2f
F(x f ) = nf e (4)
The fuzzy possibility F(x f) is based on the porosity measurement (log), x, alone.
This process is repeated for a second log type such as the volume of shale, y. This
will give F(yf), the fuzzy possibility of the measured volume of shale y belonging
to litho-facies type f. This process can be repeated for another log type, say z, to
give F(zf). At this point we have several fuzzy possibilities (F(xf), F(yf), F(zf) ….)
based on the fuzzy possibilities from different measurements (x, y, z .…)
predicting that litho-facies type f is most probable. These fuzzy possibilities are
combined harmonically to give a combined fuzzy possibility:
1 = 1 + 1 + 1 + … (5)
Cf F(x f) F(y f) F(z f)
This process is repeated for each of the f litho-facies types. The litho-facies that is
associated with the highest combined fuzzy possibility is taken as the most likely
litho-facies for that set of logs. The associated fuzzy possibility Cf(max) provides
the confidence factor for the litho-facies prediction. There are statistical
techniques for combining probabilities based on Bayes Theorem. The fuzzy logic
technique described in this paper has been developed by analysis of large data sets
from many oil fields, and differs from Bayes theorem in two respects. The fuzzy
possibilities in fuzzy logic are combined harmonically, whereas the Bayes
approach combines probabilities geometrically. When comparing lithologies that
are equally likely, with similar probabilities, the harmonic combination
emphasizes any indicator, which suggests the lithology selection is unlikely.
Secondly, fuzzy logic weights the possibilities by the square root of the proportion
in the calibrating data set whereas the Bayes approach uses the direct proportion.
Litho-facies prediction using fuzzy logic is based on the assertion that a particular
litho-facies type can give any log reading although some readings are more likely
than others. For instance, clean aeolian sand is most likely to have a high porosity,
although there is a finite probability that the logging tool could measure a low
porosity. It is important to have a consistent set of logs between wells, although
accuracy is not essential. In practice the best curves to use are the porosity log (in
pu units), as this can be calibrated to core, and the normalized gamma ray (in API
units). The gamma ray can be normalized by creating a frequency distribution of
the gamma ray readings within the reservoir formation. The five-percentile point
is determined for each well, and this point is regarded as the clean point. This
clean point plus a fixed number of API units (say 100 API) determine the shale
point. The gamma-ray log can then be re-scaled between 0 and 100%.
Any number of curves can be used by the technique. However, the additions of
further curves may not necessarily improve the prediction as the porosity and
shaliness response to the litho-facies type generally controls other log responses.
The photoelectric, nuclear magnetic resonance and resistivity log curves are
possible exceptions to this rule.
The Viking area is located on the northern flank of the Permian Rotliegendes
Sandstone in the Southern North Sea. The Viking field was developed in 1972 and
to date has produced 2.8 Tcf of gas. Consideration has recently been given to tying
back several smaller satellite pools. As part of the feasibility study, 13 exploration
and production wells, drilled between 1969 and 1994, have been re-evaluated
using fuzzy logic.
Sabkha. Sandy sabkha has good porosity but the presence of detrital clay
enhances compaction effects and thus reduces primary porosity. Muddy
sabkha porosities and permeabilities are very low with no reservoir potential.
Fluvial. The fluvial sandstones often have poorer permeabilities (<0.3 mD)
and porosities (<10 pu) than the sandy sabkha sandstones. Their porosity is
dependent on the detrital clay content and pore filling cements.
One recent well with substantial core coverage was used to calibrate the litho-
facies and permeability predictor for the older wells. The left track of Figure 1
shows the core-described facies from this well. There are several litho-facies
described; aeolian, fluvial and sabkha. The aeolian facies is sub-divided into
grainflow, wind-ripple and sand sheet sub-facies; the sabkha facies into sandy,
mixed and muddy sub-facies; and the fluvial facies into cross-bedded and
structureless sub-facies. The result of the fuzzy predicted litho-facies is shown in
the second track. There is near perfect differentiation between aeolian, fluvial and
sabkha rock types. In addition, the technique goes some way towards
differentiating between sandy, mixed and muddy sabkhas. The right track shows
the comparison of core derived and fuzzy predicted permeabilities. It must be
remembered that the core descriptions themselves are from observations and can
contain errors due to the subjective nature of the measurement. Consequently,
sedimentologists can use predicted litho-facies as an aid to refining core
interpretations. This example of a self-calibrated well has helped the sub-surface
team develop the Viking satellite reservoirs pools. “Blind-testing” between wells
can test the predictive ability of the technique in the same field. This was
conducted on data from the South Ravenspurn field.
The South Ravenspurn gas field is located in the southern North Sea, 40 miles off
the English coast. Gas reserves are around 1 Tcf, and current production is 200
mmscf/d. The field is developed by some 40 wells, in shallow water no more than
50 meters deep. Descriptions from 10 cored wells were used to derive facies in 30
uncored wells. The left well shown in Figure 2 shows the described and predicted
facies types for one cored well in the field. The prediction success rate is over
86% compared to a random prediction rate of 13%. The prediction success rate is
calculated as the number of correct predictions divided by the total number of
possible predictions. When we are attempting to predict X facies types, say 10, a
random prediction success rate would be around 1/X or 10%. Any prediction
method is expected to produce successful predictions greater than this threshold.
Using the fuzzy relationships between the described litho-facies and electrical
logs, litho-facies were predicted in a second well shown on the right of Figure 2.
The prediction success in this second well between the predicted facies and
“hidden” but known and core-described facies is 73%, with the majority of the
“failed” predictions falling into the next closest litho-facies type rather than one
with completely different reservoir characteristics.
This logic is clarified using Figure 3. For simplicity it shows only 5 bins that
represent each of the 5 familiar decades for logarithmic permeability. The diagram
shows only 2 axes (porosity and volume of shale) whereas the technique can use
an unlimited number of bins in n-dimensional space. The mean value of porosity
and volume of shale for each permeability bin is represented by the point at the
center of each cross. For instance, for core permeability greater than 100 mD the
average porosity and volume of shale are 26 pu and 12%, respectively. The
vertical and horizontal lines through each point represent the error bar or standard
deviation (fuzziness) of data in that bin. The error bars are different for each bin.
The resulting permeability line, through the points, is field specific and is “S”
shaped and shown without error. A real cross plot of log data would show
considerable scatter about this curve. A single curve predictor would predict
different permeabilities depending whether porosity or volume of shale was taken
as the predictor. Take a log depth that has a porosity of 23 pu and volume of shale
of 30% as shown on Figure 3. A porosity only predictor would estimate a
permeability of 10-100 mD by extrapolating the point vertically. The volume of
shale only predictor would give a permeability of 0.1-1 mD by extrapolating the
point horizontally.
In contrast, fuzzy logic can deal with “shades of gray”. The point at 23 pu and
30% volume of shale would be compared to all permeability bins. Knowing the
mean and standard deviation of each bin, the fuzzy possibility that the point lies in
that bin can be calculated using Equation (3). It is not necessary to normalize the
distributions because the permeability bins are of equal size. This is done
separately for porosity and the volume of shale. Their fuzzy possibilities are
combined to predict the permeability for that log depth with its associated fuzzy
possibility or “grayness”. Figure 4 shows typical results of this analysis where
each of the ten permeability bins has an associated fuzzy possibility. The highest
fuzzy possibility is taken as the most probable permeability for that combination
of log measurements. A predicted permeability is calculated as the weighted mean
of the two most probable bins.
The program uses any number of permeability bins with any number of input
curves. The distribution of bin boundaries depends on the range of expected
permeabilities, as described above. The number of bins depends on the number of
core permeabilities available for calibration, the statistical sample size. A
reasonable sample size is around 30. Consequently the number of bins is
determined so that that there are at least 30 sample points per bin. For a well with
300 core permeabilities it would be appropriate to use 10 permeability bins. The
permeability prediction has also been attempted using genetic algorithms [6].
Vertical permeability can be predicted simultaneously by simply comparing the
core vertical permeabilities with the logs in a similar manner.
The right hand track of Figure 5 shows the comparison between core-derived and
fuzzy-predicted permeabilities in one of the cored Ula wells. “Blind-testing”
between wells was used to test the predictive ability of the technique. To test the
fuzzy prediction, the technique was calibrated in a cored well and “blind-tested” in
another well to see how well it fitted the actual core permeabilities. Figure 6
shows the second well where permeabilities were predicted using the calibration
from the first well. The comparison between the predicted and cored derived
permeabilities is good compared to the natural spread in permeability data.
In a recent study, calibration data from 4 wells with shear velocity data were used
to populate all the wells in a large field. This gave the oil company a cost effective
method of building a 3D reservoir model that enabled improved location of oil
wells. Dts can be acquired by dipole logging tools. If Dts data have not been
acquired by logging, it can be estimated from other curve responses using genetic
Shear velocities are related to porosity φ, formation resistivity Rt, and the volume
of shale Vsh. Porosity is the measure of pore space in the rock matrix that is filled
with reservoir fluids such as oil, gas and water. Formation resistivity is the inverse
of the conductivity of the fluid-saturated rock. The volume of shale, in this
context, is a normalized measure of the radioactivity of the rock matrix by
measuring the formation gamma-ray background. Porosity φ, Rt and Vsh are
measured by borehole electrical logs.
The next step is to provide a method for determining how good a given f(φ,Rt,Vsh)
is as a predictor of Dts. The approach we adopt is to sum absolute errors in
prediction over all depth levels for a given borehole. We seek a function of the
form Equation (6), which minimizes this sum. A more standard way to do this
might be to use least squares rather than absolute values of residuals. The reason
for the approach that we take is that the borehole data is noisy and includes many
“outliers”. These can only be removed by extensive manual editing of the data sets
and rechecking of measurements. By using the absolute value of residuals, one
diminishes the effect of noise and outliers and produces more appropriate
predictor functions. Mathematically, the problem can be stated as:
Each chromosome is a vector of length 10. Three alleles are binary integer values
that represent the mathematical operators •1, •2, •3. The rest of the alleles are
floating point values that represent the coefficients a,b,c,d,e,g,h. The initial
population is generated by creating chromosomes with a random binary numbers
for •1, •2, •3 and random floating point numbers for the coefficients a,b,c,d,e,g,h.
If the allele represents the operator •, its value is binary and it will be switched. If
the allele represents one of the real variables, it will be modified by multiplication
by a value randomly picked from the range 0.8 to 1.2. This range decreases in
value as the number of generations increases. This provides a method that allows
the search to become more local towards the end of the algorithm as better
solutions emerge.
Figures 8 and 9 show the recorded data cross-plotted against the predictions for
the fuzzy logic and genetic algorithm techniques. Data from 4 wells in the field are
shown. The key point to notice is the good predictions by both techniques at the
extremes of the scale. This is where methods such as linear regression fail. The
extremes of the scale are often the most important in reservoir characterization.
The “stratification” of the data in Figure 8 is a result of fuzzy logic being a
binning technique. This effect is not noticed when the data is displayed as a
continuous curve as in Figure 7. The fuzzy logic technique has the advantage of
not requiring all the input data to make a prediction. Often curves are missing
from some wells due to borehole problems. Genetic algorithms derive an equation
which calculates a continuous curve. It therefore requires all the input curves to
make a prediction, but does not show any stratification. In this example fuzzy
logic and genetic algorithms compute shear velocities by two independent
methods and therefore provide confidence in the predictions.
11 Comparison of Fuzzy Logic and Genetic Algorithms with
Other Methods
Genetic algorithms and fuzzy logic (GAFL) are two ways, out of many, of making
predictions from logs. Standard statistical techniques such as least squares
regressions are essential tools of the geoscientist but are poor at predicting
extremes, whereas fuzzy logic seeks these out. However, least squares regression
has the ability to extrapolate and predict values outside the range of the
conditioning data set whereas fuzzy techniques are confined to look only within
the calibrating data set.
Neural networks are a promising technique, but require the correct amount of
conditioning. In addition, neural networks are very hard to “figure out” and are
therefore often regarded as “black boxes”. By contrast, GAFL results are
completely open and easy to understand, and relate to the problem at hand.
Although interpreting fuzzy results is simple they often describe complex non-
linear systems that would defy conventional logic. Cluster analysis works well but
can have difficulty in dealing with data equidistant from cluster centers, and
requires extensive user interaction via cross plots. Artificial intelligence and expert
systems have clear decision logic and generally ignore or minimize the error in the
data. These other methods have their place and are valuable to the geosciences.
There is no reason why they should not incorporate elements of GAFL or
complement the GAFL results.
12 Conclusions
Fuzzy logic and genetic algorithms have found several applications in reservoir
characterization and modeling including the prediction of litho-facies,
permeability and shear velocities. These are simple tools for confirming known
correlations, or as powerful predictors in uncored wells. Litho-facies typing is
used for well correlation and as input for building a 3D model of the field.
Permeability prediction is useful to complement current technology and to gain
insight into older wells without core or extensive logging programmes. The
measurement of shear velocities is important for understanding reservoir rock
The methods described here use basic log data sets such as porosity and density,
which are cheap and easy to obtain, rather than depending on new and expensive
logging technology. Over recent years, oil exploration has suffered due to erratic
and often low oil prices. Oil producing countries are now struggling to meet
demand, and there is an urgent need to find new reservoirs and make efficient use
of existing resources. Fuzzy logic and genetic algorithms make an important
contribution to this endeavor.
13 Nomenclature
x = log variable =
µ = arithmetic mean =
σ= ==standard deviation
nf = expected occurrence of x in litho-facies f
µf = arithmetic mean value of x in litho-facies f
σf = standard deviation of x in litho-facies f
P (x) = fuzzy possibility density of an observation x
R (x f) = relative fuzzy possibility of x
F (x f) = fuzzy possibility of x belonging to litho-facies f
Cf = combined fuzzy possibility
Dts = shear velocity
φ = porosity
Rt = resistivity
Vsh = volume of shale
a,b,c,d,e,g,h = numerical coefficients to be determined
•1 , •2 , •3 = operators representing either addition or multiplication
