Formulation Simplified
Formulation Simplified
Formulation Simplified
Formulation Simplified
Finding the Sweet Spot through Design and
Analysis of Experiments with Mixtures
Mark J. Anderson
Patrick J. Whitcomb
Martin A. Bezener
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made
to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all
materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all
material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future
reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in
any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, micro-
filming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.
copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-
8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that
have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identi-
fication and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
Preface........................................................................................................ix
Acknowledgments .................................................................................... xi
Authors .....................................................................................................xiii
Introduction ............................................................................................. xv
v
vi ◾ Contents
All that is gold does not glitter; Not all those who wander are lost.
—J. R. R. Tolkien (The Fellowship of the Ring)
This book rounds out our series of “Simplified” books (Anderson and
Whitcomb, 2015, 2016) into a trilogy on the design of experiments (DOE).
It may not achieve the stature of Tolkien’s towering trio—The Lord of the
Rings, but the detailing of mixture design completes our quest to provide
the statistical tools needed by modern-day industrial experimenters. The
beneficiaries of this third “Simplified” book will be formulators of alloys,
beverages, chemicals, cosmetics, construction materials (such as concrete),
food, flavors, pharmaceuticals, paints, plastics, pulp, paper rubber, textiles,
and so forth, that is, any product made from stuff.
Formulation Simplified is derived from a popular workshop on mixture
design that my coauthor, Pat, developed over twenty years ago. He’s worked
unstintingly to continuously incorporate new statistical methods that prove
to be of practical use. More recently, statistician, Martin Bezener, joined our
team at Stat-Ease and took to mixtures like a barista to coffee. However, it’s
one thing to be trained intensively by expert instructors like Pat or Martin, but
another thing to learn on your own from a book. That’s where I come in by
making these powerful statistical tools of experimental design and analysis as
unintimidating as possible in a self-study, written format. Luckily, I can rely on
Pat and Martin to bolster any inadequate mathematical details, thus helping us
maintain statistical rigor throughout. If we feel that this may create too much
information (TMI) for some readers, the in-depth explanations go into sidebars
or appendices that can be glossed over (at least on the first go-through!).
What differentiates Formulation Simplified from the standard statisti-
cal texts on mixture design by Cornell (2002) and Smith (2005) is that we
make things relatively easy and fun to read. To convey my experience that
ix
x ◾ Preface
Mark J. Anderson
Acknowledgments
Mark J. Anderson
xi
Authors
xiii
xiv ◾ Authors
To avoid disheartenment, this book swoops down from above the forest to
treetop level, where it remains for the most part. Go ahead and enjoy the
ride for the first pass through Formulation Simplified—it will be a far easier
read than any other statistical textbook you are likely to see, especially, if
you skip the formula-laden appendices and the serious sidebars (read the
trivial ones just for fun).
xv
xvi ◾ Introduction
However, if you genuinely hope to master the tools for design and analysis of
experiments with mixtures go back and do your homework via the practice
problems. Be sure to pursue the links to web-based content that provides
many of the details you will need to interpret the statistical analyses and
graphics from the software we make available or others that offer the same
features (there are several good alternatives that can be easily searched out
if not already at your fingertips from your enterprise’s server).
Finally, to leave no leaf unturned, consider going back to the first two
books in this trilogy—DOE Simplified and RSM Simplified. Even if you have
already read these two books, it will be useful to leaf through them (pun
intended) and review the detailed statistical tools presented that remain
useful for mixture design and analysis, for example—diagnostics for model
validation.
See the flowchart for specific parts of the prior Simplified books that will
be very relevant to what’s covered in Formulation Simplified.
DOE simplified
(Chapters 1–4)
RSM simplified
(Chapters 1, 2, 6–9, 11
and glossary of terms)
Formulation
simplified
Ultimately this all becomes just an academic exercise (like reading a text-
book about how to ride a bicycle) unless you actually take these tools for
a spin on your own. It will not be hard to find a proper application for
mixture design and analysis—just consider your favorite food or beverage
and search out the sweet spot in their formulation. You will see plenty of
other ideas throughout the book on experiments you can do at home, but
better yet, dive in on something that will benefit your sponsor or employer.
Introduction ◾ xvii
The secret weapon you will develop by reading this book and putting its
tools into practice is the ability to handle many ingredients simultaneously—
not just one at a time, as dictated from time immemorial by “the scientific
method.” As you will see from example after example, one needn’t hold
all else constant while changing only one thing. Instead, take advantage
of modern-day parallel processing schemes with mixture designs that
provide multicomponent testing. This is the forest we hope you will see
from the highest level, which then will provide the necessary motivation
for sharpening up your ax before going back to hacking at the trees of
formulation development.
Chapter 1
Come on in—the water’s fine! Ok, maybe you’d do best by first sampling
the temperature of the pool with just your finger or toe. That’s what we
will try to do in this chapter—start with the simple stuff before getting too
mathematical and statistical about mixture design and modeling for optimal
formulation. Our two previous books, DOE Simplified and RSM Simplified,
both featured chapters on mixture design that differentiate this tool from
factorials and response surface methods (RSM), respectively. However, if
you did not read these books, that’s OK. We will start with an empty pool
and fill it up for you!
It’s natural to think of mixtures as liquids, such as the composition of
chemicals a pool owner must monitor carefully to keep it sanitary. However,
mixtures can be solids too, such as cement or pharmaceutical excipients—
substances that bind active ingredients into pills. The following two defini-
tions of mixtures leave the form of matter open:
1
2 ◾ Formulation Simplified
the little they sip, after several samples, the amount of alcohol ingested
could matter—not just the proportions. Therefore, one should never
permit sensory evaluators to consume alcoholic beverages—only sip,
spit and rinse mouth afterward with water. Keep that in mind if you
wish to apply the methods of Cornell’s landmark book (2002) to such
purpose (e.g., if you become inspired by our case study in Chapter 2
on blending beers).
PS: Of course, some mixtures are better liberally applied—for example,
primer paint—the more, the better for hiding power. This would be a good
candidate for a “mixture-amount” design of formulation experiment. They
require more complicated approaches and modeling so let’s set this aside
for now. We will devote our full attention to “mixture-amount” experi-
ments towards the end of the book.
Figure 1.1 This exquisite necklace, now in London’s British Museum, came from
a necropolis (burial site) on Rhodes. It features Artemis, the Greek goddess of hunting.
(Courtesy of Bridgeman Art Library, London/New York.)
A EUREKA MOMENT!
You may recall from studying Archimedes’ principle of buoyancy in which
this Greek mathematician, physicist, and inventor who lived from 287–212
BC was asked by his King (Hiero of Syracuse) to determine whether
a crown was pure gold or was alloyed with a cheaper, lighter metal.
Archimedes was confused as to how to prove this, until one day, when
he started observing the overflow of water from his bathtub, he suddenly
realized that, since gold is denser, a given weight of gold represents
a smaller volume than an equal weight of the cheap alloy. Therefore, a
given weight of gold would displace less water. Delighted at his discovery,
Archimedes ran home without his clothes, shouting “Eureka! Eureka!”
which means “I have found it! I have found it!” When you make your
discovery with the aid of mixture design for an optimal formulation, feel
free to yell “Eureka!” as well, but wait until you get dressed.
PS: If you have a copy of DOE Simplified, 3rd Edition, see the Chapter 9
sidebar “Worth its weight in gold?” It provides information on a linear blend-
ing model we derived based on the individual densities of copper versus
the much heavier (nearly double) gold.
Getting Your Toe into Mixtures ◾ 5
The ancient Greek and Roman goldsmiths mixed their solder by a simple
recipe of 2 parts gold and 1 part copper (Humphrey et al., 1998). The use
of “parts,” while extremely convenient for formulators as a unit of measure
is very unwieldy for doing mathematical modeling of product performance.
The reason is obvious, the more parts of one material that you add, the
more diluted the other ingredients become, but there is no quantitative
accounting for this. For example, some goldsmiths added 1 part of silver to
the original recipe. That now brings the total to 4 parts, and thus the gold
becomes diluted further (2 parts out of 4% or 50%, versus the original con-
centration of 2/3 or about 67%). Therefore, one of the first things we must
do is wean formulators wanting to use modern tools of mixture design off
the old-fashioned use of parts. In this case, it will be convenient to spec-
ify the metal mixture by weight fraction—scaled from zero (0) to one (1).
However, all that matters is that the total is fixed, such as one for the weight
fraction. Alternatively, if our goldsmith used a 50-milliliter crucible, then
the ingredients could be specified by volume—provided that when added
together, they will always be 50 mL. You will see various units of measure-
ments used in mixture designs throughout this book, although the most
common may be by weight. Regardless, the first thing we will always spec-
ify is the total.
Getting back to the task at hand, let’s see the results for the temperature
at which various mixtures of gold and copper begin to melt. Assume this
was done in ancient times when measurements were not very accurate.
(This is a pretend experiment!) We’ve covered the entire range from zero to
one of each metal (Table 1.1).
Notice that the table sorts the blends by their purity of gold. The
actual order for experimentation can be assumed to be random. As
emphasized in both our previous books on statistical design, random-
ization provides insurance against lurking variables such as warm-up
effects from the furnace, cross-contamination in the crucible, learning
curves of operators and so forth. As the inventor of modern-day
industrial statistics, R. A. Fisher said, “Designing an experiment is like
gambling with the devil: only a random strategy can defeat all his
betting systems.”
6 ◾ Formulation Simplified
Table 1.1 Melt Points of Copper versus Gold and Mixtures of the Two
Gold Copper Melt Point
Blend # Point Type Blend Type (wt fraction) (wt fraction) (Deg C)
1 Vertex Pure 0.00 1.00 1073
2 “ “ 0.00 1.00 1063
3 “ “ 0.00 1.00 1083
4 Axial check Quarter 0.25 0.75 955
blend
5 Third edge Third 0.33 0.67 951
6 Centroid Binary 0.50 0.50 926
7 Third edge Third 0.67 0.33 929
8 Axial check Quarter 0.75 0.25 952
blend
9 Vertex Pure 1.00 0.00 1049
10 “ “ 1.00 0.00 1036
This mixture model, developed by Henri Scheffé (1958), is derived from the
conventional second-order polynomial for process RSM, called a quadratic
equation. The mathematical details are spelled out by Cornell (2002). Two
things distinguish Scheffé’s polynomial from that used for RSM. First, there
is no intercept. Normally this term represents the response when factors are
set to zero—set by standard coding to their midpoints for process modeling.
However, a mixture would disappear entirely if all the components went to
zero—we can’t have that! The second aspect of this second-order mixture-
model that differs from those used for RSM is that it lacks the squared
terms. Again, refer to Cornell’s book for the mathematical explanation, but
suffice it to say for our purposes that the x1 x2 terms capture the nonlinear
blending behavior—in this case, one that is synergistic, that is—a desirable
combination of two components.
Observe that although this experiment requires the control of two inputs—
gold versus copper, only one X-axis is needed on the response surface plot
shown in Figure 1.2. That is because of the complete inverse correlation of
one component with the other—as one goes up the other goes down and
vice versa. In statistical terms, this can be expressed as r = −1, where r sym-
bolizes correlation, and the minus sign indicates the inverse relationship.
Let’s see how that model for m.p. connects to the graph. First, the
coefficient of 1043 for x1 estimates that temperature in degrees Celsius at
which pure gold melts. On the other hand, pure copper melts at a higher
temperature—estimated from this experiment to be 1072°C. Always keep in
mind that results will vary from any given experiment, which represents only
a sampling of the true population of all possible results from your process—
an unknown and unknowable value. The predicted values represented by
1050
Melting point (deg C)
1000
950
900
Figure 1.2 Response surface for melt point of copper versus gold and their mixtures.
Getting Your Toe into Mixtures ◾ 9
the solid line in Figure 1.2 are simply an estimate. This is accentuated by the
addition of 95% confidence bands (dashed) to the plot. What really counts is
that, as a practical matter, the predictions serve the purpose of the goldsmith
for using copper to formulate an optimal jewelry-solder.
The most intriguing feature of this mixture model is the large nega-
tive coefficient of 536 on the x1x2 terms. The analysis of variance (ANOVA)
shows the term to be significant at p < 0.0001—a less than 1:10,000 chance
of it being this large if the true effect were null. (For a primer on ANOVA
and p-values, refer to DOE Simplified.) So together gold and copper melt at a
lower temperature than either one alone—isn’t that amazing!
Mathematically, due to the coding on a zero to one scale for each compo-
nent, the maximum impact of this second-order effect (x1x2) occurs at the
0.5–0.5 (“50/50”) blend. Some quick figuring will help you see that this must
be so. First, multiply 0 by 1 and 1 by 0 to get the products at either end of
the scale. If you do not compute zero in both cases, then perhaps you pos-
sess the street smarts to be a vendor like the one we quote in the sidebar
10 ◾ Formulation Simplified
below. Now things get a lot harder because fractions are involved. Multiply
¼ by ¾ and ¾ by ¼ to work out the result for the two axial check blends
that this design specifies the centroid and the vertices. If you got the first
calculation, we trust you know that either way this product comes to three-
sixteenths. This is a little less than the 1/4 of the result you get from multi-
plying 0.5 by 0.5 for the “50/50” blend at the centroid.
If you look closely at the curve in Figure 1.2, you may notice that the mini-
mum actually occurs just a little to the right of the 0.5–0.5 point. This is due
to the gold having a lower m.p. than the copper, thus favoring a bit more of
this noble metal. A computerized search for the minimum using a hill-climbing
algorithm finds the minimum at 0.55 weight fraction gold, and thus 0.45% copper
is required to make the two components total to 1.
Now for a major disclaimer: A mixture experiment like this one on
gold and copper will only produce an approximation of the true response
surface—it may not be accurate, particularly for the fine points such as
the eutectic temperature. In the end, you must ask yourself as a formulator
whether the results can be useful for improving your recipe. In this case, the
next step would be to select a composition that meets the needs of solder
for goldsmithing fine jewelry. Determine the predicted m.p. from the graph
or more precisely via the mathematical model. Then run a confirmation test
to see how close you get. As a practical matter, this might be off by some
degrees and yet still be useful for improving your process.
fine gold, and one-twelfth alloy (copper). So accurate became the com-
position and weight of the coin issued from the mint that at the 1871 trial
of the “Pyx” the jury reported that every piece they separately examined,
representing many millions of pounds sterling, was found to be accurate
for both weight and fineness. The term “Pyx,” Greek in origin, refers to
the wooden chest in which newly-minted coins are placed for presenta-
tion to the expert jury of assayers assembled once a year at the Hall of
the Worshipful Company of Goldsmiths in the United Kingdom. This
ceremony dates to 1282.
Source: Encyclopaedia Britannica, 10th Edition (1902).
The hat (^), properly known as a circumflex, over the letter y symbolizes
that we want to predict this response value. The β (beta) symbols represent
coefficients, fitted via regression.
We detail the third order (cubic), which you may never need, in the
Appendix 1A. There, for added measure, we also spell out the fourth-
order (quartic) Scheffé equation. By this stage, very complex behavior can
be modeled for all practical purposes. However, this process of model-
building could continue to infinite orders of the inputs x to approximate
any true surface in what mathematicians refer to as a Taylor polynomial
series.
12 ◾ Formulation Simplified
The second-order equation not only may suffice for your needs to character-
ize the two primary components in your formulation, but it also could reveal a
surprising nonlinear blending effect. The possibilities are illustrated graphically
in Figure 1.3, which presumes that the higher the response, the better.
Response—higher the better =>
Nonlinear: synergism
1/4β12
ing
lend
ng ar b
l endi Line
ar b
Line
1/4β12
β1
Nonlinear: antagonism
β2
Notice that we tilted the linear blending line upwards, in other words, β1
exceeds β2. So this response surface predicts better performance for pure
x1 than for pure x2. If together these two ingredients produce at the same
rates as when working alone, then at the 0.5–0.5 midpoint the response will
fall on the linear blending line. However, you hope that they really hit it
off and produce more than either one alone. Then the response will curve
upwards—producing the maximum deflection at the midpoint. This syner-
gistic (positive) nonlinear blending effect equals one-fourth (0.5 * 0.5) of the
second-order coefficient. Unfortunately, some components just do not work
very well together, and things get antagonistic. Then the response curves
downward and the β12 coefficient becomes negative.
ISOBOLOGRAMS
In 1871 T.R. Fraser introduced a graphical tool called the “isobologram.” It
characterized departures from additivity between combinations of drugs.
Although it differs a bit in shape from our graph in Figure 1.2, the isobo-
logram is essentially equivalent—it plots the dose-response surface associ-
ated with the combination superimposed on a plot of the same contour
under the assumption of additivity, that is—linear blending. The observed
results are called the “isobol,” generally produced for the combinations
of individual drug dosages that produce a 50% response by the subjects.
If the isobol falls below the line of additivity, a synergism is claimed,
because less of the drugs will be needed. On the other hand, if the
isobol rises above the line, then the drugs are presumed to be antagonis-
tic. However, there are two major shortcomings associated with the use
of isobolograms. They do not account for data variability, and they are
restricted to only a few components.
Source: Meadows, S.L. et al., Environ. Health Perspect., 110, 979, 2002.
In this example, we made the response one where higher is better. Thus,
a positive β12 coefficient is desirable for this nonlinear blending effect.
However, in the first example—blending of copper into gold—the negative
nonlinear coefficient was what the jewelry maker hoped to see. Thus, a
synergistic deflection off the linear blending slope on the response surface
could be positive or negative, depending on the goal being maximization or
minimization.
14 ◾ Formulation Simplified
Practice Problems
To practice using the statistical techniques you learned in Chapter 1, work
through the following problems. Statistical software used for such compu-
tations can be accessed freely via a website developed in support of this
book. There you will also find answers posted. See “About the Software”
for instructions.
Getting Your Toe into Mixtures ◾ 15
Errors, like straws, upon the surface flow; He who would search for
pearls must dive below.
—John Dryden (1678)
When p is low, null must go. When p is high, null will fly.
–Author unknown
P-values are the calculated probability, used to evaluate statistical
significance in a hypothesis test.
Problem 1.1
To reinforce the basics of mixture modeling presented in this chapter, we
will start you off with some obvious questions that stem from this imaginary,
but commonplace, situation in our heartland of the United States.
The old truck on your hobby farm gets very poor gas mileage. Luckily
you can purchase fuel from a wholesaler who serves the agricultural
market—A low-grade gasoline that produces 10 miles to the gallon (mpg)
then it’s alright to drive the old truck all the way back into the city where
you usually dwell. It’s cheap; only 3 dollars a gallon. Another possibility is
to purchase the highly refined premium gasoline that increases the engine
efficiency to 14 mpg. However, it costs 4 dollars a gallon.
16 ◾ Formulation Simplified
the typical USA car owner. Professor Larrick was inspired to promote “gpm”
(vs. mpg) after realizing in the end that he’d be better off trading in the fam-
ily minivan and only gaining 10 miles per gallon with a station wagon, rather
than swapping his second car, a small sedan, for a highly efficient hybrid.
Are you still not sure about the NPR puzzler? Imagine you and your
spouse work at separate locations that require an annual commute of pre-
cisely 10,000 miles per year for both of you driving separately (two auto-
mobiles). Then your 16-mpg guzzler consumes 625 gallons (10,000/16).
By trading that for a 20-mpg car, you will need only 500 gallons the next
year—a savings of 125 gallons. On the other hand, your spouse drives
the far more efficient 34 mpg sedan—it requires only 294 gallons of
gas per year (10,000/34). Upgrading this to the 50-mpg hybrid saves just
94 gallons! We will let you do the math on this last bit.
It is surprising how something as simple as an inverse transformation
makes things so much clearer.
PS: For more details on transformations, refer to Chapter 4, “Dealing
with Nonnormality via Response Transformations” in the 3rd edition of
DOE Simplified.
Problem 1.2
This exercise stems from an experiment done by Mark with help from his
daughter Katie. To demonstrate an experiment on mixtures, they blew up a
plastic film canister—not just once, but over a dozen times. The explosive
power came from Alka Seltzer®—an amalgam of citric acid, sodium bicar-
bonate (baking soda) and aspirin (Figure 1.4).
You can see the experimental apparatus pictured: launching tube, a
container with water, the tablets, plastic film canister (Fuji’s works best), a
scale and stop-watch. Research via the Internet produced many write-ups
on making Alka Seltzer “rockets.” These are generally recommended when
using only a quarter of one tablet, and they advocate experimentation on
the amount of water, starting by filling the canister halfway. Mark quickly
discovered that the tablets break apart very easily, so he found it most con-
venient and least variable to simply put in a whole tablet every time (a con-
stant). It then took a steady hand to quickly snap on the top of the canister,
over which Katie placed the launching tube and Mark prepared to press his
stopwatch. (Subsequent research on this experiment indicated it would have
18 ◾ Formulation Simplified
been far less nerve-wracking to stick the tablet on the lid with chewing gum,
put water in the container, put the lid on, and then tip it over—shooting the
canister into the air.) After some seconds the explosion occurred—propelling
the lid from the back porch to nearly the roof of his two-story home.
Before designing this experiment, Mark did some range finding to discover
that only 4 cubic centimeters (cm3) of water in the 34 cm3 canister would
produce a very satisfactory explosion. However, it would not do to fill the
container completely because the Alka Seltzer effervesced too quickly and
prevented placement of the lid. After some further fiddling, Mark found
that a reasonable maximum of water would be 20 cm3—more than half full.
Getting Your Toe into Mixtures ◾ 19
Figure 1.5 The MIGHTY Seltzer Rocket pictured from a launch pad in Tucson.
Getting Your Toe into Mixtures ◾ 21
Notice that the coefficient on the highest order, non-linear blending term
is distinguished by the Greek letter delta. Think of the letter “d” (delta) as
a symbol for the differences (“d” for difference) pictured in Figure 1A.1. It
depicts a very unusual response surface for two components with only first
and third order behavior—the second-order coefficient was zeroed out to
provide a clearer view of how the new term superimposes a wave around
the linear blending line. Also, to add another wrinkle (pun intended) to this
surface, the coefficient is negative.
See if you can bend your brain around this complex mixture model: It’s
challenging!
At the 50–50 blend point the components are equal, so the offset is
zeroed (x1 – x2 = 0). When x1 exceeds x2 to the right of the midpoint,
Response—higher the better =>
g
ndin
ar ble
Line
3/32δ12
3/32δ12
g
ndin
a r ble
Line
β1
β2
For illustrative purposes, only, we dramatized the impact of the cubic term
in Figure 1A.1. Usually, it creates a far subtler “shaping” of the surface such
you see illustrated in Figure 1A.2, which shows a cubic model (solid line) fit-
ting noticeably better than the simpler quadratic (dotted).
Think of polynomial terms as shape parameters, becoming subtler in
their effect as they increase by order. Linear (first order) terms define the
slope. As shown in the blending case of gold and copper we went through
earlier in this first chapter of Formulation Simplified, the second order
(quadratic) fits curvature. The cubic order that we’ve just introduced in the
Appendix accommodates asymmetry in the response surface.
It’s good at this stage to simply consider mixture design as a special form
of RSM, which relies on empirical, not mechanistic, model building. In other
words, it’s best that you don’t try relating specific model parameters to the
underlying chemistry and physics of your formulation behavior. However,
Getting Your Toe into Mixtures ◾ 23
In any case, to fit this cubic equation, one must design an experiment with
at least four unique blends, whereas three suffices (at the bare minimum)
to fit the quadratic. The more complex the behavior you want to model, the
more work you must do as a formulator. You get what you pay for.
24 ◾ Formulation Simplified
∑β x + ∑ ∑β x x + ∑ ∑ δ x x ( x − x ) + ∑ ∑ γ x x ( x − x )
2
y = i i ij i j ij i j i j ij i j i j
i =1 i <j j i <j j i <j j
q −2 q −1 q q −2 q −1 q q −2 q −1 q
+ ∑∑∑
i <j j<k k
βiijk x 2i x jx k + ∑∑∑
i <j j<k k
βijjk x i x j2 x k + ∑ ∑ ∑β
i <j j<k k
xxx 2
ijkk i j k
q −3 q −2 q −1 q
+ ∑ ∑ ∑ ∑β
i <j j<k k <l l
xxx x
ijkl i j k l
Notice that squared terms now appear. Although statistical software (such
as the one we provide to you readers) will handle the design and analysis
of a mixture experiment geared to this fourth order, it is very unlikely that
this will provide any practical gain over the fit you get from cubic or qua-
dratic models. For response surface modeling, it’s good to keep in mind
the principle of parsimony, which advises that when confronted with many
equally accurate explanations of a scientific phenomenon it’s best to choose
the simplest one (Anderson and Whitcomb, 2005, Chapter 1, sidebar “How
Statisticians Keep Things Simple”).
OUT OF ORDER?
Back in the days when computer-aided mixture modeling was limited to
cubic, an industrial statistician cornered Mark at a conference and com-
plained that he needed quartic to fit a formulation over the entire experi-
mental region. Quadratic fit fine for most of the results but fell short where
the performance fell off very rapidly. Mark tried a trick that his doctor told
him after he injured his shoulder playing softball. “When does it hurt,” the
medico asked. “Only when I throw a softball,” said Mark. “Just don’t do
that,” the doctor advised. In similar fashion, Mark—being ever practical—
suggested that one could simply not look at the response surface where
it drops off and gets fit inaccurately because no one cares at that point.
(Continued )
Getting Your Toe into Mixtures ◾ 25
This flippant advice is more helpful than you might think. If you can
apply your subject matter knowledge and do some pre-experimentation
to restrict the focus of the mixture design to a desired region, the degree
of Scheffé polynomial required to approximate the response surface will
likely be less, thus reducing the number of blends required by simplify-
ing the modeling needed for adequate prediction power. For example,
why model all the Rocky Mountains when you are really interested only
in exploring one of the peaks?
Some might say that this question is academic, but that’s OK because
I am an academic.
Furthermore, because the quadratic term is significant, all the linear terms
come back into the model to maintain hierarchy, which we explain in Chapter 5
of DOE Simplified in our sidebar (p. 103) on “Preserving Family Unity.” To put it
simply, parents must always be included with their children. Therefore, in this
case, where the quadratic term AB merits inclusion in the model, both A and B
must come back in for support, even though these linear terms, on their own
are insignificant. It would be especially problematic in the case of mixture not
to include the main ingredients in the model. That would not make any sense.
The computational details on the SMSS displayed in Table 1A.1, are
shown diagrammatically in Figure 1A.3.
Total
SS = 10,071,131
df = 10
Mean Residual
SS = 10,034,029 SS = 37,102
df = 1 df = 9
Cubic Residual 2
F= 1 = 0.026
SS = 2 SS = 463 463
df = 1 df = 6 6
Figure 1A.3 Calculations for sequential model sums of squares in Table 1A.1.
1100
1050
Melting point (deg C)
1000
950
900
Figure 1A.4 Response surface for the linear model of gold–copper melting points.
28 ◾ Formulation Simplified
Table 1A.2 shows the lack of fit statistics for models ranging from linear to
quartic for the gold-copper blending case. Not surprisingly, the linear model gets
rejected by the p-value being far below the 0.05 benchmark for significance.
The quadratic model wins out. Going to the next level of cubic or beyond, an
order which does no good—thus creates an overcomplicated model.
Figure 1A.5 diagrams the derivation by source for the LOF statistics.
Total
SS = 10,071,131
df = 10
Mean Residual
SS = 10,034,029 SS = 37,102
df = 1 df = 9
Figure 1A.5 Calculations for sequential model sums of squares in Table 1A.2.
Getting Your Toe into Mixtures ◾ 29
The quadratic model [A, B, and AB] comes out on top overall with the sum-
mary measures of the adjusted and predicted R-squared (never mind the raw
R-squared!).
Chapter 2
Triangulating Your
Region of Formulation
If you don’t know where you are going, you will wind up
somewhere else.
—Yogi Berra
In this chapter, we will build up from the simplicity of dealing only with
two components system and then experiment on three or more. The biggest
step will be recognizing that if you lay this out in rectangular coordinates
then you really do not know where you are going and you will wind up
somewhere else (to paraphrase baseball guru Yogi). You need to get yourself
into the triangular space depicted in Figure 2.1.
The levels of three ingredients can be represented on this two-dimen-
sional graph paper, also known as “trilinear” for the way it’s ruled. It depicts
blends of up to three materials:
1. Vertices are the pure components. For example, pure X1 (or ingredient “A”)
is the point plotted at the top. For the sake of formulators, this paper is
marked off on a zero to one-hundred scale which can be easily trans-
lated to a more mathematically convenient range of zero to one.
2. Sides are binary blends. Being a yardstick on two components, the
sides are also referred to as “q-2 flats” (Myers et al., 2009, p. 570). The
midpoints of these q-2 flats are 50/50 blends of the components at each
end of the side. For example, the point between A and C represents
exactly half of each (and none of material B!).
31
32 ◾ Formulation Simplified
X1 (A)
90
70
50
10
10
30
30
30
50
50
70
70
10
90
90
X2 (B) X3 (C)
Figure 2.1 Trilinear graph paper for mixtures with points to mark pure components,
binary blends, and overall centroid.
3. Mixtures of three components are in the center area. For example, the
point located precisely in the middle of the triangle, called the “centroid,”
represents a blend of one-third each of all three ingredients.
The neat thing about mapping mixtures to this triangular space is that
once you know two component fractions, the third is determined by the total.
Triangulating Your Region of Formulation ◾ 33
X1 (Cr)
90
70
50
Ni
10
10
8%
30
30
30
50
50
18% Cr
70
70
10
90
90
X2 (Fe) X3 (Ni)
Figure 2.2 Locating the 18-8 composition of stainless steel (for flatware).
figure with one more vertex than the number of dimensions. In this case,
only two dimensions are needed to graph the three components on to an
equilateral triangle. However, a four-component mixture experiment requires
another dimension in simplex geometry—a tetrahedron, which looks like
a pyramid, but with three sides rather than four. To show how easy it is to
create a simplex centroid, here is how you’d lay it out for four components:
1. Four points for the pure components (A, B, C, D) plotted at the corners
of the tetrahedron).
2. Six points at the edges for the 50/50 binary blends (AB, AC, AD, BC,
BD, CD).
3. Four three-component blend points at the centroids of the triangular
faces of the tetrahedron.
4. The one blend with equal parts of all ingredients at the overall centroid
of the tetrahedron.
This totals to 15 unique compositions from the four components. See these
depicted in Figure 2.3.
X1
X2 X4
X3
Here was an opportunity to put the beer snob to the test via a blind, ran-
domized, statistically-planned experiment. You can guess the outcome:
He rated the Old “Swill”Waukee (his misnomer) number 1!
Here are the beer-cocktail ingredients (prices per 12 ounces, serving shown
in parentheses):
A: Wheat
0
60 1
15
45
2 2 30
30 4 5
2
10
45
15
8 9
2 60
0 2 6 3
0
60 45 30 15
B: Lager C: Black
Figure 2.4 Simplex centroid bolstered with replicates and check blends.
Triangulating Your Region of Formulation ◾ 37
◾ Three added blends midway between the centroid and each of the
vertices (pure beers). In the jargon of mixture design, these are called
“axial check blends.” They otherwise fill empty spaces in the experi-
mental region. The addition of points to a textbook layout like the sim-
plex centroid is called “design augmentation.”
◾ Four point replicates (designated by “2”s)—the three binary blends
(midpoints of sides) plus the centroid. This provides four measures
(“degrees of freedom” in statistical lingo) of pure error. By establishing a
benchmark against which the deviations of actual points from the fitted
line can be assessed, pure error enables the testing of lack of fit—useful
for assessing model adequacy.
◾ Three replications of the entire design were sampled by three tasters.
Although the three subjects were chosen carefully based on their good
taste in beer, they differed in their generosity of rating; that is, tending
to score every brew higher or lower. These individual biases were cor-
rected via a statistical technique called “blocking.”
The 14 blends per person (blocked) were provided in random run order for
these three sensory responses:
To keep things simple for educational purposes, we will only look at the
overall liking (Y3) and the response of cost, which is determined com-
pletely by the blend’s composition and, of course, the current cost of each
ingredient.
Mark owns a very accurate kitchen scale (pictured in Figure 2.5)
that he uses to weigh out green coffee beans for roasting (another story!)
so it was convenient for him to set the total for each blend by weight
rather than volume—to 60 grams (roughly two fluid ounces). That kept
the total beer consumption per person to a reasonable level—about two
bottles worth. (Mark admits that during the experiments he managed
to drink about the same amount—in the name of science, naturally).
38 ◾ Formulation Simplified
Figure 2.5 Precisely mixing a beer cocktail behind the screen (to keep tasters blind).
—Stephen Beaumont
World of Beer
See Table 2.1 for the text matrix, laid out by blend type and location,
and the overall liking ratings for the three tasters. The actual order of
presentation was randomized, thus decoupling the cocktail type from
possibly lurking variables such as degrading taste (related to admissions
above), dehydration from exposure to the summertime elements, and
so on.
Be careful about drawing too many conclusions and extrapolating these
very far. However, like all experiments, this one may produce some useful
findings. Let’s see what can be made of it.
Go ahead and look over the results—as Yogi Berra said “you
can see a lot just by looking.” For example, is it possible that some
combinations of beers might be perceived as being unexpectedly tasty?
Or, perhaps, the opposite may be true: Putting individual taste of certain
beers together may not be such a good idea. Keep in mind that this
experiment represents only a sampling of possible reactions by these
particular tasters, who may or may not represent a particular segment of
the beer-drinking market. Does it appear as if any of the three tasters
may have been tougher than the others (hint!)? If so, do not worry; so
long as this individual remains consistent with the others in his or her
relative rankings by blend, then this consistent bias can be easily (and
appropriately) blocked out mathematically, thus eliminating this easily-
anticipated source of variation (person-to-person).
40 ◾ Formulation Simplified
9
8
7
6
Overall liking
5
4
3
A (60)
2
B (0)
1
C (0)
C (60)
A (0)
B (60)
Figure 2.6 Response surface shows peak taste with a synergistic blend of two beers.
42 ◾ Formulation Simplified
negative coefficient (−4.65) on model term BC. You can see this downturn
in the response surface along the BC edge. It is less of a deviation from lin-
ear blending than is observed for the AC binary blend (BC < AC).
That leaves one coefficient to be interpreted—that of term AB. It turns
out that the p-value for the statistical test on this coefficient (2.01) exceeds
0.1, that is, there is more than a ten percent risk that it could truly be zero.
(In contrast, the coefficients for terms AC and BC were both significant at
the 0.01 level). This time around, we did not bother to exclude the insignifi-
cant term (AB) from the model. Removing it would make little difference in
the response surface—just a straight edge between the wheat beer (A) and
lager (B), rather than a slightly upwards curve. We will revisit the issue of
model reduction later. As the number of components increase and modeling
gets more complex, it will become cumbersome to retain insignificant terms.
One advantage you gain from this format is being able to plug in the
actual blend weights and toss out the predicted response for overall lik-
ing. This gets more intense as the order of terms goes up due to the
exponential impact on coefficients—they get really small or very large,
depending on whether your actual inputs are greater than one (as in
this case) or less than one (e.g., if you were serving beer to ants—they
would be happy with very tiny amounts).
Now, look back at the coded equation we provided in the main text and
consider how easy it is to interpret. For example, one can see immediately
what the predicted sensory result will be for each of the pure components
(A, B, and C)—these are the coefficients—no calculating required.
(Continued )
Triangulating Your Region of Formulation ◾ 43
So, here’s the bottom line: For interpretation purposes, always use the
coded equation as your predictive model.
PS: In case you were wondering (?), neither the coded nor the actual
equation features a coefficient for the block effect. These models are
intended for predicting how an elite beer drinker will react to these three
types and their blends. True, some of these individuals will feel compelled
to be snobby and look down on all beers, but this cannot be anticipated
by the formulator, nor controlled once a product goes up for sale. Thus,
the block effect provides no value for predicting future behavior—only
to explain what happened during the experiment.
6
Overall liking
4 4 2
Actual lager 0 15 30 45 60
Actual black 60 45 30 15 0
Figure 2.7 View of BC (Lager-Black) edge after misfit with linear model.
44 ◾ Formulation Simplified
1.30
1.20
1.10
Cost
1.00
0.90
0.80
A (1) B (0)
C (1)
C (0) A (0)
B (1)
A: Wheat A: Wheat
60 60
4.5 0.95
4 0.90
3.5
60 0 60 60 0 60
B: Lager C: Black B: Lager C: Black
(a) (b)
Figure 2.9 (a) Contour plots for overall liking and (b) cost of blended beers.
A: Wheat
60
Cost: 1.10
0 0
Y-Pred 5.5
Cost $1.08
Overall liking: 5
60 0 60
B: Lager C: Black
Figure 2.10 Contour plots for overall liking and cost overlaid.
1 2
Black lager
Figure 2.11 Factorial design on formulation of Black and Blue Moon beer cocktail.
so this guy must have come quite early to stock up so! Such behavior goes
beyond the pale of good taste in our opinion. We urge you to be more
moderate if you try to replicate our beer-blending experiments.
Practice Problems
To practice using the statistical techniques you learned in Chapter 2, work
through the following problems.
Problem 2.1
Normally, we do not recommend the simplex centroid design, especially if
done “by the book,” that is—without check blends. However, it can be use-
ful for three components that cannot conveniently be broken down too far
into fractions. A case in point is the “Teany Beany Experiment” we detailed
in DOE Simplified, 2nd Edition (Chapter 9). Per a randomized plan, a dozen
or so tasters were each required to chew one or more small jelly beans
flavored with apple, cinnamon, or lemon. Biting down on all three at once
presented a challenge, but everyone got the job done. Table 2.2 shows the
experiment design and the taste ratings.
Problem 2.2
This is a case where materials can be freely blended, thus the formulators
could augment the simplex centroid with check blends. The experimenters
measured the effects of three solvents known to dissolve a particular family
of complex organic chemicals (Del Vecchio, 1997). They had previously dis-
covered a new compound in this family. It needed to be dissolved for purifi-
cation purposes, so they needed to find the optimal blend of solvents.
50 ◾ Formulation Simplified
Table 2.3 shows the experimental design and results. Remember that the
actual run order for experiments like this should always be randomized to
counteract any time-related effects due to the aging of material, and so on.
Also, we recommend that you always replicate at least four blends to get a
measure of pure error. In this case, it would have been helpful to do two
each of the pure materials and, also, replicate the centroid.
Notice that for the sake of formulating convenience, the interior points
(centroid and check blends) were rounded to the nearest tenth of a percent
so that they always added to one hundred. Which chemical, or a blend of
two or three, will work best as a solvent? (Hint: Read the Appendix before
finalizing your answer.) For extra credit on this problem, determine the
relative costs of each chemical (MEK is methyl ethyl ketone) and work the
material expense into your choice.
The coefficient on ABC seems surprisingly large unless you remember that
the components are on a scale of 0 to 1. For example, recall from the last
chapter that the maximum deflection from linear blending occurs at the
3100
3034
2900
Elasticity
2700
2500
2300
A (1) B (0)
C (1)
C (0)
A (0)
B (1)
1/2–1/2 (“50/50”) point, thus you must multiply the two-component terms,
such as BC, by one-fourth (1/4) to assess the synergism (or antagonism).
However, for the three-component term, the maximum non-linear effect
occurs at the 1/3-1/3-1/3 point (centroid) with a magnitude of 1/27 the
coefficient of that term. Thus, in this case the maximum effect for BC of
approximately 400 (1/4 of 1597) almost doubles the greatest impact of ABC
(1/27 of 6141).
How do higher order terms compare to the linear ones in this case (2351,
2446, and 2653)? Here again, you must be careful not to jump to conclusions
without first considering the meaning of linear coefficients in Scheffé
polynomial mixture models—the difference is what counts, not the absolute
magnitude. The range of linear coefficients is only a bit over 300 (2653 for C
minus 2351 for A), so the tilt in the plane of response (upwards to component C)
is actually exceeded by the synergism of B and C.
Have you had enough of trying to interpret coefficients in these higher
order mixture models? We hope so because it’s not worth belaboring—simply
look at the response surfaces to get a feel for things. Then with the aid of
computer tools, use the model to numerically pinpoint the most desirable
recipe for your formulation.
This model provides additional terms that capture more complex non-
linear blending than the special cubic. However, be forewarned that the
number of unique blends in the mixture design must always exceed the
count of terms in the model you want to fit. As spelled out in Table 2A.1,
the special quartic model requires considerably more blends at four or
more components than the special cubic, which may make the experi-
ment unaffordable.
(Continued )
Triangulating Your Region of Formulation ◾ 53
55
56 ◾ Formulation Simplified
(a) (b)
Figure 3.1 (a) Laying out the fold lines for building a tetrahedron and (b) the result-
ing 3D paper model.
Label the vertices of the larger triangle “1”—these represent the first
ingredient. Identify the corners of the smaller triangle as 2, 3, and 4 to pinpoint
three more components (thus allowing for four, total). Now cut out the large
triangle and fold the 1’s along lines 4–2, 2–3, and 3–4 to a point, as shown in
Figure 3.1b. There—you’ve made a tetrahedron! Keep this handy to help you
visualize our illustrations of four-component mixture design.
The simplex lattice design comprises m+1 equally spaced values from 0 to 1,
thus
x i = 0, 1/m,2/m, ,1
x i = 0, 1/1
x i = 0, 1
There are only two points, 0 and 1! Going beyond 1 is not allowed, so the
design must stop there. It’s designated as “(2, 1)” based on the number of
components and degree; respectively. We do not recommend this (2, 1)
simplex lattice as-is, there being too few points for any appreciable power.
However, it serves as a launching pad to designs for three components that
are not that bad – for example, the two pictured in Figure 3.2a and b for
second and third-degree modeling.
58 ◾ Formulation Simplified
X1 = 1 X1 = 1
X3 = 1 X2 = 1 X3 = 1 X2 = 1
(a) (b)
Figure 3.2 Three-component simplex lattices of 2nd (a) and 3rd (b) degree.
If you are a formulator, it will seem odd to experiment with several com-
ponents but never a complete blend, which is exactly what happens with
the (3, 2) design depicted in Figure 3.2a—its interior remains devoid of points.
However, keep in mind that you need not adhere strictly to this template—
strongly consider adding the centroid and, if not impractical, additional
check blends inside the simplex formulation space. This will be demon-
strated by example later in this chapter.
Two more simplex lattice designs are shown in Figure 3.3a and b. Notice
by their geometry—tetrahedral—that these encompass four components.
X1 = 1 X1 = 1
X4 = 1 X2 = 1 X4 = 1 X2 = 1
(a) X3 = 1 (b) X3 = 1
Figure 3.3 Four-component simplex lattices of 2nd (a) and (b) 3rd degree.
Simplex Lattice Designs to Any Degree You Like ◾ 59
Also, evident at-a-glance is the increase in the degree of the lattice from left
to right of two to three; respectively.
An easy way to infer the degree is by the number of design points along
the edges; when broken in half, the degree is two—whereas a fragmentation
by thirds indicates a third-degree lattice.
You may recall from math and/or stats class that the exclamation marks
denote a factorial notation. This will come back to you more quickly by
following this example calculation on the number of design points for a
four-component simplex lattice designed to the third degree—a (q, m) of
(4,3), which computes as:
= (4 + 3 − 1)!/(3!(4 − 1)!) = (6!)/(3!3!) = (6 × 5 × 4 × 3 × 2 × 1 )/[(3 × 2 × 1)( 3 × 2 × 1 )]
Table 3.1 Number of Points in Textbook (Raw) Simplex Designs versus What’s
Required by Model
Simplex Simplex Quadratic
Components (q) Centroid Lattice (m = 2) Model Terms
3 7 6 6
4 15 10 10
5 31 15 15
6 63 21 21
Simplex Lattice Designs to Any Degree You Like ◾ 61
We detailed axial check blends via the beer blending case study presented
in Chapter 2, so you’re aware of how these points fill the gaps between the
centroid and each of the q vertices. This design also featured replicates and
how they provide measures of pure error, which, in conjunction with check
blends, facilitate testing for lack of fit. Now we are establishing this augmen-
tation as the standard procedure for simplex designs—centroid or lattice.
Second-order designs augmented per the guidelines we’ve provided are suit-
able for producing a “simplex response-surface” (Smith, p. 49). This leads to
an important insight: Mixture design for an optimal formulation is a close
cousin to response surface methods (RSM) for process optimization.
To augment this lattice, the formulators add 4 axial check blends to the overall
centroid. They then specify that the four vertices (chosen for their high lever-
age) and centroid be replicated (for added pure error measure) at random
intervals. (Always randomize!) Assume that the formulators use a 1-liter blender
to mix the oils—30 blends in total after the augmentation. This ASL design and
the results for overall sensory ratings are shown in Table 3.2. (Note that, for the
Table 3.2 ASL Design for Blending Four Olive Oils and Their Sensory Results
A: B: C: D: Sensory
# Point Type Buza Bianchera Leccino Karbonaca Rating
1 Vertex 1 0 0 0 6.98
2 “ 1 0 0 0 6.84
3 Vertex 0 1 0 0 6.49
4 “ 0 1 0 0 6.45
5 Vertex 0 0 1 0 7.25
6 “ 0 0 1 0 7.30
7 Vertex 0 0 0 1 5.88
8 “ 0 0 0 1 5.95
9 Third edge 0.667 0.333 0 0 7.38
10 Third edge 0.333 0.667 0 0 7.12
(Continued)
64 ◾ Formulation Simplified
Table 3.2 (Continued) ASL Design for Blending Four Olive Oils and Their Sensory
Results
A: B: C: D: Sensory
# Point Type Buza Bianchera Leccino Karbonaca Rating
11 Third edge 0.667 0 0.333 0 6.87
12 Third edge 0 0.667 0.333 0 6.84
13 Third edge 0.333 0 0.667 0 6.95
14 Third edge 0 0.333 0.667 0 7.17
15 Third edge 0.667 0 0 0.333 7.36
16 Third edge 0 0.667 0 0.333 7.14
17 Third edge 0 0 0.667 0.333 7.50
18 Third edge 0.333 0 0 0.667 7.16
19 Third edge 0 0.333 0 0.667 6.95
20 Third edge 0 0 0.333 0.667 7.00
21 Triple blend 0.333 0.333 0.333 0 7.56
22 Triple blend 0.333 0.333 0 0.333 7.53
23 Triple blend 0.333 0 0.333 0.333 7.29
24 Triple blend 0 0.333 0.333 0.333 7.28
25 Axial CB 0.625 0.125 0.125 0.125 7.41
26 Axial CB 0.125 0.625 0.125 0.125 7.37
27 Axial CB 0.125 0.125 0.625 0.125 7.50
28 Axial CB 0.125 0.125 0.125 0.625 7.19
29 Centroid 0.25 0.25 0.25 0.25 7.58
30 “ 0.25 0.25 0.25 0.25 7.55
sake of simplicity, the one-third and two-thirds levels are rounded to 0.333 and
0.667, respectively—thus adding to the total of 1.)
The statistical analysis of this data is detailed via Problem 3.2. The chosen
model is a reduced special cubic:
The presence of the ABC nonlinear blending term supports the choice of a
third-degree lattice design. The other three special-cubic terms (ABD, ACD, and
BCD) were insignificant (p > 0.1) so we chose to remove them from the final
model. Rather than laboriously dissecting the model by its remaining terms,
let’s focus on the response surface graphics: The pictures will tell the story.
Unfortunately, now that we’ve gone to the third dimension the imaging
gets trickier—for example, only three out of the four components can be
depicted on a contour plot. This complication provides the perfect oppor-
tunity to present the “trace” plot—a way to view the relative effects of any
number of components. A trace plot for the olive-oil mixture experiment is
shown in Figure 3.5.
The traces are drawn from the overall centroid—all components at equal
volume within the 1-liter vessel. This is called the “reference blend.” Each
component alone is then mathematically varied while holding all others in
constant proportion. This reveals, for example, that the predicted sensory
evaluation falls off dramatically as the Karbonaca oil (D) is increased relative
to the three alternatives.
To give you a better feel of how the trace plot is produced, let’s consider
the simpler case of only three components. Figure 3.6 shows the paths of
the three traces.
The trace for x1 starts at the overall centroid where it amounts to
one-third of the three-component blend. The other two components are
7.70
D
C
A
C
B
7.23
A
Sensory rating
6.75
B
Reference blend
6.28 A: Buza = 0.250
B: Bianchera = 0.250
C: Leccino = 0.250
D: Karbonaca = 0.250
D
5.80
X1 = 1
90
70
X3 = 0 50 X2 = 0
10
10
30
30
30
50
50
70
70
10
90
90
X2 = 1 X1 = 0 X3 = 1
also at one-third. Thus their ratio is one-to-one. Tracing x1 from the centroid
down to the base of the ternary diagram reduces the amount of this individ-
ual component to zero. At this point the amounts of the other two compo-
nents become one half each—thus their ratio remains one-to-one. In fact, if
you pick any point along any of the three traces, the other two components
remain at constant proportions! Try working this out for yourself—it will be
good practice for reading off coordinates on the ternary diagram.
Now that we’ve been provided with clues on the non-linear blending behavior
of the four olive oils, it seems sensible to study the response surfaces of
the three good components “sliced” at varying levels of the inferior fourth
component. For example, Figure 3.7a and b show the sensory results at the
overall centroid (all components, including D, at 0.25 concentration) versus no
Karbonaca oil (D = 0). If anything, it’s the Bianchera (B) oil that creates the
greatest effect on taste—very noticeably on these slices with D held fixed at
two specific levels (0 and 0.25).
These response surface graphs are very illuminating! It appears as if the
complete four-part blend at the centroid, shown on the left (Figure 3.7a),
will be most robust to variations in olive oil concentrations and deliver a
8 8
7.5 7.5
Sensory rating
Sensory rating
7 7
6.5 6.5
6 6
B (0) B (0)
A (1) A (1)
C (1) C (1)
Figure 3.7 (a, b) Sensory results at the overall centroid versus no Karbonaca oil (D = 0).
68 ◾ Formulation Simplified
superior sensory rating for the most part. A more comprehensive computer-
aided search of the entire tetrahedral formulation space produced the blend
detailed below and depicted in Figure 3.8:
1. 0.333 Buza
2. 0.299 Bianchera
3. 0.189 Leccino
4. 0.179 Karbonaca
9.00
6.50
Desirability = 0.454
7.58
5.88
A: Buza
0.821
Prediction 7.63
95% PI Low 7.51
95% PI High 7.76 7.3
X1 0.333
X2 0.299
X3 0.189 7.4 0.000
7.6
7.5
7.4
7.3
7.2
7.0 7.1
Figure 3.9 Most desirable blend flagged on contour plot (D sliced at 0.179).
MEDITERRANEAN DIET
The key components of the Mediterranean diet include:
Many benefits have been attributed to this diet, including a reduced rate
of coronary events and weight loss. See the American Heart Association’s
internet post on the “Lyon Diet Heart Study” at http://circ.ahajournals.
org/content/103/13/1823 for details on a randomized, controlled trial with
free-living subjects (Kris-Etherton et al., 2001).
Practice Problems
To practice using the statistical techniques you learned in Chapter 3, work
through the following problems.
Simplex Lattice Designs to Any Degree You Like ◾ 71
Problem 3.1
Experience how easy it will be to design a mixture experiment and analyze
the results by using a computer tool specialized for this purpose. It can
be freely accessed via the web site developed in support of this book:
See About the Software for the path. When you arrive at the internet page,
follow the link to the accompanying tutorials. Then download and print
the Mixture Design Tutorial (Part 1—The Basics). It details a case study on
a detergent product that introduces some very practical aspects on how to
experiment on only a portion of an entire formulation. If you come across
a few new concepts, follow the advice in the Introduction of the book,
just keep moving, and worry later about the explanations—these will be
forthcoming. The primary purpose of this exercise is to get a feel for how
dedicated DOE software can facilitate your design and analysis of mixture
experiments.
Problem 3.2
Via the web site developed in support of this book, go to the answer posted
for this problem—a follow up on the olive oil mixture experiment. It details
all the important statistical models and validation of the chosen one by
analysis of variance (ANOVA) and residual diagnostics. Your assignment will
be to reproduce these results using the software made available to you for
this purpose.
Chapter 4
The more constraints one imposes, the more one frees one’s
self. And the arbitrariness of the constraint serves only to obtain
precision of execution.
—Igor Stravinsky
73
74 ◾ Formulation Simplified
that the bartender runs low on the alcoholic ingredients and, furthermore,
you appear to be someone who doesn’t drink heavily. That leads to him
establishing these minimums:
The bartender starts by pouring at least 6 ounces of orange juice into the
highball glass. This constraint is labeled “1” (segment 1–1) in the ternary
diagram (Figure 4.1). The “mixologist” follows with a jigger of vodka—
constraint 2 (segment 2–2)—and garnishes the drink with a splash of
Galliano—constraint 3 (segment 3–3). This shrinks the original simplex
region to the area labeled “A-B-C.” The point pins down the IBA’s ideal
recipe from which the bartender allows himself some variation—providing
a triangular zone of acceptable Harvey Wallbanger’s drink mixes.
Notice that the addition of lower constraints does not affect the shape
of the mixture space: It remains a simplex region. However, the shrinkage
makes it very inconvenient to work graphically within this space. We need
a way to blow it back up—like a photo enlarger. Fortunately, experts in
mixture design developed mathematical tools to accomplish this.
OJ (A)
9
2
3
A
7
1 B C 1
3 2
5
1
1
3
3
3
5
5
7
1
9
If you look over the years, the styles have changed—the clothes,
the hair, the production, the approach to the songs. The icing to the
cake has changed flavors. But if you really looked at the cake itself,
it’s really the same.
—John Oates
Xi X
x′i = = i
∑ Xi 9
Li Ui l’i u’i
1. Water 3% 5% 0.333 0.556
2. Alcohol 2% 4% 0.222 0.444
3. Urea 2% 4% 0.222 0.444
l′2
X′1
:A
2
lc
22
o
0.
ho
≥
l(
C)
B)
.9
a(
u′
re
4
2:
0.
44
:U
22
A
0.
lc
l′3
oh
≤
C)
ol
.7
a(
(B
re
)≤
U
0.
3:
44
u′
.5
.1
.1
.3
.5
.5
.7
.7
.1
.9
.9
X′2 X′3
based on 9% as the formulator’s total. Table 4.1 shows the ranges, lower
(L and l′) to upper (U and u′), for the three detergent ingredients in actual
percents (by weight) and their real values scaled 0 to 1 (dimensionless).
Figure 4.2 maps out the real boundaries of a ternary mixture diagram.
Notice how the space, a simplex triangle, is defined by the lower (l′)
limits of the three components. However, it will be hard to do design-
work within such a small region and, ultimately, visualize the modeled
response. We must perform one more mathematical transformation, called
“pseudo” coding, to expand the restricted reals into a maximum range
from 0 to 1:
Mixture Constraints That Keep Recipes Reasonable ◾ 77
Real − l ′i
L_Pseudo =
1 − ∑ l ′i
x i − l ′i
xi =
1 − ∑ l′
x′1 − 0.333
x1 =
1 − 0.778
x′2 − 0.222
x2 =
1 − 0.778
x′3 − 0.222
x3 =
1 − 0.778
The prefix “L” indicates that this coding has been done on the basis of
the lower boundaries (for more explanation, see the appendix on upper-
bounded pseudos). We’ve removed the prime (′) mark to differentiate the
pseudos from the reals. Table 4.2 provides the amazing results of this trans-
formation: The components now range from 0 to 1!
The payoff to pseudo coding can be seen in Figure 4.3—the entire
ternary graph is now utilized to map out the design points for the detergent
experiment.
You can see in Figure 4.3 that we’ve enlarged the space via this coding.
It now becomes far easier to work with for design purposes. Notice that the
points are now shown—two each (designated by the number “2”) at the
three corners of the triangular experimental space, for example. (The overall
centroid is also replicated.) Refer to Problem 3.1 and the associated online
tutorial for more details on this design.
l’i u’i li ui
1. Water 0.333 0.556 0 1
2. Alcohol 0.222 0.444 0 1
3. Urea 0.222 0.444 0 1
78 ◾ Formulation Simplified
X′1
.9
X1
.7 2
90
.5
.1
.1
.3
.3
.3
70
.5
.5
.7
.7
.1
.9
.9
X′2 X′3
50
10
10
30
30
2
30
50
50
70
70
10
90
90
2 2
X2 X3
silently hums the famous Elvis Presley song about his blue suede shoes
and knows that this is all wrong as well. That puts him back on track for
“pseudo”—the correct spelling. Such are the lengths one must go through
to contend with peculiar scientific jargon.
Drink my liquor from an old fruit jar. … You can do anything but lay
off of my blue suede shoes.
—Excerpt from lyrics by Carl Perkins (music also
originated by him—recorded in 1956)
Practice Problem
To practice using the statistical techniques you learned in Chapter 4, work
through the following problem.
Problem 4.1
Two Portuguese food scientists, Margarida Vieira and Cristina Silva (2004),
applied mixture design to optimize the taste of an exotic nectar based on
the Cupuacu (pronounced “koo-poo-a-soo”) fruit from the Amazon jungle.
They established the following ranges (coded on a 0 to 1 scale) for their
mixture components (in percentages by weight):
X1
90
70
50
10
10
30
30
30
50
50
70
70
10
90
90
X2 X3
1. X1 ≤ 0.4
2. X2 ≤ 0.6
3. X3 ≤ 0.3
X1
90
70
50
10
10
30
30
30
50
50
70
70
10
90
90
X2 X3
Figure 4A.1 Inverted simplex (real) created by entering only upper component
constraints.
Mixture Constraints That Keep Recipes Reasonable ◾ 83
A: X1
0.000
1.000
0.200
0.800
0.400
0.600
0.600
0.400
0.800
0.200
1.000
0.000
1.000 0.800 0.600 0.400 0.200 0.000
B: X2 C: X3
u′i − Re al
U_Pseudo =
∑ u′i − 1
If you are into math, compare this to the equation for L-pseudo coding we
provided in the core of the chapter. Software designed for mixture experi-
mentation, such as the program provided with this book, can detect when
upper (“U”) coding will be advantageous. Then, if the user elects to go this
route, it can do the necessary calculations.
Let’s see how U-pseudo coding works on a real-life example. A coatings
chemist designed a mixture experiment on a chemical paint remover for
an aerospace application (Hensley, 2008). The chemical supplier’s material
safety data sheet (MSDS) specified the following constraints for the three
key active ingredients for their recommended formulation, which totaled
12% by weight:
1. 0%–5%
2. 0%–5%
3. 2%–7%
84 ◾ Formulation Simplified
A: A
0.000
1.000
0.200
0.800
0.400
0.600
0.600
0.400
0.800
0.200
1.000
0.000
1.000 0.800 0.600 0.400 0.200 0.000
B: B C: C
A: A
0.000
1.000 2
0.250
0.750
0.500
0.500
2
0.750
0.250
2 2 1.000
0.000
1.000 0.750 0.500 0.250 0.000
B: B C: C
(see sidebar for details). However, the payoff will be far more visible via the
larger region of experimentation displayed in the response graphics.
Beware when working in this upside-down world of U-pseudo coding:
Low becomes high, and high becomes low on the ternary diagrams. For
example, look carefully at the triangle in Figure 4A.5, which displays the
A: A
0.00
2
2
7.00 115 5.00
120
110 2
125
105 130
X1 5.00
X2 1.93
X3 5.07
100
2 2
2
0.00 5.00 2.00
B: B C: C
Removal
The computer can’t tell you the emotional story. It can give you the
exact mathematical design, but what’s missing is the eyebrows.
—Frank Zappa
87
88 ◾ Formulation Simplified
A: A
0
100
25
75
2 2
50
50
2
75
25
2 2 100
0
100 75 50 25 0
B: B C: C
They required that the sum of these three components come to a constant
of 30% of the shampoo. In other words, 70% of the formulation (water, thick-
eners, preservatives, etc.) remained fixed. The chemists focused only on one
response, which is the foam height—measured in millimeters (mm).
When the ranges of the factors are not equal such as in this case with low to
high differences of 10%, 6%, and 2% for A, B, and C, respectively, the design
space will not be a simplex. It turns out that the constraints for the shampoo
experiment create a four-cornered shape defined by the vertices laid out in
Table 5.1, which we identified in actual component levels via the algorithm
90 ◾ Formulation Simplified
detailed in Appendix 5A. This table also provides the compositions as a frac-
tion of the total (30%)—then referred to a “reals.”
When scaled to reals, the vertices can be plotted on ternary paper so the
experimental region can be laid out and additional points around and inside
it plotted. Figure 5.2 provides the picture of the plot.
This experiment design looks a bit cramped in the rectangular
“penthouse” of the triangular structure. It needs to be converted to pseudo-
component coding, which we introduced in Chapter 4. See Figure 5.3 for
the resulting expansion of space on the ternary plot and blends filled in
along the edges (the four midpoints) and the interior (overall centroid and
four axial checkpoints). For added precision in fitting, all the blends were
replicated in the final randomized plan.
The result of this experiment, for what it’s worth (a lot to the formula-
tors, but not for our discussion on optimal design), is displayed in Figure 5.4,
x1
1
90
4
2
70
3
50
10
10
30
30
30
50
50
70
70
10
90
90
x2 x3
A: TEA-LS (wt%)
0.000
1.000 2
2 0.250
0.750 2
2 2
2 0.500
0.500
2 2
2 2 0.750
0.250
2 1.000
0.000
1.000 0.750 0.500 0.250 0.000
B: Cocamide (wt%) C: Lauramide (wt%)
A: TEA-LS (wt%)
1
28 2
150
160 2
2 170
3
26 2
Prediction 177.1
X1 24
2 X2 3
X3 2
5
24 2
2 2
160 7
22 2 2
150
2
140
9
20 2
9 7 5 3 1
B: Cocamide (wt%) C: Lauramide (wt%)
Figure 5.4 Contour plot of foam height (mm) from shampoo experiment.
with the axis labeling and gridlines reverted conveniently to actual scale.
The optimum foam height in mm is flagged and identified with the compo-
sitional coordinates.
This case study on the shampoo experiment provides a good start on
how to contend with constraints that do not form a simplex. Keep in mind
92 ◾ Formulation Simplified
that, this being done before the availability of software for mixture DOE,
the cosmetic chemists had to go through all the work to lay out the design
graphically, decode the point coordinates back to actual component levels,
and randomize the order before embarking on the experiment. Nowadays
specialized programs, such as the one that accompanies this book, provide
optimal custom designs that do all of this for formulators. Figure 5.5 presents
the shampoo design the way it would be done with these state-of-the-art
computerized tools. In this case, we kept it simple by doing the building
via an algorithm called “point exchange.” We will provide details on its
construction, and a more-sophisticated option called “coordinate exchange,”
later—after covering optimal design.
This may look a bit odd, compared to the original handmade
experiment design. Keep in mind, though, that computers care nothing for
what pleases the eye—only what the algorithmic code drives them to do.
Without the utilization of software tools like this, it is daunting to design
A: TEA-LS (wt%)
1
28 2
3
26 2
6
5
24
2
7
22
2
9
20
9 7 5 3 1
B: Cocamide (wt%) C: Lauramide (wt%)
criteria aim to do, and helpful hints on Getting the Computer to be “More
Open-Minded About What’s Optimal.”
Our mission now is to provide formulators like you just enough infor-
mation (i.e., a “need-to-know” basis) to feel comfortable using optimal
design. You are likely to discover that your constraints will not create
a simplex. Thus these custom design tools will be a godsend. Just do
not get hung up on the details of their development and mathematical
construction.
In this case, the tablet formulators went with this standard selection
of a quadratic model, which featured 10 terms—4 for the main compo-
nent effects (A, B, C, and D) plus 6 for nonlinear blending (AB, AC, AD,
BC, BD, and CD). To fit these 10 terms, 10 unique blends are required.
This is where the I-optimal criterion comes into play. It will be the judge
by which blends to run from within the mixture is spaced. Although
this can be done with few geometric restrictions via a method called
“coordinate exchange” (detailed in sidebar “Candidate-free Approaches for
Exact Optimal Designs,” Appendix 7C, RSM Simplified, 2nd ed.), let’s keep
things simple by identifying a discrete number of points as candidates for
the optimal selection. This is called the “point exchange” method. The
only requirement is that the candidate set exceeds the number of blends
required to fit the model.
The starting point for building up a good candidate set is the extreme
vertices. These being far apart will provide the highest leverage for
fitting the main component effects. There are 12 extreme vertices in the
mixture space for the tablet experiment. To fit the nonlinear terms, it is
helpful to also include centers of edges (binary blends)—18 in this case.
One more point to include is the overall centroid—a complete blend.
This brings the total to 31. Furthermore, to fill in the gaps between the
centroid and vertices, let’s add the 12 axial check blends. This brings
the total to 43 candidate points—well beyond the 10 needed for the
quadratic mixture model.
From this set of candidate points, we laid out (with the aid of the soft-
ware) the 20-run custom design shown in Table 5.2. (Note that, to make it
simpler to see that the components add up to 100%, we included the API
(shaded to highlight it being fixed) and expressed the fractions). It is com-
prised of the following blends:
Practice Problem
To practice using the statistical techniques you learned in Chapter 5, work
through the following problem.
Problem 5.1
For a fun do-it-yourself experiment that involves optimal design for
mixtures, we recommend you make play putty per the recipe posted at
www.statease.com/publications/marks-play-putty-experiment (also refer to
Anderson and Whitcomb, 2002). We chose the following components and
levels to make up 100 milliliters (mL) of this silly stuff for our experiment:
1. Glue, 40–59 mL
2. Water, 40–59 mL
3. Borax, 1–3 mL
Optimal Design to Customize Your Experiment ◾ 99
For the three-components shampoo case, the first step of this algorithm
dictates the construction of three two-by-two tables that spell out the four
extreme combinations for A versus B, A versus C, and B versus C. These are
shown from left to right, respectively, in Table 5.3.
100 ◾ Formulation Simplified
Six combinations out of the twelve meet all constraints, of which four are
unique (blend (20, 7, 3), which are repeated several times)—the ones boxed
by thicker lines. The combinations fail either due to being outside of the
individual component limits (Li−Ui) or the total constraint (TC). For example,
if you blended up 20% of A with 1% of B, then, to meet the total of 30%, 9%
of C would be needed to make up the difference. However, the cosmetic
chemists restricted C to a maximum of only 3%. Therefore, this combination
does not satisfy the rules of the algorithm. Other combinations get thrown
out straightaway due to one component taking up the entire 30% allowable
total—leaving no room for anything else.
More efficient algorithms have been developed since McLean and
Anderson invented theirs, most notably XVERT by Snee and Marquardt
(1974) and Piepel’s CONVRT algorithm (Piepel, 1988). The code is available
in R to compute extreme vertices and other points in nonsimplex spaces—
see Lawson and Cameron’s “Mixture Experiments in R Using mixexp”
(Lawson and Willden, 2016. Code Snippet 2).
Chapter 6
1. 0.1 ≤ X1 ≤ 0.5
2. 0.1 ≤ X2 ≤ 0.7
3. 0.0 ≤ X3 ≤ 0.7
101
102 ◾ Formulation Simplified
90
3. 0
70
50 2. A ≤ 0.5
10
10
.7
≤0
30
30
4. B
30
50
50
70
70
10 1. 0.1 ≤ A
90
90
X2 5. C ≤ 0.7 X3
X1
3 90
70
6
8
50 2
10
10
30
30
7
4 30
50
50
70
70
10 1
90
90
X2 5 X3
bar-screen balcony. This created a crisis when the gluten blob got going
toward the waste-water treatment plant and threatened to overwhelm
it with protein. Only by the last stand of push-broom wielding workers
was the disaster averted.
I think you should send us the biggest transport plane you have, and
take this thing to the Arctic or somewhere and drop it where it will
never thaw.
—Lieutenant Dave, The Blob, 1958, Source: IMDB
http://www.imdb.com/title/tt0051418/quotes
Mark laid out these ingredients and ranges for the Pound-Cake mixture
design, all in ounces (oz.) by weight:
He kept the total 16 ounces, that is, one pound (1 lb), which, being
one-fourth the size of the traditional Pound Cake, fit nicely into four-cavity,
nonstick, mini-loaf pans for baking in his kitchen oven. But most importantly,
to avoid an overdose of flour, Mark specified the following MCC:
3≤ A+B≤5
This created a little experiment with flour types (cake versus AP) within the
greater experiment on the recipe for Pound Cake (relative amounts of flour,
sugar, butter, and eggs). Table 6.1 lays out 12 out of 24 mixtures from an
augmented I-optimal design to fit a quadratic model. It provides 15 blends
to fit the model, plus 4 lack-of-fit points (check blends) and 5 replicates (for
pure error estimation). The total of 24 runs breaks down conveniently into 6
four-cavity pans that can be made in one oven batch.
Notice from the selected runs under columns A and B in Table 6.1 how
the MLC keeps the total of the cake and AP flour within the bounds of
3–5 oz. Success!
106 ◾ Formulation Simplified
Ratio Constraints
In some mixture problems, the ratios of components must be carefully
controlled. For example, bakers of bread do well by blending a ratio of
5 parts of flour to 3 parts of water (Ruhlman, 2009). In some cases, these
ratios relate to the ideal stoichiometry for chemical reactions, such as the
air-to-fuel ratio in a combustion engine, which, depending on the grade
of gasoline, runs at about 15-to-1 [“Air-fuel Requirement in SI Engines
(Automobile)],” what-when-how, (http://what-when-how.com/automobile/
air-fuel-requirement-in-si-engines-automobile/). These ratios generally can
be mathematically converted to MLCs as you will see now via a real-life
example.
An adhesive’s chemist ran an optimal mixture design to model and
control gel time for a liquid epoxy while maintaining other functional prop-
erties (Roesler, 2004). He varied three key components in a mixture, totaling
100% by weight:
2
2 X2 8.75
X3 35
0 0
2 2
80 20 80
B: Plasticizer (wt%) C: Catalyst (wt%)
6
C
Taste (rating)
5 A
B
D
4 E
E
D B
3 A
C
2
Figure 6A.1 Trace plot of taste from a Pound-Cake experiment with both flours
shown.
Sugar
5
Taste (rating)
Butter
4
Eggs
3 Flour
Figure 6A.2 Trace plot redone after combining the two flours.
Getting Crafty with Multicomponent Constraints ◾ 111
Keep in mind that this trick of combining components only comes into
play in the final analysis as a simplification. It should be done judiciously
on the basis of subject-matter-knowledge, that is, not on disparate materials
that coincidentally create similar effects, but only on ones that are chemically
similar.
Chapter 7
Multiple Response
Optimization Hits the Spot
113
114 ◾ Formulation Simplified
Desirability Simplified
Being on a roll with paint (pun intended), let’s see how desirability works
by way of a case study from the coatings industry. In this case (Anderson
and Whitcomb, 1997a), experimenters explored the impact of three rhe-
ology modifiers (“RM”s) on the viscosity and flow of an architectural
coating.
Each of the three components was varied from zero to one-hundred
percent (0%–100%) via a second-degree simplex-lattice design augmented
with the overall centroid and axial check blends. The three vertices were
replicated to provide a measure of pure error. Table 7.1 lays out all of these
blends and the test results. Cost is also included based on a hypothetical
spread of pricing per kilogram of $7.50, $10.00, and $15.00 for RMs A, B,
and C, respectively.
Fitting the rheological data to Scheffé polynomials produces these highly
significant models, quadratic and linear, respectively:
These models for viscosity and flow provide complete control over the
rheology required for any application needs. For example, assume that the
following specifications must be met:
1. Viscosity: 0.5 to 0.7 poise (target 0.6)
2. Flow: 8.0 units or better on a 10-point scale
3. Cost: $10.00/kg or less ($5.00/kg the cheapest possible)
116 ◾ Formulation Simplified
Figure 7.1a–c shows contour plots for viscosity, flow and cost, respectively.
Each of the three plots displays two or more blend-points that fall within
specification. However, none of them meet all the requirements, the center
of edge A–C blend—ID #5 at composition (50, 0, 50) being a near miss.
Prediction 0.68
Observed 7.50
Observed 10.00
8 X1 50.00
X2 0.00
Observed 7.50 X3 50.00
0.00 0.00
Observed 10.00
10
Observed 10.00
12
Observed 10.00 14
2 2
100.00 0.00 100.00
B: RM-B (wt%) C: RM-C (wt%)
(c)
Figure 7.1 Contour plots for viscosity, flow, and cost (a, b, and c).
Multiple Response Optimization Hits the Spot ◾ 117
A: RM-A (wt%)
100.00
2
Viscosity: 0.64
Flow: 8.30
Cost: 9.05
X1 56.56
X2 5.79
X3 37.64
Viscosity: 0.7
Flow: 8
0.00 0.00
Viscosity: 0.5
Cost: 10
2 2
100.00 0.00 100.00
B: RM-B (wt%) C: RM-C (wt%)
Figure 7.2 Overlay plot, showing the sweet spot for a blend of rheology modifiers.
The area meeting both specifications becomes far clearer by overlaying the
three plots and then shading out the regions going out of bounds. As shown
in Figure 7.2, this overlay plot provides a window that frames the sweet spot.
The flag marks a better blend than any made for the experiment: 56.56%
RM-A, 5.79% RM-B, and 37.64% RM-C. (As a practical matter, the coatings
chemist would do well by rounding these levels to 56.5%, 5.8%, and 37.7%,
respectively.) This optimal blend beats mixture #5 on two out of the three
specifications—viscosity (a bit closer to target) and cost (much cheaper).
Only on flow does it fall off somewhat—8.30 versus 8.67 for blend #5.
To achieve this optimal blend, all three response measures were rescaled to
one objective function called “desirability” going from 0 (none) to 1 (complete).
Figure 7.3 shows how desirability ramps up and down for viscosity (a target),
up for flow (maximized) and down for cost (minimized). (The number lines
are bounded by the observed extremes printed in smaller font size.)
0.6 (d = 1)
8 (d = 0) 10 (d = 0)
0.5 0.7 (d = 0)
10 (d = 1) 5 (d = 1)
0.40 4.40 4 15.00
Figure 7.3 Ramps view of the most desirable blend of rheology modifiers.
118 ◾ Formulation Simplified
DETAILS ON DESIRABILITY
For mathematical details on calculating desirability and the attendant
search algorithms for numerical optimization, refer to Chapter 6 of RSM
Simplified, 2nd ed. There, you will also find a background on how to pri-
oritize responses by scaling them in terms of “importance.” For example,
in this case, a coatings chemist could push the selection of the most
desirable blend of rheology modifiers closer to the target on viscosity
by making the response most important—5 on the 5-point scale, while
setting both flow and cost at the least important level of 1.
As evidenced by Figure 7.4, the D of 0.26 is the best that can be achieved
for the goals set on the responses. Its absolute value, that is, the overall
desirability not achieving a perfect value of 1, is of no concern.
When every goal is at least minimally met, that is, just inside the lower
and or threshold level, an overall desirability above zero is achieved. This
“satisfices” (a la the quote by Simon in the sidebar), rather than optimizes.
“Good enough.”
Multiple Response Optimization Hits the Spot ◾ 119
A: RM-A (wt%)
0.1
Desirability 0.26
0.25
0.2 0.15
1. Lactose, 5%–42%
2. Phosphate, 5%–47%
3. Cellulose, 5%–52%
4. Polymer, 17%–25%
5. Drug, 1%–2%
120 ◾ Formulation Simplified
They wanted their final formula to achieve the following two specifications
reliably:
Assuming quadratic mixture behavior, that is, nonlinear blending effects, the
chemists set up an I-optimal design with 15 model points. They augmented
this base experiment with 5 check-blends, and 5 replicates chosen optimally.
Table 7.2 lays out the resulting 25 runs.
After fitting the full quadratic model (A, B, C, D, E, AB, AC, AD, AE, BC,
BD, BE, CD, CE, DE) to each response, the following terms were removed via
backward elimination at p of 0.1 (for background on this tool, see “A Brief
Word on Algorithmic Model Reduction”, RSM Simplified, 2nd ed., pp. 30–32.):
8.00
1.00 8.00
5.00 11.00 4.00
21.75 21.75
55.25 55.25
Dissolution t(50%):
38.50 8.00 Dissolution t(50%):
38.50 8.00
38.50 Hardness: 7.97 38.50 Hardness: 7.97
X1 28.47 X1 28.47
X2 24.58 X2 24.58
X3 28.95 X3 28.95
Dissolution t(50%) CI: 5
55.25 55.25
21.75 21.75
Figure 7.6 Overlay plots for tablet formulation: (a) as-is and (b) with CI. (Continued )
Multiple Response Optimization Hits the Spot ◾ 123
A: Lactose (wt%)
5.00
72.00
21.75
55.25
72.00
5.00
72.00 55.25 38.50 21.75 5.00
(c) B: Phosphate (wt%) C: Cellulose (wt%)
Figure 7.6 (Continued) Overlay plots for tablet formulation: (c) with TI.
confidence and/or the nearer to 100% the requirements reach, the smaller
the QbD design space becomes—all else equal. In fact, we often see the
space covered up entirely by the TI, due to the experiment being sized
too small to support the imposition of the interval. For example, adding
the TI (or even the CI) to the viscosity specification closes the window
in the overlay plot depicted by Figure 7.2 for the rheology-modifier
formulation—this 10-blend design being far too meager for QbD.
◾ For “functional” design, impose CIs, which reduce the risk of mean
results falling outside the allowable operating boundaries.
◾ To verify that your product falls within final manufacturing speci-
fications, for example, to achieve a QbD design space, enforce
tolerance intervals (TI).
Keep in mind that TIs range far wider than CIs. Therefore, a relatively
large experiment will be required to create any design space meeting such
a rigorous statistical requirement.
Practice Problem
To practice using the statistical techniques you learned in Chapter 3, work
through the following problems.
Problem 7.1
Follow up on the detergent case from Problem 3.1 by completing the
multiple-response optimization. Is there a sweet spot for the formula-
tion that achieves all specifications? Find out by using the computer tool
Multiple Response Optimization Hits the Spot ◾ 125
specialized for this purpose—see About the Software for the path to the
program and the link to accompanying tutorials. Download and print
the Mixture Design Tutorial (Part 2/2—Optimization). Follow it through
to the conclusion of this case study.
Nearly all the grandest discoveries of science have been but the
rewards of accurate measurement and patient long-continued labor
in the minute sifting of numerical results.
—Lord Kelvin (Presidential Address to Royal
Society, 1871, quoted p. 940 in Life of Lord
Kelvin, Silvanus Phillips Thompson, 1910)
127
128 ◾ Formulation Simplified
DON’T KNOCK IT
The octane number characterizes a gasoline’s antiknock quality, that is,
its capacity to withstand damaging premature detonation in an engine’s
combustion chamber. Isooctane was the original antiknock additive—
hence the name “octane” for the rating. The higher the octane num-
ber, the more compression the fuel can handle. When gasoline is first
distilled from oil, it has an octane number of about 70, which would
severely limit engine efficiency. An octane rating of 87 meets the needs
for regular gasoline, a level that correlates directly to a mixture of 87%
isooctane and 13% heptane. (A 100% heptane fuel is the zero point of
the octane rating scale.)
In 1921, a team of General Motors chemists discovered that tetraethyl
lead worked wonders for octane ratings. However, due to health risks,
the Clean Air Act of 1996 banned the sale of leaded gasoline in the U.S.A.
One of the cheapest and most common replacements for the lead as an
octane booster is ethanol. However, its energy content is only 70% of
gasoline. Also, ethanol creates a host of engine issues. Currently, the best
option to ethanol is a toxic mixture of benzene, toluene, ethyl–benzene,
and xylene (BTEX).
The search for safe and effective octane-additives continues.
X1 = 1
X3 = 0 X2 = 0
X2 = 1 X3 = 1
X1 = 0
Keep in mind that along the axes of a simplex (e.g., X1 from 0 to 1),
the proportion of each component spans its entire range while the
relative proportions of the other components to one another remain
unchanged. At a minimum, a simplex screening design should be
comprised of the q vertices (10 in the case of the octane additives).
Next, if the experimental budget allows, we advise you add the centroid
and replicate it 5 or so times to provide an estimate of pure error.
Beyond that, consider adding q axial points (located midway between
the centroid and the vertices) to fill the gaps and provide more power.
This brings the total blends to 2q + 1. Finally, another q “end effect”
blend (Snee and Marquardt, 1976, p. 22) can be added—these being
the points along the sides in Figure 8.1 where one component goes to
zero (e.g., X1 = 0). These latter blends, expanding the design to 3q + 1,
measure the effect from the complete elimination of any single
ingredient.
The petroleum chemists chose the middle-sized 2q + 1 simplex screen-
ing design shown in Table 8.2, which included 5 replicates of the centroid
Screening for Vital Components ◾ 131
90
C
85 KB
A K HG E
80 H
B C DE F G J
Octane (number)
JD
F
75
70
65 A
60
(blends 21–25) and the 10 axial check blends (#s 11–20), as well as the
10 vertices (#s 1–10) that must be tested at a minimum.
The resulting octane numbers, which range from 62.3 to 86.0,
produce a significant fit to the linear mixture model. The most
important view of the relative effects is provided by the trace plot
depicted in Figure 8.2.
As we explained in Appendix 6A, this plot displays the predicted
response as any given component deviates from any chosen reference
point (typically the centroid), while holding all other components in
constant proportion. For example, follow component A (X1) from the nexus
of all 10 ingredients at the zero point on the abscissa (0.0 deviation from
reference blend). As indicated by the line going up to the left, reducing
A to its minimum of zero (−0.1 deviation), that is, taking it out of the
mixture, increases the octane. As more and more of component A goes
into the blend, it steadily degrades the octane. The lowest response
prediction comes when the additive is comprised only of A (0.9 units
above the centroid reference blend). Clearly, ingredient X1 (component A)
should be discarded.
Screening for Vital Components ◾ 133
∑β
j≠ i
j
E i = βi −
( q − 1)
134 ◾ Formulation Simplified
where 80.84 is the octane number on average for the nine other additives.
1. X1, 0.10–0.45
2. X2, 0.05–0.50
3. X3, 0.00–0.10
4. X4, 0.00–0.10
5. X5, 0.10–0.60
6. X6, 0.05–0.20
7. X 7, 0.00–0.05
8. X8, 0.00–0.05
60
A
E
55
H
B G
50
F C
D
Y 45 D
C F
40 G
B
H
35
E A
30
Figure 8.3 Trace plot of component effects from eight-component EVD case.
Components C (X3), D (X4), and F (X6) do not affect the response much
one way or the other, as evidenced by their insignificant (p > 0.05) gradi-
ents. If any or all of these ingredients come cheap, these might be retained
in the formula. Otherwise, they should be taken out.
138 ◾ Formulation Simplified
0.7 0.7
0 0.3 0 0.3
0 0 0 0
0.55 0.45 0.25 0.55 0.45 0.25
1 0 1 1 0 1
B: SAN (%) C: pitch (%) B: SAN (%) C: pitch (%)
(a) (b)
Figure 8A.1 Directions of trace for Cox (a) versus Piepel (b) in pipe case.
corners of the big triangle (coded in reals), whereas Piepel steers for
the vertices of the small triangle (coded in pseudos’).
The resulting trace plots are displayed in Figure 8A.2a and b.
The advantage provided by Piepel now becomes apparent by its tracks
being of a consistent length and, for the most part, longer than Cox. Do
not let the change in slope for component B from negative (Cox) to positive
8 8
C C A
B
6 A 6
B
Izod (ft-lb/in)
Izod (ft-lb/in)
B
4 B 4
A
A
2 2
C C
0 0
0
0
0
0
40
20
40
20
00
20
40
60
80
00
20
40
60
80
.
.
0.
0.
0.
0.
0.
0.
0.
0.
0.
0.
−0
−0
−0
−0
(a) Deviation from reference blend (b) Deviation from reference blend
Figure 8A.2 Trace plots by Cox (a) versus Piepel (b) in pipe case.
Screening for Vital Components ◾ 141
(Piepel) put you off—this wavering indicates a linear effect that may not
be significant and, indeed, pales in comparison to the impacts of A and C,
which, throughout, remain consistent.
Consider that the traces are one-dimensional only, and thus, cannot
provide a beneficial view of a response surface, especially with a nonlinear
blending model. Furthermore, they depend not only on direction but also
on the point of origin.
Chapter 9
Working Amounts, Categorical
and Process Factors into
the Mix
This chapter lays out designs that combine varying compositions of mixtures
with changes in levels of process factors and other variables. The frosting on
the cake, perhaps literally, is to experiment on two mixtures simultaneously,
that is, a “mix–mix.” These combined designs unlock a universe of potential
synergisms. However, they can quickly expand far more runs than can be
afforded. Therefore, they must be used judiciously.
143
144 ◾ Formulation Simplified
X1 = 1
X1 = 1
Z1 = −1
Z1 = +1
X2 = 1 X3 = 1
X2 = 1 X3 = 1
Y( z ) = α 0 + α1z1 + α11z12
Due to the size of the crossed models for combined designs such as this,
we recommend that they be reduced. One of the better ways to eliminate
terms is to take out any that fall below a specified p-value (typically 0.1)
but do so in step-wise fashion, starting with the least significant one and
going on from there until only the vital ones remain, that is, when no
further reduction could be made. As noted in Chapter 7 for the tableting
case, this is called the “backward” selection method (as opposed to
“forward”—a method starting from a core model and adding the most
significant term, next most, and so on). The two reduced models are shown
below—all terms being significant but for AC in Y1, which comes in due
to it being needed to support the hierarchy of ACD. (We detailed hierarchy
in Appendix 1B. However, you may not have realized its repercussions
on third-order terms such as ACD, which require not only the three main
components (A, C, and D) be modeled, but also the three nonlinear
blending terms: AC, AD, and CD.)
Being significant terms for both responses that include factor D indicates
that amount does affect the release of the drug as one would assume. The
pharma chemists hoped to achieve 10%–30% dissolution at one hour, for
the same composition and amount of coating, a release of 80%–100% at
12 hours. They succeeded as seen by the window that appears in the graph-
ical overlay depicted in Figure 9.2.
Working Amounts, Categorical and Process Factors into the Mix ◾ 149
A: EA (w/v %)
20
Dis 12 hr: 80
4 7
Dis 1 hr: 30
20 7 17
B: MMA (w/v %) C: TEC (w/v %)
Figure 9.2 Sweet spot for the release of ibuprofen at coating amount (D) of 19 mL.
This elusive sweet spot occurs only between 16.5 and 19.7 mL of the
coating and only for a particular combination of ingredients. Only by a well-
conceived combined DOE can such desirable outcomes be found in a rea-
sonable number of runs.
Elastomer and fiber types are categoric factors; that is, only one of each can
be present in any given mixture.
150 ◾ Formulation Simplified
The ranges in weight percent for the elastomer and fiber are 5%–10%
and 54%–62%, respectively. The ratio of epoxy to hardener must be
maintained from 1.8 to 2.1 (refer to Chapter 6 for detail on how to convert
this into a multicomponent constraint). These specifications set the ranges
for hardener and epoxy levels somewhere between zero and one-hundred
percent (0%–100%)–an algorithm such as that detailed in Appendix 5A will
get this job done. Nothing about the mixture part of the combined design
is new at this stage in the book—the only aspect of it is being combined
with categorical variables. Figure 9.3 shows how this complicates the
structure.
The minimal number of points for an optimal design expands
exponentially as more categorical variables are added, especially,
when they expand beyond two types each, as in the case of the fiber
alternatives. To keep the design to a manageable level, this experiment
on composites is set up for a quadratic mixture model with 10 terms
(the 4 main components plus 6 second-order nonlinear blending
combinations) crossed by a two-factor interaction equation for the
Z1
Z2
5 10 54 62 9.03226 14.6429
18 27.7742 1 2 1 2 3
Treatments Treatments
D: Epoxy resin = 21.4357 E: Elastomer (type) = Elast-2 F: Fiber (type) = Fiber-3
100
Figure 9.4 An optimal composite recipe made with selected types of elastomer
and fiber.
Practice Problem
To practice using the statistical techniques you learned in Chapter 9, work
through the following problem.
Problem 9.1
Experience how easy it will be to design a mixture experiment, combined
with multiple factors (one of the combinations we skipped over) and
analyze the results by using a computer tool specialized for this purpose.
It can be freely accessed via the website developed in support of this book:
See About the Software for the path. When you arrive at the internet page,
follow the link to the accompanying tutorials. Then download and print the
one on Combined Mixture-Process. It explains how food scientists came up
with just the right texture (never mind the taste!) for fish sticks made from
an optimal blend of mullet, sheepshead, and croaker (yuk!). They were then
deep-fried for an ideal period before being baked to perfection at the ideal
temperature and time settings. You will be amazed.
q q q k k k
η( x,z ) = ∑β x + ∑ ∑β x x + ∑ ∑ γ x z + ∑ ∑ α
i =1
i i
i< j
ij i j
i =1 n =1
ik i n
n< m
zz +
nm n m ∑α
n =1
2
nn nz
where q and k are the indices for the process factors and the mixture
components, respectively. The interactions between the linear blending
terms in the mixture and main-effect process variables (e.g., X1Z1) pro-
vide the core value for the crossing of models, that is, an opportunity to
find just the right combination of component levels at a particular set of
factors.
154 ◾ Formulation Simplified
These limitations lead to groupings of runs via the block and split plots,
respectively.
155
156 ◾ Formulation Simplified
The blocked mixture experiment cited by Draper et al. and Lewis et al. in their
1993 and 1994 publications, respectively, was conducted by an English miller
on four flours, each derived from a different variety of wheat. The food chem-
ists blended these four components into doughs in various proportions, which,
when baked into bread, created the varying results (simplified somewhat) in
specific volumes shown in Table 10.1. The higher the volume the better for it
to produce loaves of least density, as desired by consumers in Great Britain. By
breaking this experiment into two blocks, it lessened the impact of unknown
time-related variables that might otherwise be confounding. The block size of
9 runs, each fell well within the capacity of the pilot-scale bakery.
After removing the variation due to the blocks, the regression produces
a model that exhibits strong nonlinear blending of flours 1–3 with flour 1,
Blocking and Splitting Designs for Ease of Experimentation ◾ 157
that is, terms AB, AC, and AD. The strong influence of this first flour (A) can
be seen in the trace plot shown in Figure 10.1.
The mean square for the blocks exceeds the residual mean square
by nearly fourfold, which is appreciable, that is, only by filtering out this
variation over time was it possible for this design to reveal the formulation
behavior exhibited in the trace plot.
This experiment on bread flour pioneered the application of blocking
to mixture designs. However, modern-day computational tools (such those
provided by the software accompanying this book) provide more flexible
(not restricted to simplex regions) and better (statistically optimal) designs
for any number of blocks within reason. Applying optimal design, in
158 ◾ Formulation Simplified
440 D B
C
420 B
Volume (ml/100 g)
400
A
380 A
D C
360
340
mixture designs were still in their infancy. To be fair, all of these, coming
at a very slight cost-optimal designs, such as those we laid out, are not
orthogonally blocked, only nearly so (e.g., VIF of 1.4 in our design—a
value of 1.0 being completely orthogonal).
Table 10.2 Flour Experiment Rebuilt with Modern Tools for Optimal Design
A: B: C: D:
ID Block Build Type Space Type Flour 1 Flour 2 Flour 3 Flour 4
1a 1 Model CentEdge 0.500 0.500 0.000 0.000
1b 1 Replicate CentEdge 0.500 0.500 0.000 0.000
2 1 Model Vertex 0.000 1.000 0.000 0.000
3a 1 Model CentEdge 0.500 0.000 0.500 0.000
3b 1 Replicate CentEdge 0.500 0.000 0.500 0.000
4 1 Model CentEdge 0.000 0.500 0.500 0.000
5 1 Model Vertex 0.000 0.000 1.000 0.000
6 1 Model PlaneCent 0.000 0.33̅3̅ 0.33̅3̅ 0.33̅3̅
7 1 Model Interior 0.375 0.125 0.125 0.375
8 1 Lack of fit AxialCB 0.125 0.125 0.125 0.625
9 2 Model Vertex 1.000 0.000 0.000 0.000
10 2 Lack of fit PlaneCent 0.33̅3̅ 0.33̅3̅ 0.33̅3̅ 0.000
11 2 Lack of fit AxialCB 0.125 0.625 0.125 0.125
12 2 Lack of fit AxialCB 0.125 0.125 0.625 0.125
13a 2 Model CentEdge 0.500 0.000 0.000 0.500
13b 2 Replicate CentEdge 0.500 0.000 0.000 0.500
14a 2 Model CentEdge 0.000 0.500 0.000 0.500
14b 2 Replicate CentEdge 0.000 0.500 0.000 0.500
15 2 Model CentEdge 0.000 0.000 0.500 0.500
16 2 Model Vertex 0.000 0.000 0.000 1.000
160 ◾ Formulation Simplified
D D
8
13 15
7
10 14
6
12
A 3 5 C A 9 C
1 11
4
2
(a) B (b) B
Figure 10.2 Point locations in blocks 1 (a) and 2 (b) for optimal design on flours.
Figure 10.2 shows the optimal design block by block within the tetrahedral
space. The double-circled points are replicated.
Note how the point allocation in our optimal design is mutually exclusive,
that is, they complement each other by block, whereas in the original
experimenters reran blends 1–4 and 9 in their second block. If you do want a
“control” blend across both blocks, we recommend it be the overall centroid.
Also, since three out of the four vertices made it into this algorithmic design,
adding the missing one (D—the pure blend of flour 4) would be sensible.
To summarize, using standard templates, or more-versatile computer
tools for optimal design, blocks should be set up to be orthogonal, or nearly
orthogonal, to the component effects you want to estimate. Having done so,
the breakdown of runs into subgroups will not make much of a change to
the model-coefficient estimates, if at all. Then if blocks differ in an apprecia-
ble degree, you need not worry, because this variation will be filtered out of
the overall system noise, thus making it easier to detect significant changes
to the response caused by the components themselves.
◾ A new blend must be prepared even if the recipe listed in the experi-
mental plan does not change.
◾ The process must be reset, even though all of the factors remain at the
same level in the design from one run to the next.
If all the variables are easy to change (ETC), this complete randomization
with resets should be done. However, it is not often practical, or even pos-
sible, to perform an experiment in this way. Consider, for example, a food
scientist mixing up a single cookie by the recipe specified in the design
layout and then baking it all by itself per the plan for that run. That would
be ridiculous. A more sensible approach in this case would be to mix a
batch of cookies with varying recipes and then bake a tray of them. This
introduces a restriction in randomization that creates a designed-experiment
called a “split plot.”
For background on split plots see Chapter 11 of DOE Simplified, 3rd edition.
Here we will detail its application to an experiment done at Stat-Ease head-
quarters to improve the office coffee by a better blend of beans, combined
with improved methods for grinding and brewing (Bezener and Anderson,
2016).
The coffee provided by the vendor of the Stat-Ease brewing machine
created a backlash from our programmers who declared it to be “disgusting
and unacceptable.” They questioned not only the type of coffee being used
but also the fineness of its grind and how much of it to be brewed per pot.
The team agreed that changes would be made only if the new coffee be
162 ◾ Formulation Simplified
The fully randomized experiment required freshly blended beans for each
pot. However, it being far more convenient to mix these up in quantity,
the blends were restricted to 16 groups using a 74-run split-plot design.
Table 10.3 shows the first 9 runs of the experiment to illustrate the grouping
(3 shown) of the mixtures.
A supplement to the book details the modeling and statistical analysis of this
combined split-plot design. A subsequent numerical search came up with
a most desirable 50–50 blend of medium and dark beans (no light ones)
when ground to the fine level and brewed at a loading of 2.5 oz. This hit the
“sweet spot” for the taste testers as evidenced by 10 follow-up runs—4 of the
chosen blend, 2 with the standard office coffee, and 4 at various other com-
binations of beans ground in different ways and produced with changing
amounts, none of which deviated significantly from the model predictions.
164 ◾ Formulation Simplified
P1 P1
P2 P3 P2 P3
Extrusion rate
P1 P1
Drying temp
P2 P3 P2 P3
(a) (b)
Figure 10.3 Split plots for mixture HTC (a), versus process HTC (b).
Chapter 11
Eye of newt and toe of frog, wool of bat and tongue of dog.
Adder’s fork and blindworm’s sting...Barbados lime is just the
thing. Cragged salt like a sailor’s stubble! Flip the switch and let
the cauldron bubble!
—Owen’s sisters (witches) reciting recipe for
margaritas (Practical Magic, Warner Brothers, 1998)
In this last chapter, we come back to what defines a mixture and provide
practical advice on how to put the statistical design of experiments to good
use for optimizing your formulation. To begin, let’s dispel various excuses
to avoid the peculiarities of mixture design and analysis by denying the true
nature of the experimental variables.
165
166 ◾ Formulation Simplified
ingredients. For this to work, you must fix the total of your components
that will be varied, for example, a 20-gallon cauldron for a magical
concoction. Unfortunately, many formulators, particularly those who are
only taught the standard DOE tools for factorials and response surface
methods, do not take to mixture design. They excuse themselves along
the following two lines epitomized by these actual quotes from clients
who must remain nameless:
500
450
400
350 Plasticizer
300 Filler
Other
250 Polymer
200
150
100
50
0
1 2 3 4 5 6 7 8 9 10 11 12 13
100%
90%
80%
70%
Plasticizer
60%
Filler
50% Other
40% Polymer
30%
20%
10%
0%
1 2 3 4 5 6 7 8 9 10 11 12 13
1. Filler, 23%–55%
2. Plasticizer, 11%–28%
3. Other, 11%–19% (not a constant!)
4. Polymer, 19%–33%
Practical Magic for Making the Most of a Mixture ◾ 169
1. Filler 25%–55%
2. Plasticizer 11%–28%
3. Polymer 20%–33%
100%
90%
80%
70%
60% Plasticizer
Filler
50% Polymer
40% Other
30%
20%
10%
0%
1 2 3 4 5 6 7 8 9 10 11 12 13
C
7 D
A
Overall liking
B B
6
D A
C
5
Phase:
Known Unknown
Screening
components components
Designs:
Simplex screening Trivial
Extreme vertices Screening
many
Vital few
Phase:
Optimization
Estimate
Designs: nonlinear
Simplex lattice ≥ 2nd degree blending effects
Optimal for ≥ 2nd order
Phase:
Verification No
Confirm? Backup
Design:
Confirmation runs Yes
Celebrate!!!
via confirmation runs; for example, those done in the coffee experiment
detailed in Chapter 10, or more rigorous methods (see Martin Bezener’s
March 2015 webinar on “Practical Strategies for Model Verification” posted at
www.statease.com/training/webinar.html).
This concludes our book on mixture design for optimal formulation. We hope
you put these methods to good use for revising your recipe to the sweetest
spot ever.
References
175
176 ◾ References
Cornell, J. Experiments with Mixtures, 3rd ed. New York: John Wiley & Sons, 2002.
Cornell, J., and G. Piepel. Methods for Designing and Analyzing Mixture
Experiments, notes from a short course presented at the Fall Technical
Conference by the Chemical & Process Industries Divisions (CPID) and the
Statistics Division of the American Society for Quality (ASQ), and by the
Section on Physical and Engineering Sciences (SPES) and the Section on
Quality & Productivity (Q&P) of the American Statistical Association (ASA),
October 8, 2008, Phoenix, AZ.
Cox, D. R. A note on polynomial response functions for mixtures. Biometrika,
1971, 58, 155–159.
Crosier, R. B. Mixture experiments: Geometry and pseudocomponents.
Technometrics, 1984, 26(3), 209–216.
Daniel, C., and E. L. Lehmann. Henry Scheffe 1907–1977. The Annals of Statistics,
1979, 7(6), 1149–1161.
Del Vecchio, R. J. Design of Experiments. Hanser/Gardner, Cincinnati, OH, 1997,
pp. 100–101.
DeVries, P. Let Me Count the Ways. Boston, MA: Little Brown, 1965.
Dick, P. K. Valis (Valis Trilogy #1). Santa Ana, CA, 2011. www.amazon.com/
VALIS-Valis-Trilogy-Philip-Dick/dp/0547572417.
Draper, N. R., and I. Guttman. Rationalization of the “alphabetic-optimal” and
“variance plus bias” approaches to experimental design. Technical Report 841,
1988, Department of Statistics, University of Wisconsin.
Draper, N. R., P. Prescott, S. M. Lewis, A. M. Dean, P. W. M. John, and M. G. Tuck.
Mixture designs for four components in orthogonal blocks. Technometrics,
1993, 35(3), 268–276.
Dryden, J. All for Love; Or, The World Well Lost: A Tragedy. 1678. www.bartleby.
com/18/1/.
Goos, P., and A. N. Donev. The D-optimal design of blocked and split-plot
experiments with mixture components, Research Report 0303, Departement
Toegepaste Economische Wetenschappen, Katholieke Universiteit Leuven,
January 2003, p. 1. https://lirias.kuleuven.be/bitstream/123456789/118367/1/
OR_0303.pdf.
Guidance for Industry. Q8(R2) Pharmaceutical development. U.S. FDA, November
2009, p. 9.
Hensley, C. Design of experiments helps reduce time to remove aerospace
coatings. Aerospace Engineering & Manufacturing Technology Update, 2008,
pp. 21–23. http://articles.sae.org/2916/.
Humphrey, J. W., J. P. Oleson, and A. N. Sherwood. Greek and Roman Technology:
A Sourcebook. Routledge, New York, 1998.
Kalichevsky, V. A. The Amazing Petroleum Industry. New York: Reinhold, 1943, p. 7.
Koons, G. F., and M. H. Wilt. Design and analysis of an acrylonitrile—butadiene—
styrene (ABS) pipe compound experiment. Computer Applications in Applied
Polymer Science, Chapter 27. Washington, DC: American Chemical Society,
1982, pp. 439–448.
References ◾ 177
Kowalski, S., J. A. Cornell, and G. G. Vining. A new model and class of designs
for mixture experiments with process variables. Communication in Statistics:
Theory and Methods, 2000, 29(9–10), 2255–2280.
Kowalski, S., J. A. Cornell, and G. G. Vining. Split-plot designs and estimation
methods for mixture experiments with process variables. Technometrics, 2002,
44(1), 72.
Kris-Etherton, P., R. H. Eckel, B. V. Howeard, et al. Lyon Diet Heart Study: Benefits
of a Mediterranean-lifestyle, National Cholesterol Education Program/American
Heart Association Step I dietary pattern on cardiovascular disease. Circulation,
2001, 103, 1823–1825.
Lawson, J., and C. Willden. Mixture experiments in R using mixexp. Journal of
Statistical Software, 2016, 72, 1–20.
Lewis, S. M., A. M. Dean, N. R. Draper, and P. Prescott. Mixture designs for
q components in orthogonal blocks. Journal of the Royal Statistical Society.
Series B (Methodological), 1994, 56(3), 457–467.
McLean, R. A., and V. L. Anderson. Extreme vertices design of mixture
experiments. Technometrics, 1966, 8(3), 449.
Meadows, S. L., C. Gennings, W. H. Carter Jr., and D. S. Bae. Experimental
designs for mixtures of chemicals along fixed ratio rays–classic methodology
for detecting and characterizing departures from additivity, isobolograms.
Environmental Health Perspectives, 2002, 110(Suppl 6), 979.
Myers, R. H., D. C. Montgomery, and C. M. Anderson-Cook. Response Surface
Methodology, Process and Product Optimization Using Designed Experiments,
3rd ed. New York: John Wiley & Sons, 2009.
Piepel, G. F. Measuring component effects in constrained mixture experiments.
Technometrics, 1982, 25, 97–105.
Piepel, G. F. Programs for generating extreme vertices and centroids of linearly
constrained experimental regions. Journal of Quality Technology, 1988, 20(2),
125–139.
Piepel, G. F., and J. A. Cornell. Designs for mixture-amount experiments. Journal
of Quality Technology, 1987, 19(1), 11–28.
Roesler, R. R. How to bake the perfect cake. Paint & Coatings Industry, 2004.
Roudsari, S. F., R. Dhib, and F. Ein-Mozaffari. Mixing effect on emulsion
polymerization in a batch reactor. Polymer Engineering and Science, 2014,
55(4), 945–956.
Ruhlman, M. Ratio: The Simple Codes behind the Craft of Everyday Cooking, 2009,
Scribner, New York, p. ix.
Sahrmann, H. F., G. F. Piepel, and J. A. Cornell. In search of the optimum Harvey
Wallbanger recipe via mixture experiment techniques. American Statistician,
1987, 41, 190–194.
Scheffé, H. Experiments with mixtures. Journal of Royal Statistical Society, 1958,
B20, 344–360.
Scheffe, H. Simplex-centroid design for experiments with mixtures. Journal of the
Royal Statistical Society, Series B (Methodological), 1963, 25(2), 235–263.
178 ◾ References
Stat-Ease, Inc.
2021 East Hennepin Ave, Suite 480
Minneapolis, MN 55413
Telephone: 612-378-9449
Fax: 612-378-2152
E-mail: support@statease.com
Website: www.statease.com
179
Index
Note: Page numbers followed by f and t refer to figures and tables respectively.
181
182 ◾ Index
L-pseudo-component coding, 76–78, 77t, 78f minimum constraints for, 73–74, 74f
inverted simplex in, 82–83, 83f parts and, 167–169, 168f, 169f
paint remover experiment, 84, 84f quintessentials of, 2–3
RSM application to, 167
M split plots for, 161, 164, 164f
mpg. See Miles per gallon (mpg)
MAD. See Mixture-amount design (MAD) Multicomponent constraints (MCCs)
MCCs. See Multicomponent constraints application of, 101
(MCCs) combining components, 108–111,
Mediterranean diet, 70 109f, 110t, 110f
Melting point, eutectic, 9–10 pound cake experiment, 104–107,
Melting point of gold and copper mixture 106t, 107t
different blends and, 5–6, 6t ratio constraints, 107–108, 108f
response surface for, 7–10, 8f simple compared with, 101–102,
synergism of, 3–4, 4f 102f, 103f
Miles per gallon (mpg), 15–17 sources of, 103
Minimum constraints Multiple response optimization, 113
establishment of, 73–74, 74f desirability, 114–118, 114t, 116f, 117f, 119f
pseudo coding for, 76–78, 77t, 78f for detergent experiment, 124–125
rescaling of component levels for, 75–76, quality by design, 119–124, 120t–121t,
76t, 76f 122f–123f
Mixture-amount design (MAD) MX “Peacekeeper” rockets, 20
I-optimal selection for, 145–147, 146t
sweet spot for, 148–149, 149f N
three-component, 143–145, 144f
Mixture-amount experiment Nitrogen tetroxide, 20
categorical variables, 149–151, 150f, 151f Nonlinear blending
example of, 3 analogy for, 14
mixture-amount design, 143–149, 144f, idiom for, 23
146t, 149f mathematical model for, 7–8
Mixture model, 7 second-order equation for, 12–13, 12f
coded equation for, 42–43 third-order, 158–159
first-order. See First-order mixture model Nonlinear combined mixture models, 154
fourth-order. See Fourth-order mixture Nonsimplex
modeling creation of, 87, 88f
second-order. See Second-order mixture extreme vertices design for, screening,
model 134–138, 136t, 137t, 137f
special cubic. See Special cubic
special quartic, 52, 53t O
third-order. See Third-order mixture
model Octane number, 129
Mixtures. See also Four-component mixture; Olive oil study. See Extra-virgin olive oil
Three-component mixture; Two- study
component mixture Optimal design
definitions of, 1–2 for any feasible region, 93
factorial application to, 167 categorical variables in, 149–151,
fillers and, 166 150f, 151f
186 ◾ Index