Hayman: A Generalised Analysis of Diallel Crosses
Hayman: A Generalised Analysis of Diallel Crosses
Hayman: A Generalised Analysis of Diallel Crosses
TABLE 1
Parents
Genotype AA
Frequency a
Mean +d
AA:Aa Aa
M(d + h) h
A A :2Aa :aa Aa:aa
p2 BY
35It %(-d + h)
Aa Aa:aa aa
ffY BY Y2
aa y - d / h fb(-d + h) -d
Note that FI and FII can be either both positive or both negative; all other compo-
nents of variation are positive. Substituting the above components of variation,
the variances and covariances become
Vpi = +
$01 ~ H I I -
I $Fir
Vps = &HIII - ~ F I I
$01i-
Vr = $01 + &HI - &FI
V; = $01 + &HI - &HI1 - & F I
@pi/r = $01 + $Hi11 - Q H I v - &FI - &FII
@p2/r 401 + &HI11 - &HI"
= - &PI - &FII
W ~ I J=
PB +
4 0 1 $ H I I I - &$II
Where vr is mean variance of arrays, V ; is variance of array means, W P 1 1 , . and
@p2/r are the mean covariances of arrays with P1 and P2 respectively and WpIIpp
is the covariance of P1 and P2.
These formulae may be rearranged as follows:
68 ANALYSIS O F DIALLEL CROSSES
DI = +
2Vp1 SVPZ- ~ W P I / P B
HI = 4vpl+ 16Vp2 16P, + 16TV~lj~ +
- 32Wp21, - 16Wpl/p2
HII = 16Vr - 16V;
HIII = +
16Vpl MVP2 - ~ ~ W P I / P Z
HIV = +
8Vp1 - 16JV~i/r 16Wm/r - ~ W P I / P Z
FI = +
8Vp1 32Vp2 +
16Wpl/r - 3 2 W ~ ~-/ r32Wpl/p2
FII = +
16Vp1 ~ ~ V-P~ S~ W P I / P Z
The components of variation have been defined such that in a random mating
population, they reduce to those of MATHER(1.c.). The only difference is that
MATHER’S D contains a small h component if there is inequality of allele frequency.
This component is herein represented separately as Fr, F I I , H I and HI,, following
the modification used by JINKS and HAYMAN (1953).
Thus under random mating-
DI = DII = 4Zuvd2
H I = H I I ~= 82uv(l - 2uv)h2
HI1 = HIv = 1 6 Z ~ ~ v ~ h ~
FI = F I I = ~ ~ Z U V-
( Uv)dh
For random mating with unequal allele frequencies, F I and FII are only zero under
special conditions of opposing d’s and h’s.
Other properties of the components of variation are:
for equal allele frequencies, irrespective of the mating system,
H I = HI,, H I 1 1 = HIV and Fr = FII =0
which becomes h/d if the n loci have equal effects; whereas if the loci have unequal
effects, the estimated dominance ratio is automatically weighted in favour of those
with largest effect, with allele frequency nearest to 0.5, and with least heterozygosity
among the parents. As unequal effects will be the rule rather than the exception,
and since we can only deal with the combined effects of all the loci, the weighted
estimate will include relatively more information about loci of greater consequence
in these crosses than about loci a t low frequency and/or of small effect.
I n addition to thelabove, on similar grounds
The estimates of h/d and h/d2 can be solved simultaneously for h and d.
Average level of heterozygosity (P)
Of the estimates of this parameter which are available, some depend on com-
parisons within H’s, one on comparisons within F’s and another on comparisons
within D’s. Estimates depending on H’s will refer to the average level of heter-
ozygosity a t loci showing dominance since the ratios are restricted to terms in h2
and they are weighted in favour of loci showing greater dominance since such loci
contribute relatively more than ones showing little dominance. The estimate based
on F’s again involves only loci showing dominance, but weighting in favour of loci
showing greater dominance is less than in estimates based on H’s. The remaining
estimate, that using D’s, takes into account all loci, provided that they show no
linkage. This latter restriction is serious but necessary because DII, as shown later
herein, is derived from within family variance.
The estimates of P are
70 ANALYSIS OF DIALLEL CROSSES
The two estimates using H’s will be weighted in favour of loci with allele frequency
nearest to 0.5. I n addition, loci with a low level of heterozygosity contribute relatively
more to these estimates than ones with a high level; hence underestimation of the
mean heterozygosity will result whenever /3 has a different value for different loci,
which will in practice be the normal situation.
The three ratios
are useful in giving the relative values of P and 2. They are not, however, inde-
pendent of the previous estimates involving H’s and must therefore only be used as
alternative, but not additional, estimates.
By solving the above equations for both P and Z , it is also possible to estimate
the mean value of a y , since
2 = + iP(1 - PI
Inbreeding coeficient of parental population (f )
I n terms of the aPy notation, the inbreeding coefficient is given by
4ay - p2
= (2a + P I @ + 2-i)
whence
These estimates off are weighted in a manner comparable with the estimates of
P.
Interpretation of the sign of the FI and FII
The sign is dependent on the composite sign of (CY - y)h, both components of
which can take either sign.
Thus,
Sign of Sign of Therefore, when there is l k e sign of both F1
(a - Y) It an excess of and Frr will be
- + dominant increasers -
- -
recessive increasers +
+ +- recessive decreasers +-
+ dominant decreasers
a positive result will indicate an excess of increasers while the result will be negative
when decreasers are in excess.
provided that certain assumptions are made. Thus, assuming that the n loci involved
have equal effects and show unidirectional dominance, the expression becomes
( 2 n ~-~n2) ~
- n
4nPh2 n
This will be an underestimate of n if either or both assumptions are incorrect. Thus,
when all loci do not show unidirectional dominance, + and - values of h cancel
out in F1 and P2 means, with a consequent reduction in the estimated value of n.
I n addition, inequality of effects a t the n loci will also give a spuriously low estimate,
the reasons for which, in a comparable estimate, are discussed by MATHER(1.c.).
It is obvious that the present estimate, being restricted to a comparison of terms
in h, will give an estimate for loci showing dominance and will be biased in favour
of those of greater dominance. No similar comparison involving d’s has been found,
hence there is no available estimate of MATHER’S ‘effective factors’.
Linkage
The generalised diallel analysis, as described above, has been restricted to the
analysis of family-means, and a t the F1 level, the use of Yamily-means avoids the
consequences of linkage. If, however, individuals are used instead of family-means,
linkage phenomena and their resulting complications become involved.
Calculations of the total variance of individuals in a diallel is possible in a manner
comparable to that for means. This variance will include an environmental compo-
nent different from that for variance of means, in this case being the one for indi-
viduals, E1 (MATHER, I.c.).
The variance of individuals about the grand mean (V,) is,
iDI + 3DII + *HI - &HITI + $Zph2 - +FI - +FII
where DII (as defined on page 67) is 2&3d2. This variance is computed on the
assumption of no linkage. If linkage is an important element, the actual variance
will devkte from this expectation. It has not been possible to specify whether pre-
ponderant coupling or repulsion linkage will increase or decrease the actual variance.
This difficulty arises from the joint effect of the frequencies of the various parental
genotypes and the particular linkage conditions in each parent.
Theoretically, it should be possible to detect linkage if the several H’s, F’s and
DI (which have been computed from the family-means) are substituted in the
above formula for V,-this will only be possible, however, if no other source of bias
is present, such as unequal effects. The item (+Z&P +$Wh2) can then be de-
72 ANALYSIS O F DIALLEL CROSSES
termined both by remainder and also by substitution of 0,d2 and h2 which have
been estimated. The value of this item should be the same by remainder as by
substitution if linkage is absent, if allele frequencies are uncorrelated and if all loci
show dominance and have equivalent effects. Any difference between “remainder”
and “substitution” values will indicate the general level of bias from the combined
sources or, in the special case of equal effects and uncorrelated allele frequencies,
linkage will be indicated.
The accuracy of the method outlined above will not, however, be very great,
since the components of variation will have relatively large standard errors. No
significance tests have, however, been devised.
In a diallel between homozygous lines the two new items $08 + &3h2 are both
equal to zero. The linkage test, therefore, is simply one of homogeneity of the re-
maining components of variation derived from the analysis of family means and
those from the family variances of segregating families. Since segregating families
do not appear in the parents and first generation progeny of a homozygous diallel
this test cannot be applied until at least segregating generations such as FZ’Shave
been raised.
Non-allelic interaction
In all the statistics which have been developed non-allelic interaction will appear
as mimicking allelic interaction (i.e. dominance, h 5 d or overdominance h > d ) .
Thus, for instance, duplicate factor interaction will appear as h 5 d, whilst comple-
mentary gene action will appear as h > d. Other aspects of non-allelic interaction
are considered in the section immediately following.
The regression of array covariance (W,) on array variance (V?)
A brief summary of the conclusions of JINKS (1954) and HAYMAN (1954) relating
to the homozygous analysis will form a useful basis for discussion of the present
general case. Thus, with parental homozygosity, and in the absence of non-allelic
interaction, the regression of W , on V , represents a line of unit slope for any par-
ticular level of dominance, whilst in the absence of dominance the regression line
degenerates to a single point (see fig. 1). All points must, on mathematical grounds,
be within a limiting parabola, as indicated in figure 1. The point of intersection of
the regression line with the covariance axis, relative to the position of t V p l on that
axis, can be used to estimate the overall degree of dominance. The order of parents
along the regression line indicates the relative proportion of dominants to recessives
in the corresponding parents-points with lower values of W , and V , have the
greater proportion of dominants. Non-allelic interaction, particularly complementary
gene action, results in deviation of the points corresponding to the arrays of inter-
acting parents from the expected regression line. Hence, in such cases, the actual
regression line will usually deviate from unit slope.
Returning to our main theme, in the generalised case, a two-gene, two-allele model
can usefully illustrate the consequences of heterozygosity of one or more of the
parents. The nine parents in the diallel model show all combinations of increasing
and decreasing alleles of two genes, A and B. For simplicity, both genes are shown
A. G . DICKINSON AND J. L. JINKS 73
Complementa,ry gene
action shifts points
Regression lines in the corresponding to
absence of non-allelic the arrays of any
interaction interacting parents,
0 as shown by arrows.
FIGWE1.
Complete
heterozvaote
'ssion
vr
2.
FIGURE
with equal effects and interaction between loci is excluded. Family means are used
in the calculation of array variances and covariances. In the graphs (fig. 2 and 3)
A and B represent dominant increasing alleles and a and b represent recessive de-
creasing alleles, for the various levels of h/d.
74 ANALYSIS OF DIALLEL CROSSES
wr
+
-
r /
01
\
'\
FIGURE
3.
In figure 2, the W,/V, graph is shown of the model for full dominance. Points
representing homozygous parents are exactly as in a homozygous-only diallel; thus
a line through such points is of unit slope. The complete heterozygote (AaRb) lies
on the limiting parabola a t a point which, in a homozygous diallel, represents that
for parents showing no dominance. The points for partial heterozygotes are inter-
spersed between that for the complete heterozygote, on the one hand, and, on the
other hand, those for homozygous parents. The actual regression line for the points
in figure 2 is illustrated; its slope deviates from unity ( b = .92 f .Ol) even though
the model excludes non-allelic interaction. I n an actual cross, the regression line will
only rarely be of unit slope, the actual slope of the line will depend on the heter-
ozygosity of the parents present (excluding interaction). The order of points along
the regression line indicates the relative proportions of dominants to recessives in
each parent, just as with the homozygous analysis. I n practice, however, this will
be more difficult to determine in the generalised case due to the scatter of points
about the regression line.
Figure 3 is similar to figure 2 but includes various levels of dominance: it will be
evident that points for heterozygous parents always lie to the left of and above those
for homozygous parents. With an increasing proportion of heterozygosity in the
parents, for any particular level of dominance, there is a proportional shifting of the
regression line upwar& and to the left. The effect of this shift is to simulate lower
levels of dominance.
Except in cases of extreme over-dominance, the points for heterozygous parents
A. G . DICKINSON AND J. L. JINKS 75
lie in a relatively restricted area, the long-axis of which is roughly parallel to a line
of unit slope. Therefore, the deviation of the regression line from unit slope due to
heterozygosity will not usually be very great, though the deviation may be sig-
nificant. As mentioned above, it has been shown that complementary gene action
in particular, can cause extreme deviation of the corresponding point to the right
of and below the expected regression line. Hence it is considered that any interaction
causing extreme deviation, will not be confused with heterozygosity using this
regression method, but confusion will result when any such interaction is relatively
small in effect compared with the level of dominance.
I n the homozygous diallel, in the absence of non-allelic interaction, the level of
dominance can be deduced using the W,/V, graph from the point of intersection
( X ) of the regression line with the W , axis, relative to the position ( Y ) of $Vpl on
that axis: the necessary relationship is -\/(XY/OY), where 0 is the origin of the
W , axis (HAYMAN 1954). With homozygosity, after correction for Ez, $Vp, is 8uvd2
but with parental heterozygosity it becomes
Z[W + $b(l - P)lG + 2[$/3(1 - P)lh2 - 8[4P(a - r)ldh
in which case it is, in general, impracticable to deduce h/d from d ( X Y / O U ) due to
complexity in the position of Y . Also, as discussed earlier herein, the position of X
will always tend to underestimate dominance when heterozygosity is involved.
Scaling Tests
In the homozygous diallel, (W, - V,) is constant over arrays in the absence of
non-allelic interaction: if there is a significant regression of (W, - V,) on the corre-
sponding parental and array means, rescaling is necessary (HAYMAN1954). I n the
generalised diallel there is heterogeneity of (W, - V,) over arrays due to the heter-
ozygous arrays, thereby vitiating any scaling test using this parameter. No suitable
F1 scaling test has been devised.
DISCUSSION
Before summarising the generalised method of diallel analysis which has been
described, some discussion is required of its application to both theoretical and
practical problems. The theoretical applications will be taken first since they require
the briefer description in this paper.
The parental model used in deriving the foregoing analysis is representative of
any population quite independently of whether its mating system is known or not.
Based on this model, it is therefore possible to calculate correlations between relatives
76 ANALYSIS O F DIALLEL CROSSES
A method is given for the analysis of quantitative data from a diallel cross using
as parents any type of material, homozygous or heterozygous. The method repre-
sents an extension to heterozygous crosses of the one developed by JINKS (1954) and
HAYMAN (1954) for homozygous material and the theory is discussed in terms of
components of variation similar to those of MATHER(1949).
The analysis provides estimates of the overall degree of dominance, of the in-
breeding coefficient or degree of heterozygosity of loci showing dominance and of the
allele frequency a t such loci. Whether dominants or recessives are in excess can also
be determined. The effect of non-allelic interaction on the statistics is discussed,
together with a means of detecting such interaction.
Departures from the homozygous analysis are noted throughout and several
instances are mentioned where heterozygosity, non-allelic interaction and linkage
are confounded in particular statistics.
In conclusion some practical and theoretical applications of the method are
considered.
ACKNOWLEDGMENTS
LITERATURE CITED
HAYMAN, B. I., 1954 The theory and analysis of diallel crosses. Geneties 39: 789-809.
HAYMAN, B. I., and K. MATHER,1953 The progress of inbreeding when homozygotes are a t a dis-
advantage. Heredity 7: 165-183.
JINKS, J. L., 1954 The analysis of continuous variation in a diallel of Nicotiana rustica varieties.
Genetics 39: 767-788.
JINKS, J. L., and B. I. HAYMAN, 1953 The analysis of diallel crosses. Maize Genetics Coop. News
Letter 27: 48-54.
LERNER,I. M., 1950 Popzilation Genetics and Animal Improvement. Cambridge Univ. Press.
1954 Genetic Homeostasis. Edinburgh, Oliver & Boyd.
MATHER,K., 1949 Biometrical Genetics. London, Methuen.
WRIGHT,S., 1923 Mendelian analysis of pure breeds of livestock. I. The measurement of inbreeding
and relationship. J. Heredity 14: 339-348.
1935 The analysis of variance and the correlations between relatives with respect to deviations
from an optimum. J. Genet. 30:243-256.
1952 I n Quantitative Inheritance. Ed. Reeve & Waddington, H.M.S.O., London.