BULLETINS
OF THE
Zoological Society of San Diego
No. 17
Four Papers on the Applications of Statistical
Methods to Herpetological Problems
I. THE FREQUENCY DISTRIBUTIONS OF CERTAIN
HERPETOLOGICAL VARIABLES
II. ILLUSTRATIONS OF THE RELATIONSHIP BETWEEN
POPULATIONS AND SAMPLES
III. THE CORRELATION BETWEEN SCALATION AND LIFE
ZONES IN SAN DIEGO COUNTY SNAKES
IV. THE RATTLESNAKES LISTED BY LINNAEUS IN 1758
By L. M. KLAUBER
Consulting Curator of Reptiles, Zoological Society of San Diego
SAN DIEGO, CALIFORNIA
October 15, 1941
Digitized by the Internet Archive
in 2017 with funding from
IMLS LG-70-15-0138-15
https://archive.org/details/bulletinsofzoolo1719unse
ZOOLOGICAL SOCIETY OF SAN DIEGO
BOARD OF DIRECTORS
W. C. Crandall, President
F. L. Ann able. First Vice-President
Gordon Gray,
Fred Kunzel, Secretary
Thos. O. Burger, M. D.
C. F. Cotant
J. Waldo Malmberg
Milton '
Second Vice-President
T. M. Russell, Treasurer
F. T. Olmstead
Mrs. Robert P. Scripps
Robert J. Sullivan
Wegeforth
HONORARY VICE-PRESIDENTS
G. Allan Hancock Osa Johnson
Fred E. Lewis
STAFF OF ZOOLOGICAF GARDEN
Mrs. Belle J. Bf.nchlf.y, Executive Secretary
Ralph Virden, C. B. Perkins,
Superintendent of Maintenance
Milton Feeper,
Head Gardener
Herpetologist
Karl L. Koch,
Ornithologist
Mrs. Lena P. Crouse, Peter March,
Head of Educational Department Head Keeper
Frank D. McKenney, D.V.M.,
Veterinary Pathologist
THE PURPOSES
1. To advance a sincere and scientific study
of nature.
2. To foster and stimulate interest in the
conservation of wild life.
3. To maintain a permanent Zoological
Exhibit in San Diego.
4. To stimulate public interest in the
building and maintenance of a Zoo-
logical Hospital.
OF THE SOCIETY
5. To provide for the delivery of lectures,
exhibit of pictures and publication of
literature dealing with natural history
and science.
6. To operate a society for the mutual
benefit of its members for non-
lucrative purposes.
BULLETINS
OF THE
ZOOLOGICAL SOCIETY OF SAN DIEGO
No. 17
Four Papers on the
Applications of Statistical Methods to
Herpetological Problems
I. THE FREQUENCY DISTRIBUTIONS OF CERTAIN
HERPETOLOGICAL VARIABLES
II. ILLUSTRATIONS OF THE RELATIONSHIP BETWEEN
POPULATIONS AND SAMPLES
III. THE CORRELATION BETWEEN SCALATION AND LIFE
ZONES IN SAN DIEGO COUNTY SNAKES
IV. THE RATTLESNAKES LISTED BY LINNAEUS IN 175 8
BY L. M. Klauber
Consulting Curator of Reptiles
Zoological Society of San Diego
SAN DIEGO, CALIFORNIA
OCTOBER 15. 1941
TABLE OF CONTENTS
Page
I. The Frequency Distribution of Certain Herpetological
Variables 5
Introduction 5
Importance of Homogeneity 6
Tests of Normality
Scale Rows 8
Ventrals 11
Subcaudals 14
Labials and Other Head Scales 16
Lizard Scales 19
Turtle Scutes 19
Pattern 20
Broods .... 20
Rattles 21
Hemipenes 21
Measurements 21
Illustrative Sampling 2 5
Acknowledgments 27
Summary 27
Appendix 2 8
Bibliography of Statistical Texts 2 8
II. Illustrations of the Relationship Between Populations
and Samples 3 3
Introduction 3 3
Artificial Populations 3 5
The Methods of Random Sampling 41
Variations of the Mean 42
Ranges of Variation 47
Dispersions of Samples Compared to Those of Populations 53
Sampling of Alternative Attributes 70
Summary 71
III. The Correlation Between Scalation and Life Zones in
San Diego County Snakes 73
IV. The Rattlesnakes Listed by Linnaeus in 175 8 81
Introduction 81
Past Usages 81
Linnaeus’ Method and Types Specimens 82
Studies of Scale Counts 8 3
Horridus 87
Dryinas 89
Durissus 90
Conclusions 92
Acknowledgments 92
Summary 92
Bibliography 93
Klauber: Frequency Distributions
5
APPLICATIONS OF STATISTICAL METHODS TO
HERPETOLOGICAL PROBLEMS
I. THE FREQUENCY DISTRIBUTIONS OF CERTAIN
HERPETOLOGICAL VARIABLES
Introduction
The methods of mathematical statistics may be used to advantage in the
investigation of a number of herpetological problems; they will often va-
lidate conclusions as to species relationships, morphology, ontogeny, and
genetics. They are particularly valuable in assessing the significance of
differences, relative degrees of variation, and the reality of correlations.
But the accuracy of several of the formulas most commonly used depends
to some extent on the closeness of adherence of the distribution of the
variates to the normal probability curve . 1 For example, in taxonomic prob-
lems, one of the most frequently used formulas is that for determining the
significance of the difference between two means, or the related problem
of the probability that two samples were drawn from the same population
and therefore represent the same species. This formula assumes a normal
distribution of the population variates, although giving satisfactory results
with moderate departures from normality . 2 Similarly, normality is assumed
in applying the correlation coefficient . 3 Certain descriptive indicators, such
as the interquartile range, do not give a satisfactory picture of a distribu-
tion unless that distribution is substantially normal.
Since, in taxonomy, we are interested primarily in the population which
a sample represents, rather than the sample itself, it is usually desirable to
have some indication of the probability that the sample was drawn from a
normally distributed population. For the fortuities of sampling cause devia-
tions from a normal distribution in the sample, even though the population
from which the sample has been drawn is normally distributed. In herpeto-
logical work of the kind here under consideration, the sample may com-
prise from one to several hundred specimens of preserved laboratory material
available for study; the population is the much larger, but unknown, group
of live animals which were in the wild at the time the sample specimens were
collected.
1 For references covering the statistical terms and methods used see the appendix and
bibliography. In this discussion it is to be understood that a "normal” distribution is one
in which the frequency distribution of the variates follows the normal probability curve.
2 Kenney, vol. 2, p. 141.
3 Treloar, p. 104.
6
Bulletin 17 : Zoological Society of San Diego
It is the purpose of this paper to set forth the results of an investigation
of some typical herpetological distributions of scale counts, morphological
features, and pattern to see whether normal distributions are frequent; and
particularly to determine the nature of the deviations from normal in cer-
tain especially important characters. For if substantially normal distribu-
tions are the rule, as demonstrated in large samples representing these
characters (ventral scale counts, for example), we may with some assur-
ance presume normality in taxonomic problems involving closely related
species, even though the available samples are too small to warrant final
conclusions with respect to normality. (Larger samples permit greater
assurance than smaller as to the probability of non-normality in the basic
population.) If it be indicated that the variates in a certain character are
distributed normally in several species, we need not investigate completely
the distribution in some related form; if a visual inspection shows the dis-
persion to be substantially normal we may safely use any formulas which
give accurate results with approximate normality, as, for example, that
for determining the significance of the difference between averages.
Importance of Homogeneity
In an investigation of the shape of a dispersion curve we should be sure
of the homogeneity of the sample, otherwise an inaccurate conclusion may
be drawn. Care must be exercised not to complicate the situation by the
introduction of extraneous variables or stratification. Thus, if sexual
dimorphism be present, the sexes should be tested separately; for if the
ventral scutes in each sex of a certain snake be distributed normally, but
there is a sex difference, then combining the sexes will produce a platykurtic
(flat-topped) distribution, or even one which is bimodal. Hence, in such
a combination a non-normal result may merely be an inefficient proof
of sexual dimorphism. Similarly, geographically widespread samples are
usually to be avoided in testing dispersion curves, for kurtosis or skewness
may be only a cumbersome proof of geographic variation or incipient
speciation.
If there be doubt as to the existence of sexual or territorial dimorphism,
one of the usual tests for the significance of differences should be made
before treating the entire collection as a homogeneous unit. This may
seem like arguing in a circle, since one of the purposes of determining
normality is to validate the significance test. However, the latter is sub-
stantially accurate even with some departure from normality, provided
the distribution is unimodal and not strongely skewed. Sometimes a
platykurtic distribution may in itself suggest an unrecognized hetero-
geneity, if the same character is known to be normally distributed in
other populations. For example, if the ventral scale counts of male rattle-
snakes ordinarily have a normal distribution and they are found to be
markedly platykurtic in a certain territory, one might suspect the presence
of an unrecognized species confused with the one being investigated.
Such a test would have indicated the composite character of Crotalus
Klauber: Frequency Distributions
7
cinereous in Arizona, before the recognition and acceptance of C, scufnlatus
as a valid species. However, the same result can usually be achieved some-
what more simply by comparing the coefficients of variation in suspected
populations with samples from other areas wherein the populations are
assuredly homogeneous.
Sometimes it may be desired to study distributions within a species or
subspecies as a whole, even though territorial variations are known to
exist intraspecifically. In such cases an attempt should be made to draw
equal samples from each territorial or other element of the population,
lest the result be distorted by the over-emphasis of the more numerous
sections of the general sample.
Accuracy, in making counts and in sexing, is essential if the results are
to have value. For example, if there be sexual dimorphism in a certain
character, inaccurate sexing will cause the distribution in each sex to be
skewed toward the other. Also, there must be a uniform method of making
counts. Herpetological methods are not highly standardized; if data
are accumulated from more than one source there must be assurance that
uniform rules were employed in deciding questionable counts.
Tests of Normality
Two types of tests of normality are in current use: (1) the chi-square
test for goodness of fit; and (2) the comparison of certain moments of
the sample distribution with the corresponding moments of the normal
curve. Although involving somewhat less computation than a test of
moments, the chi-square method has been criticized because of the neces-
sity of grouping the small edge-frequencies, and because it ignores the
signs and distribution of the differences of the class frequencies from
normal; that is, it does not distinguish between kurtosis and skewness, or
their directions — lepto- from platykursis, or positive from negative skew-
ness. 4 5 Both of these methods will be found discussed at length in statistical
texts (see the appendix and bibliography) together with the nomenclature
of aberrations from the normal curve (Figs. 1-5). Graphic methods, such as
plotting the points of a distribution against the normal curve of best fit,
either on rectangular co-ordinates or using a probability scale, will also
give a picture of the fit, although they will not indicate the probability
that the parent population is non-normal.' 1 In fact, it will often be found
worthwhile to plot the theoretical against the actual frequencies, after a
chi-square calculation has been made, in order to visualize the departure
4 Geary and Pearson, p. 1. Although the chi-square test is criticized because of the sub-
jective factor involved in grouping edge-classes, there are cases where one or two freak
specimens too greatly affect moment determinations. These, often juveniles which probably
would not survive, should usually be eliminated.
5 Codex Book Co. Arithmetic Probability Paper No. 3127 will be found useful; also the
Otis Normal Percentile Chart of the World Book Co. Pearl (1940) p. 3 82 suggests an-
other method of making a visual comparison between an empirical distribution and the
normal curve.
8
Bulletin 17: Zoological Society of San Diego
from normality. This is of particular value if one desires to find whether
a certain attribute maintains a similar quality of deviation from the normal
curve through several species; that is, whether this kind of deviation is
characteristic of the attribute. Occasionally it will be desired to investi-
gate such characteristics deviations further by fitting to other curves than
the normal probability curve. See Croxton and Cowden, pp. 293-304,
Elderton, pp. 5 8-127.
Scale Rows
The tapering bodies of most species of snakes are usually correlated
with changes in the number of dorsal scale rows. In order to discuss the
shape of the dispersion curve of this character it is necessary to define
the particular count which is to be used as a basis of investigation.
In most taxonomic work either the number of rows at mid-body is
cited, or the number at the neck, at mid-body, and just before the vent.
There are four types of suppression (or lack of suppression) of scale
rows between the neck and the vent: (1) a constant number; (2) a con-
stant number from neck to mid-body, followed by a decrease to the vent;
(3) a continuously decreasing number from neck to vent; (4) an increase
from neck (at its point of least diameter) to mid-body, followed by a
decrease toward the vent. While many species, and even genera, adhere
to only one of these methods, others may follow two or more, since,
after all, there is no very sharp line between them. In fact, there is not
entire agreement as to the definitions of the three points at which the
rows are to be counted; usually they are (1) on the neck one head-length
posterior to the hinge of the jaw; (2) at mid-body half-way between the
head and vent; and (3) a short distance anterior to the vent, to avoid the
irregularities involved in the considerable diminution in body diameter
at that point.
Complete studies of scale rows involve, not a determination of the num-
ber of rows at these arbitrary and somewhat ill-defined points, but rather
a two-dimensional picture presenting the sequence number of each row
suppressed (considering the row bordering the ventrals as No. 1), and
the point of suppression, the latter being located by the number of the
ventral scute (counting from the head toward the tail) opposite which
the suppression occurs. This introduces a more complicated set of variables
than can be the subject of the present investigation which, therefore, will
be restricted to the number of rows at mid-body. But the term will be
used in the rather broad sense of the maximum rows evident in a trans-
verse band at approximately the central part of the body, rather than at
a single carefully determined mid-point. This will avoid variations pro-
duced by a rigid definition rather than a true condition of the scale rows;
it will not differentiate between suppression just anterior or posterior to
the exact mid-body.
Klauber: Frequency Distributions
9
Another difficulty in dealing with the distribution of scale rows lies in
the strong tendency toward uneven rows in nearly all genera. This results
from the fact that most counts are made up of a mid-dorsal row and two
equal sets of lateral rows on either side, thus producing an odd-numbered
total. Thus only in genera wherein the mid-dorsal row is occasionally sup-
pressed, as for example in Trimorphodon, or in individuals having unequal
numbers of lateral rows, is there an even-numbered total. Even where the
mid-body count is considered to be the maximum found in a band extend-
ing both anterior and posterior to the true mid-point there are cases of
unbalance, that is, bilateral asymmetry, wherein a row is suppressed much
sooner on one side than the other, or fails entirely to appear on one side.
Sometimes a row may be represented by only a few scattered scales.
But these even-numbered specimens are the exception rather than the
rule; they will rarely reach eight per cent of the total, and in most species
will be fewer. In checking the distribution of these variates against nor-
mality these even-numbered specimens may be allocated to the uneven
classifications next above and below; that is, if there are 10 specimens
with 24 rows, add 5 to the number with 23 and 5 to the number with 2 5.
The series of uneven numbers can then be tested for normality of distribu-
tion. However, it is best, in those genera where there is true bilateral
asymmetry (not the suppression of the mid-dorsal itself, as in Trimorpho-
don), to consider the laterals, rather than the total dorsals, as the variable.
This is done by deducting the mid-dorsal row and determining the dis-
tribution of the laterals. In this method a snake with, say, 24 scale rows
is presumed to have a mid-dorsal, 12 laterals on one side and 11 on the
other; while, of course, one with 2 5 rows has a mid-dorsal and two sets of
12 laterals each. Thus we can make the type of transformation shown
in Table 1. This distribution of laterals can then be checked for normality
in the usual way.
TABLE 1
San Diego County Crotalns viridis oreganus
Conversion of Dorsals to Laterals
Number of
Number of
— Number
of lateral rows
•
dorsal rows
specimens
ll
12
13
23
7
14
24
5
5
5
25
440
880
26
37
37
37
27
121
242
28
1
1
29
2
Total
613
19
922
280
14
1
4
5
10
Bulletin 17: Zoological Society of San Diego
The scale rows of most snakes at mid-body are too nearly invariant to
require or justify an investigation of normality. The smaller and slimmer
colubrids are often almost or quite without variation. For example, 213
specimens of Sonora occipitalis from eastern San Diego County all have
15 scale rows. The distribution in 274 specimens of Diadopbis amabilis
similis from western San Diego County is 13 ( 12), 14(7), 15 (25 5).''
Of 3 34 specimens of Pbyllorbyncbus decurtatus perkinsi from desert San
Diego County all have 19 scale rows except four, which are distributed as
follows: 17(1), 18(2), and 21(1). Of 202 specimens of Rhinocbeilus
lecontei from southern California all but four have 23 scale rows; of the
four aberrants, three have 2 5, and the other 24 scale rows. Some of the
larger species have a greater diversity. For example, 431 specimens of
Lampropeltis getulus calif orniae (both pattern phases) from cismontane
San Diego County have the following distribution: 22(2), 23 (3 54),
24(27), 25 (48); or, expressed as laterals, 10(2), 11(737), 12(123).
This distribution does not exhibit enough variation to warrant a test
for normality, but some of the larger colubrids may have a sufficient
spread to justify such a test. For example, 178 specimens of Pituophis
catenifer annectens from coastal San Diego County have the following
dispersion of laterals: 13 (2), 14(22), 15 (111), 16(169), 17(48), 18(4).
A chi-square test indicates that the dispersion probably approximates a
normal distribution (P = 0.25). 7
It is to be presumed that some of the larger species of the Boidae, with
their high numbers of scale rows, would have some interesting variations,
but securing sufficient data presents obvious difficulties. Among the smaller
boids we have the following distribution of the laterals of 103 specimens
of Licbanura roseofusca roseofusca from western San Diego County: 17(2) ,
18 (5), 19(35), 20(93), 21 (67), 22(4). By the chi-square test P = 0.13.
In Cbarina bottae 142 specimens are distributed as follows: 19(17),
20(88), 21 (89), 22(41), 23 (36), 24(13). Here P is less than 0.001, for
the distribution is skewed; however, these data represent a territorially non-
homogeneous population from a large area, which has affected the result.
The rattlesnakes exhibit a moderate degree of variation in scale rows,
no doubt because of their thick bodies; for it can be shown that among
snakes there is often a positive intrageneric, and even intrafamily correla-
tion of the number of scale rows with adult body diameter; and, assuming
,:i Throughout this paper, in expressing distributions in this way, the values of the variate
will be stated first, followed in parentheses by the frequency of occurrence of that value.
‘ The ch i-square test for normality does not state the probability that a certain distribu-
tion is normal; it answers the question in this way: "If the population from which this
sample was drawn were indeed normal, what percentage or proportion of similarly sized
samples, drawn at random, would exhibit as great, or a greater, departure from normality
than this one?” So in a way it gives a negative rather than a positive answer to the question.
If we take the often-used probability limit of 0.0 5 we merely determine (when P is above
0.05) that there is not a strong indication that the parent distribution is not normal.
Klauber: Frequency Distributions 1 1
a constant coefficient of variation, the thicker species will have a higher
number of scale-row classes.
Table 2 sets forth the data on the five largest homogeneous series of
rattlers available to me.
TABLE 2
Distribution of Lateral Scale Rows in
Homogeneous Series of Rattlesnakes
Number of
Lateral
Rows
Platteville
series
C.v. viridis
Pierre
series
C.v. viridis
Pateros
series
C.v. oreganus
S. D. County
series
C.v. oreganus
San Lucan
series
C. lucasensis
11
4
12
19
12
493
387
894
922
15
13
1113
895
322
280
523
14
56
64
2
5
148
15
5
16
3
Totals
1666
1346
1230
1226
694
These distributions are unimodal, but the dispersions are not great
enough — that is, there are not enough classes — to determine whether there
is a tendency away from normality.
Summarizing the scale-row study, it may be said that in most species
of snakes the scale rows are too constant to permit useful determinations
of whether such variation as there is follows a normal dispersion. Probably
only the largest boids would produce results of interest. The significance
of differences in scale rows may best be ascertained by means of a chi-
square test of an Rx2 table, 8 rather than by determining the difference
between means. No doubt the location of the termination of dropped rows
will warrant statistical examination in some cases. Sexual dimorphism is
sometimes present, either in the number of rows, or the point of termina-
tion of suppressed rows.
Ventrals
The most important character employed in intrageneric classification
is the ventral scale count, for it may be determined with accuracy and
usually has a high degree of constancy in any territorially homogeneous
series, the coefficient of variation approximating 2 per cent. Yet it is sub-
8 For the use of this method see such texts as Mills, p. 63 3; Snedecor, p. 164; Simpson
and Roe, p. 295; Pearl, p. 329.
12
Bulletin 17: Zoological Society of San Diego
ject to sufficient plasticity to show the effects of ecological and other
changes.
Chi-square tests of a number of species of colubrids indicate that nor-
mality of distribution is probably the rule, the results being shown in
Table 3 for several homogeneous series.
TABLE 3
Evidence of Normality
In the Distribution of Ventral Scale Counts
of Example Colubrids
Species
Area
Sex
Number
of
Specimens
Chi-square
Probability
P
Di ado phis a. similis
. Coastal S. D. Co.
M
128
0.22
F
131
0.82
Pbyllorbyncbns d. perkinsi ...
Desert S. D. Co.
M
126
0.73
F
99
0.29
Pituopbis c. annectens
. . Coastal S. D. Co.
M
96
0.76
F
80
0.05
Tbamnopbis hammondii
...San Diego Co.
M
170
0.61
F
159
0.93
Thamnopbis o. ordinoides
. Nw. Oregon
M
149
0.81
F
151
0.41
Lampropelth g. calif orniae...
San Diego Co.
M
202
0.46
F
171
0.60
Geopbis nasalis
...Volcan Zunil
M
124
0.26
F
89
0.45
P in the table indicates the proportion of similarly sized samples that
would show a departure from normality at least as great as that shown by
the available sample, if the population sampled were truly normal. Thus
the evidence for normality of distribution is quite strong in this group
of colubrids.
The five homogeneous series of rattlesnakes have been investigated by
the alternative method of moments. The results are shown in Table 4.
Klauber: Frequency Distributions
13
TABLE 4
Evidence of
Skewness and Kurtosis in Ventral Scale Counts
in Homogeneous Series of Rattlesnakes
by the Method of Moments
Number of Probability 9
Species
Series
Sex
Specimens
Skewness
Kurtosis
C.v. viridis
Platteville
M
441
0.44
0.73
F
392
0.64
0.08
C.v. viridis
Pierre
M
342
0.79
0.007
F
331
0.45
0.44
C.v. ore games
Pateros
M
326
0.80
0.05
F
289
0.00001-
0.81
C.v. oreganus
San Diego Co.
M
292
0.57
0.16
F
278
0.74
0.72
C. lucasensis
San Lucan
M
168
0.0009
0.0004
F
125
0.00001-
0.00001-
These dispersions were also checked by the chi-square method and all
were found to be well above the 5 per cent limit toward normality, except
in the case of the Pateros females, but including lucasensis, which Table 4
shows to be non-normal. Thus we find substantial agreement between the
two methods except in the case of the lucasensis series. Here it is determined
that the low probability disclosed by the moment method results from two
specimens (defective young) in each sex. These, of course, are grouped
with others in the edge-classes when employing the chi-square method,
and hence have small effect on the result. If we drop them out and recal-
culate the results by the moment method we have the following:
Male Female
P (skewness) 0.43 0.62
P (kurtosis) 0.46 0.63
Thus there is little evidence of anormality in lucasensis when these aberrant
individuals are omitted.
As to the directions of the deviations, we find that the skewnesses are
all positive, indicating a long tail toward the right, that is, a surplus of
the higher ventral counts. With respect to kurtosis, six cases are positive
9 As in the case of the chi-square test, the method of moments tests the evidence of non-
normality, rather than normality, by determining the ratios of certain departures from
normality to their standard errors. From these ratios the significance of the departures may
be determined. In the above table, if the probability is above 0.0 5, there is assumed to be
no strong evidence for non-normality in the parent population. Any result smaller than
0.01 is taken to indicate a high probability of non-normality.
14
Bulletin 17: Zoological Society of San Diego
(leptokurtic) , the other four negative; obviously there is no weight of
evidence for a trend toward either a peaked or a flat-topped curve being
the mode.
In the worm snakes, genus Leptotypblops, the dorsals, rather than the
ventrals, are used in taxonomy. The chi-square test applied to 56 specimens
of L. h. humilis from western San Diego County, and to 40 specimens of
L. h. cahuilae or cahuilae -humilis intergrades from the desert slope of the
county, indicate normal distributions, although the number of specimens
is too small to permit a fully adequate determination. 108 specimens of
L. didcis dutch from Texas have a definitely flat-topped distribution
(P = .001— ). However, this population may not be homogeneous, as dis-
cussed elsewhere. 10
SUBCAUDALS
The distributions of the subcaudals in some typical series of colubrids
are shown in Table 5. The evidence is in favor of normality in most cases.
TABLE 5
Evidence of Normality in the
Distribution of Subcaudal Scale Counts in Colubrids
Number
Chi-square
of
Probability
Species
Area
Sex
Specimens
P
Diado phis a. si mil is
. Coastal S. D. Co.
M
119
0.02
F
118
0.28
Phyllorhynchus d. perkinsi ...
. . Desert S. D. Co.
M
126
0.05
F
94
0.53
Pituophis c. annectens
Coastal S. D. Co.
M
88
0.64
F
80
0.48
Thamnophis hammondii
. . San Diego Co.
M
149
0.74
F
133
0.92
Thamnophis o. ordinoides ...
Nw. Oregon
M
110
0.44
F
122
0.35
Lam pro pelt is g. calif or niae
San Diego Co.
M
184
0.90
F
154
0.27
Geo phis nasalis
Volcan Zunil
M
115
0.06
F
84
0.001-
10 Trans. S. D. Soc. Nat. Hist., Vol. 9, No. 18, p. 103, 1940.
Klauber: Frequency Distributions
15
The same result is apparent in the subcaudals of the rattlesnakes (Table
6), analyzed by the method of moments, although the foreshortened tails
of the rattlesnakes render this character less important than in the sharp-
tailed snakes.
TABLE 6
Evidence of
Skewness and Kurtosis in Subcaudal Scale Counts in
Homogeneous Series of Rattlesnakes
by the Method of Moments
Number of Probability
Species
Series
Sex
Specimens
Skewness
Kurtosis
C.v. viridis
Platteville
M
441
0.76
0.21
F
390
0.09
0.001-
C.v. viridis
Pierre
M
342
0.41
0.51
F
331
0.46
0.85
C.v. oreganus
Pateros
M
324
0.57
0.93
F
289
0.002
0.07
C.v. oreganus
S. D. County
M
294
0.38
0.29
F
279
0.72
0.24
C. lucasensis
San Lucan
M
168
0.22
0.50
F
123
0.85
0.04
Thus, while not every test results in a probability above 0.0 5 the
general indication is toward normality. I have rechecked the Platteville
and Pateros females, the two cases which show the greatest probability of
non-normality. I find that in both there are several high counts, possibly
the result of inaccurately sexing juveniles, always a possibility in handling
some hundreds of these little specimens. If we employ the chi-square test,
which groups these end-classes, we find for the Platteville series P=0.82,
and for the Pateros series P = 0.55. Thus the appearance of non-normality
results from these few aberrant specimens, totaling only about one per
cent of the available material.
These distributions afford good examples of the extent of the departures
from the normal curve of sets of variates in which there is strong evi-
dence that the parent populations are truly normal. For instance, these
are the detailed figures for the female Pierre viridis:
16
Bulletin 17: Zoological Society of San Diego
Number of
Subcaudais
Actual
Distribution
Theoretical
Distribution
(A)
Theoretical
Distribution
(B)
16
1
1.84*
2.18*
17
10
9.36
10.00
18
34
32.19
32.80
19
65
68.53
67.86
20
95
90.88
89.17
21
74
74.92
74.04
22
36
38.37
38.86
23
13
12.24
12.98
24
2
2.42
2.71
25
1
0.32*
0.39*
Total
331
331.07
330.99
Column A gives the theoretical distribution by ordinates, Column B,
by areas. I have stated that in cases such as this, where the variates can
take only integral values, I have usually determined the theoretical distri-
bution by the first method. It will be observed that the area method gives
a slightly more platykurtic curve than the other. The actual distribution
is exceedingly close to either, closer in fact than would be obtained in nine
out of ten random samples if the parent population is indeed normally
distributed; for P = 0.95 when the value of chi-square is obtained by com-
paring with the ordinate curve, or P = 0.93 if the area curve is taken.
Labials and Other Head Scales
The labials, in homogeneous series of colubrids, seldom have a sufficient
variation to indicate more than a strong unimodal tendency. In the rattlers
there is more variation, usually five or more classes being present, and
from these the trend toward normality can be ascertained. The distri-
butions in the five series which have been used before are shown in Table 7.
* 16 or less; 2 5 or more.
Klauber: Frequency Distributions 17
TABLE 7
Distribution of Labial Scale Counts in
Homogeneous Series of Rattlesnakes
SuPRALABIALS
Platteville
12 11
13 89
14 542
D 728
1 6 256
17 35
18 4
19
Total 1665
P (by chi-square) 0.01
Infralabials
12 1
13 12
14 106
15 487
16 641
17 382
18 79
19 6
20
21
Total 1664
P (by chi-square) 0.24
Pierre
Pateros
S. D. Co.
San Lucan
6
1
2
65
18
28
4
362
190
225
13
625
600
569
103
233
351
320
308
54
67
69
203
1
3
11
56
4
1346
1229
1223
693
0.001-
0.07
0.025
0.08
9
1
1
90
30
51
1
428
232
301
11
503
532
502
61
254
363
295
198
56
71
48
275
5
2
12
124
2
20
4
1345
1230
1212
695
0.02
0.89
0.31
0.20
It will be noted that there is a greater indication of non-normality in the
supralabials than the infralabials. A study of the former by the method
of moments indicates that the trend away from normality is brought about
by a leptokursis, the peak being sharper than that of a normal curve.
Sometimes it is desirable to investigate entire species to determine curve
shapes, especially if, in a character, there is reason to believe that little
territorial variation is involved; for the greater number of specimens will
give greater assurance of the curve shape. However, if there be territorial
variation it is obvious that there will be a tendency toward greater dis-
18
Bulletin 17: Zoological Society of San Diego
persion in the larger samples; so that if homogeneous segments of the
population have normal distributions, the combinations will tend toward
platykursis.
There is not much variation in the labials of C. v. viridis, especially in
the northern part of its range. I have therefore investigated the curve
shapes of the labials in all the specimens available to me, to see whether
these larger samples tend to verify the non-normality of the supralabials
and the normality of the infralabials, as indicated by the Platteville and
Pierre series. The data are as follows:
Supra-
10
11
12
13
14
15
16
17
18
19
Total
labials .
Infra-
2
8
33
279
1460
2125
817
153
10
4887
labials . ...
1
2
47
284
1434
1834
897
214
14
4827
The results are as follows:
P (chi-square) P (moments)
Skewness Kurtosis
Supralabials 0.0001- 0.037 0.000001-
Infralabials 0.05 0.415 0.222
It will be seen that there is strong evidence that the supralabial frequency
is not normal; in fact, the value of P for kurtosis is very much smaller
than the figure given, for the distribution is strongly peaked. On the other
hand, these tests, especially that based on moments, indicate that the infra-
labial distribution is probably normal.
Occasionally other head scales vary sufficiently to warrant investigation.
For example, consider this distribution of the minimum scales between the
supraoculars in the Pierre series of C. v. viridis : 1 (7), 2(145), 3 (370),
4(138), 5(12). By the method of moments we find P (skewness) to be
0.20, and P (kurtosis) 0.69. In the Platteville series of 831 specimens the
distribution is 1(4), 2(71), 3(381), 4(311), 5(51), 6(12), 7(1). This
distribution is definitely skewed, and P (chi-square) is 0.01. A species
with a high number of scales in the supraocular bridge is C. ruber, in
which the distribution among 243 specimens from all areas is 4(6), 5(35),
6(96), 7(75), 8 (29), 9(2). This is markedly skewed distribution. The
distribution in C. lucascnsis is more symmetrical: 3(3), 4(34), 5(70),
6(135), 7(70), 8(24), 9(1); total 3 37. C. m. molossus and C. scutulatus
are two forms in which these minimum scales across the frontal area are
strongly skewed. For example, in 148 C. m. molossus the variation is
2(86), 3(29), 4(21), 5(10), 6(1), 7(1), giving what is known as
a J-shaped curve. C. scutulatus is even more strongly skewed: 1(3),
2 (324), 3 (37), 4( 5), 5 (1); total 370. This is quite a different distri-
bution from that of C. cinereous ( atrox ) which in all specimens avail-
Klauber: Frequency Distributions
19
able to me is as follows: 3 (76), 4(276), 5 (230), 6(90), 7(11), 8 (1);
total 684. In determining the significance of the difference between two
such non-normal distributions as these, it is best to use an Rx2 table, rather
than a comparison of means. Such a test would show the probability of a
common origin to be far below 0.0001, clearly demonstrating the validity
of scutulatus.
The scales in the prefrontal area on the snout of a rattlesnake may be
distributed normally, although usually either skewed, platykurtic, or
both. In C. scutulatus there is slight skewness (P by chi-square = 0.07 in
274 specimens). But in the Pierre series of viridis the distribution in 672
specimens is decidedly flat-topped and P (chi-square) is less than 0.001.
A few species of snakes have considerable diversity in loreals, although
most are rather constant. A form exhibiting variation is Lichanura r.
roseofusca, wherein we find the following distribution: 2(1), 3(22),
4(56), 5(69), 6(11), 7(5). This indicates the possibility of a normal
distribution in the parent population, for P (chi-square) is found to be
0.09. Also in this species we have an occular ring, with no definite dis-
tinction to be drawn between supra-, pre-, sub-, or post-oculars. The dis-
tribution in 1 63 counts is as follows: 7(5), 8(16), 9(70), 10(61), 11(9),
and 12(2). P is 0.17 and a normal distribution is therefore possible.
In Pituophis the prefrontals are subject to considerable variation. For
example, in 120 specimens of P. c. annectens from San Diego County the
distribution is 2(4), 3(7), 4(91), 5(6), 6(9), 7(2), 8(1). This clearly
is not a normal distribution, being sharply peaked at 4, and P is much
less than 0.001.
Lizard Scales
Lizard scale distributions are also useful in taxonomic problems and
may be checked for normality by the same methods. Thus in 56 specimens
of Sceloporus j. jarrovii the scales around the body indicate a normal dis-
tribution (P=0.59). In 760 specimens of Cnemidophorus t. tessellatus
from all areas, the distribution of ventrals is surprisingly close to normal
(P = 0.88). In the same species 1424 counts of the scales on the fourth
toe show a leptokurtic distribution as follows: 17(12), 18 (92), 19(337),
20(673), 21(205), 22 (86), 23(15), 24(4). In this distribution, by the
chi-square test, P is found to be much below 0.001. Similarly the number
of dorsal scale rows in a large series of Anniella p. pulchra is peaked in
distribution. In 1 5 06 counts in C. t. tessellatus (sexes combined), the
femoral pores indicate that the distribution may be normal (P=0.27).
I think there will be a growing tendency to apply statistical methods in
future taxonomic studies of the lizards, particularly in verifying the sig-
nificance of differences, for they have a greater number of countable
characters than snakes.
Turtle Scutes
Southern California is a territory notably poor in chelonians and I have
no original data on turtles. I have tested the distributions of the scutes of
20 Bulletin 17: Zoological Society of San Diego
Lepidochelys olivacea, cited in the "Tetrapod Reptiles of Ceylon” by P.
Deraniyagala. The total scutes in 378 specimens (p. 133) are not dis-
tributed normally (P less than 0.001), for the distribution is decidedly
platykurtic. The costals (p. 137) in 756 counts are both platykurtic and
skewed (P less than 0.001). The vertebrals may possibly be normally dis-
tributed (P = 0.08), although this particular sample is platykurtic.
Pattern
Where the pattern of a snake or lizard includes bands, saddles, blotches,
or spots, the numbers on the body, tail, or both are frequently used in
taxonomy. They often approach normality in distribution. Thus in 180
specimens of Pituophis c. annectens we find for the body blotches P = 0.06;
in the tail spots of 89 males P = 0.32 and in 80 females P = 0.90.
In the five large series of rattlesnakes a normal distribution is indi-
cated, as shown in Table 8. In one case — the Pateros oregamis female tail
rings — the distribution is strongly skewed; in three others the variation
is limited to only three classes and the tendency is indeterminate.
TABLE 8
Chi-Square Test of Normality of
Body Blotches and Tail Rings in
Homogeneous Series of Rattlesnakes
Series
Platteville viridis
Pierre viridis
Pateros oreganus
San Diego Co. oreganus
San Lucan lucasensis. ..
Body Blotches Tail Rings
Males Females
Number
P
Number
P
Number
P
832
0.09
440
0.20
392
0.41
672
0.28
342
0.32
330
0.81
616
0.22
326
0.33
290
0.001
579
0.90
285
0.15
283
339
0.13
198
146
Broods
The sizes of broods of young snakes have been shown to be correlated
with the size of the mothers, larger females having more young. 11 Thus
the frequency distribution of brood sizes may depend on the dispersion of
fertile females, as well as on the variation for any given size of mother.
The only series which I have available, large enough to afford a determina-
tion, comprises data on broods and developing eggs of Crotalus v. viridis.
Of these there are a total of 303 sets. The distribution is found to be both
platykurtic and positively skewed; P (chi-square) is 0.006. It may well be
11 Occ. Papers S. D. Soc. Nat. Hist., No. 1, p. 16, 193 6.
Klauber: Frequency Distributions
21
that a more nearly normal distribution would be in evidence in a series
from mothers within a narrow length-range. The data available are not
sufficient to permit checking this possibility.
Rattles
One rattle-variable is the number of rattles in adult strings. Data on
this subject have been given in a discussion of the rattle. 12 The distribu-
tions in two series, the Platteville and San Lucan, including both complete
and broken strings, are found to be leptokurtic in the first case (chi-
square P less than 0.001) and possibly normal in the second (P=0.13).
True normality could not be expected in these cases since, with very large
series, a considerable number of snakes should have less than no rattles —
an obvious impossibility.
Hemipenes
In some species the spines and fringes of the hemipenes are quite variable,
although the spines, if non-uniform, are often difficult to count with
accuracy. In a series of C. scutulatus the spines seem normally distributed
(P=0.29) and the fringes likewise (P = 0.72).
Measurements
Thus far I have dealt with scales, blotches, and other countable quanti-
ties, characters which can only take integral values. Because of their
scalation most reptiles are well supplied with such countable characters
(only fish equal them), a fortunate provision from the standpoint of
ascertaining the significance of differences. However, even in herpetology,
dispersion curves of measurements and weights are of interest. They are
of importance in determining factors affecting variation and correlation,
and in calculating the frequency of expectation of unusual specimens. But
in using measurements in most difference problems we are confronted by
the complication that proportionalities are subject to ontogenetic varia-
tion. We are seldom concerned with the total variation within a species
from birth to death; more often we wish to know the extent of variation
at a single age or life period, since it is only within such limitations that
a three-dimensional variation can be reduced to two dimensions, permitting
an analysis of the frequency distribution. For usually the measurements
to be used are of a relative or proportional nature involving two variables —
for example the ratio between some part of the body and the whole; and
although such ratios eliminate the unit of measurement, they do not ob-
viate the effects of ontogeny. It is seldom that a body part remains in con-
stant ratio with another part (or with the body as a whole) throughout
life. Where such ontogenetic changes in proportionality are in evidence,
to determine dispersion we must either have a very large assemblage of
specimens at a single value of the independent variable — that is to say, at
12 Occ. Papers S. D. So c. Nat. Hist., No. 6 , pp. 18-19, 1940.
22
Bulletin 17: Zoological Society of San Diego
a given age or size — or we must make some assumptions and convert the
available specimens to a standard value of the independent variable. This
usually involves the determination of the probable size of a body part at
a standard value of the body size. This may then be followed by a study
of the frequency distribution of the dependent variable at what is equiva-
lent to a cross-section of the dispersion surface. For example, if it be de-
sired to determine the dispersion of the head size of a snake as a proportion
of body size, we first set a standard body size (usually somewhere in the
adult range) and then translate the head lengths of all the available
specimens to what they would probably be at this standard body size.
The translation is effected by determining the regression line for all speci-
mens and then assuming that any specimen, in growing to (or returning to)
the standard size, would do so by maintaining a constant percentage devia-
tion from the regression line. This assumption is validated by the fact that
the coefficients of dispersion of characters of this type seem to remain
substantially constant through life. The results of several studies of this
nature, with example computations, have already been published. 1,1 It is
only by the use of such calculations that sufficient data can be secured to
permit the study of frequency distributions.
In a population of snakes — rattlesnakes for example — the lengths do not
approach a normal distribution. On the contrary it is bimodal, for the
young of the year are rather sharply differentiated from the adolescents
and adults. 14 The young of the year taken by themselves are probably
normally distributed with respect to body length as shown by the fol-
lowing tests on homogeneous series:
Number
Series of Individuals
Zacatecas nigrescens 82
San Patricio cinereous, 139
Pierre viridis 152
Platteville viridis 229
P (chi-square)
0.51
0.64
0.77
0.01
Only the Platteville series is distributed non-normally; it is negatively
skewed.
A set of miscellaneous broods totaling 320 young snakes, when the in-
dividual lengths were expressed as percentages of the mean of each brood,
had a dispersion giving a chi-square value of P=0.25.
Starting with young of the year having substantially normal distribu-
tions, it would be interesting if we could trace the progress of each age-
class as it passes through maturity until it finally disappears, losing indi-
13 Ratio of weight to length, Occ. Papers, S. D. Soc. Nat. Hist., No. 3, p. 47, 1937;
ratio of head length to body length overall, idem, No. 4, p. 22, 1938; ratio of fang length
to head and body length, idem. No. 5, p. 3 6, 193 9.
14 Occ. Papers S. D. Soc. Nat. Hist., No. 3, p. 20, 1937.
Klauber: Frequency Distributions
23
viduals continuously along the way, to see whether the distribution con-
tinues normal. But studies have shown that it is impossible to segregate
successive classes by size after the first year; for even in their second
year the most rapidly growing adolescents will have overtaken the smallest
adults of the preceding year, thus preventing an accurate segregation. Sub-
sequently the adults grow so slowly (probably never stopping growth en-
tirely as do mammals and birds), compared to individual variations, that
the separation of the age-classes becomes continually smaller and the
overlap between successive years greater. While we may assume that the
distribution of each age-class remains normal, since they start with such
a distribution as young of the year, and second year individuals having
complete strings of 5 rattles have been found to approach normality, we
cannot prove this continuity. A complete population of adolescents and
adults is both positively skewed and platykurtic, as might be expected
from the nature of a curve comprising the sum of several normal curves
of successively decreasing areas.
It would be useful if the sizes of an adult population of snakes could
be shown to have a normal distribution, and the mean and standard devia-
tion could be determined; for from such parameters we could determine
the probable frequency of occurrence of unusually large specimens, cer-
tainly a matter of interest.
But this is a difficult assignment. First, very large samples would be
required, some hundreds of specimens of each sex, at least; for as there is
sexual dimorphism in size in most species, the sexes must be treated sep-
arately. There must be no conscious selection with respect to size; the
sample must be truly representative of the population as a whole. This will
at once eliminate the larger species from consideration owing to the prac-
tical difficulty of collecting, preserving, and measuring great numbers of
large individuals. There is the further complication that incomplete tails
are numerous among the larger specimens of many species, particularly
of those with slender tails such as the racers. This tends to distort the
true frequency distribution. But most important of all, there is the difficulty
of segregating the adolescents, as already mentioned. To avoid this com-
plication there is some possibility of investigating only the right hand
half of the curve of distribution; that is, the half above the mean which
contains the largest specimens. Probably the garter snakes, because of their
occurrence in large numbers, and their ease of capture around certain
ponds and lakes will offer the best material for an investigation of this
kind. To determine with any degree of certainty whether the distri-
bution is normal, and particularly whether the largest specimens occur
with greater or less frequency than would be expected with a normal
distribution, would probably require the measurements of at least 500
specimens of each sex.
I have no such series available, but I have checked the two best homo-
geneous series at hand, although admittedly they are quite inadequate in
24 Bulletin 17: Zoological Society of San Diego
numbers to afford conclusive evidence respecting the shape of the dis-
persion curve. These are series of the little snakes Diadophis amabilis similis
and Phyllorbyncbus decurtatus perkinsi. In the interest of homogeneity
I have restricted the investigation to specimens from San Diego County,
since territorial variations in size are evident in many species. By rather
arbitrary methods I have attempted to eliminate adolescents. The statistics
of the adult populations are as follows, all lengths being given in millimeters:
Diadophis
a. similis
Phyllorhynchus d. perkinsi
Males
Females
Males
Females
Number
101
86
119
70
Mean length
294.1
340.9
392.1
406.6
Standard deviation
38.64
50.11
53.10
32.85
Length of an individual 2
standard deviations above
the mean
371.4
441.1
498.3
474.3
Theoretical number greater
than this length (2.28%)..
2.30
1.96
2.71
1.60
Actual number greater than
this length
2
4
0
2
It will be noted that in two cases ( Diadophis males and Phyllorhynchus
females) the actual number of specimens at least two standard deviations
larger than the mean is as near the theoretical number as possible. The
number of large Diadophis females is four instead of two as calculated;
while in the case of the Phyllorbyncbus males there should be about three
specimens above 498.3 mm. long, whereas actually there are none so large.
But it is interesting to note that the largest specimens come close to ex-
pectation, for the three largest are 490, 491, and 495 mm., respectively.
At least we can say, for these admittedly inadequate tests, that they offer
no particular evidence that the size distributions of these adult populations
are not normal with respect to the presence of unusually large individuals.
One of the important correlative studies that may be made is that of
weight on length. I have determined 15 that the dispersion around the re-
gression line of 818 individuals of the Platteville viridis (standardized as
discussed above) are probably not normally distributed for P (chi-square)
is 0.003. The distribution is skewed.
Head length dispersions are found to be normally distributed about the
regression lines of head on body length over-all, in two series investigated. 16
833 Platteville viridis P (chi-square) 0.893
715 Pierre viridis P (chi-square) 0.226
15 Occ. Papers S. D. Soc. Nat. Hist., No. 3, p. 47, 1937.
16 Idem. p. 18. A graphic illustration of one of the distributions is given.
Klauber: Frequency Distributions
25
Similarly the distribution of fang lengths about either the fang-head or
the fang-body regression lines are probably normally distributed. The
results in the Platteville series were as follows:
Fang on head length, 519 specimens, P (chi-square) =0.3 51
Fang on body length overall, 526 specimens, P (chi-square) =0.165.
One measurement which remains unchanged in each individual during
life is that of rattle width; that is, the width of any specific rattle of the
sequence. Using only specimens with complete strings, so that the sequence
number of each ring is known, the frequency distribution of measure-
ments of any particular ring can be determined. An investigation of a
number of series, upon which I hope to publish some notes later, would
indicate that the distribution approximates normality. For example, the
chi-square P for 448 buttons (No. 1 rings) of the Platteville series is
0.15 5. This particular series is somewhat platykurtic, but not excessively so.
Illustrative Sampling
To illustrate the variations in a series of random samples from a truly
normal population, I have assumed a hypothetical homogeneous popula-
tion comprising 100,000 snakes (all of one sex) with a mean ventral scale
count of 100 and a coefficient of variation of 2 per cent, which is a degree
of variation closely approached by many species. Thus, the standard devia-
tion is two scales. Then, by the use of random sampling numbers (Tippett,
1927; Fisher and Yates, 1938, pp. 18 and 82) I have selected ten random
samples, each comprising 100 specimens. The distributions of the entire
normal population and each of the samples is shown in Table 9. The fit
for all samples taken together is, by the chi-square test, P = 0.77, which is
quite high; that is, the fit is very close. If we take only the first five sets
of samples, the fit is not so good, for P= 0.20, the greatest deviation from
normal being the low number of specimens with 101 ventrals (drawn 73;
expected 8 8.0). Some of the individual samples will be observed to have
still poorer fits.
26
Bulletin 17: Zoological Society of San Diego
TABLE 9
Distribution of Ventrals in a Hypothetical Population of 100,000 Snakes
Normally Distributed, Together with 10 Random Samples
Each Composed of 100 Specimens
Number
of
Ventrals
Composition
of
Population*
1
2
3
4
Samples
5
6
7
8
9
10
Total of
Samples
91
1
92
7
. . .
93
44
94
222
1
2
3
95
876
1
1
1
2
3
1
1
10
96
2,700
5
4
4
4
4
3
3
4
1
32
97
6,476
7
7
8
6
5
5
9
8
3
6
64
98
12,098
15
12
15
9
15
12
10
11
5
10
114
99
17,603
20
18
20
18
22
24
21
14
18
17
192
100
19,946
16
17
17
20
16
21
17
23
20
19
186
101
17,603
16
14
13
16
14
16
15
18
20
23
165
102
12,098
9
17
16
14
8
8
15
12
13
9
121
103
6,476
6
8
3
8
8
7
8
9
9
6
72
104
2,700
4
1
4
3
4
2
1
1
4
2
26
105
876
1
1
1
3
1
1
2
3
13
106
222
1
1
2
107
44
....
108
7
....
....
109
1
....
Total
100,000
100
100
100
100
100
100
100
100
100
100
1000
Calculated by ordinates, not
areas.
Klauber: Frequency Distributions
27
Acknowledgments
I wish to acknowledge my indebtedness to Messrs. Charles E. Shaw,
James Deuel, and Laurence H. Cook for scale counts, and to Mrs. Elizabeth
Leslie and Alice G. Klauber for assistance in computations. Mr. Joseph R.
Slevin was kind enough to furnish the scale counts of Gcophis nasalis. Mr.
C. B. Perkins made several editorial suggestions for improvement.
Summary
A considerable number of tests, both by the chi-square method and the
method of moments, indicate that many of the countable variable char-
acters studied in herpetology, particularly in problems of taxonomy,
follow a normal distribution, or one closely approximating such a distri-
bution. Amongst others this is found to be the case with ventral scale
counts, probably the most important single character used in herpetological
classification.
APPENDIX
I have tried, as far as possible, to eliminate descriptions of routine sta-
tistical methods from the herpetological discussion, mentioning only unusual
points. Statistical texts of such number and variety have lately appeared
that extensive references are no longer necessary. However, some refer-
ences are given below for the use of those not familiar with these methods;
they are limited to a few on each separate element.
The characteristics of the normal curve: Walker (199-211), Treloar
(76-83 ), Simpson and Roe (70-75 ), Croxton and Cowden (265-271).
Skewness and kurtosis: Croxton and Cowden (234-245), Treloar (32-
3 5), Goulden (28-31).
Tables of the normal curve: Abridged tables will be found in nearly
every statistical text, a particularly convenient set being those of
Camp (3 80-3 85 ). The following are more detailed and extensive:
Davenport and Ekas (164-172), Kelley (14-114), Glover (392-
411), Pearson, part 2 (2-10).
Biological approximations to the normal curve: Simpson and Roe (129-
132), Treloar (34-3 5). See also the interesting comment in the
Preface to Kelley’s Tables.
Fitting the normal curve to data: (a) by areas, Arkin and Colton (106-
108) , Chaddock and Croxton ( 123-126) , Croxton and Cowden (275-
280); (b) by ordinates, Arkin and Colton (108-109), Croxton and
Cowden (271-275).
28
Bulletin 17: Zoological Society of San Diego
The chi-square test for normality: Arkin and Colton (109-112), Mills
(626-630), Treloar (219-226).
Chi-square tables: Fisher (118-119), Fisher and Yates (27), Pearson,
Part I (26-28), Davis and Nelson *(399-405).
The moment tests for skewness and kurtosis: Arkin and Colton (145-
149), Geary and Pearson (1-15), Yule and Kendall (154-166), Tip-
pett (33-42), Goulden (27-32), Fisher (54-56; 74-79), Madow
(515-517).
In applying the chi-square test I have used the standard deviation of
the sample, rather than the estimated standard deviation of the popula-
tion (Fisher, 53). Edge classes have been combined until the theoretical
frequency was at least 5 in each group. The classes, as suppressed, in all
cases numbered less than 20 (Kenney, Vol. 2, p. 170). The degrees of free-
dom were taken at 3 less than the number of classes as suppressed (Rider,
pp. 109-110), since the theoretical distribution is made to conform to the
actual in total number of variates, mean, and standard deviation. In fitting
a normal curve to the data, except in a few cases where grouping has been
necessary, I have used the ordinate, rather than the area method, as seems
to be preferable for discrete variates (Baten, p. 94). For this reason Shep-
pard’s correction was not made in calculating the standard deviation of the
sample. The differences involved in employing the two methods will
usually be unimportant, unless near some assumed critical level of sig-
nificance.
I have used Fisher’s methods in determining the significance of skewness
and kurtosis.
BIBLIOGRAPHY OF STATISTICAL TEXTS
Arkin, H. and Colton, R. R.
Statistical Methods, Fourth Edition, New York, 1940.
Baten, W. D.
Elementary Mathematical Statistics, New York, 193 8.
Camp, B. H.
The Mathematical Part of Elementary Statistics, New York, 1931.
Chaddock, R. E. and Croxton, F. E.
Exercises in Statistical Methods, Cambridge, Mass., 1928.
Croxton, F. E. and Cowden, D. J.
Applied General Statistics, New York, 1939.
Dahlberg, G.
Statistical Methods for Medical and Biological Students, London, 1940.
Klauber: Frequency Distributions
29
Davenport, C. B. and Ekas, M. P.
Statistical Methods in Biology, Medicine and Psychology, Fourth Edition,
New York, 1936.
Davies, G. R. and Yoder, D.
Business Statistics, New York, 1937.
Davis, H. T. and Nelson, W. F. C.
Elements of Statistics, Bloomington, Ind., 193 5.
/
Dunlap, J. W. and Kurtz, A. K.
Flandbook of Statistical Nomographs, Tables and Formulas, Yonkers-
on-Hudson, 1932.
Elderton, W. P.
Frequency Curves and Correlation, Cambridge, England, 193 8.
Ezekiel, M.
Methods of Correlation Analysis, New York, 1930.
Fisher, R. A.
Statistical Methods for Research Workers, Seventh Edition, Edinburgh
and London, 193 8.
Fisher, R. A. and Yates, F.
Statistical Tables for Biological, Agriculture and Medical Research.
Edinburgh and London, 193 8.
Geary, R. C. and Pearson, E. S.
Tests of Normality, Cambridge, England, 193 8.
Glover, J. W.
Tables of Applied Mathematics in Finance, Insurance, Statistics, Ann
Arbor, 1923.
Goulden, C. H.
Methods of Statistical Analysis, New York, 1939.
Kelley, T. L.
The Kelley Statistical Tables, New York, 193 8.
Kenney, J. F.
Mathematics of Statistics, 2 vols., New York, 1939.
Kurtz, A. K. and Edgerton, H. A.
Statistical Dictionary of Terms and Symbols, New York, 1939.
Madow, W. G.
Note on Tests of Departure from Normality, Jour. Am. Statistical
Assn., Vol. 35, No. 211, pp. 515-517, Sept., 1940.
Mills, F. C.
Statistical Methods Applied to Economics and Business, Revised Edition,
New York, 1938.
30
Bulletin 17: Zoological Society of San Diego
Otis, A. S.
Normal Percentile Chart, Yonkers-on-Hudson, 193 8.
Pearson, K.
Tables for Statisticans and Biometricians, Part 1, Cambridge, England,
1914 (also Third Edition, 1930); Part 2, 1931.
Pearl, R.
Introduction to Medical Biometry and Statistics, Third Edition, Phila-
delphia, 1940.
Peters, C. C. and Van Voorhis, W. R.
Statistical Procedures and their Mathematical Bases, New York, 1940.
Rider, P. R.
An Introduction to Modern Statistical Methods, New York, 1939.
Simpson, G. G. and Roe, Anne.
Quantitative Zoology, New York, 1939.
Snedecor, G. W.
Statistical Methods, Applied to Experiments in Agriculture and Biology,
Third Edition, Ames, Iowa, 1940.
Thurstone, L. L.
The Fundamentals of Statistics, New York, 1927.
Tippett, L. FT C.
Tracts for Computers. XV Random Sampling Numbers, Cambridge.
England, 1927.
The Methods of Statistics, Second Edition, London, 1937.
Treloar, A. E.
Elements of Statistical Reasoning, New York, 1939.
Walker, Helen M.
Mathematics Essential for Elementary Statistics, New York, 1934.
Yule, G. U. and Kendall, M. G.
An Introduction to the Theory of Statistics, Eleventh Edition, London,
1937.
Klauber: Frequency Distributions
31
/ \
/ \
/ \
/ \
FIG. 3
F IG. ^4-
Figure 1.
Figure 2. Leptokurtic Distribution.
Figure 4. Positive Skewness.
Normal Curve.
Figure 3. Platykurtic Distribution.
Figure 5. Negative Skewness.
Klauber: Populations and Samples
33
II. ILLUSTRATIONS OF THE RELATIONSHIP BETWEEN
POPULATIONS AND SAMPLES
Introduction
The relationship between a sample, whether an individual specimen or
a series of specimens, and the total population out of which it was col-
lected, is always somewhat uncertain. We have the sample before us; it
is tangible; we know as much about it as our senses and our methods
of investigation permit us to learn. Behind it lies the population which
the sample represents, indefinite and nebulous, and in many ways un-
known. It is true that the methods of mathematical statistics permit the
approximate definition of a population from a sample; yet even with
these formulas the population is presented only as a sort of shadow, out-
lined by statements to the effect that it probably has such and such
characteristics, and there is a certain percentage of chance that it falls
within such and such limits. But its exact form, character, and limita-
tions we can never know.
In herpetology, as in other branches of biology, we deal with samples.
They are the particular specimens which we have been able to acquire
for study. Some of these samples, often the first collected, are assigned
special importance by being selected as taxonomic or nomenclatorial ref-
erence guides or anchors. These are the types. But all the while we are
studying and classifying these samples, and attempting to differentiate
them from others, we are not really thinking of the samples themselves,
but of the populations, still in the wild, which the samples represent. For
if taxonomy is to have any real purpose, it is not primarily the determina-
tion of the similarities and differences between two or more individuals
which we have at hand, but is a judgment respecting these similarities and
differences as they are manifested in the original populations from which
the samples were drawn. Whether we use the mathematical formulas for
estimating the characteristics of populations from samples, or draw infer-
ences as to these characteristics somewhat unconsciously, we are nonethe-
less really aiming at a definition of the population rather than the sample.
Of course it is individual variation that leads to the uncertainty. Were
the animals of a single kind invariant we would know at once all about a
population (except its number) from a single sample. But all animals,
however closely related, differ in some degree from each other; the prob-
lem is to estimate, from the known spread in a sample, how wide the range
of these differences becomes in an entire population. Even if two samples
are different we cannot be sure, without investigation, that both may not
be included within the spread or range of the entire population. For ex-
ample, one snake may have 150 ventral scutes and another specimen 160.
How are we to know whether the entire population — a single homogeneous
group of these snakes — does not contain individuals running from as low
34
Bulletin 17: Zoological Society of San Diego
as 145 to as high as 170 scutes, thus including both? Parenthetically, I
may say that this discussion has nothing to do with the interpretation of
the extent of these differences into classification — that is, whether a given
difference is great enough to warrant subspecific or specific recognition,
or whether it should be considered merely an intrasubspecific territorial
or ecological variation. The problem has to do with the determination
of differences rather than their interpretation in taxonomy and nomen-
clature.
The reasons why the relationships between samples and parent popu-
lations are sometimes ignored, even today when the statisticians have
made available the formulas governing such relationships, are, first, their
expression in mathematical terms, which may seem too abstract to permit
visualizing the result; and secondly, the absence of the population itself
for comparison. For the latter remains always hazy and ill-defined, and
we never know how well our description, based on the samples, really fits
it. We may presume that the larger the sample the more representative it
becomes of the population — that is, the more closely its characteristics
are likely to approach those of the population, but we must accept this
largely on faith or inherent common sense. This leads to a tendency to
treat the sample as if it were the population, and to draw unwarranted
conclusions with respect to identities, similarities, and differences from
other populations.
While in actual practice we can never secure an entire population for
study (except of an animal approaching extinction) we may experiment
with theoretical or artificial populations, in any size and form desired, by
setting up large groups of individuals segregated into classes premised on
the variation in some particular character. An example of such a popula-
tion, using subcaudal scales as the basic character, would be the follow-
ing, comprising, for the sake of simplicity, only four classes:
Number of
Number of
Subcaudals:
Specimens
13
185
14
2,149
15
131,650
16
477
Total population
134,461
From such a population we may then select samples quite at random and
thus see in operation, without recourse to mathematical formulas, the prin-
ciples which cause samples to resemble their parent populations; and how
they fluctuate and differ from other samples drawn from the same or
different populations. Thus, we will have before us, for continuous ex-
amination and comparison, both the entire parent population, simplified
and perfected in character as compared to a real population, and the
Klauber: Populations and Samples
35
sample. We can watch the sample grow (in numbers, not in the size of
its individuals) and witness the favorable effects of larger samples or the
adverse effects of heterogeneity in samples. All the while the formulas of
the mathematicians will be at work, but we will not be using them; we
will see only their results. To carry out such a program the tests on
artificial populations which follow have been made; they will serve to
illustrate some of the principles of the relationship between populations
and samples.
An artificial population has certain fundamental advantages over any
real one. First, as previously mentioned, it may be visualized in its en-
tirety, and thus be made available for comparison with samples; secondly,
it may be very large, so large in fact, that it may be assumed toi remain un-
changed in composition when a few individuals have been withdrawn as a
sample; and lastly, it may be designed to follow any scheme of variation,
and to fit that scheme perfectly, avoiding the complicating peculiarities
found in every real population. For in considering a real population it is
often difficult to divorce the relationship to be demonstrated from the
particular characteristics of that species, its variations and morphology.
When th a se tests on theoretical populations are made there is no guar-
antee that the result will follow a particular course, for each sample will
be a truly random or chance sample. Thus, one may start out to prove a
given point (for instance that the means of two separate samples tend to
approach each other as the samples are increased in size) and by some
freak of chance the first trial might prove exactly the opposite. But re-
peated trials will surely demonstrate the truth of this proposition; in the
lone run the results will follow the mathematical formulas, but without
their seeming complications. For this reason several tests will usually be
made to illustrate each type of relationship.
It should be stated for the benefit of those unfamiliar with statistical
methods that there is nothing having the slightest originality or novelty
developed herein with respect to the mathematical relationships between
populations and samples. And of course I am not presenting these illustra-
tive examples for the purpose of proving the validity of mathematical
formulas; such proofs are available in any statistical text. The purpose
of the tests is to give a direct picture of the relationship, without re-
course to the formulas. Flowever, from time to time, after demonstrating a
relationship by example, I have pointed out. in the interest of clarity, what
formula is involved, and how well the illustrative test follows it. But
certainly this is with no idea of furnishing a proof where none is needed.
Artificial Populations
Populations may vary in several ways, such as the number of individuals
included, the arithmetical mean or average value of a character, and the
extent and nature of its variability, that is, how closely and in what way
it is dispersed about the mean. In the present discussion, in the interest of
simplicity, I shall prepare my populations according to rather rigid specifi-
36
Bulletin 17: Zoological Society of San Diego
cations. First, only one character at a time will be considered; for ex-
ample, a population will be made up of a group of individuals having dif-
ferent numbers of ventral scutes and all other features of real reptdes will
for the time be ignored. Secondly, with a few exceptions, each popula-
tion will contain exactly 100,000 individuals, this number being sufficiently
large to demonstrate the results of sampling without being cumbersome.
However, the removal of a few individual specimens will not change the
relative class-compositions of the remaining population, which is treated,
in this regard, as if it were infinite. Next, the characters discussed will be
of the type always expressible in integral terms, as far as any single speci-
men is concerned; that is, they will be countable (rather than measure-
able) characters, such as ventrals, subcaudals, body blotches, etc. Further,
in the interest of simplicification, the population will always be so ar-
ranged that its arithmetical mean or average will be expressible as an
integer; for example, the mean in these hypothetical populations will al-
ways be exactly 166 ventrals, or 48 blotches, instead of 166.3 5 2 ventrals,
or 48.713 blotches, as would be the case in a real population. This simpli-
fication can be made without in any way interfering with the demon-
stration of the trends of samples, but has the important practical ad-
vantage that the number of individuals in each class above the mean equals
the number in the corresponding class below, if the distribution be sym-
metrical. Finally, the populations will be normally distributed, that is, the
variations will follow the normal curve of error. This distribution has
several advantages: it has been extensively investigated and tabled, and
thus the populations can be set up with only the most elementary calcula-
tions; its distribution may be fully described and fixed by three simple
statistics (the mean, the number of individuals, and the standard devia-
tion) ; and finally, this type of distribution is closely approximated by
many characters in natural history, and herpetology is no exception, as
shown by investigations previously made. 1
I have stated that these normal populations may be completely defined
by three statistics — the number of individuals, the mean, and the standard
deviation. It will be desirable to show how populations change when one
of these statistics or parameters varies while the other two remain constant. 2
In Tables 1, 2, and 3, such variations are set forth. I have chosen to
consider the populations as representing the distribution of the ventral
scutes in a hypothetical species of snake; of course, they might equally
well have signified any other character. Table 1 shows three normal popu-
lations, each about 100 times as large as the next. To avoid confusion these
have not been given the slight adjustment necessary to cause them to
total exactly to the nearest even thousand.
1 See Part I of this series.
- The word "statistic” is usually taken to refer to a numerical characteristic of a sample,
while "parameter” refers to the corresponding characteristic of a population.
Klauber: Populations and Samples
37
TABLE 1
Effect of Variation in the Number of Individuals in a Population.
Number of
Population
Population
Population
Ventrals
No. 1
No. 2
No. 3
95
15
96
13
1,338
97
4
443
44,318
98
54
5,399
539,910
99
242
24,197
2,419,707
100
399
39,894
3,989,423
101
242
24,197
2,419,707
102
54
5,399
539,910
103
4
443
44,318
104
13
1,338
105
15
Total
999
99,998
9,999,999
An interesting feature of Table 1 is the increase in the over-all range —
minimum to maximum — that follows an increased population.
In Table 2 the mean of one population has been shifted from 100 to
98. It will be observed that the sizes of the groups or classes of variates
remain otherwise unaffected.
TABLE 2
Effect of Variation in the Mean of a Population.
Number
Population
Population
of
No. 1
No. 2
Ventrals
Mean =100
Mean = 98
94
13
95
443
96
13
5,399
97
443
24,197
98
5,399
39,894
99
24,197
24,197
100
39,894
5,399
101
24,197
443
102
5,399
13
103
443
104
13
Total
99,998
99,998
38
Bulletin 17: Zoological Society of San Diego
In Table 3 the effect of changing the standard deviation (a) of a popu-
lation is shown, the number of specimens remaining constant at 100,000,
and the mean at 100. It wdl be noted that the spread or scatter increases,
for the standard deviation is a measure of dispersion. In this and subse-
quent populations I have in some instances made slight adjustments in
the last figure of one or two of the most populous classes to cause the
total to equal 100,000 exactly. This facilitates comparisons without chang-
ing to an appreciable degree the chances involved in drawing random
samples from that population.
TABLE 3
Effect of Variation
In the Standard Deviation of a Population
Population
Population
Population
Ventrals
No. 1
No. 2
No. 3
O' = 0.8
(7=1.0
rO
II
b
94
1
95
26
96
13
332
97
44
443
2,380
98
2,191
5,399
9,714
99
22,831
24,197
22,586
100
49,868
39,896
29,922
101
22,831
24,197
22,586
102
2,191
5,399
9,714
103
44
443
2,380
104
13
332
105
26
106
1
Total
100,000
100,000
100,000
A few notes on the method of forming these populations should be
recorded. Since we are dealing with characters that can take only discrete
values, the ordinates rather than the areas of the normal curve have been
used in segregating a population into classes. If one sets up a population
by areas, the square of the standard deviation will always come out too
high by 1/12, this being the amount of Sheppard’s correction for group-
ing. But in the present instance, since the variates are integral, they always
take a central position in each class and no correction should be made. A
comparison of distributions by ordinates and areas is given in Table 4 for
one value of the standard deviation (o-=l). It will be noted that the
area basis gives a slightly wider dispersion.
Klauber: Populations and Samples
39
TABLE 4
Effect Df Calculating Distributions
by Ordinates and by Areas
Number
Population
Population
of
No. 1
No. 2
Ventrals
by Ordinates
by Areas
96
13
23
97
443
598
98
5,399
6,060
99
24,197
24,173
100
39,896
38,292
101
24,197
24,173
102
5,399
6,060
103
443
598
104
13
23
Total
100,000
100,000
The standard deviation of the population as finally set up is never ex-
actly equal to that sought, even when the ordinate method is employed.
Efowever, no population that I have used differs in its standard deviation
from the figure desired by more than 0.001. For example, in the third popu-
lation given in Table 3, the standard deviation calculated, after the popu-
lation was set up, was found to be 1.3 33 12 instead of 1.3 3 33 3. Obviously,
this slight difference will not affect the results in the sampling tests that
I have made. The particular table employed in setting up the populations
is that of W. F. Sheppard in Karl Pearson: Tables for Statisticians and
Biometricians, Part 1, Ed. 3, 1930, pp. 2-8.
It will be observed that changing the mean or the number of indi-
viduals in a population utilizes the same or proportionate numbers in
each class; only a change in the standard deviation requires a completely
new set of figures. For the purposes of this investigation, populations of
100,000 with the following 9 values of the standard deviation were set
up: 0.667, 0.8, 1, 1.333, 2, 2.857, 4, 6.667, 10. It was found that these
would fit almost any variable character used in herpetological classifica-
tion closely enough to test an illustrative sampling problem.
Before leaving these populations for the experiments in sampling, I
wish to show the effect of heterogeneity on a composite population, each
component of which is normally distributed. Let us assume a population
composed of 100,000 males and an equal number of females, and observe
the effect on the distribution of the combined ventral scale counts, as
sexual dimorphism increases. The standard deviation will be taken as 1.3 3 3
in both sexes, but the means will be caused to diverge. In the first com-
posite population both sexes average 100 ventrals; in the second, the
females average 101 and the males 99; in the third the females 102 and
40
Bulletin 17: Zoological Society of San Diego
males 98, etc. In each successive distribution the difference between the
means increases by 2, but the composite mean remains at 100. The results
are shown in Table 5. It will be observed in a case such as this, that, if
the means are different, the composite distribution ceases to be normal;
TABLE 5
Effect of Heterogeneity on Composite Populations
when Each Component is Normally Distributed.
o-= 1.333
Number
of
Difference
Between the
Means of Males and
Females
Ventrals
0
2
4
6
91
1
92
1
26
93
1
26
332
94
2
26
332
2,380
95
52
333
2,380
9,714
96
664
2,406
9,715
22,586
97
4,760
10,046
22,612
29,923
98
19,428
24,966
30,254
22,612
99
45,172
39,636
24,966
10,046
100
59,844
45,172
19,428
4,760
101
45,172
39,636
24,966
10,046
102
19,428
24,966
30,254
22,612
103
4,760
10,046
22,612
29,923
104
664
2,406
9,715
22,586
105
52
333
2,380
9,714
106
2
26
332
2,380
107
1
26
332
108
1
26
109
1
Total
200,000
200,000
200,000
200,000
it becomes flat-topped, and, as the difference between the means increases,
it even becomes bimodal, as is evident in the last two columns. It can be
shown that bimodality begins when the differences between the means ex-
ceeds twice the standard deviation. Estimating population characteristics
from samples, using rules and formulas premised on the substantial normal-
ity of the population, will not produce accurate results when the normal
components differ enough to produce marked abnormality in the composite
group. Thus, combining sexes should usually be avoided in making com-
parisons, if there be an important degree of sexual dimorphism.
The Methods of Random Sampling
Random sampling tables are available and their use described in the
following publications: (a) Tracts for Computers, No. 15, Random
Klauber: Populations and Samples
41
Sampling Numbers arranged by L. H. Tippett, pp. VIII + 24, Cambridge
University Press, 1927; (b) Statistical Tables for Biological, Agricultural,
and Medical Research, by Fisher and Yates, pp. 18-20, 82-87, London and
Edinburgh, 193 8. The first table comprises 208 columns, each column
containing 50 4-figure numbers; the second 3 0 columns, each column
containing 50 10-figure numbers. The individual single-figure columns
can be grouped in a great variety of ways; any five contiguous single-
figure columns may be employed as random selections of 5 -figure num-
bers, which may in turn be used directly in drawing individuals from a
population of 100,000. For example, we set up our population with equiva-
lent limiting numbers as given in Table 6. Then we have only to decide
on a method of selecting five figure numbers from one of the columns
(or combinations of parts of columns) in either table and we have a series
of ventral counts by chance. We may use dice, cards, a roulette wheel, or
bingo numbers to select both the page and the group of columns which
are to be used.
TABLE 6
An Example Application of Random Sampling Numbers.
Number of
Population
Inclusive
Ventrals
Distribution
Numbers
96
13
00001 :: '-00013
97
443
00014-00456
98
5,399
00457-05855
99
24,197
05856-30052
100
39,896
30053-69948
101
24,197
69949-94145
102
5,399
94146-99544
103
443
99545-99987
104
13
99988-100000
Total
100,000
* The number 00000 is recorded as 100,000.
Suppose our method of selection leads to the numbers in columns 2 5-29
on p. XII of Tippett. Then the first five samples are represented by the
numbers 00433, 48901, 27228, 72094, and 13224. Referring to Table 6,
we find that we have drawn, in order, snakes with counts of 97, 100, 99,
101, and 99 ventrals.
Random selections from heterogeneous populations may be made by
using dice, the odd numbers representing males, for example, and the even
females. Or, if the population components are not evenly divided, we
may use cards or a roulette wheel, allocating the numbers in any desired
ratios; or a two or three figure column, selected by lot from either of the
42
Bulletin 17: Zoological Society of San Diego
tables of random numbers. It is only necessary that we play the game
fairly and use a system without bias. Many ways of solving such prob-
lems of selection will readily suggest themselves to any one accustomed
to games of chance. Even the tables of random sampling numbers may
be supplanted by the use of numbered discs or balls, although this will
slow selection and introduce the possibility of bias through mechanical
imperfection. The totalizer wheels of an old speedometer may be used as
a selector by freeing them from each other, but they must be well bal-
anced to avoid concentration on particular numbers.
Variations of the Mean
So much for discussions of the methods of setting up populations and
drawing random samples from them. We shall now put these schemes to
work to illustrate various relationships between populations and samples,
and how samples tend to vary. We shall, as far as possible, select illustra-
tions approximating situations found in the herpetological field.
It is first desired to follow the trend in the mean of the ventral scale
counts in samples representing a homogeneous population of rattlesnakes.
The investigation is limited to one sex so that sexual dimorphism will not
complicate the result. There are, as usual, 100,000 individuals in the
population. The mean is 200 ventrals, and the coefficient of variation is
2 per cent, a figure representative of homogeneous series of rattlers. The
standard deviation is then 4, since — i
CO
N)
o
CO
194 211
192 207
191 205
Note: — T o visualize the population from which tjiese samples were drawn see the second
column in Table 18.
Table 1 1 is interesting in showing a considerable variation in the final
scores; particularly Sample 2 reaches a range rather close to the true
population range for so few specimens.
Klauber: Populations and Samples
49
TABLE 11
Changes in the Range with Increasing Sizes of Samples.
Body Blotches. Population Mean 40;
Absolute Range 23 to 57.
Specimen
Sample No. 1 Sample No. 2
Sample No. 3
Sample
No. 4
Number
Min. Max. Min. Max..
Min. Max.
Min.
Max
1
39 39 43 43
44 44
43
43
2
36 .... 41
41
39
3
46
4
35 .... 36
40
5
.... 44
38
6
40
7
8
.... 45
49
37
9
46
36
10
36
11
35
12
13
14
15
16
29
45
17
.... 46
18
.... 50
19
20
32
21
22
23
47
24
25
Final
Score
35 46 29 50
32 49
36
47
Note:-
—For the population distribution see Table
13.
J
Before proceeding to the effects of heterogeneity I wish to carry further
the range experiments on the population used in Table 11, by bringing
the samples up to 200 each; but to conserve space I shall combine each
series in groups of ten. Thus it will only be possible to tell within a range
of 10 specimens when a new record was made. The results are set forth
in Table 12. It will be seen how much greater are the ranges attained in
this table with 200 specimens in each sample, as compared to those in
Table 11 having 2 5 specimens per sample. This relationship between the
number of specimens available and the extreme range of a character should
50
Bulletin 17: Zoological Society of San Diego
always be given consideration in taxonomic work, not too much de-
pendence being placed on over-all ranges derived from small samples.
TABLE 12
Specimen
Number
1 to 10
11 to 20
21 to 30
31 to 40
41 to 50
51 to 60
61 to 70
71 to 80
81 to 90
91 to 100
101 to 110
111 to 120
121 to 130
131 to 140
141 to 150
151 to 160
161 to 170
171 to 180
181 to 190
191 to 200
Changes in the Range
with Increasing Sizes of Samples by Groups of 10.
Body Blotches.
Population
Mean 40;
Sample No. 1
Absolute Range 23 to
Sample No. 2
57
Sample No. 3
Sample No. 4
Min. Max.
Min.
Max.
Min. Max.
Min. Max.
3 3 44
34
50
36 50
34 45
46
31
49
48
30
29
51
32
51
27
49
27
30
28
51
Final Score 30 49
27 51
28 5 1
27 51
Note: — For the population distribution see Table 13.
A table giving the relationship between the standard deviation of a
normal distribution and the mean range, as it varies with the number of
specimens in the sample, is available in Pearson’s Tables for Statisticians
and Biometricians, Part II, Table XXII, pp. 165-166, 1931. We find from
this table that the mean range attained in my Table 1 1 should be about
15.7; the actual ranges are 11, 21, 17, and 11 blotches, which gives a
mean of 15. In Table 12 the final ranges were 19, 24, 23, and 24; mean
22.5. From Pearson’s table we learn that the range should average 5.49
times the standard deviation (4), or 22.0; this is good agreement for
only 4 samples. Theoretically a sample comprising 5 specimens will have
Klauber: Populations and Samples
51
about twice the range shown by the first two specimens, while 60 speci-
mens will again double the range of the first five. But even 1000 speci-
mens will not again double the range; they will add only 40 per cent.
This matter of range is thought to be of sufficient interest and im-
portance to warrant the presentation of Table 13, which shows the actual
population from which the samples in Tables 11 and 12 were drawn.
TABLE 13
Normal Distribution of Body Blotches in a
Population of 100,000 Specimens.
Mean 40; Standard Deviation 4.
Number
of
Blotches
Number
of
Specimens
23
1
24
3
25
9
26
22
27
51
28
111
29
227
30
438
31
793
32
1,350
33
2,157
34
3,238
3 5
4,566
36
6,049
37
7,529
38
8,802
39
9,667
40
9,974
The other half of the distribution
is not given since it duplicates the
first half in reverse; that is, there
are 9,667 specimens with 41
blotches, 8,802 with 42, etc.
A random sample of 4000 specimens drawn from this population had
a range of 23 to 54 blotches; in other words, the lowest individual in the
population was drawn, but the highest drawn was 3 below the absolute
maximum contained in the population.
52
Bulletin 17: Zoological Society of San Diego
Table 14 represents the results of sampling the same composite popu-
lation discussed under Table 9. The spread is increased and the results
are more erratic than those disclosed in sampling homogeneous popula-
tions. The results are somewhat similar to sampling a normal population
with a higher standard deviation, but this is not exactly true, as the com-
TABLE 14
Changes in the Range
with Increasing Sizes of Samples,
Showing the Effect of Heterogeneity.
Ventral Scale Counts;
Absolute Range 162 to 205.
Specimen
Sample No. 1
Sample No. 2
Sample
No. 3
Sample No. 4
Number
Min.
Max.
Min.
Max.
Min.
Max.
Min. Max.
1
187
187
188
188
185
185
183 183
2
191
185
189
182
3
184
193
193
4
184
178
178
178
5
182
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Final
Score
181
171
175
193
77
176
171
191
175
193
177 193
176 193
Klauber: Populations and Samples
53
posite population is not normally distributed. It should be understood that
while Tables 9 and 14 are drawn from the same composite population,
they are based on separately selected samples. A comparison of the trends
in the means, as indicated by Tables 7, 8, and 9, with trends and varia-
tions in the over-all ranges as shown in Tables 10, 11, 12, and 14, will
demonstrate how much more accurately the mean represents the popula-
tion mean than the range in the sample represents the population range.
For example, compare the results of Tables 8 and 11, Table 11 being used
only up to and including Specimen 20:
Mean
Minimum
Maximum
Population Parameter
40.0
23
57
Sample 1
39.9
35
46
2
40.9
29
50
3
40.2
32
49
4
39.8
3 6
45
It is not only that the sample range fails to reach the population range,
for this is to be expected, but there is a considerable discrepancy between
the several sample ranges as compared with the slight variation between
the sample means.
In subsequent studies of dispersion, comparisons are drawn between
population parameters and sample statistics, with respect to the mean
and range, where each sample comprises 100 specimens, instead of only
20 as above. The results are set forth in Table 22. The consistency of
sample means, and the lack of dependability in the maxima and minima
are there evident. Further, it is to be remembered that these results are
derived from populations that are truly normal. In real populations it is
quite likely that there may be fringe specimens in greater number than
are expected in a normal distribution, especially if juveniles are included.
For juvenile specimens are occasionally so distorted and aberrant that
they probably would not survive. Although the range of a character is
often used in taxonomic work, especially to show whether there is an
overlap between two forms, it is really a rather poor indicator because of
its lack of close adherence to the range of the population. It should never
be used without a statement giving the number of specimens in the sample;
otherwise it is almost without value.
Later, in discussing dispersion in samples, the use of the interquartile
range, as a dispersion indicator, will be examined, and additional examples
of the relationship between population and sample ranges will be adduced.
Dispersions of Samples Compared to Those of Populations
Thus far I have discussed the relationships of samples and parent popu-
lations in respect of the two simplest and most frequently used herpeto-
logical statistics — the mean and the range of variation. Illustrations have
54
Bulletin 17: Zoological Society of San Diego
been given showing how these sample statistics change, and in general
tend to approach the population parameters as the samples increase in
size. There remains the important attribute of dispersion, that is the
scatter of the variates about the mean — how closely they adhere to the
mean and what the nature of the dispersion may be. This is an extremely
important statistic in taxonomic problems, especially those having to do
with the significance of differences between subspecies or species. For,
given a certain difference between means, the extent of the overlap be-
tween two forms will obviously depend upon the extent to which the
variates spread on each side of the mean.
Returning to the relationship between samples and the populations
from which they were drawn, I again resort to the method of illustration
by drawing a number of samples from each of several different popu-
lations, setting them beside each other in tabular form to facilitate a
visual comparison. Since samples of appreciable size are rarely identical
(each sample having its own individuality), no single sample can out-
line this relationship completely, which is the reason for drawing a num-
ber of illustrative samples from each population.
TABLE 15
Changes in the Dispersion of a Sample with Enlargement of the Sample.
Infralabials: Mean =16, a— 1
Number of
of
Infralabials
Distribution
of
Population
Successive Sample
Steps
1
2
3
4
5
6
7 8
9
10
11
12
13
13
443
1 1
1
1
1
14
5,399
2
5
19
15
24,197
1
2
3
3
5
10 15
24
48
129
16
39,896
1
1
1
1
1
1
7 15
38
74
199
17
24,197
1
3
4 16
32
61
125
18
5,399
1
3 3
3
9
25
19
443
2
2
20
13
Total
100,000
1
2
3
4
5
10
25 50
100
200
500
In presenting tables showing the trends of means and over-all ranges with
the growth of samples, each specimen increment was shown separately.
This is usually too cumbersome a method when illustrating the relation-
ship between the dispersion in samples and populations; however, in
Table 15 1 have set forth such a growth of one sample by successive steps,
using, as the population, an assumed distribution of infralabials in a
species of rattlesnake, this particular distribution being closely approxi-
Klauber: Populations and Samples
55
mated in several species. Table 1 5 shows how, as a sample grows, it tends
more closely to approximate the parent population in the character of its
dispersion. But even at 200 specimens the discrepancies are quite con-
spicuous; in this particular sample there are too few specimens with 14
infralabials and too many with 17. By the time the sample has been built
up to 500 specimens these imperfections have mostly disappeared, and the
sample shows a closer resemblance to the parent population. In making such
a comparison, however, it is important to recall how seldom we have a
herpetological sample as large as 500 specimens.
TABLE 16
Dispersions of 10 Samples, Each Comprising 100 Individuals.
Infralabials: Mean -16, a = 1
Number
of
Infralabials
Distribution
Sampl
e Number
Population
l
2
3
4
5
6
7
8
9
10
Total
12
13
13
443
2
2
1
1
6
14
5,399
1
9
7
4
9
3
3
1
7
7
51
15
24,197
29
24
15
15
29
30
22
34
21
16
235
16
39,896
40
37
39
47
37
36
38
39
43
42
398
17
24,197
24
25
37
30
20
28
33
21
23
27
268
18
5,399
6
3
2
4
2
3
3
4
4
8
39
19
443
1
1
1
3
20
13
Total
100,000
100
100
100
100
100
100
100
100
100
100
1000
Table 16 uses the same population, drawing therefrom 10 new samples,
each containing 100 specimens. A considerable variation amongst the
samples will be noted; each seems to have an individuality of its own.
In the summation of the ten samples, bringing the total to 1000 speci-
mens, there is a rather marked deficiency in specimens having 18 in'frala-
bials, and correspondingly an overabundance of those with 17. Some of
the samples are badly distorted, while others more closely follow the dis-
tribution of the population. Sample No. 9 is probably the closest fit to
1/1000 of the population, while No. 3 is particularly unbalanced. A
chi-square test of the total of the 1000 specimens gives P=0.21. 4 This is
not a particularly good fit, for it shows that 79 similarly drawn samples
out of 100 would more closely approximate the distribution of the
population.
4 In making these chi-square tests I have taken the degrees of freedom as one less than
the number of classes as suppressed, since a fit has been, forced only with respect to the
totals of the theoretical and actual distributions.
5 6
Bulletin 17: Zoological Society of San Diego
But the principal point to be observed with respect to these samples,
comparing each with the others, and with the parent population from
which they were drawn, is the varying impression that a taxonomist might
gain from them, having in mind the fact that the taxonomist would
have only one sample before him, neither the population nor the other
samples being available.
Tables 15 and 16 gave the results of random sampling from a popula-
tion having a rather closely concentrated character; after all there is not
much variation in the infralabials of any homogeneous series of rattle-
snakes. Table 17 sets forth the results of sampling a population which is
divided into a greater number of classes. The character considered is the
subcaudal scale count; the mean is 30 and the standard deviation 2. Thus
the coefficient of variation is 6.67 per cent. Such a distribution is not
unusual; it is closely approximated, for example, by female Pbyllorhynchus
decurtatus perkinsi. It will be observed that, from the separate samples,
one gains a less accurate suggestion of the parent population than was
the case with the more concentrated character of Table 16. Sample 2 is
particularly distorted; Sample 9 is good as far as the central classes are
concerned, but poorly distributed in the edge-classes. The grand total of
1000 specimens rather closely approximates the parent curve (P by chi-
square = 0.61 ) , although there is on overabundance of specimens with 32
subcaudals, and a shortage of those with 28.
Klauber: Populations and Samples
57
TABLE 17
Dispersions of 10 Samples Each Comprising 100 Individuals.
Subcaudals: Mean 30, cr -2
Number
of
Subcaudals
Distribution
Sample Number
Population
l
2
3
4
5
6
7
8
9
10
Total
21
1
....
—
22
7
—
—
—
23
44
1
1
24
222
1
1
2
25
876
1
1
1
1
2
—
2
1
2
11
26
2,700
4
2
2
3
5
5
2
2
3
2
30
27
6,476
7
4
7
10
8
6
1
6
9
7
65
28
12,098
11
4
12
15
4
12
12
9
9
10
98
29
17,603
16
26
19
15
17
17
12
18
18
16
174
30
19,946
18
24
18
22
19
13
21
23
22
18
198
31
17,603
15
16
21
15
20
15
25
15
17
14
173
32
12,098
15
11
12
10
11
17
14
16
10
17
133
33
6,476
6
6
3
4
10
11
6
8
5
10
69
34
2,700
5
5
2
4
4
1
4
1
7
3
36
35
876
1
2
2
1
1
1
8
36
222
1
....
1
37
44
1
....
1
3 8
7
....
....
39
1
....
....
....
....
....
100,000 100 100 100 100 100 100 100 100 100 100
Total
1000
58
Bulletin 17: Zoological Society of San Diego
Passing on to a character with a still wider spread, Table 18 presents
the distributions of the usual 10 samples of 100 individuals each, drawn
from the same population which comprised the basis of Tables 7 and 10,
namely the ventral scutes in a homogeneous series of rattlesnakes with
TABLE 18
Dispersions of 10 Samples, Each Comprising 100 Individuals.
Ventrals: Mean = 200,
d
HN
o
i—i
o
o
rH
OS
oo
Csl
O
OS
s
OO
ON
ON
ON
OS
OS
oo
oo
OS
ON
oo
"a,
E
33
i—l
rH
rH
i—i
rH
rH
rH
rH
1-H
1—H
rH
H
03
o
Vz*s
'Tt"
CM
o
O
o
o
as
O
o
OS
OS
ON
oo
a
s
o
O
o
o
os
o
o
OS
OS
ON
On
03
(N
(N
<
ON
so
UH
r\
UH
Vr,
C-N
vh
03
G
03
s
riN
no
nO
riN
rO
nO
nO
rO
rO
no
Oh
K
O
Oh
oo
d
rH
Cr,
Lh
"T"
V-N
nO
Us
Vo
NO
V**\
I”" 1
§
CM
(N
Cs|
CM
Csl
Cn|
Cs|
Cs4
Cs|
CM
Csl
M
CQ
l\
H
c
1—H
c
O
rO
Csl
CM
'✓"S
OS
O
VO
rO
Lh
o
c
r
ns
o
o
(N
O
so
©
rH
nO
rH
O
Csl
o
VO
03
03
d
H
X
o
OO
OO
OO
OO
OS
OO
OO
OS
ON
OO
CD
a
s
CM
r ""
1-H
T " H
T " H
•“ 1 1
rH
03
VO
U
.
C/5
1)
__
£
03
03
Jh
h CM
no
us
NO
r\
OO
ON
O
03
rH
Ph
o'
C
£
o
____
80.0-121.1
6
83
116
80.4-120.9
7
79
122
78.7-120.3
8
84
116
81.1-117.4
9
85
125
78.9-122.9
10
84
110
81.8-117.5
Total (Sample of 1000)
79
125
80.2-119.8
We see that the population minimum is 72, while the sample minimums
vary from 79 to 88; the population maximum is 128 and the sample
maximums vary from 110 to 12 5. On the other hand the sample statistics
represented by M — 3cr are much more consistent indicators of the corre-
sponding population parameter; for the population low figure is 80.0
while the samples vary between 78.7 and 82.0, and the population high
figure is 120, with the samples varying between 117.4 and 122.9. Thus,
there is every reason to recommend the M 111 3cr statistic as a population
indicator as compared to the over-all range. Admittedly the samples I
have used in this test (100 individuals) are relatively large, but similar
advantages in the use of this statistic will be found in the case of smaller
samples. For example, in Table 26 I have built a sample up to 2 5 speci-
mens, by adding one randomly selected individual at a time, all the while
recording the trend in the over-all range and in the statistic M — 3a.
Klauber: Populations and Samples
67
The population used is that shown in Table 13, that is, a homogeneous
series of snakes having an average number of 40 body blotches, with a
standard deviation of 4. All data are calculated from the sample as it
grows, the population being assumed unavailable, as would be the case in
actual practice. Thus, both M and a change as each specimen is added;
and in each calculation the factor [N/(N— 1 ) ] ^ is used in securing the
optimum population value of o\
TABLE 26
Changes in the Range and in M ± 3cr
as a Sample Increases from 1 Specimen to 2 5.
Body Blotches. Mean = 40, a — 4.
Specimen
Count
Overall Range
Min. Max.
Range
M ± 3(7
Population Parameter
40
23
57
28.0-52.0
Specimen No. 1
.. 44
44
44
44.0-44.0
2
43
43
44
42.0-44.0
3
.. 41
41
44
38.1-47.3
4
.. 37
37
44
32.0-50.5
5
34
34
44
27.2-52.4
6
. 44
34
44
28.1-52.9
7
41
34
44
29.2-51.9
8
.. 44
34
44
30.0-52.0
9
. 35
34
44
28.2-52.3
10
.. 33
33
44
26.3-52.9
11
39
33
44
27.0-52.2
12
... 37
33
44
27.1-51.5
13
... 42
33
44
27.6-51.5
14
... 38
33
44
27.8-51.0
15
... 33
3 3
44
26.9-51.2
16
... 31
31
44
25.3-51.7
17
... 38
31
44
25.7-51.3
18
... 36
31
44
25.8-50.8
19
... 42
31
44
26.1-51.0
20
... 42
31
44
26.4-51.0
21
... 42
31
44
26.7-51.1
22
... 40
31
44
27.0-50.9
23
... 45
31
45
26.9-51.4
24
... 40
31
45
27.2-51.2
25
... 32
31
45
26.4-51.4
Note: In determining the
population was calculated from
range M — 3(j,
the sample.
the
estimated standard deviation of
68
Bulletin 17: Zoological Society of San Diego
We see that the extreme or over-all range never approaches closely to
that of the population. Even after 2 5 specimens are accumulated the range
is only from 5 above the mean to 9 below, a notable unbalance in itself.
But the statistic M ± }a reaches a figure quite close to the population
parameter after the accumulation of only 5 specimens, and remains con-
sistently close to that parameter as long as specimens are added. So once
more the consistency of a statistic which is a multiple of the standard
deviation is demonstrated. However, it should be noted that these strictures
upon the relative values of statistics of the over-all range as compared to
some multiple of the standard deviation, are only pertinent when ap-
plied to variates having a substantially normal distribution, or at least one
which is fairly symmetrical. The over-all range may be a better criterion
in strongly skewed distributions.
These statistics of populations and samples are primarily necessary in
taxonomic work to demonstrate the validity of differences — namely the
chances that two supposed species overlap in a particular character and
the extent of such overlap. I shall illustrate a typical case of overlap and
how it becomes increasingly evident as more specimens are added to the
available collections, that is, as the samples increase in size.
Table 27 represents the results of sampling two populations coincidently,
the same number of specimens being added to each. The populations sampled
have similar characteristics, except with respect to their averages. They
represent ventral scale counts in two species of snakes, both having
standard deviations of 4; but the mean of one population is 200 scutes
and of the other 184. Thus, the difference between the means is 4 times
the standard deviation of either. The vertical columns in the table are
cumulative samples; that is, the first column shows the first sample
drawn, the second, the first plus the second, etc. To conserve space indi-
vidual drawings are shown up to Sample 5, then by twos, threes, fives, etc.
The interesting feature of this test is that one does not get the feeling that
there is a probable overlap between the two forms until the seventh speci-
men of each has become available; and an actual overlap did not occur until
the drawing of the eighteenth specimen. This is somewhat typical of our
knowledge of rarer forms, which are often first thought to be quite well
separated, and are so noted in keys, but which later are shown to over-
lap, when additional specimens have become available. The probability
that such an overlap would eventually be evident might have been pre-
dicted by calculation as early as Specimen 5. Of course these remarks on
the gradual evidence of an overlap have little to do with the validity of
the two species, since even with the overlap, the difference between the
means is sufficient to warrant recognition. According to Ginsburg’s cri-
terion the small overlap (2.28%) would indicate full species. However,
the discussion of the extent of divergence, its measure and significance, and
5 Zoologica, Vol. 23, pp. 253-286, 1938.
TABLE 27. Simultaneous Sampling of Two Populations. Ventral Scutes: M] = 184, M 2 = 200;
Sample Step
Populations
•'4'
II
b
- ^ h — — CM rrN CM rr N »/"\ lx " — i CM SO * — 1 ©o i*"n cs v — <
*— I 1—1
^Nt'CKawooo^HKt^^HMH : ; ; : ; : ; ; ; ;
t“H 1— H
i ! C\J C\J (N ’ — < ^ 0\ O »^N \D ty-N r-y 4— H U-S
’ < ^
; r-< : : ; ; ; : : ; ; ; ; ; ;
■ i ^ —<
CM
cm
: cm
CM
N ^ O
SO OS ^
x K f'l
>-n OS
KXVCONOSMK^K
«^>r
u
C
o
u.
vt:
i i i i i
i i i
i i
F-H
os
CM
SO
SO
r-H
f-H
^H
r-H
r-H
r-H
K
Os
^H
^H
r-H
t— H
r-H
1— H
^t"
so
r-H
T-H
os
O
r-H
r-H
o
Os
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
SO
OS
O
o
Os
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
r\
o
K
Os
OS
o
OS
(N
r\
K
OO
Ks\
OS
OS
(N
o
OS
OS
so
so
<**■>
u
c
____
5
U
W
hJ
{a
w
w
Mh
O
V5
c
Cis
cq
<
H
U
a >
rQ
£
H
W
e
u
a>
t
CQ
2
Oh
CO
to
c
rt
Li
o
C
rj
i-»
c
o
£
4>
I c
«2 «
C
C3
*-»
c
o
E
____
> a
l\
o
C\ OS
Tf o
K
so oo so oo
ns IN. CM l\
nf so os
so so oo
so
o
H-
so
as oo as oo so
so
SO K SC OS (N ^ (N
Os
© O
(N
o
K
o
OO
os
r\
K
^t*
OO
o
K
ON
r-H
00
t-H
as
NO
«^N
os
CM
T— H
G— s
K
l\
CM
f^N
rsi
NO
K
o
K
T-H
rsi
CM
r__~~3
* N.
<0
O
*3
too
V
<3
r«*
©
N
* N,
V.
oo
n3
E
D
n
£
£
ts
I-H
a
k
Oo
o
o
• K.
*>*
£
Co
-2
■K.
©
K.
u
o
Q
★
Klauber: Scalation and Life Zones
77
species. The only exception is Pituophis catenifer. The evidence continues
to multiply that the two supposed subspecies, P. c. annectens and P. c.
deserticola, meet or overlap, but do not intergrade in eastern San Diego
County.
The results of the investigation of the number of ventral scutes are
given in Table 1. Since sexual dimorphism is present in nearly all these
forms, the sexes are treated separately, except in the case of Leptotyphlops
humilis. Also in this species the dorsals, rather than the ventrals are used,
since they can be more accurately counted, and therefore are more often
employed in taxonomic work.
From this tabulation of the thirteen forms we find that there is an al-
most universal tendency toward a higher number of ventrals in the desert
specimens, as compared to those which were collected in the more humid 4 5
cismontane region. No less than ten out of the thirteen forms show this
trend in both sexes; and in every instance the significance is beyond the
usually accepted level of P = 0.05, meaning less than one chance in 20 that
the result has occurred through an accident of random sampling. In the
majority of cases the probability is below one in a thousand, leaving no
doubt as to the reality of the trend. a
The exceptions are three in number: Lichanura roseofusca, Coluber
flagellum frenatus, and Crotalus mitchellii pyrrhus. In the first two we
find that the males follow the usual trend, that is, the desert males have
more ventrals than the coastal. The females show no significant territorial
variation in Lichanura; while in C. /. frenatus, the desert females average
lower than those from the coastal side of the mountains, although the
difference is below the usually accepted level of significance.
The last species which fails to follow the trend of the majority is the
rattlesnake C. m. pyrrhus. Flere the desert males average slightly higher
than the coastal, and the desert females somewhat lower, but the differences
are below the significance level. It may be seriously doubted whether larger
samples would reverse the condition noted in this species. Of the 13 forms
listed in this study C. m. pyrrhus is the only one, besides Trimorphodon
4 The difference in humidity is the outstanding difference between the two regions; how-
ever I do not claim to have shown that this is the cause of the observed difference in
ventral scale counts, which might result from any of a number of secondary environmental
characteristics, of which temperature is outstanding.
5 As several of the samples are rather small, especially in the case of some species which are
not plentiful in the desert region, I have in all cases used the Atest and the method of
pooling, in determining the significance of the difference. The equation is
^=(M-M') [NN'(N + N'-2)]5* [(N + N') (Nv + NV) ] - 'A,
where M and M' are the means of the two samples, N and N' the numbers of specimens in
each of the samples, and v and v' are the variances (standard deviations squared) of the
samples. The f-table is entered at N + N' -2 degrees of freedom. (R. A. Fisher, Statistical
Methods for Research Workers, Seventh Edition, 193 8, p. 128; J. F. Kenney, Mathematics
of Statistics, part 2, 1939, p. 140; P. R. Rider, An Introduction to Modern Statistical
Methods, 1939, p. 91.)
78
Bulletin 17: Zoological Society of San Diego
vandenburghi, which inhabits only a limited zone in the cismontane area;
for although both may rarely be found in the coastal and inland valleys
zones, their infrequency shows that they are more or less strays. Their
headquarters are in the foothill zones on both sides of the mountains. Thus
it may be said, in possible explanation of the deviation of C. m. pyrrhus
from the trend followed by the majority of forms, that it has never been
fully subjected to the coastal influence.
It will be observed that the three smallest snakes, Leptotyphlops humilis,
Hypsiglena ocbrarbyncbus, and T ant ilia eiseni, have the highest coefficients
of divergence. ,J This is probably evidence of the trend, frequently observable,
that slowly moving forms exhibit greater differences per unit of distance
than more widely wandering species.
Of the three colubrids which fail to attain a significance of P = .001 — ,
or greater than one in a thousand, two have not been fully influenced by
zonal extremes; L. g. calif orniae is quite rare in the desert, and T. van-
denburghi is virtually absent along the coast. Even so, both would prob-
ably attain a significance of .001 with larger samples, since, if the co-
efficient of divergence remains unchanged, the significance increases with
the size of the sample.
Although a general tendency can be shown to exist in some genera (the
rattlesnakes are an example) for larger species and subspecies to have
higher scale counts, this is not a causative factor in increasing the number
of ventrals in these desert specimens. In several cases desert specimens
run somewhat smaller in size than in the cismontane region, as is the
case, for example, in Crotalus ruber and Coluber flagellum frenatus. The
reverse is true of Leptotyphlops ; in other forms the inhabitants of the
two areas do not differ conspicuously in size.
The trend shown definitely to exist in the ventrals in at least ten of
thirteen species is not repeated in any other characteristic, the following
having been tested by the same method: scale rows, caudals, supralabials,
infralabials, body blotches, and tail rings. Some have no variations at all,
such being true of several species in the case of scale rows and labials;
for many colubrids have little or no intraspecific variation in these char-
acters. Many show differences below the level of significance, which I have
taken at P = .0 5. But even where there is significance in one species, there
is no consistent trend throughout the thirteen forms or even a majority
of them.
Thus in the case of the dorsal scale rows we find that C. ruber has a
significantly higher average on the coast, while C. m. pyrrhus has a cor-
respondingly higher average in the desert. The others either have no dif-
ferences, or such differences as there are fall below the level of significance.
With respect to the caudals the following show significant differences:
Salvador a, Lampropeltis, Rhinocheilus, and Hypsiglena average higher in
6 Defined as the difference between the means divided by half the sum of the means.
Klauber: Scalation and Life Zones
79
the desert than in the cismontane region, thus following the trend in the
ventrals; Pi t uo phis, on the other hand, has the opposite trend, for the
coastal subspecies has a higher average. The study of the caudals is some-
what handicapped by lack of specimens; for many specimens have incom-
plete tails, thus always reducing the number below those available for
the study of ventrals. Larger series may show more definite trends; but
they are not likely to be as consistent as is found to be the case in the
ventrals.
The labials are constant in a number of species, as is often the case in
the colubrids. Only Trimorphodon shows a significant difference, the
coastal specimens having the greater number of supralabials. As to the
infralabials A. e. occidentals, T. vandenburgbi, and C. ruber average sig-
nificantly higher in the coastal region, while the contrary is true of
Pituophis.
As regards pattern, those species which have rings or blotches — as dif-
ferentiated from the striped forms — do exhibit differences, although not
always above the level of significance. Thus Rhinocheilus and Pituophis
have markedly fewer blotches in the desert than along the coast. However,
the opposite is true in Arizona and Trimorphodon, although in both these
cases the differences are somewhat below the significance level. Desert
Hypsiglena also has a higher number of blotches than the cismontane form.
Thus we see that the lightening of color in the desert individuals, which is
universal in all 13 forms except Lam -pro pelt is, is not secured by reducing
the number of blotches, although this is the case with Pituophis, in which
both fewer blotches and reduced pigment contribute to the lighter tone.
Rather, it is effected by a reduction of pigment in blotches, ground color,
or both.
It is worthy of note that the trend in ventrals, which has been shown
to exist, is evident in that character which is relatively the most consistent
of the really variable scale counts, that is, the ventrals have the lowest
coefficients of variation.
I wish to acknowledge my indebtedness to Charles Shaw for making
scale counts, Mrs. Elizabeth Leslie for statistical computations, and C. B.
Perkins for his usual pertinent suggestions. Scale counts of particular
specimens were received from Dr. R. B. Cowles, Charles M. Bogert, David
Regnery, and J. R. Slevin.
Rattlesnakes Listed by Linnaeus
81
IV. THE RATTLESNAKES LISTED BY LINNAEUS IN 175 8.
Introduction
Linnaeus’ descriptions of reptiles were so brief and so frequently based
on composite material that, unless the type specimens are still extant,
linking them with known species is often difficult and sometimes impos-
sible. Yet, as the tenth edition of the Systema Naturae, 175 8, is the
foundation of all nomenclature, it is important that attempts be made
to solve these problems of identification.
Linnaeus listed three species of rattlesnakes in the tenth edition: horridus,
dryinas, and durissus. In the twelfth edition he added two more; these are
not difficult to assign, being the species now known as Sistrurus miliarius
and Lachesis mnta, the latter not a rattlesnake. But the first three have
been the source of much confusion among taxonomists, and even now
there is not complete agreement respecting the proper applications of these
names. It has occurred to me that the large collections of specimens at
present available might justify a re-examination of the problems involved,
since we can now more accurately define the scale-count ranges of the
several species which may have been the real subjects of Linnaeus’ descrip-
tions. We likewise have new statistical methods of determining degrees
of difference.
The confusion primarily relates to the correct names to be assigned to
five species and subspecies of rattlesnakes; these are the timber (or banded)
rattlesnake of the eastern United States, the canebrake rattler (a south-
eastern subspecies of the timber rattler) , the Florida diamondback, the
Central American rattlesnake, and the South American rattlesnake, the
last two being subspecies of the Neotropical rattlesnake. I am constrained,
for the moment, to refer to these by their common names, since to use the
technical names would involve the confusion I am trying to explain. Be-
sides the three initiated by Linnaeus, another technical name, that of
terrifcus Laurenti, 1768, must also be considered.
Past Usages
Some past allocations have been as follows:
(a) The timber rattlesnake (to which the name h. horridus is now
usually assigned) was identified as durissus by Holbrook, 1842, and by
Dumeril, Bibron, and Dumeril, 18 54.
(b) The Florida diamondback (now generally known as adamanteus)
was called durissus by Boulenger, 1896, and in the Mission Scientifique,
1909; and terri ficus by Le Conte, 1 8 53.
(c) The Central American rattlesnake (now usually called durissus )
was assigned to horridus by D., B., and D., 18 54, and Gunther, 1902; and
to terrifcus by Boulenger, 1896.
82
Bulletin 17: Zoological Society of San Diego
(d) The South American rattlesnake (to which the name terrificus is
usually applied) was referred to as durissus by Jan, 18 59.
Dryinas has been adjudged so vague that the name was dropped at an
early date and has not been used for many years.
I have mentioned the decisions of only a few herpetologists; the list
and confusion could be greatly extended.
Linnaeus’ Method and Type Specimens
Linnaeus, in describing the snakes in the Systema Naturae, generally
used a schedule comprising five parts, especially if the type specimen was
contained in one of the collections to which he had access, as was the
case with the rattlesnakes. These parts are:
( 1 ) The sex and number of ventrals and subcaudals of the type
specimen.
(2) A primary reference, in which a more complete description of the
type, either by himself or some other author, may be found.
(3) Secondary references which Linnaeus assigned to the same species.
However, these often lead to confusion, since they may refer to species
other than that of the type, or they may refer to composite or indefinitely
described material. When there is a conflict, these secondary references
must obviously yield to 1 and 2. It is important that the primary refer-
ence be not confused with the secondary; it can usually be identified
through the scale counts, as well as by its initial position. Sometimes all
references have an equal value, but such is not the case with the three
rattlesnakes. The primary reference may contain still others, which may
be considered tertiary.
(4) A habitat. Since this is expressed more as an over-all range than a
type locality, as we know the latter today, the statement is usually too
broad to be of any service in assigning names. For example, the habitats
of all three rattlesnakes are given in the tenth edition simply as "America,”
and therefore do not facilitate the problems of identification.
( 5 ) A description, usually including color notes. These are all too
brief; they sometimes involve descriptions of specimens other than the
type, and it is often clear that the specimens described were much faded.
Sometimes the descriptive notes on the type specimen are supplanted by
natural history notes culled from other references.
None of the three types of Linnaeus’ rattlesnakes is now available for
study. The type of horridus was contained in the King Adolf Fredrik
Museum, most of the material from which was eventually transferred to
the Royal Museum in Stockholm. Andersson, 1899, p. 5, states that
horridus is one of the types now missing. He mentions (p. 27) two jars
labeled Crotahis horridus, one containing a head, which is not that of a
rattler; and the other the tail of a rattlesnake, which presumably is not
that of the type of horridus, since it has more rattles than the type had,
Rattlesnakes Listed by Linnaeus
83
and more subcaudals as well. I communicated with the Royal Natural
History Museum in 193 5, hoping the rattle might be a complete string
which could be analyzed, but Count Nils Gyldenstolpe replied that the
string was incomplete, and that there were no new developments with
regard to the lost type of horridus , only the jars and their contents men-
tioned by Andersson remaining.
Lonnberg, 1896, states that the types of both dryinas and durissus are
also lost (pp. 18 and 27). These specimens were originally contained in
two collections which were available to Linnaeus for study, the first the
Adolf Fredrik Collection (not the same assemblage as the Adolf Fredrik
Museum) and the second the Claudius Grill Collection, also referred to
as the Surinam Collection. Both collections were eventually transferred
to the Zoological Museum of the Royal University at Upsala, but the two
rattlesnakes have disappeared. Thus our investigation must be restricted
to the original descriptions supplied by Linnaeus. In fact, it must be clear
that the uncertainties respecting the proper applications of the Linnean
names are present only because the types are gone. Were they available,
they would take precedence over the inadequate and conflicting descrip-
tions upon which we must now depend.
Studies of Scale Counts
The rattlesnakes scale counts given by Linnaeus are as follows:
192. horridus. 167-23:2.
195. Dryinas. 16 5-30.
196. Durissus. 172-21:3.
In each case the figure preceding the name is the sum of the ventrals and
subcaudals. 1 If there are two figures, separated by a colon, representing the
subcaudals, it is to be understood that the first indicates the number of
entire scales, and the second those which are divided. Only two other scale
counts are made available by Linnaeus; in the primary reference the supral-
abials of dryinas are given as 14-14, and the infralabials 14-14 also. All
three types are stated to be males.
Before matching these scale counts against the known dispersions of
present day species and subspecies by the methods of mathematical statistics,
we can narrow the field by some general considerations of the characters
and ranges of the forms which are now recognized as valid. Hereafter I
shall use the scientific names in their customary modern assignments
(Klauber, 1936).
Of the more than forty species and subspecies of rattlesnakes now
recognized, many can be eliminated from consideration on one of two
counts: Either their ranges are so restricted or were so inaccessible to the
1 It is interesting to note that Linnaeus listed the snakes in each genus in the order of
their total ventral plus subcaudal scales, beginning with the lowest number, in effect a
sort of numerical index.
84
Bulletin 17: Zoological Society of San Diego
early eighteenth century traveler as virtually to exclude the possibility of
their being in the three collections which contained these types; or their
ventral and subcaudal scale counts are so widely different from those
of the three types as to preclude their being the species described. The
first criterion practically eliminates all species except those found along
the eastern coasts of North and South America, or territories not far
inland; the second excludes many other forms.
I think we may exclude C. viridis and all of its subspecies on the score
of geographic inaccessibility. In fact, it seems to me highly significant that
no amphibian or reptile specimen from what is now the United States
was contained in any of the three collections which included the three
rattlesnake types. In the tenth edition of the Systema Naturae, Linnaeus
described six land reptiles (other than C. horridus ) whose ranges center
in the United States. Using their present-day names these are: Eumeces
fasciatus, Coluber constrictor, Natrix sipedon, T hamnophis sirtalis, Che-
lydra serpentina, and Terrapene Carolina. I omit Bufo marinus as being
primarily Neotropical. The type descriptions of all of these were based on
Kalm, Catesby, or Edwards, with the exception of C. serpentina (about
the original of which type Linnaeus was not definite) none being described
from specimens in the three Swedish collections containing the rattlers.
This would leave one to infer that the chance that, of all the reptiles in
these collections, only one or more of the rattlers came from the United
States, is somewhat remote; at least it would take rather strong evidence
to balance the probability that they did not. In the twelfth edition of
the Systema Naturae, Linnaeus described fourteen additional land reptiles
from the United States, but all except one were premised on Catesby’s
descriptions; thus up to 1766 no U. S. specimens had reached these col-
lections which Linnaeus studied, although many Neotropical forms were
included therein. Nevertheless I have not excluded the timber rattler and
Florida diamondback as possibilities. But at least we are justified in elim-
inating such western forms as viridis and its subspecies.
Returning to other species which may be omitted from consideration
on the score of rarity or geographical inaccessibility, I think we can ex-
clude both molossus or scutulatus, which, although they are found in the
vicinity of Mexico City are rare so near the southern limits of their
ranges. Neither reaches the east coast of Mexico.
C. triseriatns and S. catenatus are eliminated on the score of scale counts.
The likeliest candidates remaining are the following:
C. d. durissus Central American Rattlesnake
C. d. terrificus South American Rattlesnake.
C. unicolor Aruba Island Rattlesnake.
C. adamanteus Florida Diamondback Rattlesnake.
C. cinereous ( atrox ) Western Diamond Rattlesnake.
C. h. horridus Timber Rattlesnake.
C.h. atricaudatus Canebrake Rattlesnake.
Rattlesnakes Listed by Linnaeus
85
I have included C. unicolor as a possibility because the color descrip-
tion of Linnaeus’ dryinas fits it well, although the chance that he had
access to such an island form appears remote. However, unicolor may also
occur on the mainland (Klauber, 1936, p. 197).
I now proceed to analyze the relative chances that the three species
described by Linnaeus represent each of the seven species and subspecies
listed as being possibilities. This analysis is made by taking the statistics
of these forms, as deduced from scale counts now at hand, and calculating,
by the /-test, the significance of the difference between the population
mean and Linnaeus’ scale count. The formula is /= (M-X) /
This scheme involves the following assumptions: That the scale counts
(and sexes) as given by Linnaeus are accurate and were made by the
same methods as those used today; that the dispersions of these characters
are substantially normal, as seems to be the case (see Sec. 1 of this series) ;
and finally that the scale counts included in my samples represent the
areas from which Linnaeus’ specimens were derived, so that the tests are
not adversely affected by intransubspecific trends or dines.
With respect to the scale counts, aside from obvious slips, Linnaeus
seems to have been quite accurate. When we compare Andersson’s checks
on Linnean types we find that there is seldom a difference of more than
one in either the ventrals or subcaudals. Usually when there is a difference,
Andersson’s results are one higher than Linnaeus’. If there are any counts
especially to be doubted they are the 14-14 of both supralabials and in-
fralabials in the type of dryinas. This is a uniformity rarely met with in
actuality.
Admittedly these conditions render conclusions somewhat hazardous;
nevertheless, by this method we will be making the best use of the tenuous
numerical data which Linnaeus has supplied. Table 1 sets forth the statistics
resulting from studies of the species and subspecies considered to be
possible solutions.
2 R. A. Fisher: Statistical Methods for Research Workers, Seventh Edition, pp 104-
106, 1938.
86
Bulletin 17: Zoological Society of San Diego
TABLE 1
Statistics of Scale Counts
Ventrals Subcaudals
Subspecies
N
M
(T
N
M
cr
C. d. durissus
52
175.27
3.92
51
30.14
1.91
C. d. t err i ficus
18
170.39
3.26
18
28.44
2.18
C. unicolor
12
158.67
2.10
12
28.67
1.30
C. adamanteus
35
170.49
2.73
35
29.49
1.34
C. cinereous
147
178.63
3.24
147
25.88
1.47
C. h. horrid us
154
167.66
3.29
154
24.69
1.82
C. h. atricaudatus ....
24
171.04
3.52
25
26.72
1.86
Supralabials
Infralabials
Subspecies
N
M
~~