STATISTICS: The study of methods for collecting,
organizing, and analyzing data
oDescriptive Statistics: Procedures used to organize and
present data in a convenient and communicable form
oInferential Statistics: Procedures employed to
populations on the basis of samples
POPULATION: The complete set of actual or
SAMPLE: A subset of the population selected using
some sampling method
oSampling methods
-Cluster sample: A population is divided into
groups called clusters; some clusters are randomly
selected, and every member in them is observed
-Stratified sample: The population is divided into
strata, and a fixed number of elements of each
stratum are selected for the sample
-Simple random sample: A sample selected so that
each possible sample of the same size has an equal
probability of being selected; used for most

elementary inference
VARIABLE: An attribute of elements ofa population
or sample that can be measured; ex: height, weight,
IQ, hair colo~ and pulse rate are some of the many
variables that can be measured for people
DATA: Values of variables that have been
observed
oTypes of data
-Qualitative (or "categorical") data are descriptive
the color of an automobile
-Quantitative data take numeric values
-Discrete data take counting numbers (0, 1,2, ... ) as
values, usually representing things that can be
counted; ex: the number of fleas on a dog, the
number of times a professor is late in a semester
-Continuous data can take a range of numeric
values, not just counting numbers; ex: the height of
a child, the weight of a bag of beans, the amount of
time a professor is late
oLevels of measurement
-Qualitative data can be measured at the:
oNominal level: Values are just names, without any
order; ex: color of a car, major in college
oOrdinal level: Values have some natural order;
ex: high school class (freshman!sophomore!
junior/senior), military rank
-Quantitative data can be measured at the :
oInterval level: Numeric data with no natural zero
point; intervals (differences) are meaningful , but
ratios are not; ex: temperature in Fahrenheit
degrees; 80°F is 20°F hotter than 60°F, but it is not
150% as hot
oRatio level: Numeric data for which there is a
true zero; both intervals and ratios are
meaningful; ex: weight, length, duration, most
physical properties
STATISTIC: A numeric measure computed
from sample data, used to describe the sample and to
estimate the corresponding popUlation parameter

PARAMETER: A numeric measure that describes a
population; parameters are usually not computed, but
are inferred from sample statistics

FREQUENCY DISTRIBUTION

MEASURES OF DISPERSION

Provides the frequency (number of times observed)
of each value of a variable

SUM OF SQUARES (SS): The sum of squared
deviations from the mean
(" )2
2
2
L.,.X·
oPopulation SS: L(X;- f.1x) orLx; - ----,;-­

Table #1: Students in a driving class are polled
regarding number of accidents they've had:
(# of accidents) (frequency) (relative frequency)
x

f

RF

5
4

3
2

0.0526

3

9

0.1579

2

15

0.2632

1

16

0.2807

0

12

0.2105

0.0351

Table #2: The scores on a midterm exam are grouped
into classes:
class

f

cumulative freq.

90-99

4

80

80-89

18

76

70-79

31

58

19

27

50-59

7

8

40-49

1

1

MEAN: Most commonly used measure of central
tendency, usually meant by "average"; sensitive to
extreme values
SAMPLE MEAN
1 n

LX;

= I
number of the highest and lowest values; less
sensitive than ordinary mean
i

G

oWeighted mean: Computed with a L W;X;
weight multiplied to each value, making ; =
some values influence the mean more
L W;
heavily than others
;=I

d

MEDIAN: Value that divides the set so the same
number of observations lie on each side of it; less
sensitive to extreme values; for an odd number of
values, it is the middle value; for an even number, it
is the average of the middle two; ex: in Table 1, the
median is the average of the 28th and 29th observa­
tions, or 1.5
MODE: Observation that occurs with the greatest
frequency; ex: in Table 1, the mode is 1
1

I

(J

N

= N L(X;- f.1)

2

=I
STANDARD SCORES: Also known as Z-scores;
the standard score of a value is the directed number
of standard deviations from the mean at which the

MEASURES OF CENTRAL

TENDENCY

= Ii

I
n
2
oSample variance: s2= --=-1 L(x;- x)
n
; =I
oVariances for grouped data:
2
I G
2
-Population: (J = N Lf;(m;- f.1)

i = I

2
1
G
2
-Sample: s = --=-1 Lf;(m;- x)
n
; =I
STANDARD DEVIATION: The square root of the
variance; unlike variance, it has the same units as the
original data and is more commonly used:

i

CUMULATIVE FREQUENCY DISTRIBUTION:
Frequencies count all observations at a particular value
or class and all those less; ex: third column of Table 2

X

2

oSample ss: L(X;- x) orLx; - --nVARIANCE: The average of square differences
between observations and their mean
1 N
2
oPopulation variance: (J2= N L(x;- f.1)

ex: Pop. S,D.:

RELATIVE FREQUENCY DISTRIBUTION: Each
frequency is divided by the total number of observa­
tions to produce the proportion or percentage of the
data set having that value; ex: third column ofTable 1

POPULATION MEAN

(LX;)

2

; = I

GROUPED FREQUENCY DISTRIBUTION:
Values of the variable are grouped into classes

60-69

2

value is found; that is, z = x ~ J.l
oA positive z-score indicates a value greater than the
mean; a negative z-score indicates a value less than
the mean; a z-score of zero indicates the mean value
·Converting every value in a data set or distribution
to a z-score is called standardization; once a data set
or distribution has been standardized, it has a new
mean ~= 0, and a new standard deviation a = I

GRAPHING TECHNIQUES
BAR GRAPH: A graph that uses bars to indicate the
frequency of occurrence of observations
oHistogram: A bar graph used with quantitative,
continuous variables
FREQUENCY CURVE: A graph representing a
frequency distribution in the form of a continuous
line that traces a histogram
oCumulative frequency curve: A continuous line
that traces a histogram where bars in all the lower
classes are stacked up in the adjacent higher class;
cannot have a negative slope
oSymmetric curve: The frequency curve is unchanged
if rotated around its center; median = mean
oNormal curve: Bell-shaped curve; symmetric
-Skewed curve: Deviates from symmetry; frequency
curve is shifted with a longer "tail" to the left (mean
< median) or to the right (mean > median)

~ tA?ArSkd

-10

~~ tt
-10

-5

0

+5

+10

SKEWED CURVE

-5

o

+5

+\0

PROBABILITY

STATISTICAL INFERENCE

A measure of the likelihood of a random event; the
long-term relative frequency with which an outcome
or event occurs
Probability of occurrence of Event A
(A) = Number of outcomes favoring Event A
p
Total number of outcomes
·Sample space: All possible simple outcomes of an
experiment
• Relationships between events
-Exhaustive: 2 or more events are said to be
exhaustive if they represent all possible outcomes
·Symbolically, peA or B or...) = I
-Non-exhaustive: Two or more events are said to be
non-exhaustive if they do not exhaust all possible
outcomes
-Mutually exclusive: Events that cannot occur
simultaneously: peA and B) = 0, and peA or B) =
peA) + PCB); ex: males, females
-Non-mutually exclusive: Events that can occur
simultaneously: peA or B) = peA) + PCB) - peA and
B); ex: males, brown eyes
-Independent: Events whose probability is
unaffected by occurrence or nonoccurrence of each
other: P(AIB) = peA); P(BIA) = PCB); and peA and
B) = P(A)P(B); ex: gender and eye color
-Dependent: Events whose probability changes
depending upon the occurrence or non-occurrence
of each other: P(AIB) differs from peA); P(BIA)
differs from PCB); and peA and B) = peA) P(BIA) =
PCB) P(AIB); ex: race and eye color
·Joint probabilities: Probability that 2 or more
events occur simultaneously
• Marginal probabilities or unconditional
probabilities = summation of probabilities
·Conditional p robabilities: Probability of A given
the existence of S, written, P(AIS)
•Ex: Given the numbers I to 9 as observations in a
sample space:
-Events mutually exclusive and complementary;
ex: P(all odd numbers); P(all even numbers)
-Events mutually exclusive but not complementary;
ex: P(an even number); P(the numbers 7 and 5)
- Events neither mutually exclusive or exhaustive;
ex: P(an even number or a 2)

-A random variable takes numeric values randomly,
with probabilities specified by a probability distri­
bution (or density) function
• Discrete random variables: Take only distinct
values (as with quantitative data)
• Binomial distribution: A model for the number (x)
of successes in a series of n independent trials where
each trial results in success with probability p, or
failure with probability 1 - p; ex: The number (x) of
heads ("successes") obtained in 12 (n) tosses of a
fair (probability of heads = p = 0.5) coin
P(x)=nCx P x(l_p)n-x where P(x) is the probability
of exactly x successes out of n trials with a constant
probability p of success on each trial;
nCx = n!/(n-x)!x!
-Binomial mean: 11 = np
-Binomial variance: (J2 = np(l- p)
-As n increases, the binomial approaches the
Normal distribution
•Hypergeometric distribution:
- Represents the number of successes from a series
of n trials where each trial results in success or
failure
- Like the binomial, except that each trial is drawn
from a small population with N elements split
between NI successes and Nz failures
- Then the probability of splitting the n trials
between x I successes and Xz failures is:
P(
XI

an

NI!
N2!

d ) - xl!(N I - X I)!X2!(Nz- X 2)!

X2 N!
n! (N - n)!

-Hypergeometric

mean:

nNI

111 = E(xl) = Nand

variance' O'z= N - n[nNI][Nz]
.

N-I

N

N

·Poisson distribution: A model for the number of
occurrences of an event x=0,l,2,... , counted over
some fixed interval of space and time rather than
some fixed number of trials; the parameter is the
average number of occurrences, A, for x=0,1,2,3, ... ,
and >0, otherwise P(x)=O

-AA x

p (x) = _e_,_ Poisson mean and variance: 1.

x.

FREQUENCY TABLE
Event C

Event D

I Totals

Event E

52

36

87

Event F

62

71

133

Totals

114

106

220

EX: Joint Probability Between C and E
p(C & E) = 52/220 = 0.24
JOINT, MARGINAL & CONDITIONAL
PROBABILITY TABLE
EventC

Event D

Event E

0.24

0.16

0.40

(CfE)=O.60
(DfE)=O.40

Event F

0.28

0.32

0.60

(CIF)=O.47
(D/F)=O.53

Marginal
Probability

0.52

0.48

Marginal Conditional
Probability ProbablHty

1.00

Conditional (E/C)=O.46 (EfD)=O.33
Probability (F/C)=O.54 (FfD)=O.67
Sampling distribution: A theoretical probability
distribution of a statistic that would result from
drawing all possible samples of a given size from
some population

- A continuous random variable may take on any
value along an uninterrupted interval of a number line
- Probabilities are measured only over intervals,
never for single values; the probability that a
continuous random variable falls between two
values is exactly equal to the area under the density
curve between those two values
•Normal distribution: Bell curve; a distribution
whose values cluster symmetrically around the
mean (also median and mode); common in nature
and important in making inferences
- The density curve is the graph of:

I(x) = __
1_ e -

0'/2ii

(x -

Ji )' /2(5' where f(x) =

frequency at a given value
(J = standard deviation of the normal distribution
J1 = the mean of the normal distribution
x = value of normally distributed variable
·Standard normal distribution: A normal distri­
bution with a mean of 0, and standard deviation of I;
values following a normal distribution can be
transformed to the standard normal distribution by
using z-scores [see Measures of Dispersion, page I]

2

• In order to make inferences about a population,
which is unobserved, a random sample is drawn
-The sample is used to compute statistics, which are
then used to draw probability conclusions about the
parameters of the population
Population
(unobserved)

'andom mmpUng

measured by

Parameters
(unknown)

..

statistical inferellce

Sample
(observed)
measured by

Statistics
(known)

BIASED & UNBIASED
ESTIMATORS
• Unbiased estimator of a parameter: An estimator
(sample statistic) with an average value equal to the
value of the parameter; ex: the sample mean is an
unbiased estimator of the population mean; the
average value of all possible sample means is the
population mean; all other factors being equal, an
unbiased estimator is preferable to a biased one
• Biased estimator of a parameter: An estimator
(sample statistic) that does not equal on the average
the value of the parameter; ex: the median is a biased
estimator, since the average of sample medians is not
always equal to the population median; variance
calculated from a sample, dividing by n, is a biased
estimator of the population variance; however, when
calculated with n-I it is unbiased
- Note: Estimators themselves present only one source
of bias; even when an unbiased estimator is
used, bias in the sample (elements not all
equally likely to be chosen) may still be present
- Elementary methods of in ference assume
unbiased sampling
-Sampling distribution: The probability distribution
of a sample statistic that would resul t from drawing
all possible samples of a given size from some
population; because samples are drawn at random,
every sample statistic is a random variable, and
has a probability di stribution that can be described
using mean and standard deviation
·Standard error: The standard deviation of the
estimator; do not confuse this with the standard
deviation of the sample itself; measures the
variability in the estimates around their expected
value, while the standard deviation of the sample
refl ects the variability within the sample around the
sample mean

-The standard deviation of all possible sampl e means
of a given sample size, drawn from the same
population, is called the standard error of the
sample mean
-If the population standard deviation (J is known , the
standard error is: O' x= ~
';11

-Usually, the popUlation standard devi ation s is
unknown, and is estimated by s; in thi s case, the
.
d stan dard
estImate
error'IS: 0' ,,'" s" = ~

in

-Note: in either case, the standard error of the sample
mean decreases as sample si ze is increased - a larger
sample provides more reliable information about the
population

HYPOTHESIS TESTING

•In a hypothesis test, sample data is used to accept or
reject a null hypothesis (Ho) in favor of an
alternative hypothesis (H .); the significance level
at which the null hypothesis can be rejected
I....... indicates how much evidence the sample
provides against the null hypothesis
•Null hypothesis (Ho): Always specifies a
value (the null hypothesis value) for a
population parameter; the null hypothesis is
assumed to be true-this assumption underlies
the computations for the hypothesis test; ex:
Ho: "a coin is unbiased," that is, the proportion
of heads is 0.5: Ho: P = 0.5
•Alternative hypothesis (H.): Never specifies a
value for a parameter; the alternative hypothesis
states that a population parameter has some value
different from the one specified under the null
hypothesis; ex: H.: A coin is biased; that is, the
proportion of heads is not 0.5: HI: p 0.5
I. Two-tailed (or nondirectional): An alternative
hypothesis (H 1) that states only that the
population parameter is simply different from
the one specified under Ho; two-tailed probability
is employed; ex: To use sample data to test
whether the population mean pulse rate is
different from 65, we would use the two-tailed
hypothesis test Ho: ~ = 65 vs. HI: ~ 65
2. One-tailed (or directional): An alternative
hypothesis (H t) that states that the population
parameter is greater than (right-tailed) or less
than (left-tailed) the value specified under Ho; one­
tailed probability is employed; ex: to use sample
data to test whether the population mean pulse rate
1_
is greater than 65, we would use the right-tailed
I/"
hypothesis test Ho: ~ = 65 vs. H.: ~ > 65
• The alternative hypothesis HI is also
sometimes known as the "research hypothesis,"
as only claims expressed as alternative
hypotheses can be positively asserted
• Level of significance: The probability of observing
sample results as extreme or more extreme than
those actually observed, under the assumption the
null hypothesis is true; if this probability is small
enough, we conclude there is sufficient evidence to
reject the null hypothesis; two basic approaches:
I. Fixed significance level (traditional method): A
level of significance a is predetermined; commonly
used significance levels are 0.0 I, 0.05, and 0.10
• Thesmaller the significance level a, the higher the
standard for rejecting Ho; critical value(s) for the
test statistic are determined such that the probability
of the test statistic being farther from zero than the
critical value (in one or two tails, depending on HI)
is a; if the test statistic falls beyond the critical
value-in the rejection region- then Ho can be·
rejected at that fixed significance level a
2. Observed
significance
level
(p-value
method): The test statistic is computed using
the sample data, then the appropriate probability
distribution is used to find the probability of
observing a sample statistic that differs at least
that much from the null hypothesis value for the
population parameter (the probability value, or
~
p-value); the smaller the p-value, the better the
evidence against Ho
I ~
·This method is more commonly used by
computer applications
·The p-value also represents the smallest signifi­
cance level a at which Ho can be rejected; thus,
p-value results can be used with a fixed signifi­
cance level by rejecting Ho ifp-va[ue ~ a

*

*

• Generally, the larger (farther from zero, positive
or negative) the value of the test statistic, the
smaller the p-value will be, providing better
evidence against the null hypothesis in favor of the
alternative
• Notion of indirect proof: Through traditional
hypothesis testing, the null hypothesis can never be
proven true; ex: if we toss a coin 200 times and tails
comes up exactly 100 times, we have no evidence
the coin is biased, but cannot prove the coin is fair
because of the random nature of sampling-it is
possible to flip an unfair coin 200 times and get
exactly 100 heads, just as it is possible to draw a
sample from a population with mean 104.5 and find
a sample mean of 101; failing to reject the null
hypothesis does not prove it true and rejecting it
does not prove it false
·Two types of errors
- Type I error: Rejecting Ho when it is actually true;
the probability of a type I error is given by the
significance level a; type I is generally more
prominent, as it can be controlled
- Type II error: Failing to reject Ho when it is
actually false; the probability of a type II error is
denoted Ii; type II error is often (foolishly)
disregarded: it is difficult to measure or control, as
Ii depends on the unknown true value of the
parameter in question, which is not known
True Status of Ho

Statistical

Hypotheses

Ho True

Ho False

, - - - - - - r - - - - - 1- - - - -t--.---.---.--.-----.­

"0
Reject "0

Accept

Decision:

Correct (l-a)

i Type (~)
II error

Type I error

Correct (1- ~)

(al

CENTRAL '-IMIT THEOREM
(for sample mean x)
If XI, X2, X3, ... x n , is a simple random sample of n
elements from a large (infinite) population, with
mean Il( m) and standard deviation (J, then the distri­
bution of x takes on the bell shaped distribution of a
normal random variable as n increases and the distri­
bution of the ratio:

x ~ f1

I"rn

approaches the standard

normal distribution as n goes to infinity; in practice,
a normal approximation is acceptable for samples of
size 30 or larger

INFERENCE FOR

POPULATION MEAN USING

THE Z·STATISTIC

(0 KNOWN)

Requires that the sample must be drawn from a
normal distribution or have a sample size (n) of at
least 30
• Used when the population standard deviation (J is
known: If (J is known (treated as a constant, not
random) and the above conditions are met, then the
distribution of the sample mean follows a normal
distribution, and the test statistic z follows a
standard normal distribution: Note that this is rarely
the case in reality. and the t-distribution is more
widely used

3

·The test statistic is

z=

x;;:

where

~

=

population mean (either known or hypothesized
under Ho) and (J" x = (J" /Iii
·Critical region: The portion of the area under the
curve which includes those values of the test statistic
that provide sufficient evidence for the rejection of
the null hypothesis
- The most often used significance levels are 0.01,
0.05, and 0.1; for a one-tailed test using z-statistic,
these correspond to z-values of 2.33, 1.65, and 1.28
respectively-positive values for a right-tailed test,
negative for a left-tailed test
•For a two-tailed test, the critical region for a =
0.01 is split into two equal outer areas marked by
z-values of 12.581; for a = 0.05, the critical values
of z are 11.961, and for a = 0.10, the critical values
are 11.651
-Ex 1: Given a population with (J = 50, a simple
random sample of n = 100 values is chosen with a
sample mean X of 255; test using the p-value
method Ho: ~ = 250 vs. HI: ~ > 250; is there
sufficient evidence to reject the null hypothesis?
• In this case, the test statistic z

(255-250)/(501'" I 00) = 1.00

•Looking at Table A, the area given for z = 1.00 is
0.3413; the area to its right (since H. is ">", this
is a right-tailed test) is 0.5 - 0.3413 = 0.1587, or
15.87%
·This is the p-value: the probability, if Ho is true
(that is, if ~ = 250), of obtaining a sample mean of
255 or greater; it also represents the smallest
significance level a at which Ho can be rejected
•Since, even if Ho is true, the probability of
obtaining a sample mean ~ 255 from this
popUlation with a sample of size n = 100 is about
16%, it is quite plausible that Ho is true- there is
not very good evidence to support the alternative
hypothesis that the population mean is greater
than 250--so we fail to reject Ho
·It can't even be rejected at the weakest common
significance level of a = 0.10, since 0.1587 >
0.10; remember, this doesn't prove the population
mean to be equal to 250; we just haven't accumu­
lated sufficient evidence against the claim
-Ex 2: A simple random sample of size n = 25 is
taken from a population following a normal distri­
bution with (J = 15; the sample mean x is 95; use
the p-value method to test Ho: ~ = 100 vs. H.: 1.1
100; is there sufficient evidence to reject the claim
that the population mean is 100 at a significance
level a of 0.1 O? At a = 0.05?
•In this case, the test statistic z = (95 ­
100)/(15/"'25) = - 5/3 = - 1.67
·Since the normal curve is symmetric, we can look
up a z-score of 1.67 - the value in Table A is
0.4525, that is, P(O < z < 1.67) = P( -1.67 < z < 0)
= 0.4525
-Thus, P(z < -1.67) = P(z > 1.67) = 0.5 - 0.4525 =
0.0475
• Since this is a two-tailed test (HI: ~* 100), the p­
value is twice this area, or 0.095
• Since the p-value = 0.095 < 0.10 = a, there is
sufficient evidence to reject the null hypothesis at
a significance level a of 0.10, but in the second
case, the p-value = 0.095 > 0.05 = a, so the
sample data are not strong enough to reject at the
higher (0.05) level of significance

*

~

Table A
Normal Curve Areas
Area from mean to

-Ex: A simple random sample of size 25 is taken from a population following a
normal distribution, with a sample mean 42, and the sample standard deviation
7.5; test at a fixed significance level a = 0.05: "0: 11 = 45 vs. HI: 11 > 45

z

·This is a left-tailed test (H ( 11 < 45), so the critical value and rejection region

~
0
Z
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 .0
1.1
1 .2
1.3
1.4
1 .5
1.6
1.7
1.8
1 .9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0

.00

.01

.02

.03

.04

z
.05

.06

.07

.08

.09

.0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
.0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
.0793 .0832 .0871 .0910.0948.0987 .1026 .1064 .1103 .1141

will be negative
·Consulting Table B to find the appropriate critical value, with df = n - I = 24,
produces a critical value of - 1.711; the null hypothesis can be rejected at a =
0.05 if the value of the test statistic t < -1.711
·The test statistic t = (42 - 45)/(7 .5/-.,125) = -311.5 = - 2; since this is less than
the critical value of - 1.711 , HO is rejected at a = 0.05

Table B
Critical Values of t

.1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
.1554 .1591 .1628 .1664 .1700.1736 .1772 .1808 .1844 .1879

Values indicate area to right of ta

.1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
.2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
.2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852

~

.2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
.3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
.3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621

A*
B'

.3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
.3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
.4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
info

.4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
.4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
.4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
.4554 .4564 .4573 .4582 .4591 .4599 .4608
.4641 .4649 .4656 .4664 .4671 .4678 .4686
.4713 .4719 .4726 .4732 .4738 .4744 .4750
.4772 .4778 .4783 .4788 .4793 .4798 .4803

.4616
.4693
.4756
.4808

.4625
.4699
.4761
.4812

.4633
.4706
.4767

.4817
.4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
.4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
.4893 .4896 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916
.4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
.4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952
.4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
.4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
.4974 ~4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981
.4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986
.4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990

INFERENCE FOR POPULATION MEAN
USING THE t·STATISTIC
(a UNKNOWN)
Requires that the sample must be drawn from a normal distribllfion or
have a sample size (n) ofat least 30
·When (J is not known- as is usually the case-it is estimated from s,
the sample standard deviation
•Because of the variability of both estimates- the sample mean as well
as the sample standard deviation- the test statistic follows not a z-dis­
tribution, but at-distribution
·Comparison between t- and z-distributions
-Although both distributions are symmetric about a mean of zero, the t­

distribution is more spread out than the normal distribution, producing

a larger critical value of t as the boundary for the rejection region
-The t-distribution is characterized by its degrees of freedom (dt),
referring to the number of values that are free to vary after placing cer­
tain restrictions on the data
•For example, if we know that a sample of size 4 produces a mean of
87, we know that the sum of the numbers is 4 * 87 = 348; this tells us
nothing about the individual values in the sample- there are an infi­
nite number of ways to get four numbers to add up to 348; but as soon

as we 've chosen three of them, the fourth is determined

•For instance, the first number might be 84, the second 98, and the

third 81; but if the first three numbers are 84, 98, and 81 , then the
fourth must be 85, the only number producing the known sample
mean-that is, there are n- I or 3 degrees of freedom in this example
- For a test about a population mean, the t-statistic follows at-distribution
with n -1 df
•As df increases, the t-distribution approaches the standard normal z­
distribution
-The test statistic t used for testing hypotheses about a population mean

·

IS: t

x- j.1 were
h
S
=~
11 = i
popu '
atlon mean under H 0 an d Sx= in

Note: This is not so dif.forentfrom the test statistic z used when (J is known!

uares
ns­

0.1
0.2
3.078
1.886
1 .638
1.533
1.476
1.440
1.415
1.397
1 .383
1.372
1.363
1 .356
1.350
1.345
1.341
1.337
1.333
1 .330
1 .328
1 .325
1 .323
1.321
1 .319
1 .318
1 .316
1.315
1.314
1.313
1.311
1.310
1 .282

0.05
0 .1
6.314
2.920
2.353
2 . 132
2.015
1.943
1 .895
1.860
1 .833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1 .725
1.721
1.717
1 .714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.645

0.025
0 .05
12.706
4 .303
3.182
2.776
2.571
2 .447
2.365
2.306
2.262
2.228
2.201
2 . 179
2 .160
2.145
2.131
2.120
2.110
2.101
2.093
2 .086
2 .080
2.074
2.069
2 .064
2.060
2.056
2.052
2 .048
2.045
2 .042
1 .960

0.01
0.02
31.821
6.965
4 .541
3.747
3.365
3.143
2 .998
2.896
2.821
2 .764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2 .508
2 .500
2.492
2 .485
2.479
2.473
2.467
2.462
2.457
2.326

0.005­
0 .01
63.657
9 .925
5.841
4.604
4.032
3.707
3 .499
3 .355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2 .921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
2.576

A * = Level of significance for one-tailed test
B* = Level of significance for two-tailed test
Note: The t-distribution is a robust alternative to the z-distribution when testing for the
population mean: inferences are likely to be valid even if the population distribution is
far from normal; however, the larger the departure from normality in the population,
the larger the sample size needed for a valid hypothesis test using either distribution

sider
ceds
can

CONFIDENCE INTERVALS

Confidence interval: Interval within which a population parameter is likely to be
found; determined by sample data and a chosen level of confidence (I - a - [a
refers to the level of significance])
·Common confidence levels are 90%, 95%, and 99%, just as common levels of
significance are 0.10, 0.05, and 0.01
• (I - a) confidence interval for 11:
Xi -

Zal2(Yrn) :S j.1 :S X + Zal2(Yrn) where Za/2 is the value of the standard

normal variable z that puts an area a/2 in each tail of the distribution

·A t-statistic should be used in place of the z-statistic when (J is unknown and s
must be used as an estimate

·Ex: Given X = 108, s= 15, and n =26, estimate a 95% confidence interval for the

population mean

- Since the population variance is unknown, the t-distribution is used

- The resulting interval, using at-value of2.060 from Table B (row 25 of the mid­
dle column), is approximately 102 to 114

- Consequently, any null hypothesis that 11 is between 102 to 114 is tenable based

on this sample
-Any hypothesized 11 below 102 or above 114 would be rejected at 0.05 significance
4

3

-
... ---­
_

~

l"'"1l'!'1I~~CJ~"'

COMPARING
POPULATION MEANS

~

-Sampling distribution of the difference
between means: If a number of pairs of sam­
III pIes were taken from the same population or
II from two different populations, then:
-The distribution of differences between
"
...
pairs of sample means tends to be normal
(z-distribution)
- The mean of these differences between means
~ f.i x I - X 2 is equal to the difference between
the population means, that IS 111 - 112
-Independent samples
- We are testing whether or not two samples
are drawn from populations with the same
mean, that is, HO: 111 = 112, versus a one- or
two-tailed alternative

- When 01 and 02 are known, the test statistic
z follows a standard normal distribution
under the null hypothesis
- The standard error of the difference between

Z

=

means

CY x

I

-Homogeneity of variances (a criterion for the pooled
2-sample t-test): The condition that the variances of
two populations are equal; to establish homo~eneity
of variances, test HO: 0)2 =
vs. H f 0) oF- 022
(note that this is e~uivalent to testing
HO: o)2 / 02 2 = ) vs. Hfo) /°2 2 oF- 1)
- Under the null hypothesis, the test statistic s 12/S22
follows an F-distribution with degrees of freedom:
(n I - I , n2 - 1); if the test statistic exceeds the critical
value in Table C, then the null hypothesis can be
rejected at the indicated level of significance

oi

difference in means, the following statistic
can be used for hypothesis tests:

(x,- x z)(f.1, -

f.1z)

Top row = .05; bottom row = .0 I;

Points for distribution of F

Degrees of freedom for numerator

- x
- When 01 and 02 ~re unknown, which is usu­
ally the case, substitute s 1 and s2 for ° I and
02, respectively, in the above formulas, and
....
use the t-distribution with df= n I + n2 - 2
"IlIIIII - Pooled t-test
Both populations have normal distributions
-n < 30
-Requires homogeneity of variance: 01 and
02 are not known but assumed equal- a
risky assumption!
~ -Many
statisticians do not recommend the t­
distribution with pooled standard error, the
above approach is more conservative
-The hypothesis test may be 2 tailed (= vs. oF-) or
1 tailed: III :5112 and the alternative is
111 >112 (or 11) ~112 and the alternative is

111 < 112)

-Degrees of freedom (df):
(nl-)+(nT1)= n 1+nT2
-Use the given formula below for estimating
<7 ~ - ~ to determine s ~ - x2
-DetermIne the critical region for rejection by
assigning an acceptable level of significance
and looking at the Hable with
df=nl +n2-2
-Use the following formula for the estimated
standard ec;r:.:.r: :. o: . :r:_ _ _ _ _ _~,..._--

Z -

=

s-_-= l(n,-os,z+ (nz - l)si][n,+ nz]
XI
X,
n, + nz-2
n,nz

~

Z
IU
~

-Matched pairs: When making repeated

measurements of the same elements, we can

test for the mean difference
-For instance, clients of a weight-loss pro­
gram might be weighed before and after the
program, and a significant mean difference
ascribed to the effectiveness
-Standard error of the mean difference
-General formula: sd =

J,; where sd is

the standard deviation of differences :

=

~

S;t=

- We can then test HO:

2

3

4

5

6

7

8

200

216

225

230

234

237

239

9

10

241 1 242

4052 4999 5403 5625 5764 5859 5928 5981 6022 16056

2

..

3

.S
E

98.49 99.01 99.17 19925 1 9930 1 99 .33 99.34 99.36 99.38 99.40
10.13 9.55 9.28 1 912 I 9.01 I 894 8.88 8.64 8.81 8.78

two-tailed alternative by using a t-test statistic

distribution, in which case: z =

7.71

5

~

E
Q

1
.::...."
Q

6

~

7

."

6.94

6.59

6.39

6.26

6.16

6.08

6.04

6.00

In(lp -- nn) In

5.96

21.20 18.00 16.69 11598 15.52 15.21 14.98 14.80 14.66 14.54

~ 8

9

6.61

5.79

5.41

5.19

5.05

4.88

4.95

4.82

4.78

4.74

16.26 13.27 12.06 11 .39 10.97 10.67 10.45 10.27 10.15 10.05
5.99

5.14

4.53

4.39

4.28

4.21

4.15

13.74 10.92 9.78

9.15

8.75

8.47

8.Z6

8.10

7.98

7.87

5.59

4.12

3.97

3.87

3.79

3.73

3.68

3.63
6.62

4.74

4.76

4.35

12.25 9.55

8.45

7.85

7.46

7.19

5.32

4.46

4.07

3.84

3.69

3.58

11.26 8.65

7.59

7.01

6.63

6.37

1

4.10

4.06

7.00

6.64

6.71

3 .50

3.44

3.39

3.34

6.19

6.03

5.91

5.82

5.12 4.26

3.86

3.63

3.48

3.37

3.29

3.23

3.18

3.1 3

10.56 8.02

6.99

6.42

6.06

5.80

5.62

5.47 I 5.35

5.26
2.97

4.10

3.71

3.48

3.33

3.22

3.14

3.07

3.02

10 10.04 7.56

6.55

5.99

5.54

5.39

5.21

5.08

4.95 14.85

4.96

• Correlation: A relationship between two variables
- The correlation coefficient r (also known as the
" Pearson
Product-Moment
Correlation
Coefficient") is a measure of the linear (straight­
line) relationship between two quantitative vari­
ables
- Ex: Given observations to two variables X and
Y, we can compute their corresponding sums of
squares:
SSx = I,(x-(x- x)/sx)2 and SSy = I,(y-(y- y)/s i
-The formulas for the Pearson correlation (r):

Dx - x)(y ANALYSIS OF VARIANCE

(ANOVA)
- Purpose: To determine whether any significant dif­

ference exists between more than two group means
- Indicates possibility of overall mean effect of the
experimental treatments; does not specify which of
the means are different
-ANOVA: Consists of obtaining independent esti­
mates from population subgroups
- The total sum of squares is partitioned into known
components of variation
- Partition of variances
-Between-group variance (BGV): Reflects the
magnitude of the difference(s) among the group

means

- Within-group variance (WGV): Reflects the dis­
persion within each treatment group; also referred
to as the error term
-Test
- When the BGV is large relative to the WGY, the F­
ratio will also be large

BGV

=

nI(X: -X;t)2
th
k'- 1 0
where Xi = mean of i

treatment group and Xtot = mean of all n values
across all k treatment groups

SSI+ SS2+ . .. + SSk
WGV =
n _ k
where the SS's are

IJ.d = 0 versus a one- or

proportion standard error of In (I - n) In
As the sample size increases, it concentrates more
around its target mean; also gets closer to the nomlal

4

c

..."...

t

18.51 19.00 19.16 19.25 19.30 19.33 19.36 1937 19.38 19.39

In random samples ofsize n, the sample proportion
p
fluctuates
around
the
population
.
.h
.
f n (l - n )
proportIOn p WIt a vanance 0
n

34.12 30.81 29.46 1 28.71 28.24 27.91 27.67 27.49 27.34 27.23

~

IU

~

1
161
1

Q

CY x

-USING F-RATIO: F=BGV/WGV
- Degrees of freedom are k- I for the numerator
and n- k for the denominator
-If BGV>WGY, the experimental treatments are
responsible for the large differences among
group means
-Null hypothesis: The group sample means are
all estimates of a common population mean;
that is, HO:I1) = 112 = 113 = ... = l1k ' for all k
treatment groups, vs. HI : at least one pair of
means is different (determining which pair(s)
are different requires follow-up testing)

Table C Critical Values of F

_ x 2 = j(cy~) In,+ (CY~) Inz

- Where (Ill - 112) represents the hypothesized

Z=

r..rl] .'1 I :J.!.':i I ~ [tIl'/.!.':i r.!., ~ [~+"

the sums of squares [see Measures of Central
Tendency, page I] of each subgroup's values
around the subgroup mean

5

r =

y)

jSSx. SS)'

=

LXY _ CLX~LY)
r =--rF====~"g,,===~

[LX2- (L;) ZllLy

2-

(L,;) 2]

Note that -I :5 r :5 1 for any data set; when r =
1, the data are said to have perfect positive cor­
relation- if plotted, they would form a straight
line with positive (upward) slope; when r =-\.
the data are said to have perfect negative corre­
lation- if plotted, they would form a straight
line with negative (downward) slope; if r = 0,
the data are said to have no linear correlation;
(it is possible, of course, that they are related in
some other way)
Note: It is possible, of course,for a random sam­
ple ji"Olll a population >,vith ::.ero correlation 10
produce by chan ce a sample with
O!

r'"

CHI-SQUARE ( X2) TESTS
- Most widely used non-parametric test
-The X2 mean = its degrees of freedom
-The X2 variance = twice its degrees of freedom
-Can be used to test independence, homogeneity,
and goodness-o f- fit
-The square of a standard normal variable is a
chi-square variable with df = I
- Like the t-distribution, the shape of the di stribu­
tion depends on the value of df

CHI-SQUARE (X2) TESTS (continued)
DEGREES OF FREEDOM (dt) COMPUTATION
•If chi-square tests for the goodness-of-fit to a
hypothesized distribution (uses frequency distri­
bution), df = g ~ I, where g = number of groups, or
classes, in the frequency distribution.
•If chi-square tests for homogeneity or independence
(uses two-way contingency table), df = (#of
rows ~ I)(# of columns ~ I)

Regression is a method for predicting values of
one variable (the outcome or dependent variable)
on the basis of the values of one or more
independent or predictor variables; fitting a
regression model is the process of using sample
data to determine an equation to represent the
relationship

GOODNESS-OF-FIT TEST: To apply the chi-square
distribution in this manner, the critical chi-square value
is expressed as:

L

if

~ 1)2
0

Ie '

frequency of the variable, fe = expected frequency
(based on hypothesized population distribution)
TESTS OF CONTINGENCY: Application of chi­
square tests to two separate populations to test statis­
tical independence of attributes
TESTS OF HOMOGENEITY: Application of chi­
square tests to two samples to test if they came from
populations with like distributions
RUNS TEST: Tests whether a sequence (to comprise a
sample) is random; the following equations are applied:
2nl1l2 1 d
( 'R- ) =--~+
an SR
1l1+1l2

21111l,(2111112 ~ 1I1~1I,)

h

- were

)

(1I1+1l2)-(nl+n2~1)

R = mean number of runs
nl = number of outcomes of one type
n2 = number of outcomes of the other type
SR = standard deviation of the distribution of the
number of runs

HYPOTHESIS TEST FOR

LINEAR CORRELATION

With a simple random sample of size n producing a
sample correlation coefficient r, it is possible to test
for the linear correlation in the population, P; that is,
we conduct the hypothesis test Ho: P = Po, versus a
right-, left-, or two-tailed alternative; usually we are

interested in determining whether there is any linear
correlation at all; that is, po = 0
. . .
Th e test statIstIc
IS: t

=/

(I' -

(I

Po)

~r2)/(n~2)

which follows a t-distribution with n ~ 2 degrees of
freedom under Ho; this hypothesis test assumes that
the sample is drawn from a population with a bivariate
normal distribution.
'Ex: A simple random sample of size 27 produces a
correlation coefficient r = -0.41; is there sufficient

evidence at a = 0.05 of a negative linear

relationship?

-Since we're testing for a negative linear relationship,

we need a left-tailed test: Ho: P = 0 vs. HI: P < 0;

the critical value can be found trom the t-distri­

bution with n ~ 2 = 25 df, and one-tailed a = 0.05;

since this is a left-tailed test, we take the negative:

-1.708; that is, if the test statistic is less than

~I.708 , we conclude that there is sufficient

evidence of a negative linear relationship

The test statistic t =

- 0.41

)(1 ~(~ 0.412)/(27 -2))

SIMPLE LINEAR

REGRESSION

where fo = observed

~ 2.248, allowing us

to reject the null hypothesis of no linear correlation
and support the alternative hypothesis of a negative
linear correlation at a = 0.05
ISBN~13: 978-157222944-0
ISBN-10. 157222944-6

911~ 111,1ll ~~llllll~I!I!~llllrlllllr Ilil lI

In a simple linear regression model, we use only one
predictor variable and assume that the relationship to
the outcome variable is linear; that is, the graph of the
regression equation is that of a straight line; (we often
refer to the "regression line"); for the entire
population, the model can be expressed as:
y = ~ + ~lX + e
y is called the dependent variable (or outcome
variable), as it is assumed (0 depend on a linear
relationship to x
x is the independent variable, also called the
predictor variable
~ is the intercept of the regression line; that is, the

predicted value for y when x = 0
~I is the slope of the regression line-the marginal
change in y per unit change in x

e refers to random error; the error term is assumed (0
follow a normal distribution with a mean olzero and
constant variation-that is, there should be no
increase or decrease in dispersion for different
regions along the regression line; in addition , it is
assumed that error terms are independent for
different (x, y) observations
On the basis of sample data, we find estimates bo
and bl of the intercept ~o and slope ~l; this gives
us the estimated (or sample) regression equation
y= bo+ blX
The parameter estimates bo and bl can be derived in
a variety of ways; one of the most common is known
as the method of least squares; least squares
estimates minimize the sum of squared differences
between predicted and actual values of the dependent
variable y
For a simple linear regression model , the least
squares estimates of the intercept and slope are:
estimated slope = b 1 = SSxy 1 SSx
estimated intercept = bo =

Y~

bIx

These estimates-and other calculations
regression-involve sums of squares:

in

SSxy = I(x ~ x)(y ~ y) = Ixy ~ (Ix)(Iy)/n
SSx = I(x

~

x)2 = I(x 2) ~ (.h)2/n

SSy = I(y ~ Y)2 = I(y2) ~ (Iy)2/n
Ex: A simple random sample of 8 cars provides the
following data on engine displacement (x) and
highway mileage (y); fit a simple linear regression
model

U.S. \$5.95 I CAN.\$8.95
Customer Hotline # 1.800.230.9522

J'lUtldredS of tltres at

qUlcKsluay.com

(displacement) (mileage)
x
y
x2
5.7
18 32.49
2.5
19
6.25
3.8
20
14.44
19
7.84
2.8
4.6
17 21.16
1.6
32
2.56
1.6
29
2.56
1.4
30
1.96
SUMS: 24
184 89.26

y2
xy
324 102.6
361
47.5
400
76
361
53.2
78.2 ,....,.
289
,.,
1024 51.2
841
46.4
900
42
4500 497.1

Fitting a model entails computing the least-squares
estimates bo and bl; note that there are 8 observationsthat is, n = 8
First, SSxy = Ixy ~ (Ix)(Iy)/n = ~54.9, SSx = I(x 2 )
CIx)2/n = 17.26, and SSy = I(y2) ~ (Iy)2/n = 268

~

Then the estimated slope is bl = SSxy/SSx = ~3.18 ,
and (he estimated intercept is bo = Y~ b l X = 32.54
The estimated regression model, then , is mileage =
32.54 ~ 3.18 displacement

SIGNIFICANCE OF A

REGRESSION MODEL

We can assess the significance of the model by testing to
see if the sample provides sufficient evidence of a linear
relationship in the population; that is, we conduct the
hypothesis test: Ho: ~1 = 0 versus HI: ~I;C 0; this is
exactly equivalent to testing for linear correlation in the
population: Ho: P = 0 versus HI: p;c 0; the test for
correlation is somewhat simpler:

.

. .

The correlatIOn coeffiCient r = /

SSxy

~SSx.SSy

The test statistic t =

(r~O)

I

f(l ~r 2)/(n ~ 2)

=

= -0.8072

~3.350

Consulting Table S, with degrees of freedom
=n~2=6 , we obtain a critical value of 3.143 at
a=0.02, and a critical value of 3.707 at a=O.OI;
since we have a two-tailed test, we should consider
the absolute value of the test statistic, which exceeds
3.143, but does not exceed 3.707; that is. we can
reject Ho at a=0.02 but not at a=O.OI , so the p­
value is between 0.02 and 0.01; (the actual p-value,
which can be found using computer applications. is
0.0154); this is a reasonably significant model

LINEAR DETERMINATION
Regression models are also assessed by the coefficient
of linear determination, r2; this represents the
proportion of total variation in y that is explained by the
regression model; the coefficient of linear determination
can be calculated in a variety of ways; the easiest is to
compute r2 = (r)2; that is, the coefficient of determi­
nation is the square of the coefficient of correlation

RESIDUALS
The difference between an observed and a fitted value of
y(y ~ y) is called a residual; examining the residuals is
useful (0 identify outliers (observations far from the
regression line, representing unusual values for x
and y) and to check the assumptions of the model

l'iOTICE TO STUU[NT: This Qu!ckQuickstud y® guide uJVers the b;;tsies of
[ntroductory Statistics. Due to its condensed fonnm, however, use it as a Statistics g il ide and

not :.IS a replacement for asslg!H"!d coune work.

form, or by any means, electronic or mechanical, including photocopy, record ing. or any

information storage and retrieval system, without written permission fro lll the publi sher

.c 2002, 2UU5 BarCharts, Jnc, Boca Raton, FL 0308

6

6

3

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay

×