# OReilly statistics hacks tips and tools for measuring the world and beating the odds may 2006 ISBN 0596101643

StatisticsHacks
ByBruceFrey
...............................................
Publisher:O'Reilly
PubDate:May2006
PrintISBN-10:0-596-10164-3
PrintISBN-13:978-0-59-610164-0
Pages:356

TableofContents|Index

Wanttocalculatetheprobabilitythataneventwillhappen?Beabletospotfakedata?
Provebeyonddoubtwhetheronethingcausesanother?Orlearntobeabettergambler?
Youcandothatandmuchmorewith75practicalandfunhackspackedintoStatistics
Hacks.Thesecooltips,tricks,andmind-bogglingsolutionsfromtheworldofstatistics,
measurement,andresearchmethodswillnotonlyamazeandentertainyou,butwillgive
Thisbookisidealforanyonewholikespuzzles,brainteasers,games,gambling,magic
tricks,andthosewhowanttoapplymathandsciencetoeverydaycircumstances.Several
hacksinthefirstchapteralone-suchasthe"centrallimittheorem,",whichallowsyouto
knoweverythingbyknowingjustalittle-serveassoundapproachesformarketingand

wayprobabilityworks,discoverrelationships,predicteventswithuncannyaccuracy,and
evenmakealittlemoneywithawell-placedwagerhereandthere.

StatisticsHackspresentsusefultechniquesfromstatistics,educationalandpsychological
measurement,andexperimentalresearchtohelpyousolveavarietyofproblemsin

PlaysmartwhenyouplayTexasHold'Em,blackjack,roulette,dicegames,oreven
thelottery
Designyourownwinnablebarbetstomakemoneyandamazeyourfriends
Predicttheoutcomesofbaseballgames,knowwhento"gofortwo"infootball,and
anticipatethewinnersofothersportingeventswithsurprisingaccuracy

Demystifyamazingcoincidencesanddistinguishthetrulyrandomfromtheonly
seeminglyrandom--evenkeepyouriPod's"random"shufflehonest
Spotfraudulentdata,detectplagiarism,andbreakcodes
Howtoisolatetheeffectsofobservationonthethingobserved

Whetheryou'reastatisticsenthusiastwhodoescalculationsinyoursleeporacivilianwho
isentertainedbycleversolutionstointerestingproblems,StatisticsHackshastoolstogive
youanedgeovertheworld'sslimodds.

TableofContents|Index

creditsCredits
Preface
Chapter1.TheBasics
Hack1.KnowtheBigSecret
Hack2.DescribetheWorldUsingJustTwoNumbers
Hack3.FiguretheOdds
Hack4.RejecttheNull
Hack5.GoBigtoGetSmall
Hack6.MeasurePrecisely
Hack7.MeasureUp
Hack8.PowerUp
Hack9.ShowCauseandEffect
Hack10.KnowBigWhenYouSeeIt
Chapter2.DiscoveringRelationships
Hack11.DiscoverRelationships
Hack12.GraphRelationships
Hack13.UseOneVariabletoPredictAnother
Hack14.UseMoreThanOneVariabletoPredictAnother
Hack15.IdentifyUnexpectedOutcomes
Hack16.IdentifyUnexpectedRelationships
Hack17.CompareTwoGroups
Hack18.FindOutJustHowWrongYouReallyAre
Hack19.SampleFairly
Hack20.SamplewithaTouchofScotch

Hack21.ChoosetheHonestAverage
Hack22.AvoidtheAxisofEvil
Chapter3.MeasuringtheWorld
Hack23.SeetheShapeofEverything
Hack24.ProducePercentiles
Hack25.PredicttheFuturewiththeNormalCurve
Hack26.GiveRawScoresaMakeover
Hack27.StandardizeScores
Hack29.TestFairly
Hack30.ImproveYourTestScoreWhileWatchingPaintDry
Hack31.EstablishReliability
Hack32.EstablishValidity
Hack34.MakeWiseMedicalDecisions
Chapter4.BeatingtheOdds
Hack35.GambleSmart
Hack36.KnowWhentoHold'Em
Hack37.KnowWhentoFold'Em
Hack38.KnowWhentoWalkAway
Hack39.LoseSlowlyatRoulette
Hack40.PlayintheBlackinBlackjack
Hack41.PlaySmartWhenYouPlaytheLottery
Hack42.PlaywithCardsandGetLucky
Hack43.PlaywithDiceandGetLucky
Hack44.SharpenYourCard-Sharping
Hack45.AmazeYour23ClosestFriends
Hack46.DesignYourOwnBarBet
Hack47.GoCrazywithWildCards
Hack48.NeverTrustanHonestCoin
Hack49.KnowYourLimit
Chapter5.PlayingGames
Hack50.AvoidtheZonk
Hack51.PassGo,Collect\$200,WintheGame
Hack52.UseRandomSelectionasArtificialIntelligence
Hack53.DoCardTricksThroughtheMail
Hack54.CheckYouriPod'sHonesty
Hack55.PredicttheGameWinners
Hack56.PredicttheOutcomeofaBaseballGame

Hack57.PlotHistogramsinExcel
Hack58.GoforTwo
Hack59.RankwiththeBestofThem
Hack60.EstimatePibyChance
Chapter6.ThinkingSmart
Hack61.OutsmartSuperman
Hack62.DemystifyAmazingCoincidences
Hack63.SensetheRealRandomnessofLife
Hack64.SpotFakedData
Hack65.GiveCreditWhereCreditIsDue
Hack66.PlayaTuneonPascal'sTriangle
Hack67.ControlRandomThoughts
Hack68.SearchforESP
Hack69.CureConjunctionitus
Hack70.BreakCodeswithEtaoinShrdlu
Hack71.DiscoveraNewSpecies
Hack72.FeelConnected
Hack73.LearntoRideaVotercycle
Hack75.SeekOutNewLifeandNewCivilizations
Colophon
Index

Credits
BruceFrey,Ph.D.,isacomicbookcollectorandfilmbuff.Inhis
conductsresearchinhissecretidentityasanassistant
professorinEducationalPsychologyandResearchatthe
UniversityofKansas.Heisanaward-winningteacher,andhis
testsandclassroomassessment,themeasurementof
spirituality,andprogramevaluationmethods.Bruce'shonors
includetakingthirdplaceintheKansasMonopolyChampionship
asateenager,secondplaceintheKansasFilmFestivalasa
collegestudent,andarespectablethird-placefinishinthe
Lawrence,Kansas,TexasHold'EmPokerTournamentasa
middle-agedman.Heisproudestoftwoaccomplishments:his
copyofShowcase#4,acomicbookwhereinthe"SilverAge
Flashfirstappears,"whateverthatmeans.

Contributors
Thefollowingpeoplecontributedtheirhacks,writing,and
inspirationtothisbook:
atVeriSign,focusingonproblemsinuserauthentication,
managedsecurityservices,andRFIDsecurity.Joehasyears
ofexperienceanalyzingdata,buildingstatisticalmodels,
consultantforcompaniesincludingDoubleClick,American

MassachusettsInstituteofTechnologywithanSc.B.andan
M.Eng.incomputerscienceandcomputerengineering.Joe
isanunapologeticYankeesfan,butheappreciatesany
goodbaseballgame.JoelivesinSiliconValleywithhiswife,
RonHale-Evansisawriter,thinker,andgamedesignerwho
earnshisdailysandwichwithfrequentgigsasatechnical
writer.HehasaBachelor'sdegreeinPsychologyfromYale,
himtocreatetheMentatWiki
(http://www.ludism.org/mentat),whichledtohisrecent
book,MindPerformanceHacks(O'Reilly).Youcanfindhis
multinefarious[sic]otherprojectsathishomepage,
http://ron.ludism.org,includinghisaward-winningboard
games,alistofhisShort-DurationPersonalSaviors,andhis
especiallysincehisseriesofarticlesonthattopicforthe
dearlydepartedTheGamesJournal
(http://www.thegamesjournal.com)hasbeenrelatively
emailRonthenamesofsomegulliblepublishers,orifyou
justwanttobughim,youcanreachhimat
rwhe@ludism.org(rhymeswithnudismandhasnothingto
dowithLuddism).
BrianE.Hansen,27,grewupintheDallas,Texasarea.
Afterservingatwo-yearreligiousmissioninSpain,he
aB.S.degreeinPetroleumEngineering.Hecurrentlyworks
asaReservoirEngineerforalargeindependentoilandgas
Irving,Texas.

fromTheUniversityofMassachusetts,Amherst.Sheis
currentlytheEvaluationDirectorfortheSchoolProgram
EvaluationandResearchgroupattheUniversityofKansas.
Jilllikesoutdoorsports,especiallyrunning,hiking,and
playingsoccerwithherkids.
ErnestE.RothmanisaProfessorandChairofthe
MathematicalSciencesDepartmentatSalveRegina
University(SRU)inNewport,RhodeIsland.Ernieholdsa
Ph.D.inAppliedMathematicsfromBrownUniversityand
heldpositionsattheCornellTheoryCenterinIthaca,New
YorkbeforecomingtoSRU.Hisinterestsareprimarilyin
scientificcomputing,mathematicsandstatisticseducation,
andtheUnixunderpinningsofMacOSX.Youcankeep
abreastofhislatestactivitiesat
http://homepage.mac.com/samchops.
NeilJ.Salkindisasometimesfacultymemberatthe
UniversityofKansaswithanofficeoppositethatofBruce
authorofStatisticsforPeopleWho(ThinkThey)Hate
collectsbooks,cooks,worksonoldhousesandap1800
StudioBLiteraryAgencyinNewYork.
WilliamSkorupskiiscurrentlyanassistantprofessorinthe
SchoolofEducationattheUniversityofKansas,wherehe
teachescoursesinpsychometricsandstatistics.Heearned
hisBachelor'sdegreeineducationalresearchand
psychologyfromBucknellUniversityin2000,andhis
DoctorateinpsychometricmethodsfromtheUniversityof
Massachusetts,Amherstin2004.Hisprimaryresearch
interestisintheapplicationofmathematicalmodelsto
psychometricdata,includingtheuseofBayesianstatistics

forsolvingpracticalmeasurementproblems.Healsoenjoys
applyinghisknowledgeofstatisticsandprobabilityto
everydaysituations,suchasplayingpokeragainstthe
authorofthisbook!

Acknowledgments
I'dliketothankallthecontributorstothisbook,boththose
whoarelistedinthe"Contributors"sectionandthosewho
helpedwithideas,reviewedthemanuscript,andprovided
suggestionsofsourcesandresources.Thanksinthiscapacity
Blackstone,Jr.'spaperbackbookThere'sOneBornEveryMinute
(JovePublications)providedgreatinspirationformanyofthe
hacksherein.
I'dliketothankmyeditor,BrianSawyer,whoshepherdedthis
projectwithastronghandandastrongvisionofwhatisandis
notahack.Hewasrightmostofthetime.(Thoughnotallthe
time....)Brianwasinstrumentalinbringingthisprojectto
completion,especiallyduringastringofunluckyrollswherethe
oddsofsuccesslookedslim.
I'dliketothankNeilSalkind,statisticswritersupreme,forhis
helpwithmanyfacetsofmyprofessionallifeandthisbook.
Mostimportantly,thankstoBonnieJohnson,mysweetwife,
whomIvaguelyrecall,butwhoIthinkwillbewaitingformeat
homewhenIfinallyturninthelastrevisionofthisbook.

Preface
Chanceplaysahugepartinyourlife,whetheryouknowitor
not.Yourparticulargeneticmakeupmutatedslightlywhenyou
werecreated,anditdidsobasedonspecificlawsofprobability.
Performanceinschoolinvolveshumanerrors,yoursand
others',whichtendstokeepyouractualabilitylevelfrombeing
reflectedpreciselyinyourreportcardoronthosehigh-stakes
tests.Researchoncareersevensuggeststhatwhatyoudofora
livingwasprobablynotaresultofcarefulplanningand
preparation,butmorelikelyduetohappenstance.And,of
course,chancedeterminesyourfateingamesofchanceand
playsalargeroleintheoutcomeofsportingevents.
Fortunately,anentiresetofscientifictools,thevarious
applicationsofstatistics,canbeusedtosolvetheproblems
causedbyourfate-influencedsystem.Inferentialstatistics,a
fieldofsciencebasedentirelyonthenatureofprobability,
allowsustounderstandthewaythingswork,discover
relationshipsamongvariables,describeahugepopulationby
seeingjustasmallbitofit,makeuncannilyaccurate
predictions,and,yes,evenmakealittlemoneywithawellplacedwagerhereandthere.
Thisbookisacollectionofstatisticaltricksandtools.Statistics
Hackspresentsusefultoolsfromstatistics,ofcourse,butalso
fromtherealmsofeducationalandpsychologicalmeasurement
andexperimentalresearchdesign.Itprovidessolutionstoa
varietyofproblemsintheworldofsocialscience,butalsointhe
inyoursleep,you'llenjoythisbookandthecreative
applicationsitfindsforthoserustyoldtoolsyouknowsowell.
Ifyoujustlikethescientificapproachtolifeandareentertained
bycoolideasandcleversolutionstointerestingproblems,don't

worry.StatisticsHackswaswrittenwiththenonscientistin
mind,too,soifthatisyou,you'vecometotherightplace.It's
writtenforthenonstatisticianaswell,soifthisstilldescribes
you,you'llfeelsafehere.
If,ontheotherhand,youaretakingastatisticscourseorhave
findthisbookapleasantcompaniontothetextbookstypically
requiredforthosesortsofcourses.Therewon'tbeany
theoreticalwon'thurtyourdevelopment.It'sjustthatthereare
someprettycoolthingsthatyoucandowithstatisticsthat
seemmorelikefunthanlikework.

WhyStatisticsHacks?
torefertopeoplewhobreakintosystemsorwreakhavoc,
usingcomputersastheirweapon.Amongpeoplewhowrite
code,though,thetermhackreferstoa"quick-and-dirty"
solutiontoaproblemoracleverwaytogetsomethingdone.
Andthetermhackeristakenverymuchasacompliment,
referringtosomeoneasbeingcreative,havingthetechnical
chopstogetthingsdone.TheHacksseriesisanattemptto
reclaimtheword,documentthegoodwayspeoplearehacking,
andpassthehackerethicofcreativeparticipationontothe
uninitiated.Seeinghowothersapproachsystemsandproblems
Thetechnologiesattheheartofthisbookarestatistics,
measurement,andresearchdesign.Computertechnologyhas
developedhand-in-handwiththesetechnologies,sotheuseof
thetermhackstodescribewhatisdoneinthisbookis
consistentwithalmosteveryperspectiveonthatword.Though
thereisjustalittlecomputerhackingcoveredinthesepages,
thereisaplethoraofcleverwaystogetthingsdone.

HowThisBookIsOrganized
hackstandsonitsown,sofeelfreetobrowseandjumptothe
differentsectionsthatinterestyoumost.Ifthere'sa
guideyoutotherighthack.
Theearlierhacksaremorefoundationalandprobablyprovide
generalizedsolutionsorstrategicapproachesacrossavarietyof
problemstoagreaterextentthanlaterhacks.Ontheother
hand,laterhacksprovidemuchmorespecifictricksforwinning
gamesorjustinformationtohelpyouunderstandwhat'sgoing
onaroundyou.
Thebookisdividedintoseveralchapters,organizedbysubject:

Chapter1,TheBasics
Usethesehacksasastrongsetoffoundationaltools,the
onesyouwillusemostoftenwhenyouarestat-hacking
yourwayintoandoutoftrouble.Thinkoftheseasyour
basictoolkit:yourhammer,saw,andvariousscrewdrivers.

Chapter2,DiscoveringRelationships
testrelationshipsamongvariables.Youwillbeabletomake
theinvisiblevisiblewiththesehacks.

Chapter3,MeasuringtheWorld

Avarietyoftipsandtricksformeasuringtheworldaround
questions,assessaccurately,andevenincreaseyourown
performanceonhigh-stakestests.

Chapter4,BeatingtheOdds
Thischapterisforthegambler.Usetheoddstoyour
determinestheoutcome.

Chapter5,PlayingGames
FromTVgameshowstrategytowinningMonopolyto
enjoyingsportstojusthavingfun,thischapterpresents
differenthacksforgettingthemostoutofyourgame
playing.

Chapter6,ThinkingSmart
Thischapterisperhapsthemostcerebralofthemall.Get
yourmindright,playmindgames,makediscoveries,and
unlockthemysteriesoftheworldaroundususingthe
statisticshacksyou'llfindhere.

ConventionsUsedinThisBook
Thefollowingisalistofthetypographicalconventionsusedin
thisbook:

Italics
Usedtoindicatekeytermsandconcepts,URLs,and
filenames.

Constantwidth
UsedforExcelfunctionsandcodeexamples.

Constantwidthitalic

Usedforcodetextthatshouldbereplacedbyuser-supplied
values.

Graytype
Usedtoindicateacross-referencewithinthetext.
Youshouldpayspecialattentiontonotessetapartfromthe
textwiththisicon:

Thisisatip,suggestion,orgeneralnote.Itcontainsuseful

Thethermometericons,foundnexttoeachhack,indicatethe
relativecomplexityofthehack:

SafariEnabled

WhenyouseeaSafari®Enabledicononthecoverofyour
favoritetechnologybook,thatmeansthebookisavailable
onlinethroughtheO'ReillyNetworkSafariBookshelf.
Safarioffersasolutionthat'sbetterthane-books.It'savirtual
librarythatletsyoueasilysearchthousandsoftoptechbooks,
Tryitforfreeathttp://safari.oreilly.com.

HowtoContactUs
Wehavetestedandverifiedtheinformationinthisbooktothe
bestofourability,butyoumayfindthattherulesor
characteristicsofagivensituationaredifferentthandescribed
statements,andtyposthatyoufindanywhereinthisbook.
incorporatereasonablesuggestionsintofutureeditions.Youcan
writetousat:
O'ReillyMedia,Inc.
1005GravensteinHwyN.
Sebastopol,CA95472
707-829-0515(international/local)
707-829-0104(fax)
emailto:
bookquestions@oreilly.com

ThewebsiteforStatisticsHackslistsexamples,errata,and
plansforfutureeditions.Youcanfindthispageat:
http://www.oreilly.com/catalog/statisticshks
O'Reillywebsite:
http://www.oreilly.com

GotaHack?
ToexploreHacksbooksonlineortocontributeahackforfuture
titles,visit:
http://hacks.oreilly.com

Chapter1.TheBasics
There'sonlyasmallgroupoftoolsthatstatisticiansuseto
thewaythatstatisticiansuseprobabilityorknowledgeofthe
normaldistributiontohelpthemoutindifferentsituationsthat
varies.Thischapterpresentsthesebasichacks.
asaprobability[Hack#1]isanessentialtrickfrequentlyused
bystat-hackers,asisusingatinybitofsampledatato
accuratelydescribeallthescoresinalargerpopulation[Hack
#2].Knowledgeofbasicrulesforcalculatingprobabilities[Hack
#3]iscrucial,andyougottaknowthelogicofsignificance
testingifyouwanttomakestatistically-baseddecisions[Hacks
#4and#8].
Minimizingerrorsinyourguesses[Hack#5]andscores[Hack
#6]andinterpretingyourdata[Hack#7]correctlyarekey
strategiesthatwillhelpyougetthemostbangforyourbuckin
avarietyofsituations.Andsuccessfulstat-hackershaveno
troublerecognizingwhattheresultsofanyorganizedsetof
observationsorexperimentalmanipulationreallymean[Hacks
#9and#10].
Learntousethesecoretools,andthelaterhackswillbea
breezetolearnandmaster.

Hack1.KnowtheBigSecret

Statisticiansknowonesecretthingthatmakesthem
seemsmarterthaneverybodyelse.
Theprimarypurposeofstatisticsasascientificmethodologyis
wejumpintothat,weneedsomequickdefinitionstogetus
rolling,bothtounderstandthishackandtolayafoundationfor
otherstatisticshacks.
Samplesarenumericvaluesthatyouhavegatheredtogether
andcanseeinfrontofyouthatrepresentsomelarger
populationofscoresthatyouhavenotgatheredtogetherand
cannotseeinfrontofyou.Becausethesevaluesarealmost
alwaysnumbersthatindicatethepresenceorlevelofsome
characteristic,measurementfolkscallthesevaluesscores.A
someeventoccurring.
Probabilityistheheartandsoulofstatistics.Acommon
perceptionofstatisticians,infact,isthattheymainlycalculate
theexactlikelihoodthatcertaineventsofinterestwilloccur,
suchaswinningthelotteryorbeingstruckbylightning.
todescribealargegroupofpeopleusingonlyafewsummary
statistics.
timespentonthebasicrulesofprobability:themethodsfor
calculatingthechancesofvariouscombinationsorpermutations
ofpossibleoutcomes.Morecommonapplicationsofstatistics,

however,aretheuseofdescriptivestatisticstodescribeagroup
ofscores,ortheuseofinferentialstatisticstomakeguesses
containedinasampleofscores.Insocialscience,thescores
usuallydescribeeitherpeopleorsomethingthatishappeningto
them.
Itturnsout,then,thatresearchersandmeasurers(thepeople
whoaremostlikelytousestatisticsintherealworld)arecalled
upontodomorethancalculatetheprobabilityofcertain
combinationsandpermutationsofinterest.Theyareableto
questionsofvaryinglevelsofcomplexitywithoutonceneeding
tocomputetheoddsofthrowingapairofsix-sideddiceand
gettingthree7sinarow.

Thoseoddsare.005or1/2of1percentifyoustartfromscratch.Ifyou
thatthird7.

TheBigSecret
Thekeyreasonthatprobabilityissocrucialtowhatstatisticians
thescoresinrealortheoreticaldistributions.

sometimes,howmanyofeachvaluethereare.

25percentoftheclassgot10points,thenImightsay,without
chancethatyougot10points.Icouldalsosaythatthereisa
75percentchancethatyoudidnotget10points.AllIhave
valuesandexpressedthatinformationasastatementof
probability.Thisisatrick.Itisthesecrettrickthatall
statisticiansknow.Infact,thisismostlyallthatstatisticians
everdo!
somevaluesandexpressthatinformationasastatementof
probability.Thisisworthrepeating(or,technically,
threepeating,asIfirstsaiditfivesentencesago).Statisticians
andexpressthatinformationasastatementofprobability.
HeavenstoBetsy,wecanalldothat.Howhardcoulditbe?
Imaginethattherearethreemarblesinanotherwiseempty
coffeecan.Furtherimaginethatyouknowthatonlyoneofthe
marblesisblue.Therearethreevaluesinthedistribution:one
bluemarbleandtwomarblesofsomeothercolor,foratotal
samplesizeofthree.Thereisonebluemarbleoutofthree
marbles.Oh,statistician,whatarethechancesthat,without
looking,Iwilldrawthebluemarbleoutfirst?Oneoutofthree.
1/3.33percent.
Tobefair,thevaluesandtheirdistributionsmostcommonly
usedbystatisticiansareabitmoreabstractorcomplexthan
thoseofthemarblesinacoffeecanscenario,andsomuchof
whatstatisticiansdoisnotquitethattransparent.Appliedsocial
scienceresearchersusuallyproducevaluesthatrepresentthe
differencebetweentheaveragescoresofseveralgroupsof
people,forexample,oranindexofthesizeoftherelationship
betweentwoormoresetsofscores.Theunderlyingprocessis
thesameasthatusedwiththecoffeecanexample,though:

referencetheknowndistributionofthevalueofinterestand
Thekey,ofcourse,ishowoneknowsthedistributionofall
theseexotictypesofvaluesthatmightinterestastatistician.
Howcanoneknowthedistributionofaveragedifferencesorthe
distributionofthesizeofarelationshipbetweentwosetsof
variables?Conveniently,pastresearchersandmathematicians
havedevelopedordiscoveredformulasandtheoremsandrules
ofthumbandphilosophiesandassumptionsthatprovideus
withtheknowledgeofthedistributionsofthesecomplexvalues
mostoftensoughtbyresearchers.Theworkhasbeendonefor
us.

ASmaller,DirtierSecret
Mostoftheproceduresthatstatisticiansusetotakeknown
informationasastatementofprobabilityhavecertain
requirementsthatmustbemetfortheprobabilitystatementto
beaccurate.Oneoftheseassumptionsthatalmostalwaysmust
bemetisthatthevaluesinasamplehavebeenrandomly
drawnfromthedistribution.
NoticethatinthecoffeecanexampleIslippedinthat"without
guidingthesamplingprocess,thentheassociatedprobabilities
reportedaresimplywrongandhere'stheworstpartwecan't
possiblyknowhowwrongtheyare.Much,andmaybemost,of
theappliedpsychologicalandeducationalresearchthatoccurs
todayusessamplesofpeoplethatwerenotrandomlydrawn
fromsomepopulationofinterest.
Collegestudentstakinganintroductorypsychologycourse
makeupthesamplesofmuchpsychologicalresearch,for
example,andstudentsatelementaryschoolsconveniently

locatednearwhereaneducationalresearcherlivesareoften
chosenforstudy.Thisisaproblemthatsocialscience
nevertheless,itisalimitationofmuchsocialscienceresearch.

Hack2.DescribetheWorldUsingJustTwo
Numbers

Mostofthestatisticalsolutionsandtoolspresentedin
thisbookworkonlybecauseyoucanlookatasample
TheCentralLimitTheoremisthemeta-tool,theprime
directive,thekingofallsecretsthatallowsustopulloff
theseinferentialtricks.
Statisticsprovidesolutionstoproblemswheneveryourgoalis
todescribeagroupofscores.Sometimesthewholegroupof
scoresyouwanttodescribeisinfrontofyou.Thetoolsforthis
onlypartofthegroupofthescoresyouwanttodescribe,but
youstillwanttodescribethewholegroup.Thissummary
approachiscalledinferentialstatistics.Ininferentialstatistics,
thepartofthegroupofscoresyoucanseeiscalledasample,
andthewholegroupofscoresyouwishtomakeinferences
describewithanyconfidenceapopulationofvalueswhen,by
definition,youarenotdirectlyobservingthosevalues.Byusing
threepiecesofinformationtwosamplevaluesandan
populationyoucanconfidentlyandaccuratelydescribethose
invisiblepopulations.Thesetofproceduresforderivingthat
eerilyaccuratedescriptioniscollectivelyknownastheCentral
LimitTheorem.

SomeQuickStatisticsBasics

Inferentialstatisticstendtousetwovaluestodescribe
populations,themeanandthestandarddeviation.

Mean
Ratherthandescribeasampleofvaluesbyshowingthemall,it
issimplymoreefficienttoreportsomefairsummaryofagroup
numberismeanttofairlyrepresentallthescoresandwhat
theyhaveincommon.Consequently,thissinglenumberis
referredtoasthecentraltendencyofagroupofscores.
Typically,thebestmeasureofcentraltendency,foravarietyof
reasons,isthemean[Hack#21].Themeanisthearithmetic
thevaluesinagroup,andthendividingthattotalbythe
allthescoresinagroupthanothercentraltendencyoptions
(suchasreportingthemiddlescore,themostcommonscore,
andsoon).
Infact,mathematically,themeanhasaninterestingproperty.A
bythenumberofscores)producesanumberthatisascloseas
possibletoalltheotherscores.Themeanwillbeclosetosome
distances,yougetatotalthatisassmallaspossible.Noother
number,realorimagined,willproduceasmallertotaldistance
fromallthescoresinagroupthanthemean.

Standarddeviation
ofthescores.Aretheymostlyclosetothemeanormostlyfar

fromthemean?Twowildlydifferentdistributionscouldhavethe
samemeanbutdifferintheirvariability.Themostcommonly
reportedmeasureofvariabilitysummarizesthedistances
betweeneachscoreandthemean.
Aswiththemean,themoreinformativemeasureofvariability
measureofvariabilitythatdoesthisisthestandarddeviation.
Thestandarddeviationistheaveragedistanceofeachscore
fromthemean.Astandarddeviationcalculatesallthedistances
arethedistancebetweeneachscoreandthemean.

Anothercommonlyreportedvaluethatsummarizesthevariabilityina
distributionisthevariance.Thevarianceissimplythestandard
deviationsquaredandisnotparticularlyusefulinpicturinga
isfrequentlyusedasavalueinstatisticalcalculations,suchaswiththe
independentttest[Hack#17].

Theformulaforthestandarddeviationappearstobemore
complicatedthanitneedstobe,buttherearesome
mathematicalcomplicationswithsummingdistances(negative
distancesalwayscanceloutthepositivedistanceswhenthe
meanisusedasthedividingpoint).Consequently,hereisthe
equation:
Smeanstosumup.Thexmeanseachscore,andthenmeans
thenumberofscores.

CentralLimitTheorem
TheCentralLimitTheoremisfairlybrief,butverypowerful.

