Tải bản đầy đủ

Cassandra high availability

www.it-ebooks.info


CassandraHighAvailability

www.it-ebooks.info


TableofContents
CassandraHighAvailability
Credits
AbouttheAuthor
AbouttheReviewers
www.PacktPub.com
Supportfiles,eBooks,discountoffers,andmore
Whysubscribe?
FreeaccessforPacktaccountholders
Preface
Whatthisbookcovers
Whatyouneedforthisbook
Whothisbookisfor

Conventions
Readerfeedback
Customersupport
Errata
Piracy
Questions
1.Cassandra’sApproachtoHighAvailability
ACID
Themonolithicarchitecture
Themaster-slavearchitecture
Sharding
Masterfailover
Cassandra’ssolution
Cassandra’sarchitecture
Distributedhashtable
Replication
Replicationacrossdatacenters
Tunableconsistency
TheCAPtheorem
Summary
2.DataDistribution
Hashtablefundamentals
Distributinghashtables
Consistenthashing
Themechanicsofconsistenthashing
Tokenassignment
Manuallyassignedtokens
www.it-ebooks.info


vnodes
Howvnodesimproveavailability
Addingandremovingnodes
Noderebuilding
Heterogeneousnodes
Partitioners
Hotspots
EffectsofscalingoutusingByteOrderedPartitioner
Atime-seriesexample
Summary


3.Replication
Thereplicationfactor
Replicationstrategies
SimpleStrategy
NetworkTopologyStrategy
Snitches
Maintainingthereplicationfactorwhenanodefails
Consistencyconflicts
Consistencylevels
Repairingdata
Balancingthereplicationfactorwithconsistency
Summary
4.DataCenters
Usecasesformultipledatacenters
Livebackup
Failover
Loadbalancing
Geographicdistribution
Onlineanalysis
AnalysisusingHadoop
AnalysisusingSpark
Datacentersetup
RackInferringSnitch
PropertyFileSnitch
GossipingPropertyFileSnitch
Cloudsnitches
Replicationacrossdatacenters
Settingthereplicationfactor
Consistencyinamultipledatacenterenvironment
Theanatomyofareplicatedwrite
Achievingstrongerconsistencybetweendatacenters
Summary
5.ScalingOut

www.it-ebooks.info


Choosingtherighthardwareconfiguration
Scalingoutversusscalingup
Growingyourcluster
Addingnodeswithoutvnodes
Addingnodeswithvnodes
Howtoscaleout
Addingadatacenter
Howtoscaleup
Upgradinginplace
Scalingupusingdatacenterreplication
Removingnodes
Removingnodeswithinadatacenter
Decommissioningadatacenter
Otherdatamigrationscenarios
Snitchchanges
Summary
6.HighAvailabilityFeaturesintheNativeJavaClient
Thriftversusthenativeprotocol
Settinguptheenvironment
Connectingtothecluster
Executingstatements
Preparedstatements
Batchedstatements
Cautionwithbatches
Handlingasynchronousrequests
Runningqueriesinparallel
Loadbalancing
Failingovertoaremotedatacenter
Downgradingtheconsistencylevel
Definingyourownretrypolicy
Tokenawareness
Tyingitalltogether
FallingbacktoQUORUM
Summary
7.ModelingforHighAvailability
HowCassandrastoresdata
Implicationsofalog-structuredstorage
Understandingcompaction
Size-tieredcompaction
Leveledcompaction
Date-tieredcompaction
CQLunderthehood
Singleprimarykey
Compoundkeys
www.it-ebooks.info


Partitionkeys
Clusteringcolumns
Compositepartitionkeys
Theimportanceofthestoragemodel
Understandingqueries
Querybykey
Rangequeries
Denormalizingwithcollections
Howcollectionsarestored
Sets
Lists
Maps
Workingwithtime-seriesdata
Designingforimmutability
Modelingsensordata
Queries
Time-basedordering
Usingasentinelvalue
Satisfyingourqueries
Whentimeisallthatmatters
Workingwithgeospatialdata
Summary
8.Antipatterns
Multikeyqueries
Secondaryindices
Secondaryindicesunderthehood
Distributedjoins
Deletingdata
Garbagecollection
Resurrectingthedead
Unexpecteddeletes
Theproblemwithtombstones
Expiringcolumns
TTLantipatterns
Whennulldoesnotmeanempty
Cassandraisnotaqueue
Unboundedrowgrowth
Summary
9.FailingGracefully
Knowledgeispower
MonitoringviaJavaManagementExtensions
UsingOpsCenter
Choosingamanagementtoolset
Logging
www.it-ebooks.info


Cassandralogs
Garbagecollectorlogs
Monitoringnodemetrics
Threadpools
Columnfamilystatistics
Findinglatencyoutliers
Communicationmetrics
Whenanodegoesdown
Markingadownednode
Handlingadownednode
Handlingslownodes
Backingupdata
Takingasnapshot
Incrementalbackups
Restoringfromasnapshot
Summary
Index

www.it-ebooks.info


CassandraHighAvailability

www.it-ebooks.info


CassandraHighAvailability
Copyright©2014PacktPublishing
Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,
ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthe
publisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.
Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyofthe
informationpresented.However,theinformationcontainedinthisbookissoldwithout
warranty,eitherexpressorimplied.Neithertheauthor,norPacktPublishing,andits
dealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecaused
directlyorindirectlybythisbook.
PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthe
companiesandproductsmentionedinthisbookbytheappropriateuseofcapitals.
However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.
Firstpublished:December2014
Productionreference:1221214
PublishedbyPacktPublishingLtd.
LiveryPlace
35LiveryStreet
BirminghamB32PB,UK.
ISBN978-1-78398-912-6
www.packtpub.com

www.it-ebooks.info


Credits
Author
RobbieStrickland
Reviewers
RichardLow
JimmyMårdell
RobMurphy
RussellSpitzer
CommissioningEditor
KunalParikh
AcquisitionEditors
RichardHarvey
OwenRoberts
ContentDevelopmentEditors
SamanthaGonsalves
AzharuddinSheikh
TechnicalEditor
AnkitaThakur
CopyEditors
PranjaliChury
MerilynPereira
ProjectCoordinator
SanchitaMandal
Proofreaders
SimranBhogal
MariaGould
AmeeshaGreen
PaulHindle
Indexer
RekhaNair
Graphics

www.it-ebooks.info


SheetalAute
DishaHaria
AbhinashSahu
ProductionCoordinator
AlwinRoy
CoverWork
AlwinRoy

www.it-ebooks.info


AbouttheAuthor
RobbieStricklandgotinvolvedintheApacheCassandraprojectin2010,andheinitially
wentintoproductionwiththe0.5release.Hehasmadenumerouscontributionsoverthe
years,includinghisworkondriversforC#andScala,andmultiplecontributionstothe
coreCassandracodebase.In2013,hebecametheveryfirstcertifiedCassandradeveloper,
andin2014,DataStaxselectedhimasanApacheCassandraMVP.
WhilethisisRobbie’sfirstpublishedtechnicalbook,hehasbeenanactivespeakerand
writerintheCassandracommunityandisthefounderoftheAtlantaCassandraUsers
Group.OtherexamplesofhiswritingcanbefoundontheDataStaxblog,andhehas
conductednumerouswebinarsandspokenatmanyconferencesovertheyears.
Iwouldliketothankmywifeforencouragingmetogoforwardwiththisprojectandfor
continuingtobesupportivethroughoutthesignificanttimecommitmentrequiredtowrite
abook.Also,Iamtrulyappreciativeofmyexcellentreviewers:RichardLow,Jimmy
Mårdell,RobMurphy,andRussellSpitzer.Theyhelpedkeepmehonest,andtheirdeep
expertiseaddedmateriallytothequalityofthecontent.Iwouldalsoliketothankthe
entirestaffatPacktPublishingwhowereinvolvedinthebook’spublishingprocess.
Lastly,IwanttothankLoganJohnsonwhoinitiallypointedmetowardCassandra.The
riskhaspaidoff,andLoganisresponsibleforstartingmeoffonthispath.

www.it-ebooks.info


AbouttheReviewers
RichardLowhasworkedwithCassandrasinceVersion0.6andhasmanagedand
supportedsomeofthelargestCassandradeployments.Hehascontributedfixesand
featurestotheprojectandhashelpedmanyusersbuildtheirfirstCassandradeployment.
HeisaregularspeakeratCassandraeventsandacontributortoCassandraonlineforums.
JimmyMårdellisaseniorsoftwareengineerandCassandracontributorwhohasspent
thelast4yearsworkingwithlargedistributedsystemsusingCassandra.Since2013,he
hasbeenleadingadatabaseinfrastructureteamatSpotify,focusingonimprovingthe
CassandraecosystematSpotifyandempoweringotherteamstooperateCassandra
clusters.Jimmylikesalgorithmsandcompetitiveprogrammingandwontheprogramming
competitionGoogleCodeJamin2003.
RobMurphyisasolutionsengineeratDataStaxwithmorethan16yearsofexperiencein
thefieldofdata-drivenapplicationdevelopmentanddesign.Rob’sbackgroundincludes
workwithmostRDMSplatformsaswellasDataStax/ApacheCassandra,Hadoop,
MongoDB,ApacheAccumulo,andApacheSpark.Hispassionforsolving“data
problems”goesbeyondthesystemleveltothedataitself.RobhasaMaster’sdegreein
PredictiveAnalyticsfromNorthwesternUniversitywithaspecificresearchinterestin
machinelearningandpredictivealgorithmsatthe“Internetscale”.
RussellSpitzerreceivedhisPhDinBioinformaticsfromUCSFin2013,wherehe
becameincreasinglyinterestedindataanalyticsanddistributedcomputation.Hefollowed
theseinterestsandjoinedDataStax,theenterprisecompanybehindtheApacheCassandra
distributeddatabase.AtDataStax,heworksonthetestinganddevelopmentofthe
integrationbetweenCassandraandothergroundbreakingopensourcetechnologies,such
asSpark,Solr,andHadoop.
Iwouldliketothankmywife,Maggie,whoputupwithalotoflate-nightlaptopscreen
glowsothatIcouldhelpoutwiththisbook.

www.it-ebooks.info


www.PacktPub.com

www.it-ebooks.info


Supportfiles,eBooks,discountoffers,and
more
Forsupportfilesanddownloadsrelatedtoyourbook,pleasevisitwww.PacktPub.com.
DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFand
ePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandas
aprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwith
usatformoredetails.
Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signup
forarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooks
andeBooks.

https://www2.packtpub.com/books/subscription/packtlib
DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt’sonlinedigital
booklibrary.Here,youcansearch,access,andreadPackt’sentirelibraryofbooks.

www.it-ebooks.info


Whysubscribe?

FullysearchableacrosseverybookpublishedbyPackt
Copyandpaste,print,andbookmarkcontent
Ondemandandaccessibleviaawebbrowser

www.it-ebooks.info


FreeaccessforPacktaccountholders
IfyouhaveanaccountwithPacktatwww.PacktPub.com,youcanusethistoaccess
PacktLibtodayandview9entirelyfreebooks.Simplyuseyourlogincredentialsfor
immediateaccess.

www.it-ebooks.info


Preface
Cassandraisafantasticdatastoreandiscertainlywellsuitedasthefoundationfora
highlyavailablesystem.Infact,itwasbuiltforsuchapurpose:tohandleFacebook’s
messagingservice.However,ithasn’talwaysbeensoeasytouse,withitsearlyThrift
interfaceandunfamiliardatamodelcausingmanypotentialuserstopause—andinmany
casesforagoodreason.
Fortunately,Cassandrahasmaturedsubstantiallyoverthelastfewyears.Iusedtoadvise
peopletouseCassandraonlyifnothingelsewoulddothejobbecausethelearningcurve
foritwasquitehigh.However,theintroductionofnewerfeaturessuchasCQLand
vnodeshaschangedthegameentirely.
Whatonceappearedcomplexandoverlydauntingnowcomesacrossasdeceptively
simple.ASQL-likeinterfacemaskstheunderlyingdatastructure,whosefamiliaritycan
lureanunsuspectingnewuserintodangeroustraps.Themoralofthisstoryisthatit’snot
arelationaldatabase,andyoustillneedtoknowwhatit’sdoingunderthehood.
Impartingthisknowledgeisthecoreobjectiveofthisbook.Eachchapterattemptsto
demystifytheinnerworkingsofCassandrasothatyounolongerhavetoworkblindly
againstablackboxdatastore.Youwilllearntoconfigure,design,andbuildyoursystem
basedonafundamentallysolidfoundation.
ThegoodnewsisthatCassandramakesthetaskofbuildingmassivelyscalableand
incrediblyreliablesystemsrelativelystraightforward,presumingyouunderstandhowto
partnerwithittoachievethesegoals.
Sinceyouarereadingthisbook,IpresumeyouareeitheralreadyusingCassandraor
planningtodoso,andthatyou’reinterestedinbuildingahighlyavailablesystemontop
ofit.Ifso,Iamconfidentthatyouwillmeetwithsuccessifyoufollowtheprinciplesand
guidelinesofferedinthechaptersthatfollow.

www.it-ebooks.info


Whatthisbookcovers
Chapter1,Cassandra’sApproachtoHighAvailability,isanintroductiontoconcepts
relatedtosystemavailabilityandtheproblemsthathavebeenencounteredhistorically
whiletryingtomakedatastoreshighlyavailable.ThischapteroutlinesCassandra’s
solutionstotheseproblems.
Chapter2,DataDistribution,outlinesthecoremechanismsthatunderlieCassandra’s
distributedhashtablemodel,includingconsistenthashingandpartitioner
implementations.
Chapter3,Replication,offersanin-depthlookatthedatareplicationarchitectureusedin
Cassandra,withafocusontherelationshipbetweenconsistencylevelsandreplication
factors.
Chapter4,DataCenters,enablesyoutothoroughlyunderstandCassandra’srobustdata
centerreplicationcapabilities,includingdeploymentonEC2andbuildingseparate
clustersforanalysisusingHadooporSpark.
Chapter5,ScalingOut,isadiscussiononthetools,processes,andgeneralguidance
requiredtoproperlyincreasethesizeofyourcluster.
Chapter6,HighAvailabilityFeaturesintheNativeJavaClient,coversthenewnative
Javadriveranditsavailability-relatedfeatures.We’lldiscussnodediscovery,clusterawareloadbalancing,automaticfailover,andotherimportantconcepts.
Chapter7,ModelingforHighAvailability,explainstheimportantconceptsyouneedto
understandwhilemodelinghighlyavailabledatainCassandra.CQL,keys,widerows,and
denormalizationareamongthetopicsthatwillbecovered.
Chapter8,Antipatterns,complementsthedatamodelingchapterbypresentingasetof
commonantipatternsthatproliferateamonginexperiencedCassandradevelopers.Some
patternsincludequeues,joins,highdeletevolumes,andhighcardinalitysecondary
indexesamongothers.
Chapter9,FailingGracefully,helpsthereadertounderstandhowtodealwithvarious
failurecases,asfailureinalargedistributedsystemisinevitable.We’llexamineanumber
ofpossiblefailurescenarios,anddiscusshowtodetectandresolvethem.

www.it-ebooks.info


Whatyouneedforthisbook
ThisbookassumesyouhaveaccesstoarunningCassandrainstallationthat’satleastas
newasrelease1.2.x.Somefeaturesdiscussedwillbeapplicableonlytothe2.0.xseries,
andwewillpointtheseoutwhenthisapplies.Usersofversionsolderthan1.2.xcanstill
gainalotfromthecontent,buttherewillbesomeportionsthatdonotdirectlytranslateto
thoseversions.
ForChapter6,HighAvailabilityFeaturesintheNativeJavaClient,coverageoftheJava
driver,youwillneedtheJavaDevelopmentKit1.7andasuitabletexteditortowriteJava
code.Allcommand-lineexamplesassumeaLinuxenvironmentsincethisistheonly
supportedoperatingsystemforusewithaproductionCassandrasystem.

www.it-ebooks.info


Whothisbookisfor
Thisbookisfordevelopersandsystemadministratorswhoareinterestedinbuildingan
advancedunderstandingofCassandra’sinternalsforthepurposeofdeployinghigh
availabilityservicesusingitasabackingdatastore.Thisisnotanintroductionto
Cassandra,sothosewhoarecompletelynewwouldbewellservedtofindasuitable
tutorialbeforedivingintothisbook.

www.it-ebooks.info


Conventions
Inthisbook,youwillfindanumberofstylesoftextthatdistinguishbetweendifferent
kindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheir
meaning.
Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,
pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:“The
PropertyFileSnitchconfigurationallowsanadministratortopreciselyconfigurethe
topologyofthenetworkbymeansofapropertiesfilenamedcassandratopology.properties.”
Ablockofcodeissetasfollows:
CREATEKEYSPACEAddressBook
WITHREPLICATION={
‘class’:‘NetworkTopologyStrategy’,
‘dc1’:3,
‘dc2’:2
};

Whenwewishtodrawyourattentiontoaparticularpartofacodeblock,therelevant
linesoritemsaresetinbold:
CREATEKEYSPACEAddressBook
WITHREPLICATION={
‘class’:‘SimpleStrategy’,
‘replication_factor’:3
};

Anycommand-lineinputoroutputiswrittenasfollows:
#nodetoolstatus

Newtermsandimportantwordsareshowninbold.Wordsthatyouseeonthescreen,
forexample,inmenusordialogboxes,appearinthetextlikethis:“Then,fillinthehost,
port,andyourcredentialsinthedialogboxandclickontheConnectbutton.”

Note
Warningsorimportantnotesappearinaboxlikethis.

Tip
Tipsandtricksappearlikethis.

www.it-ebooks.info


Readerfeedback
Feedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthis
book—whatyoulikedormayhavedisliked.Readerfeedbackisimportantforusto
developtitlesthatyoureallygetthemostoutof.
Tosendusgeneralfeedback,simplysendane-mailto,and
mentionthebooktitlethroughthesubjectofyourmessage.
Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingor
contributingtoabook,seeourauthorguideonwww.packtpub.com/authors.

www.it-ebooks.info


Customersupport
NowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelp
youtogetthemostfromyourpurchase.

www.it-ebooks.info


Errata
Althoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdo
happen.Ifyoufindamistakeinoneofourbooks—maybeamistakeinthetextorthe
code—wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveother
readersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufind
anyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,
selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthe
detailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedand
theerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataunderthe
Erratasectionofthattitle.
Toviewthepreviouslysubmittederrata,goto
https://www.packtpub.com/books/content/supportandenterthenameofthebookinthe
searchfield.TherequiredinformationwillappearundertheErratasection.

www.it-ebooks.info


Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay

×