荣耀彩票代理

IT技术互动交流平台

HBase工作理学习

来源:IT165收集  发布日期:2016-07-21 21:37:05

HBase工作原理学习

 

1 HBase简介

荣耀彩票代理HBaseSHIYIGEGAOKEKAOXING、GAOXINGNENG、MIANXIANGLIE、KESHENSUODEFENBUSHICUNCHUXITONG,LIYONGHBaseJISHUKEZAILIANJIAPC ServerSHANGDAJIANDAGUIMOJIEGOUHUADECUNCHUJIQUN。HBaseDEMUBIAOSHICUNCHUBINGCHULIDAXINGSHUJU,JUTILAISHUOSHIJINXUSHIYONGPUTONGDEYINGJIANPEIZHI,JIUNENGGOUCHULIYOUCHENGQIANSHANGWANDEXINGHELIESUOZUCHENGDEDAXINGSHUJU。

YUMapReduceDELIXIANPICHULIJISUANKUANGJIABUTONG,HBaseSHIYIGEKEYISUIJIFANGWENDECUNCHUHEJIANSUOSHUJUPINGTAI,MIBULEHDFSBUNENGSUIJIFANGWENSHUJUDEQUEXIAN,SHIHESHISHIXINGYAOQIUBUSHIFEICHANGGAODEYEWUCHANGJING。HBaseCUNCHUDEDOUSHIByteSHUZU,TABUJIEYISHUJULEIXING,YUNXUDONGTAI、LINGHUODESHUJUMOXING。

 

荣耀彩票代理SHANGTUMIAOSHULEHadoop 2.0SHENGTAIXITONGZHONGDEGECENGJIEGOU。QIZHONGHBaseWEIYUJIEGOUHUACUNCHUCENG,HDFSWEIHBaseTIGONGLEGAOKEKAOXINGDEDICENGCUNCHUZHICHI, MapReduceWEIHBaseTIGONGLEGAOXINGNENGDEPICHULINENGLI,ZookeeperWEIHBaseTIGONGLEWENDINGFUWUHEfailoverJIZHI,PigHEHiveWEIHBaseTIGONGLEJINXINGSHUJUTONGJICHULIDEGAOCENGYUYANZHICHI,SqoopZEWEIHBaseTIGONGLEBIANJIEDERDBMSSHUJUDAORUGONGNENG,SHIYEWUSHUJUCONGCHUANTONGSHUJUKUXIANGHBaseQIANYIBIANDEFEICHANGFANGBIAN。

2 HBase体系结构

2.1 设计思路

HBaseSHIYIGEFENBUSHIDESHUJUKU,SHIYONGZookeeperGUANLIJIQUN,SHIYONGHDFSZUOWEIDICENGCUNCHU。ZAIJIAGOUCENGMIANSHANGYOUHMaster(ZookeeperXUANJUCHANSHENGDELeader)HEDUOGEHRegionServerZUCHENG,JIBENJIAGOURUXIATUSUOSHI:

 

荣耀彩票代理ZAIHBaseDEGAINIANZHONG,HRegionServerDUIYINGJIQUNZHONGDEYIGEJIEDIAN,YIGEHRegionServerFUZEGUANLIDUOGEHRegion,ERYIGEHRegionDAIBIAOYIZHANGBIAODEYIBUFENSHUJU。ZAIHBaseZHONG,YIZHANGBIAOKENENGHUIXUYAOHENDUOGEHRegionLAICUNCHUSHUJU,MEIGEHRegionZHONGDESHUJUBINGBUSHIZALUANWUZHANGDE。HBaseZAIGUANLIHRegionDESHIHOUHUIJIMEIGEHRegionDINGYIYIGERowkeyDEFANWEI,LUOZAITEDINGFANWEINEIDESHUJUJIANGJIAOJITEDINGDERegion,CONGERJIANGFUZAIFENTANDAODUOGEJIEDIAN,ZHEIYANGJIUCHONGFENLIYONGLEFENBUSHIDEYOUDIANHETEXING。LINGWAI,HBaseHUIZIDONGDIAOJIERegionSUOCHUDEWEIZHI,RUGUOYIGEHRegionServerGUORE,JIDALIANGDEQINGQIULUOZAIZHEIGEHRegionServerGUANLIDEHRegionSHANG,HBaseJIUHUIBAHRegionYIDONGDAOXIANGDUIKONGXIANDEQITAJIEDIAN,YICIBAOZHENGJIQUNHUANJINGBEICHONGFENLIYONG。

2.2 基本架构

HBaseYOUHMasterHEHRegionServerZUCHENG,TONGYANGZUNCONGZHUCONGFUWUQIJIAGOU。HBaseJIANGLUOJISHANGDEBIAOHUAFENCHENGDUOGESHUJUKUAIJIHRegion,CUNCHUZAIHRegionServerZHONG。HMasterFUZEGUANLISUOYOUDEHRegionServer,TABENSHENBINGBUCUNCHURENHESHUJU,ERZHISHICUNCHUSHUJUDAOHRegionServerDEYINGSHEGUANXI(YUANSHUJU)。JIQUNZHONGDESUOYOUJIEDIANTONGGUOZookeeperJINXINGXIEDIAO,BINGCHULIHBaseYUNXINGQIJIANKENENGYUDAODEGEZHONGWENTI。HBaseDEJIBENJIAGOURUXIATUSUOSHI:

 

Client:使用HBase的RPC机制与HMaster和HRegionServer进行通信,提交请求和获取结果。对于管理类操作,Client与HMaster进行RPC;对于数据读写类操作,Client与HRegionServer进行RPC。

Zookeeper:通过将集群各节点状态信息注册到Zookeeper中,使得HMaster可随时感知各个HRegionServer的健康状态,而且也能避免HMaster的单点问题。

HMaster荣耀彩票代理:管理所有的HRegionServer,告诉其需要维护哪些HRegion,并监控所有HRegionServer的运行状态。当一个新的HRegionServer登录到HMaster时,HMaster会告诉它等待分配数据;而当某个HRegion死机时,HMaster会把它负责的所有HRegion标记为未分配,然后再把它们分配到其他HRegionServer中。HMaster没有单点问题,HBase可以启动多个HMaster,通过Zookeeper的选举机制保证集群中总有一个HMaster运行,从而提高了集群的可用性。

HRegion:当表的大小超过预设值的时候,HBase会自动将表划分为不同的区域,每个区域包含表中所有行的一个子集。对用户来说,每个表是一堆数据的集合,靠主键(RowKey)来区分。从物理上来说,一张表被拆分成了多块,每一块就是一个HRegion。我们用表名+开始/结束主键,来区分每一个HRegion,一个HRegion会保存一个表中某段连续的数据,一张完整的表数据是保存在多个HRegion中的。

HRegionServer:HBase中的所有数据从底层来说一般都是保存在HDFS中的,用户通过一系列HRegionServer获取这些数据。集群一个节点上一般只运行一个HRegionServer,且每一个区段的HRegion只会被一个HRegionServer维护。HRegionServer主要负责响应用户I/O请求,向HDFS文件系统读写数据,是HBase中最核心的模块。HRegionServer内部管理了一系列HRegion对象,每个HRegion对应了逻辑表中的一个连续数据段。HRegion由多个HStore组成,每个HStore对应了逻辑表中的一个列族的存储,可以看出每个列族其实就是一个集中的存储单元。因此,为了提高操作效率,最好将具备共同I/O特性的列放在一个列族中。

HStore荣耀彩票代理:它是HBase存储的核心,由MemStore和StoreFiles两部分组成。MemStore是内存缓冲区,用户写入的数据首先会放入MemStore,当MemStore满了以后会Flush成一个StoreFile(底层实现是HFile),当StoreFile的文件数量增长到一定阈值后,会触发Compact合并操作,将多个StoreFiles合并成一个StoreFile,合并过程中会进行版本合并和数据删除操作。因此,可以看出HBase其实只有增加数据,所有的更新和删除操作都是在后续的Compact过程中进行的,这样使得用户的写操作只要进入内存就可以立即返回,保证了HBaseI/O的高性能。当StoreFiles Compact后,会逐步形成越来越大的StoreFile,当单个StoreFile大小超过一定阈值后,会触发Split操作,同时把当前的HRegion Split成2个HRegion,父HRegion会下线,新分出的2个子HRegion会被HMaster分配到相应的HRegionServer,使得原先1个HRegion的负载压力分流到2个HRegion上。

HLog荣耀彩票代理:每个HRegionServer中都有一个HLog对象,它是一个实现了Write Ahead Log的预写日志类。在每次用户操作将数据写入MemStore的时候,也会写一份数据到HLog文件中,HLog文件会定期滚动刷新,并删除旧的文件(已持久化到StoreFile中的数据)。当HMaster通过Zookeeper感知到某个HRegionServer意外终止时,HMaster首先会处理遗留的 HLog文件,将其中不同HRegion的HLog数据进行拆分,分别放到相应HRegion的目录下,然后再将失效的HRegion重新分配,领取到这些HRegion的HRegionServer在加载 HRegion的过程中,会发现有历史HLog需要处理,因此会Replay HLog中的数据到MemStore中,然后Flush到StoreFiles,完成数据恢复。

2.3 ROOT表和META表

荣耀彩票代理HBaseDESUOYOUHRegionYUANSHUJUBEICUNCHUZAI.META.BIAOZHONG,SUIZHEHRegionDEZENGDUO,.META.BIAOZHONGDESHUJUYEHUIZENGDA,BINGFENLIECHENGDUOGEXINDEHRegion。WEILEDINGWEI.META.BIAOZHONGGEGEHRegionDEWEIZHI,BA.META.BIAOZHONGSUOYOUHRegionDEYUANSHUJUBAOCUNZAI-ROOT-BIAOZHONG,ZUIHOUYOUZookeeperJILU-ROOT-BIAODEWEIZHIXINXI。SUOYOUKEHUDUANFANGWENYONGHUSHUJUQIAN,XUYAOSHOUXIANFANGWENZookeeperHUODE-ROOT-DEWEIZHI,RANHOUFANGWEN-ROOT-BIAOHUODE.META.BIAODEWEIZHI,ZUIHOUGENJU.META.BIAOZHONGDEXINXIQUEDINGYONGHUSHUJUCUNFANGDEWEIZHI,RUXIATUSUOSHI。

 

-ROOT-BIAOYONGYUANBUHUIBEIFENGE,TAZHIYOUYIGEHRegion,ZHEIYANGKEYIBAOZHENGZUIDUOZHIXUYAOSANCITIAOZHUANJIUKEYIDINGWEIRENYIYIGEHRegion。WEILEJIAKUAIFANGWENSUDU,.META.BIAODESUOYOUHRegionQUANBUBAOCUNZAINEICUNZHONG。KEHUDUANHUIJIANGCHAXUNGUODEWEIZHIXINXIHUANCUNQILAI,QIEHUANCUNBUHUIZHUDONGSHIXIAO。RUGUOKEHUDUANGENJUHUANCUNXINXIHAIFANGWENBUDAOSHUJU,ZEXUNWENXIANGGUAN.META.BIAODERegionFUWUQI,SHITUHUOQUSHUJUDEWEIZHI,RUGUOHAISHISHIBAI,ZEXUNWEN-ROOT-BIAOXIANGGUANDE.META.BIAOZAINALI。ZUIHOU,RUGUOQIANMIANDEXINXIQUANBUSHIXIAO,ZETONGGUOZooKeeperZHONGXINDINGWEIHRegionDEXINXI。SUOYIRUGUOKEHUDUANSHANGDEHUANCUNQUANBUSHISHIXIAO,ZEXUYAOJINXING6CIWANGLUOLAIHUI,CAINENGDINGWEIDAOZHENGQUEDEHRegion。

3 HBase数据模型

HBaseSHIYIGELEISIYUBigTableDEFENBUSHISHUJUKU,TASHIYIGEXISHUDEZHANGQICUNCHUDE(CUNZAIHDFSSHANG)、DUOWEIDUDE、PAIXUDEYINGSHEBIAO。ZHEIZHANGBIAODESUOYINSHIXINGGUANJIANZI、LIEGUANJIANZIHESHIJIANCHUO。HBaseDESHUJUDOUSHIZIFUCHUAN,MEIYOULEIXING。

 

荣耀彩票代理KEYIJIANGYIGEBIAOXIANGXIANGCHENGYIGEDADEYINGSHEGUANXI,TONGGUOXINGJIAN、XINGJIAN+SHIJIANCHUOHUOXINGJIAN+LIE(LIEZU:LIEXIUSHIFU),JIUKEYIDINGWEITEDINGSHUJU。YOUYUHBaseSHIXISHUCUNCHUSHUJUDE,SUOYIMOUXIELIEKEYISHIKONGBAIDE。SHANGBIAOJICHULEcom.cnn.wwwWANGZHANDESHUJUCUNFANGLUOJISHITU,BIAOZHONGJINYOUYIXINGSHUJU,XINGDEWEIYIBIAOSHIWEI“com.cnn.www”,DUIZHEIXINGSHUJUDEMEIYICILUOJIXIUGAIDOUYOUYIGESHIJIANCHUOGUANLIANDUIYING。BIAOZHONGGONGYOUSILIE:contents:html、anchor:cnnsi.com、anchor:my.look.ca、mime:type,MEIYILIEYIQIANZHUIDEFANGSHIJICHUQISUOSHUDELIEZU。

荣耀彩票代理XINGJIAN(RowKey)SHISHUJUXINGZAIBIAOZHONGDEWEIYIBIAOSHI,BINGZUOWEIJIANSUOJILUDEZHUJIAN。ZAIHBaseZHONGFANGWENBIAOZHONGDEXINGZHIYOUSANZHONGFANGSHI:TONGGUOMOUGEXINGJIANFANGWEN、JIDINGXINGJIANDEFANWEIFANGWEN、QUANBIAOSAOMIAO。XINGJIANKEYISHIRENYIZIFUCHUAN(ZUIDAZHANGDU64KB)BINGANZHAOZIDIANXUJINXINGCUNCHU。DUIYUNEIXIEJINGCHANGYIQIDUQUDEXING,XUYAODUIJIANZHIJINGXINSHEJI,YIBIANTAMENNENGFANGZAIYIQICUNCHU。

4 HBase读写流程

 

荣耀彩票代理SHANGTUSHIHRegionServerSHUJUCUNCHUGUANXITU。SHANGWENTIDAO,HBaseSHIYONGMemStoreHEStoreFileCUNCHUDUIBIAODEGENGXIN。SHUJUZAIGENGXINSHISHOUXIANXIERUHLogHEMemStore。MemStoreZHONGDESHUJUSHIPAIXUDE,DANGMemStoreLEIJIDAOYIDINGYUZHISHI,JIUHUICHUANGJIANYIGEXINDEMemStore,BINGQIEJIANGLAODEMemStoreTIANJIADAOFlushDUILIE,YOUDANDUDEXIANCHENGFlushDAOCIPANSHANG,CHENGWEIYIGEStoreFile。YUCITONGSHI,XITONGHUIZAIZookeeperZHONGJILUYIGECheckPoint,BIAOSHIZHEIGESHIKEZHIQIANDESHUJUBIANGENGYIJINGCHIJIUHUALE。DANGXITONGCHUXIANYIWAISHI,KENENGDAOZHIMemStoreZHONGDESHUJUDIUSHI,CISHISHIYONGHLogLAIHUIFUCheckPointZHIHOUDESHUJU。

StoreFileSHIZHIDUDE,YIDANCHUANGJIANHOUJIUBUKEYIZAIXIUGAI。YINCIHbaseDEGENGXINQISHISHIBUDUANZHUIJIADECAOZUO。DANGYIGEStoreZHONGDEStoreFileDADAOYIDINGYUZHIHOU,JIUHUIJINXINGYICIHEBINGCAOZUO,JIANGDUITONGYIGEkeyDEXIUGAIHEBINGDAOYIQI,XINGCHENGYIGEDADEStoreFile。DANGStoreFileDEDAXIAODADAOYIDINGYUZHIHOU,YOUHUIDUI StoreFileJINXINGQIEFENCAOZUO,DENGFENWEILIANGGEStoreFile。

4.1 写操作流程

荣耀彩票代理BUZHOU1:ClientTONGGUOZookeeperDEDIAODU,XIANGHRegionServerFACHUXIESHUJUQINGQIU,ZAIHRegionZHONGXIESHUJU。

BUZHOU2:SHUJUBEIXIERUHRegionDEMemStore,ZHIDAOMemStoreDADAOYUSHEYUZHI。

荣耀彩票代理BUZHOU3:MemStoreZHONGDESHUJUBEIFlushCHENGYIGEStoreFile。

荣耀彩票代理BUZHOU4:SUIZHEStoreFileWENJIANDEBUDUANZENGDUO,DANGQISHULIANGZENGZHANGDAOYIDINGYUZHIHOU,CHUFACompactHEBINGCAOZUO,JIANGDUOGEStoreFileHEBINGCHENGYIGEStoreFile,TONGSHIJINXINGBANBENHEBINGHESHUJUSHANCHU。

荣耀彩票代理BUZHOU5:StoreFilesTONGGUOBUDUANDECompactHEBINGCAOZUO,ZHUBUXINGCHENGYUELAIYUEDADEStoreFile。

荣耀彩票代理BUZHOU6:DANGEStoreFileDAXIAOCHAOGUOYIDINGYUZHIHOU,CHUFASplitCAOZUO,BADANGQIANHRegion SplitCHENG2GEXINDEHRegion。FUHRegionHUIXIAXIAN,XINSplitCHUDE2GEZIHRegionHUIBEIHMasterFENPEIDAOXIANGYINGDEHRegionServer SHANG,SHIDEYUANXIAN1GEHRegionDEYALIDEYIFENLIUDAO2GEHRegionSHANG。

4.2 读操作流程

BUZHOU1:clientFANGWENZookeeper,CHAZHAO-ROOT-BIAO,HUOQU.META.BIAOXINXI。

荣耀彩票代理BUZHOU2:CONG.META.BIAOCHAZHAO,HUOQUCUNFANGMUBIAOSHUJUDEHRegionXINXI,CONGERZHAODAODUIYINGDEHRegionServer。

荣耀彩票代理BUZHOU3:TONGGUOHRegionServerHUOQUXUYAOCHAZHAODESHUJU。

荣耀彩票代理BUZHOU4:HRegionserverDENEICUNFENWEIMemStoreHEBlockCacheLIANGBUFEN,MemStoreZHUYAOYONGYUXIESHUJU,BlockCacheZHUYAOYONGYUDUSHUJU。DUQINGQIUXIANDAOMemStoreZHONGCHASHUJU,CHABUDAOJIUDAOBlockCacheZHONGCHA,ZAICHABUDAOJIUHUIDAOStoreFileSHANGDU,BINGBADUDEJIEGUOFANGRUBlockCache。

 

5 HBase使用场景

 

荣耀彩票代理BANJIEGOUHUAHUOFEIJIEGOUHUASHUJU:DUIYUSHUJUJIEGOUZIDUANBUGOUQUEDINGHUOZALUANWUZHANG,HENNANANYIGEGAINIANQUJINXINGCHOUQUDESHUJUSHIHEYONGHBase。RUSUIZHEYEWUFAZHANXUYAOCUNCHUGENGDUODEZIDUANSHI,RDBMSXUYAOTINGJIWEIHUGENGGAIBIAOJIEGOU,ERHBaseZHICHIDONGTAIZENGJIA。

荣耀彩票代理JILUFEICHANGXISHU:RDBMSDEXINGYOUDUOSHAOLIESHIGUDINGDE,WEIKONGDELIELANGFEILECUNCHUKONGJIAN。ERHBaseWEIKONGDELIEBUHUIBEICUNCHU,ZHEIYANGJIJIESHENGLEKONGJIANYOUTIGAOLEDUXINGNENG。

荣耀彩票代理DUOBANBENSHUJU:GENJURowKeyHELIEBIAOSHIFUDINGWEIDAODEValueKEYIYOURENYISHULIANGDEBANBENZHI(SHIJIANCHUOBUTONG),YINCIDUIYUXUYAOCUNCHUBIANDONGLISHIJILUDESHUJU,YONGHBaseJIANGFEICHANGFANGBIAN。

CHAODASHUJULIANG:DANGSHUJULIANGYUELAIYUEDA,RDBMSSHUJUKUCHENGBUZHULE,JIUCHUXIANLEDUXIEFENLICELVE,TONGGUOYIGEMasterZHUANMENFUZEXIECAOZUO,DUOGESlaveFUZEDUCAOZUO,FUWUQICHENGBENBEIZENG。SUIZHEYALIZENGJIA,MasterCHENGBUZHULE,ZHEISHIJIUYAOFENKULE,BAGUANLIANBUDADESHUJUFENKAIBUSHU,YIXIEjoinCHAXUNBUNENGYONGLE,XUYAOJIEZHUZHONGJIANCENG。SUIZHESHUJULIANGDEJINYIBUZENGJIA,YIGEBIAODEJILUYUELAIYUEDA,CHAXUNJIUBIANDEHENMAN,YUSHIYOUDEGAOFENBIAO,BIRUANIDQUMOFENCHENGDUOGEBIAOYIJIANSHAODANGEBIAODEJILUSHU。JINGLIGUOZHEIXIESHIDERENDOUZHIDAOGUOCHENGSHIDUOMEDEZHETENG。CAIYONGHBaseJIUJIANDANLE,ZHIXUYAOZAIJIQUNZHONGJIARUXINDEJIEDIANJIKE,HBaseHUIZIDONGSHUIPINGQIEFENKUOZHAN,GENHadoopDEWUFENGJICHENGBAOZHANGLESHUJUDEKEKAOXING(HDFS)HEHAILIANGSHUJUFENXIDEGAOXINGNENG(MapReduce)。

6 HBase的MapReduce

 

荣耀彩票代理HBaseZHONGTableHERegionDEGUANXI,YOUXIELEISIHDFSZHONGFileHEBlockDEGUANXI。YOUYUHBaseTIGONGLEPEITAODEYUMapReduceJINXINGJIAOHUDEAPIRUTableInputFormatHETableOutputFormat,KEYIJIANGHBaseDESHUJUBIAOZHIJIEZUOWEIHadoop MapReduceDESHURUHESHUCHU,CONGERFANGBIANLEMapReduceYINGYONGCHENGXUDEKAIFA,JIBENBUXUYAOGUANZHUHBaseXITONGZISHENDECHULIXIJIE。


延伸阅读:

Tag标签:   
  • 专题推荐

About IT165 - 广告服务 - 隐私声明 - 版权申明 - 免责条款 - 网站地图 - 网友投稿 - 联系方式
本站内容来自于互联网,仅供用于网络技术学习,学习中请遵循相关法律法规