亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区
Chinaunix
標題:
hadoop中的實現(xiàn)細節(jié)
[打印本頁]
作者:
xuyuanchao_cnu
時間:
2011-12-23 02:39
標題:
hadoop中的實現(xiàn)細節(jié)
<DIV>
<DIV class=Section0 style="LAYOUT-GRID: 15.6pt none">
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 15pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">淺談</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 15pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">hadoop<FONT face=宋體>云計算模型</FONT></SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 15pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">沈巖</SPAN><SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">(首都師范大學(xué)信息工程學(xué)院 北京 <FONT face="Times New Roman">1000</FONT></SPAN><SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">48</SPAN><SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">)</SPAN><SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">摘要:</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">隨著人們的需求的不斷增加,傳統(tǒng)的互聯(lián)網(wǎng)技術(shù)已經(jīng)不能夠滿足人們的需要。為了有效的解決數(shù)據(jù)海嘯問題,國內(nèi)外提出了很多不同的模型,但是以</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">google<FONT face=宋體>的</FONT><FONT face="Times New Roman">MapReduce</FONT><FONT face=宋體>并行模型被廣泛的采用。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">關(guān)鍵詞:</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HBase <FONT face=宋體>并行計算 </FONT><FONT face="Times New Roman">MapReduce</FONT><FONT face=宋體>模型 </FONT><FONT face="Times New Roman">hadoop </FONT><FONT face=宋體>云計算</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p16 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">1</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> 引言</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> 數(shù)據(jù)海嘯的出現(xiàn)使得傳統(tǒng)的oracle數(shù)據(jù)庫及其它商業(yè)數(shù)據(jù)庫收到了相應(yīng)的沖擊。由于google技術(shù)上的需要,他們自主研發(fā)的MapReduce并行計算框架應(yīng)運而生,而由于google的該項目并不開源,所以Hadoop誕生了,F(xiàn)在hadoop幾乎成為了云計算的代名詞。利用hadoop可以對集群進行控制,同時也可以更加便捷的構(gòu)建企業(yè)級的應(yīng)用。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Hadoop是google的云計算的開源的實現(xiàn),是Apache開源組織的一個分布式框架</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">,本文以hadoop中的HDFS及HBase的相關(guān)技術(shù)來剖析hadoop的實現(xiàn)</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p16 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">hadoop的優(yōu)點</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p16 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">hadoop的開源特性給hadoop增加了很多的優(yōu)勢,很多的開發(fā)者可以利用hadoop研發(fā)自己的產(chǎn)品,而且能夠真正的解決大規(guī)模的相對無關(guān)的數(shù)據(jù)。Hadoop的開發(fā)者當初的目標就是實現(xiàn)google的三大核心技術(shù),并將其推廣到大眾。并且在批處理作業(yè)的大規(guī)模分布式計算上有很卓越的成就。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p16 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Hadoop的缺點</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p16 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Hadoop的缺點也是不容小覷的。沒有一種編程模型是適合所有情況的,hadoop自身實現(xiàn)非常的簡單并沒有實現(xiàn)太多的復(fù)雜的技術(shù)。當數(shù)據(jù)之間關(guān)聯(lián)度比較大的時候,Hadoop的MapReduce編程模型的效果就會大大折扣。而且MapReduce的執(zhí)行效果對于批處理作業(yè)有很好的效果,但是對于實時性較強的應(yīng)用來說就不合適。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Hadoop的使用環(huán)境:</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; MARGIN-LEFT: 21.25pt; TEXT-INDENT: -21.25pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">1) </SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">硬件錯誤是常態(tài)而不是異常態(tài)。云計算環(huán)境有成千上萬個server節(jié)點,大多數(shù)的硬件可能出錯的概率較高,所以如何進行錯誤恢復(fù)和檢測是hadoop文件系統(tǒng)的技術(shù)核心。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; MARGIN-LEFT: 21.25pt; TEXT-INDENT: -21.25pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">2) </SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">訪問文件與其他的應(yīng)用不同,因為普通的文件訪問追求低延遲,而在hadoop中訪問文件時,hadoop系統(tǒng)追求的是高吞吐量。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; MARGIN-LEFT: 21.25pt; TEXT-INDENT: -21.25pt"><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">3) </SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">以支持大數(shù)據(jù)集合為目標,一個存儲在上面的典型文件大小一般都在千兆至</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">T</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">字節(jié),一個單一</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">實例應(yīng)該能支撐數(shù)以千萬計的文件。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> </SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; MARGIN-LEFT: 21.25pt; TEXT-INDENT: -21.25pt"><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">4) </SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">移動計算的代價比之移動數(shù)據(jù)的代價低。一個應(yīng)用請求的計算,離它操作的數(shù)據(jù)越近就越高效,這在數(shù)據(jù)達到海量級別的時候更是如此。將計算移動到數(shù)據(jù)附近,比之將數(shù)據(jù)移動到應(yīng)用所在顯然更好,</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">提供給應(yīng)用這樣的接口。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; VERTICAL-ALIGN: super; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">【</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; VERTICAL-ALIGN: super; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">2</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; VERTICAL-ALIGN: super; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">】</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Hadoop結(jié)構(gòu)</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-INDENT: 21pt"><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">H</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">adoop</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">采用</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">master/slave</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">架構(gòu)。一個</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">集群是有一個</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Namenode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">和一定數(shù)目的</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">組成。在</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">namenode上同時有jobtracker任務(wù)在運行負責將任務(wù)分發(fā)到不同的節(jié)點上。在下圖中可以看到在偽分布式系統(tǒng)中namenode節(jié)點和datanode節(jié)點的輸出信息的不同。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p19 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-INDENT: 21pt"><IMG height=95 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-31.png" width=494><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: 'Times New Roman'; mso-spacerun: 'yes'">圖表 </SPAN><SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: 'Times New Roman'; mso-spacerun: 'yes'">1</SPAN></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> hadoop<FONT face=宋體>偽分布式系統(tǒng)的節(jié)點信息</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-INDENT: 21pt"><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Namenode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">是一個中心服 務(wù)器,負責管理文件系統(tǒng)的</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">namespace</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">和客戶端對文件的訪問。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">在集群中分布在各個節(jié)點上,負責管理節(jié)點上它們附帶的存儲。在內(nèi) 部,一個文件其實分成一個或多個</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">block</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">,每個塊一般在內(nèi)存當中是</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">128M</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">,所以如果文件中出現(xiàn)了很多的細小的文件的時候,會侵蝕大量的內(nèi)存空間,這對</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS非常不利,但是hadoop采用了har歸檔文件的方式將這些文件放在一個歸檔文件當中,有效的解決了這個問題,</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">這些</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">block</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">存儲在</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">集合里。在</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS文件中的操作是類似于本地文件的操作下面的例子是列出當前目錄里面的所有的文件的內(nèi)容。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><IMG height=229 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-325.png" width=554><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: 'Times New Roman'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p19 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'">圖表 </SPAN><SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'">2</SPAN></SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"> HDFS <FONT face=黑體>文件操作</FONT></SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-INDENT: 21pt"><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Namenode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">執(zhí)行文件系統(tǒng)的</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">namespace</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">操作,例如 打開、關(guān)閉、重命名文件和目錄,同時決定</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">block</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">到具體</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">節(jié)點的映射。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">在</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Namenode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">的指揮下進行</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">block</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">的創(chuàng) 建、刪除和復(fù)制。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Namenode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">和</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">都是設(shè)計成可以跑在普通的廉價的運行</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">linux</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">的機器上。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">采用</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">java</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">語言開發(fā),因此可以部 署在很大范圍的機器上。一個典型的部署場景是一臺機器跑一個單獨的</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Namenode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">節(jié)點,集群中的其他機器各跑一個</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">實例。這個架構(gòu)并不排 除一臺機器上跑多個</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Datanode</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">,不過這比較少見</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-INDENT: 21pt"><SPAN><IMG height=435 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-4296.png" width=508></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: 'Times New Roman'; mso-spacerun: 'yes'"> </SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p19 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"> </SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'">圖表 </SPAN><SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'">3</SPAN></SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"> HDFS<FONT face=黑體>的結(jié)構(gòu)示意圖</FONT></SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-INDENT: 21pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">HDFS保證可靠性的相關(guān)措施</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">1)</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">冗余備份,為了容錯,文件的所有數(shù)據(jù)塊都會有副本(副本數(shù)量就是備份因子)。HDFS都是一次性寫入的,保證任何時候都有一個寫者。Datanode使用本地文件系統(tǒng)存儲HDFS中的數(shù)據(jù),他對HDFS一無所知,只用一個一個的文件存儲HDFS中的數(shù)據(jù)塊,當Datanode啟動的時候,他會遍歷HDFS,產(chǎn)生一份HDFS數(shù)據(jù)塊與本地文件的對應(yīng)關(guān)系的報告。并把這個報告給namenode節(jié)點,塊報告中包含了所有的datanode節(jié)點的所有塊的列表。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">2)</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">副本存放,當復(fù)制因子是3的時候,hadoop會采用機架感知的策略,將一個副本放在本地的機架里,一個副本放在同一個機架里,一個副本放在不同的機架里。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="MARGIN-TOP: 6px; Z-INDEX: 1; LEFT: 0px; MARGIN-LEFT: 22px; WIDTH: 184px; POSITION: absolute; HEIGHT: 104px"><IMG height=104 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-27087.png" width=184></SPAN><SPAN style="MARGIN-TOP: 31px; Z-INDEX: 1; LEFT: 0px; MARGIN-LEFT: 309px; WIDTH: 65px; POSITION: absolute; HEIGHT: 69px"><IMG height=69 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-18106.png" width=65></SPAN><SPAN style="MARGIN-TOP: 31px; Z-INDEX: 1; LEFT: 0px; MARGIN-LEFT: 227px; WIDTH: 65px; POSITION: absolute; HEIGHT: 69px"><IMG height=69 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-15337.png" width=65></SPAN><SPAN style="MARGIN-TOP: 7px; Z-INDEX: 1; LEFT: 0px; MARGIN-LEFT: 210px; WIDTH: 184px; POSITION: absolute; HEIGHT: 104px"><IMG height=104 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-11453.png" width=184></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="MARGIN-TOP: 9px; Z-INDEX: 1; LEFT: 0px; MARGIN-LEFT: 129px; WIDTH: 65px; POSITION: absolute; HEIGHT: 69px"><IMG height=69 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-19593.png" width=65></SPAN><SPAN style="MARGIN-TOP: 11px; Z-INDEX: 1; LEFT: 0px; MARGIN-LEFT: 46px; WIDTH: 65px; POSITION: absolute; HEIGHT: 69px"><IMG height=69 src="file:///C:/DOCUME~1/shenyan/LOCALS~1/Temp/ksohtml/wps_clip_image-11258.png" width=65></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> </SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p19 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p19 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'">圖表 </SPAN><SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Cambria'; mso-spacerun: 'yes'">4</SPAN></SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"> 機架感知的策略</SPAN><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">3)</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">心跳檢測</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Namenode<FONT face=宋體>周期性的從</FONT><FONT face="Times New Roman">datanode</FONT><FONT face=宋體>接受心跳包和塊報告,收到心跳包說明</FONT><FONT face="Times New Roman">datanode</FONT><FONT face=宋體>工作正常。</FONT><FONT face="Times New Roman">Namenode</FONT><FONT face=宋體>會標記最近沒有心跳的</FONT><FONT face="Times New Roman">datanode</FONT><FONT face=宋體>為死機,</FONT><FONT face="Times New Roman">namenode</FONT><FONT face=宋體>會不斷的檢測這些需要復(fù)制的數(shù)據(jù)塊,并在需要的時候重新復(fù)制。原因主要是由于</FONT><FONT face="Times New Roman">datanode</FONT><FONT face=宋體>上的節(jié)點不可用,數(shù)據(jù)副本損壞,</FONT><FONT face="Times New Roman">datanode</FONT><FONT face=宋體>節(jié)點磁盤錯誤或復(fù)制因子增大等等。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">剖析<FONT face="Times New Roman">MapReduce</FONT><FONT face=宋體>工作機制</FONT></SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">1.</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">客戶端:提交<FONT face="Times New Roman">MapReduce</FONT><FONT face=宋體>作業(yè)</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">2.</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Jobtracker<FONT face=宋體>負責協(xié)調(diào)作業(yè)運行 類:</FONT><FONT face="Times New Roman">JobTracker</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">3.</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Tasktracker <FONT face=宋體>運行劃分后的作業(yè)任務(wù) 類:</FONT><FONT face="Times New Roman">TaskTracker</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">4.</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">分布式文件系統(tǒng),用來在其他的實體間共享作業(yè)文件。</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> Runjob<FONT face=宋體>提交作業(yè)后,第一步</FONT><FONT face="Times New Roman">.runjob</FONT><FONT face=宋體>并周期性的輪詢作業(yè)的進度,如果發(fā)現(xiàn)上次與這次的報告有變化,將進度報告給控制臺。第二步</FONT><FONT face="Times New Roman">Jobclient</FONT><FONT face=宋體>向</FONT><FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>申請一個新的作業(yè)</FONT><FONT face="Times New Roman">ID</FONT><FONT face=宋體>。第三步</FONT><FONT face="Times New Roman">.</FONT><FONT face=宋體>將運行作業(yè)需要的資源</FONT><FONT face="Times New Roman">(</FONT><FONT face=宋體>包括作業(yè)</FONT><FONT face="Times New Roman">jar</FONT><FONT face=宋體>文件、配置文件、輸入分片等</FONT><FONT face="Times New Roman">)</FONT><FONT face=宋體>復(fù)制到一個以作業(yè)</FONT><FONT face="Times New Roman">ID</FONT><FONT face=宋體>命名的</FONT><FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>的文件系統(tǒng)中。作業(yè)</FONT><FONT face="Times New Roman">jar</FONT><FONT face=宋體>的副本較多(由</FONT><FONT face="Times New Roman">mapred.submit.replication</FONT><FONT face=宋體>屬性控制)第四步</FONT><FONT face="Times New Roman">.</FONT><FONT face=宋體>告知</FONT><FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>作業(yè)準備執(zhí)行(通過</FONT><FONT face="Times New Roman">submitjob</FONT><FONT face=宋體>方法實現(xiàn))。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">作業(yè)的初始化</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> 當<FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>接收到對其</FONT><FONT face="Times New Roman">submitjob</FONT><FONT face=宋體>方法的調(diào)用后,交由作業(yè)調(diào)度器進行調(diào)度,并對其進行初始化。為了創(chuàng)建任務(wù)運行列表,作業(yè)調(diào)度器從共享文件系統(tǒng)中獲取</FONT><FONT face="Times New Roman">jobclient</FONT><FONT face=宋體>已經(jīng)算好的輸入分片信息。然后將每個分片分配到</FONT><FONT face="Times New Roman">map</FONT><FONT face=宋體>任務(wù)。創(chuàng)建的</FONT><FONT face="Times New Roman">reduce</FONT><FONT face=宋體>任務(wù)的數(shù)量由</FONT><FONT face="Times New Roman">jobconf</FONT><FONT face=宋體>的</FONT><FONT face="Times New Roman">mapred.reduce.task</FONT><FONT face=宋體>屬性決定,然后調(diào)度器創(chuàng)建相應(yīng)的數(shù)量的要運行的</FONT><FONT face="Times New Roman">reduce</FONT><FONT face=宋體>任務(wù),任務(wù)在此時被指定</FONT><FONT face="Times New Roman">ID</FONT><FONT face=宋體>。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">任務(wù)的分配</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> Tasktracker<FONT face=宋體>運行一個簡單的循環(huán)來定期發(fā)送“心跳”給</FONT><FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>。對于</FONT><FONT face="Times New Roman">map</FONT><FONT face=宋體>任務(wù)和</FONT><FONT face="Times New Roman">reduce</FONT><FONT face=宋體>任務(wù),</FONT><FONT face="Times New Roman">tasktracker</FONT><FONT face=宋體>有固定數(shù)量的任務(wù)槽。例如,一個</FONT><FONT face="Times New Roman">tasktracker</FONT><FONT face=宋體>可能可以同時運行兩個或多個</FONT><FONT face="Times New Roman">map</FONT><FONT face=宋體>任務(wù)和</FONT><FONT face="Times New Roman">reduce</FONT><FONT face=宋體>任務(wù)。準確的數(shù)量由</FONT><FONT face="Times New Roman">tasktracker</FONT><FONT face=宋體>核的數(shù)量和內(nèi)存的大小來決定。如果</FONT><FONT face="Times New Roman">tasktracker</FONT><FONT face=宋體>至少有一個空閑的</FONT><FONT face="Times New Roman">map</FONT><FONT face=宋體>任務(wù)槽,</FONT><FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>會為它選擇一個</FONT><FONT face="Times New Roman">map</FONT><FONT face=宋體>任務(wù),否則選擇一個</FONT><FONT face="Times New Roman">reduce</FONT><FONT face=宋體>任務(wù)。為選擇一個</FONT><FONT face="Times New Roman">reduce</FONT><FONT face=宋體>任務(wù),</FONT><FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>簡單地從待運行的</FONT><FONT face="Times New Roman">reduce</FONT><FONT face=宋體>任務(wù)列表中選取下一個執(zhí)行,用不著考慮數(shù)據(jù)的本地化。然而對于</FONT><FONT face="Times New Roman">map</FONT><FONT face=宋體>任務(wù),</FONT><FONT face="Times New Roman">jobtracker</FONT><FONT face=宋體>會考慮</FONT><FONT face="Times New Roman">tasktracker</FONT><FONT face=宋體>的網(wǎng)絡(luò)位置,并選擇一個距離其輸入分片文件最近的</FONT><FONT face="Times New Roman">tasktracker</FONT><FONT face=宋體>。在最理想的情況下,任務(wù)是數(shù)據(jù)本地化的。同樣也可能是機架本地化的。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; VERTICAL-ALIGN: super; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">【<FONT face="Times New Roman">3</FONT><FONT face=宋體>】</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">分布式結(jié)構(gòu)化數(shù)據(jù)表<FONT face="Times New Roman">Hbase</FONT></SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">傳統(tǒng)的<FONT face="Times New Roman">SQL</FONT><FONT face=宋體>與</FONT><FONT face="Times New Roman">Hbase</FONT><FONT face=宋體>的區(qū)別</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; VERTICAL-ALIGN: super; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">【<FONT face="Times New Roman">1</FONT><FONT face=宋體>】</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Hadoop<FONT face=宋體>是一個處理數(shù)據(jù)的框架,比關(guān)系型數(shù)據(jù)庫要好很多。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">1)</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Hbase<FONT face=宋體>目標是處理大量的非結(jié)構(gòu)化數(shù)據(jù),而</FONT><FONT face="Times New Roman">SQL</FONT><FONT face=宋體>是處理大量的結(jié)構(gòu)化數(shù)據(jù)。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">2)</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">商業(yè)的關(guān)系型數(shù)據(jù)庫是非常昂貴的。他們的設(shè)計更趨向于擴大規(guī)模。為了運行一個更大的數(shù)據(jù)庫,商家不得不為支持一個更大更強健的數(shù)據(jù)庫而購買一個更大的、處理速度更快的服務(wù)器。然而,隨著數(shù)據(jù)集合的不斷的擴大,高端的服務(wù)器已經(jīng)不能滿足數(shù)據(jù)規(guī)模的擴大了,而且在本地磁盤上的查詢速度主要與尋道的時間相關(guān),當硬盤容量增加時,尋道時間是制約查詢速度的瓶頸,而<FONT face="Times New Roman">hadoop</FONT><FONT face=宋體>是一個具有向外延伸特性的框架結(jié)構(gòu),如果需要更多的資源,只需要向這個集群里增加更多的廉價的</FONT><FONT face="Times New Roman">PC</FONT><FONT face=宋體>機器就可以了。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">3)</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">從電力成本的角度上來說,分布式系統(tǒng)的節(jié)能能力要高于<FONT face="Times New Roman">SQL</FONT><FONT face=宋體>系統(tǒng)的節(jié)能能力。一臺這樣的高性能高配置的電腦比四臺低端配置的</FONT><FONT face="Times New Roman">PC</FONT><FONT face=宋體>機的能耗要多,這說明昂貴的硬件開銷仍然不能夠滿足實際的需要。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">4<FONT face=宋體>)關(guān)系型數(shù)據(jù)庫是處理數(shù)據(jù)之間存在依賴關(guān)系的數(shù)據(jù),而大量的數(shù)據(jù)類型是相對無關(guān)的,比如圖片,</FONT><FONT face="Times New Roman">xml</FONT><FONT face=宋體>,文本文檔一些實例,大量的數(shù)據(jù)是無組織的,無結(jié)構(gòu)化的數(shù)據(jù)。</FONT><FONT face="Times New Roman">Hadoop</FONT><FONT face=宋體>用 </FONT><FONT face="Times New Roman">key/value</FONT><FONT face=宋體>的方式,這種方式處理大量的無結(jié)構(gòu)化的數(shù)據(jù)時就變得很靈活了。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">5<FONT face=宋體>)查詢方式不同。</FONT><FONT face="Times New Roman">SQL</FONT><FONT face=宋體>語言是利用查詢語句來對內(nèi)容進行檢索的,而</FONT><FONT face="Times New Roman">hadoop</FONT><FONT face=宋體>是通過編程和腳本來實現(xiàn)對數(shù)據(jù)的查詢與檢索的,很多的</FONT><FONT face="Times New Roman">SQL</FONT><FONT face=宋體>使用者不習(xí)慣</FONT><FONT face="Times New Roman">hadoop</FONT><FONT face=宋體>的編程模式,但是</FONT><FONT face="Times New Roman">hadoop</FONT><FONT face=宋體>給用戶提供了接口,可以使用類似</FONT><FONT face="Times New Roman">SQL</FONT><FONT face=宋體>語言的</FONT><FONT face="Times New Roman">pig</FONT><FONT face=宋體>語言來實現(xiàn)相關(guān)的內(nèi)容檢索,讓其自動的轉(zhuǎn)換為</FONT><FONT face="Times New Roman">mapreduce</FONT><FONT face=宋體>程序。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">6<FONT face=宋體>)</FONT><FONT face="Times New Roman">hadoop</FONT><FONT face=宋體>適合于對于大量數(shù)據(jù)的脫機處理,而不是對大量數(shù)據(jù)的在線的交易。</FONT><FONT face="Times New Roman">Hadoop</FONT><FONT face=宋體>不適合于隨機讀寫,實時性能不太好。但是他適合于一次寫入,多次讀取的數(shù)據(jù)存儲類型。</FONT></SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; MARGIN-LEFT: 18pt; TEXT-INDENT: -18pt"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: 'Times New Roman'; mso-spacerun: 'yes'">4 </SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">總結(jié)</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: left"><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"> 目前基本上hadoop成為了人們眼中的云計算的代名詞。對于PB級別的海量數(shù)據(jù)處理,hadoop起著無法替代的作用,中國移動出產(chǎn)了大云計劃。通過修改和改裝hadoop平臺實現(xiàn)了對hadoop的海量數(shù)據(jù)的處理。</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt; TEXT-ALIGN: center"><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'">參考文獻</SPAN><SPAN style="FONT-WEIGHT: bold; FONT-SIZE: 10.5pt; FONT-FAMILY: '黑體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">[1]</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">Chunk Lam.Hadoop in action 2010</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">[2]</SPAN><SPAN style="FONT-WEIGHT: normal; FONT-SIZE: 10.5pt; COLOR: rgb(0,0,0); FONT-STYLE: normal; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">http://hadoop.apache.org/core/docs/current/hdfs_design.html</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"> </SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'">[3]Tom White. hadoop the definitive guide 2011</SPAN><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P>
<P class=p0 style="MARGIN-TOP: 0pt; MARGIN-BOTTOM: 0pt"><SPAN style="FONT-SIZE: 10.5pt; FONT-FAMILY: '宋體'; mso-spacerun: 'yes'"></SPAN></P></DIV></DIV>
歡迎光臨 Chinaunix (http://www.72891.cn/)
Powered by Discuz! X3.2