亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区

  免費(fèi)注冊 查看新帖 |

Chinaunix

  平臺 論壇 博客 文庫
最近訪問板塊 發(fā)新帖
查看: 21699 | 回復(fù): 1
打印 上一主題 下一主題

[服務(wù)應(yīng)用] Linux磁盤檢測工具smartctl的使用和分析? [復(fù)制鏈接]

論壇徽章:
0
跳轉(zhuǎn)到指定樓層
1 [收藏(0)] [報(bào)告]
發(fā)表于 2014-03-20 11:40 |只看該作者 |倒序?yàn)g覽
Linux磁盤檢測工具smartctl的使用和分析?

論壇徽章:
0
2 [報(bào)告]
發(fā)表于 2014-03-20 11:41 |只看該作者
1        編寫目的
在如今大數(shù)據(jù)的環(huán)境中,磁盤的性能和穩(wěn)定性是非常重要的一個業(yè)務(wù)因素。在Linux系統(tǒng)中,smartctl是較為常用的磁盤檢測工具。
本文基于Linux系統(tǒng)中smartctl進(jìn)行分析,目的在于說明相關(guān)工具的使用,并對SMART(Self-Monitoring, Analysis and Reporting Technology)做一些分析。
2        術(shù)語、定義和縮略語
2.1        術(shù)語、定義
本文使用的專用術(shù)語、定義,見表2.1。
表2.1
術(shù)語/定義        含義
SMART        Self-Monitoring, Analysis and Reporting Technology
2.2        縮略語
本文件應(yīng)用了以下縮略語,見表2.2。
表2.2
縮略語        原    文        中文含義
SMART        Self-Monitoring, Analysis and Reporting Technology        自監(jiān)察分析及報(bào)告技術(shù)
               
               
3        smartctl
smartctl是smartmontools-5.38-2.el5 rpm中的一個命令行工具,可以執(zhí)行SMART任務(wù):打印SMART self-test和error報(bào)告,開啟或關(guān)閉SMART自動測試,觸發(fā)磁盤self-test。
語法:
       smartctl  [options]  device
device:
"/dev/hd[a-t]"    IDE/ATA 磁盤
"/dev/sd[a-z]"    SCSI devices磁盤。注意,對于SATA磁盤,由于是通過libata
庫來訪問,所以要增加參數(shù)"-d  ata"。
3.1        [options]:
        參數(shù)按照不同的類型來分類。
3.1.1        顯示信息 參數(shù):
-h                幫助信息
-V                版本信息
-i                打印基本信息(磁盤設(shè)備號、序列號、固件版本…)
-a      打印磁盤所有的SMART信息
3.1.2        運(yùn)行時行為 參數(shù):
-q  TYPE        指定輸出的安靜模式。
TYPE可以有3種選擇:
                          eorsonly                只打印錯誤日志。
                          slent                        有任何打印。
                          nserial                不打印序列號
        -d  TYPE        指定磁盤的類型。如果沒有指定,smartctl會根據(jù)磁盤的名字來
猜測磁盤類型。
-T  TYPE        指定當(dāng)發(fā)生錯誤時,smartctl的容忍程度,是否繼續(xù)運(yùn)行。
                        TYPE可以有4種選擇:
                          conservative        一有錯就會退出
                          normal        如果必須支持的SMART命令失敗,則退出
                          permissive     忽略一次必須支持的SMART命令失敗
                          verypermissive  忽略所有必須支持的SMART命令失敗
-b  TYPE        指定當(dāng)發(fā)生校驗(yàn)錯誤時,smartctl的動作。
                        TYPE有3種選擇:
                          warn                發(fā)出警告,繼續(xù)執(zhí)行
                          exit                 退出smartctl
                          ignore                不發(fā)出告警,繼續(xù)執(zhí)行       
-r  TYPE        smartmontools開發(fā)人員相關(guān)。
-n  POWERMODE        指定當(dāng)磁盤處于節(jié)能模式時,smartctl是否繼續(xù)檢查,
默認(rèn)是不檢查。
POWERMODE有4種選擇:
  never   檢查
  sleep    除了sleep模式,檢查。
  standby  除了sleep或standby模式,檢查。
  idle      除了sleep或standby或idle模式,見車。
3.1.3        SMART功能開關(guān) 參數(shù):
-s  on/off        打開或關(guān)閉磁盤的SMART功能
-o  on/off        打開或關(guān)閉SMART自動離線檢測,該功能每4小時就會自動掃描磁盤是
否有缺陷。
-S  on/off   打開或關(guān)閉“自動保存廠商指定屬性”功能。
3.1.4        SMART 讀和顯示數(shù)據(jù) 參數(shù)
-H                報(bào)告磁盤的是否健康。如果報(bào)告不健康,則說明磁盤已經(jīng)損壞或會在24小時
內(nèi)損壞。
-c                顯示磁盤支持的普通SMART功能,以及這些功能當(dāng)前的狀態(tài)。
-A                顯示磁盤支持的廠商指定SMART特性。這些特性的編號從1-253,并且有指
定的名字。
-l  TYPE        指定顯示的log類型。
                        TYPE有4種選擇:
                        error                只顯示error  log。
                        selftest        只顯示selftest  log
                        selective 只顯示selective  self-test  log
                        directory 只顯示Log  Directory
        -v  N,OPTION        顯示廠商指定SMART特性N時,使用廠商相關(guān)的顯示方式。
-F  TYPE        設(shè)置smartctl的行為,當(dāng)出現(xiàn)一些已知但還沒有解決的硬件或軟件bug時,
smartctl應(yīng)該怎么做。
-P  TYPE        設(shè)置smartctl是否對磁盤使用數(shù)據(jù)庫中已有的參數(shù)。
3.1.5        SMART 離線測試、自測試 參數(shù)
-t  TEST        立刻執(zhí)行測試,可以和-C參數(shù)一起使用。
                        TEST可以有以下幾個選擇:
                        offline  離線測試?梢栽趻燧d文件系統(tǒng)的磁盤上使用
                        short   短時間測試?梢栽趻燧d文件系統(tǒng)的磁盤上使用。
                        long   長時間測試?梢栽趻燧d文件系統(tǒng)的磁盤上使用。
                        conveyance  [ATA only]傳輸zi測試?梢栽趻燧d文件系統(tǒng)的磁盤上使用。
                        select, N-M       
select, N+SIZE  [ATA only]有選擇性測試,測試磁盤的部分LBA。N表示
LBA編號,M表示結(jié)束LBA編號,SIZE表示測試的LBA
范圍。
-C  在captive模式下運(yùn)行測試。
注意:(1)-C必須配合-t一起使用,但如果是-t offline,則-C不生效。
         (2)-C會使得磁盤很忙,所以最好是在沒有掛載文件系統(tǒng)的磁盤上使用。
-X  中斷no-captive模式下運(yùn)行的測試。
3.2        常用example
3.2.1        查看當(dāng)前整體健康狀態(tài)
查看/dev/sda當(dāng)前整體監(jiān)控狀態(tài)。PASSED表示健康,否則意味著磁盤已經(jīng)故障,或很快就會發(fā)生故障。
smartctl  -H  /dev/sda

3.2.2        查看所有信息
打印/dev/sda所有的SMART信息。
martctl  -a  /dev/sda

相當(dāng)于依次執(zhí)行:
smartctl  –i  /dev/sda   
smartctl  -c  /dev/sda   
smartctl  -A  /dev/sda   
smartctl  -l  error  /dev/sda
smartctl  -l  selftest  /dev/sda
smartctl  -l  selective  /dev/sda
3.2.3        開/關(guān)SMART功能
打開或關(guān)閉/dev/sda 的SMART功能。
smartctl  -s  on/off  /dev/sda

查看當(dāng)前SMART功能是否開啟,可以使用 –i 參數(shù)。
smartctl  -i  /dev/sda
3.2.4        離線測試
對/dev/sda進(jìn)行離線測試,它的結(jié)果主要用來更新SMART 屬性。
smartctl  -t  offline  /dev/sda
3.2.5         短時間測試
對/dev/sda進(jìn)行短時間測試。
smartctl  -t  short  /dev/sda
3.2.5.1        觀察測試進(jìn)度
通過-c 參數(shù),可以觀察到測試的進(jìn)度:
# smartctl -c    /dev/sda

Self-test execution status:      ( 242)        Self-test routine in progress...
                                                    20% of test remaining.

3.2.5.2        觀察測試結(jié)果
通過-l selftest 參數(shù),可以看到/dev/sda測試的結(jié)果記錄:
“#1”代表的那一次測試,Completed without error表示完成,沒有錯誤。
“#2”代表的那一次測試,Aborted by host表示測試被用戶終止,還有90%沒有完成。

# smartctl -l selftest    /dev/sda
...
Num  Test_Description  Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error   00%        9535         -
# 2  Extended offline    Aborted by host          90%        9534         -
...
3.2.6        查看SMART屬性值
通過-A參數(shù),可以看到/dev/sda SMART屬性值。
smartctl  -A  /dev/sda

每一行代表一個SMART屬性的相關(guān)信息。
RAW_VALUE: 表示該屬性的實(shí)際值,比如12行,表示磁盤power-cycle的實(shí)際次數(shù)。
VALUE: 范圍是1到254,由RAW_VALUE裝換而來,裝換工作是由磁盤的固件自己完成的。
THRESH: 范圍0到255,門限值,和VALUE值比較。如果VALUE值小于等于THRESH,那么這個屬性就不正常了。
TYPE: Pre-fail表示當(dāng)VALUE值小于或等于THRESH時,磁盤即將會有相關(guān)故障。
      Old_age表示當(dāng)VALUE值小于或等于THRESH時,磁盤相關(guān)屬性已經(jīng)老化。
3.3        smartctl結(jié)構(gòu)
smartctl工具的主要結(jié)構(gòu)如下圖,解析參數(shù)后,就根據(jù)參數(shù)指定的值設(shè)置或查詢SMART信息。

3.4        SMART 屬性
使用smartctl  -A  /dev/sda能看到很多磁盤的SMART        屬性,可以知道磁盤是否健康。
下面是一個列表,可以知道每個屬性的具體含義:
ID        Hex        Attribut name        Description
01        0x01        Read Error Rate        (Vendor specific raw value.) Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number.
02        0x02        Throughput Performance        Overall (general) throughput performance of a hard disk drive. If the value of this attribute is decreasing there is a high probability that there is a problem with the disk.
03        0x03        Spin-Up Time        Average time of spindle spin up (from zero RPM to fully operational [millisecs]).
04        0x04        Start/Stop Count        A tally of spindle start/stop cycles. The spindle turns on, and hence the count is increased, both when the hard disk is turned on after having before been turned entirely off (disconnected from power source) and when the hard disk returns from having previously been put to sleep mode.
05        0x05        Reallocated Sectors Count        Count of reallocated sectors. When the hard drive finds a read/write/verification error, it marks that sector as "reallocated" and transfers data to a special reserved area (spare area). This process is also known as remapping, and reallocated sectors are called "remaps". The raw value normally represents a count of the bad sectors that have been found and remapped. Thus, the higher the attribute value, the more sectors the drive has had to reallocate. This allows a drive with bad sectors to continue operation; however, a drive which has had any reallocations at all is significantly more likely to fail in the near future.[2]While primarily used as a metric of the life expectancy of the drive, this number also affects performance. As the count of reallocated sectors increases, the read/write speed tends to become worse because the drive head is forced to seek to the reserved area whenever a remap is accessed. A workaround which will preserve drive speed at the expense of capacity is to create a disk partition over the region which contains remaps and instruct the operating system to not use that partition.
06        0x06        Read Channel Margin        Margin of a channel while reading data. The function of this attribute is not specified.
07        0x07        Seek Error Rate        (Vendor specific raw value.) Rate of seek errors of the magnetic heads. If there is a partial failure in the mechanical positioning system, then seek errors will arise. Such a failure may be due to numerous factors, such as damage to a servo, or thermal widening of the hard disk. The raw value has different structure for different vendors and is often not meaningful as a decimal number.
08        0x08        Seek Time Performance        Average performance of seek operations of the magnetic heads. If this attribute is decreasing, it is a sign of problems in the mechanical subsystem.
09        0x09        Power-On Hours (POH)
Count of hours in power-on state. The raw value of this attribute shows total count of hours (or minutes, or seconds, depending on manufacturer) in power-on state.
10        0x0A        Spin Retry Count        Count of retry of spin start attempts. This attribute stores a total count of the spin start attempts to reach the fully operational speed (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem.
11        0x0B        Recalibration Retries orCalibration Retry Count        This attribute indicates the count that recalibration was requested (under the condition that the first attempt was unsuccessful). An increase of this attribute value is a sign of problems in the hard disk mechanical subsystem.
12        0x0C        Power Cycle Count        This attribute indicates the count of full hard disk power on/off cycles.
13        0x0D        Soft Read Error Rate        Uncorrected read errors reported to the operating system.
180        0xB4        Unused Reserved Block Count Total        "Pre-Fail" Attribute used at least in HP devices.
183        0xB7        SATA Downshift Error Count        Western Digital and Samsung attribute.
184        0xb8        End-to-End error / IOEDC                This attribute is a part of Hewlett-Packard's SMART IV technology, as well as part of other vendors' IO Error Detection and Correction schemas, and it contains a count of parity errors which occur in the data path to the media via the drive's cache RAM.
185        0xB9        Head Stability        Western Digital attribute.
186        0xBA        Induced Op-Vibration Detection        Western Digital attribute.
187        0xBB        Reported Uncorrectable Errors        The count of errors that could not be recovered using hardware ECC .
188        0xBC        Command Timeout        The count of aborted operations due to HDD timeout. Normally this attribute value should be equal to zero and if the value is far above zero, then most likely there will be some serious problems with power supply or an oxidized data cable.
189        0xBD        High Fly Writes        HDD producers implement a Fly Height Monitor that attempts to provide additional protections for write operations by detecting when a recording head is flying outside its normal operating range. If an unsafe fly height condition is encountered, the write process is stopped, and the information is rewritten or reallocated to a safe region of the hard drive. This attribute indicates the count of these errors detected over the lifetime of the drive.
This feature is implemented in most modern Seagate drives and some of Western Digital’s drives, beginning with the WD Enterprise WDE18300 and WDE9180 Ultra2 SCSI hard drives, and will be included on all future WD Enterprise products.

190        0xBE        Airflow Temperature (WDC) resp.Airflow Temperature Celsius (HP)        Airflow temperature on Western Digital HDs (Same as temp. [C2], but current value is 50 less for some models. Marked as obsolete.)
191        0xBF        G-sense Error Rate        The count of errors resulting from externally-induced shock & vibration.
192        0xC0        Power-off Retract Countor Emergency Retract Cycle Count(Fujitsu)        Count of times the heads are loaded off the media. Heads can be unloaded without actually powering off.
193        0xC1        Load Cycle Count orLoad/Unload Cycle Count(Fujitsu)        Count of load/unload cycles into head landing zone position.
The typical lifetime rating for laptop (2.5-in) hard drives is 300,000 to 600,000 load cycles. Some laptop drives are programmed to unload the heads whenever there has not been any activity for about five seconds.Many Linux installations write to the file system a few times a minute in the background. As a result, there may be 100 or more load cycles per hour, and the load cycle rating may be exceeded in less than a year

194        0xC2        Temperatureresp.Temperature Celsius        Current internal temperature.
195        0xC3        Hardware ECC Recovered        (Vendor specific raw value.) The raw value has different structure for different vendors and is often not meaningful as a decimal number.
196        0xC4        Reallocation Event Count        Count of remap operations. The raw value of this attribute shows the total count of attempts to transfer data from reallocated sectors to a spare area. Both successful & unsuccessful attempts are counted.
197        0xC5        Current Pending Sector Count        Count of "unstable" sectors (waiting to be remapped, because of read errors). If an unstable sector is subsequently read successfully, this value is decreased and the sector is not remapped. Read errors on a sector will not remap the sector (since it might be readable later); instead, the drive firmware remembers that the sector needs to be remapped, and remaps it the next time it's written.
198        0xC6        Uncorrectable Sector Countor
Offline Uncorrectableor
Off-Line Scan Uncorrectable Sector Count
        The total count of uncorrectable errors when reading/writing a sector. A rise in the value of this attribute indicates defects of the disk surface and/or problems in the mechanical subsystem.
199        0xC7        UltraDMA CRC Error Count        The count of errors in data transfer via the interface cable as determined by ICRC (Interface Cyclic Redundancy Check).
200        0xC8        Multi-Zone Error Rate        The count of errors found when writing a sector. The higher the value, the worse the disk's mechanical condition is.
200        0xC8        Write Error Rate (Fujitsu)        The total count of errors when writing a sector.
201        0xC9        Soft Read Error Rate or
TA Counter Detected
        Count of off-track errors.
202        0xCA        Data Address Mark errorsor
TA Counter Increased
        Count of Data Address Mark errors (or vendor-specific).
203        0xCB        Run Out Cancel        Count of ECC errors
204        0xCC        Soft ECC Correction        Count of errors corrected by software ECC
205        0xCD        Thermal Asperity Rate (TAR)        Count of errors due to high temperature.
206        0xCE        Flying Height        Height of heads above the disk surface. A flying height that's too low increases the chances of a head crash while a flying height that's too high increases the chances of a read/write error.
207        0xCF        Spin High Current        Amount of surge current used to spin up the drive.
208        0xD0        Spin Buzz        Count of buzz routines needed to spin up the drive due to insufficient power.
209        0xD1        Offline Seek Performance        Drive’s seek performance during its internal tests.
210        0xD2        Unkonw        (found in a Maxtor 6B200M0 200GB and Maxtor 2R015H1 15GB disks)
211        0xD3        Vibration During Write        Vibration During Write
212        0xD4        Shock During Write        Shock During Write
220        0xDC        Disk Shift        Distance the disk has shifted relative to the spindle (usually due to shock or temperature). Unit of measure is unknown.
222        0xDE        Loaded Hours        Time spent operating under data load (movement of magnetic head armature)
223        0xDF        Load/Unload Retry Count        Count of times head changes position.
224        0xE0        Load Friction        Resistance caused by friction in mechanical parts while operating.
225        0xE1        Load/Unload Cycle Count        Total count of load cycles
226        0xE2        Load 'In'-time        Total time of loading on the magnetic heads actuator (time not spent in parking area).
227        0xE3        Torque Amplification Count        Count of attempts to compensate for platter speed variations
228        0xE4        Power-Off Retract Cycle        The count of times the magnetic armature was retracted automatically as a result of cutting power.
230        0xE6        GMR Head Amplitude        Amplitude of "thrashing" (distance of repetitive forward/reverse head motion)
231        0xE7        Temperature        Drive Temperature
232        0xE8        Endurance Remaining        Number of physical erase cycles completed on the drive as a percentage of the maximum physical erase cycles the drive is designed to endure
232        0xE8        Available Reserved Space        Intel SSD reports the number of available reserved space as a percentage of reserved space in a brand new SSD.
233        0xE9        Power-On Hours        Number of hours elapsed in the power-on state.
233        0xE9        Media Wearout Indicator        Intel SSD reports a normalized value of 100 (when the SSD is new) and declines to a minimum value of 1. It decreases while the NAND erase cycles increase from 0 to the maximum-rated cycles.
240        0xF0        Head Flying Hours        Time while head is positioning
240        0xF0        Transfer Error Rate(Fujitsu)        Count of times the link is reset during a data transfer.
241        0xF1        Total LBAs Written        Total count of LBAs written
242        0xF2        Total LBAs Read        Total count of LBAs read.
Some S.M.A.R.T. utilities will report a negative number for the raw value since in reality it has 48 bits rather than 32.
250        0xFA        Read Error Retry Rate        Count of errors while reading from a disk
254        0xFE        Free Fall Protection        ount of "Free Fall Events" detected

3.5        SMART self-test
使用smartctl  –t  offline/short/long   可以指定磁盤進(jìn)行自測。
offline:
這個是默認(rèn)的自測。
short:
        短時自測的目的是快速確認(rèn)磁盤是否故障。
        測試過程有很多項(xiàng)目,都是磁盤廠商自定義的,比如下面的項(xiàng)目:
a)        電氣測試項(xiàng)目,測試磁盤內(nèi)部的電路。具體測試細(xì)節(jié)有磁盤廠商自己指定,比如:
A)        緩存測試。
B)        讀、寫電路測試。
C)        讀、寫磁頭測試。
b)        尋道、伺服測試項(xiàng)目,測試磁盤在數(shù)據(jù)磁道上的尋找和伺服能。
c)        讀、校驗(yàn)測試項(xiàng)目,測試磁盤對部分或全盤的讀能力。
long:
        稱為擴(kuò)展的自測試。測試的項(xiàng)目和short類型,但是時間長得多。
您需要登錄后才可以回帖 登錄 | 注冊

本版積分規(guī)則 發(fā)表回復(fù)

  

北京盛拓優(yōu)訊信息技術(shù)有限公司. 版權(quán)所有 京ICP備16024965號-6 北京市公安局海淀分局網(wǎng)監(jiān)中心備案編號:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年舉報(bào)專區(qū)
中國互聯(lián)網(wǎng)協(xié)會會員  聯(lián)系我們:huangweiwei@itpub.net
感謝所有關(guān)心和支持過ChinaUnix的朋友們 轉(zhuǎn)載本站內(nèi)容請注明原作者名及出處

清除 Cookies - ChinaUnix - Archiver - WAP - TOP