天海翼一区二区三区高清在线,亚洲av日韩av高潮潮喷无码

wazhl

稍有積蓄

論壇徽章:: 0

電梯直達(dá)

1樓 [收藏(0)] [報(bào)告]

發(fā)表于 2010-07-02 13:26 |只看該作者 |倒序?yàn)g覽

RT。

比如我想下載網(wǎng)上的

http://mirrors.kernel.org/opensuse/distribution/11.2/iso/

下的文件。。。

怎么得到呢？

文庫(kù)|博客

Apache官方強(qiáng)心劑：開源不受出口管理?xiàng)l例約束！
Linux基礎(chǔ)命令---lynx瀏覽器
Dell R740服務(wù)器設(shè)置磁盤直通,不做RAID虛擬磁盤陣列
Linux基礎(chǔ)命令---elinks文本瀏覽器
Linux基礎(chǔ)命令---wget下載工具

dexter_yccs

小富即安

論壇徽章:: 0

2樓 [報(bào)告]

發(fā)表于 2010-07-02 13:53 |只看該作者

wget http://mirrors.kernel.org/opensu ... Addon-Lang-i586.iso
好像不能批次

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

gamester88

豐衣足食

論壇徽章:: 5

IT運(yùn)維版塊每日發(fā)帖之星
日期:2015-12-17 06:20:00

3樓 [報(bào)告]

發(fā)表于 2010-07-02 14:07 |只看該作者

回復(fù) 1# wazhl

wget 使用技巧

2007-10-14 Toy Posted in TipsRSSTrackback

wget 是一個(gè)命令行的下載工具。對(duì)于我們這些 Linux 用戶來(lái)說(shuō)，幾乎每天都在使用它。下面為大家介紹幾個(gè)有用的 wget 小技巧，可以讓你更加高效而靈活的使用 wget。

* $ wget -r -np -nd http://example.com/packages/

這條命令可以下載 http://example.com 網(wǎng)站上 packages 目錄中的所有文件。其中，-np 的作用是不遍歷父目錄，-nd 表示不在本機(jī)重新創(chuàng)建目錄結(jié)構(gòu)。

* $ wget -r -np -nd --accept=iso http://example.com/centos-5/i386/

與上一條命令相似，但多加了一個(gè) --accept=iso 選項(xiàng)，這指示 wget 僅下載 i386 目錄中所有擴(kuò)展名為 iso 的文件。你也可以指定多個(gè)擴(kuò)展名，只需用逗號(hào)分隔即可。

* $ wget -i filename.txt

此命令常用于批量下載的情形，把所有需要下載文件的地址放到 filename.txt 中，然后 wget 就會(huì)自動(dòng)為你下載所有文件了。

* $ wget -c http://example.com/really-big-file.iso

這里所指定的 -c 選項(xiàng)的作用為斷點(diǎn)續(xù)傳。

* $ wget -m -k (-H) http://www.example.com/

該命令可用來(lái)鏡像一個(gè)網(wǎng)站，wget 將對(duì)鏈接進(jìn)行轉(zhuǎn)換。如果網(wǎng)站中的圖像是放在另外的站點(diǎn)，那么可以使用 -H 選項(xiàng)。

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

一路征程一路笑該用戶已被刪除	4樓 [報(bào)告] 發(fā)表于 2010-07-02 14:35 \|只看該作者提示: 作者被禁止或刪除內(nèi)容自動(dòng)屏蔽
一路征程一路笑該用戶已被刪除	實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門\| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 \| ChinaUnix打賞功能已上線！ \| 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

俺小時(shí)候可帥了

白手起家

論壇徽章:: 0

5樓 [報(bào)告]

發(fā)表于 2010-07-02 15:31 |只看該作者

大哥不會(huì)用man嗎？！

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

klanet

家境小康

論壇徽章:: 0

6樓 [報(bào)告]

發(fā)表于 2010-07-02 21:58 |只看該作者

3l的牛...我只會(huì)wget url

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

expresss

稍有積蓄

論壇徽章:: 0

7樓 [報(bào)告]

發(fā)表于 2010-07-03 17:54 |只看該作者

本帖最后由 expresss 于 2010-07-05 08:09 編輯

為什么我下載樓主給的那些文件，沒辦法下載成功呢？
wget -r -np -nd --accept=md5 http://mirrors.kernel.org/opensuse/distribution/11.2/iso/
并不能成功的把md5文件下下來(lái)，只有一個(gè)robots.txt文件。改成其它后綴也一樣。
這個(gè)參數(shù)有問(wèn)題嗎？試過(guò)好幾次都錯(cuò)了。并不能下載指定類型的文件。

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

gamester88

豐衣足食

論壇徽章:: 5

8樓 [報(bào)告]

發(fā)表于 2010-07-05 11:58 |只看該作者

回復(fù) 7# expresss

9.1 Robot Exclusion
It is extremely easy to make Wget wander aimlessly around a web site, sucking all the available data in progress. ‘wget -r site’, and you're set. Great? Not for the server admin.
As long as Wget is only retrieving static pages, and doing it at a reasonable rate (see the ‘--wait’ option), there's not much of a problem. The trouble is that Wget can't tell the difference between the smallest static page and the most demanding CGI. A site I know has a section handled by a CGI Perl script that converts Info files to html on the fly. The script is slow, but works well enough for human users viewing an occasional Info file. However, when someone's recursive Wget download stumbles upon the index page that links to all the Info files through the script, the system is brought to its knees without providing anything useful to the user (This task of converting Info files could be done locally and access to Info documentation for all installed GNU software on a system is available from the info command).
To avoid this kind of accident, as well as to preserve privacy for documents that need to be protected from well-behaved robots, the concept of robot exclusion was invented. The idea is that the server administrators and document authors can specify which portions of the site they wish to protect from robots and those they will permit access.
The most popular mechanism, and the de facto standard supported by all the major robots, is the “Robots Exclusion Standard” (RES) written by Martijn Koster et al. in 1994. It specifies the format of a text file containing directives that instruct the robots which URL paths to avoid. To be found by the robots, the specifications must be placed in /robots.txt in the server root, which the robots are expected to download and parse.
Although Wget is not a web robot in the strictest sense of the word, it can download large parts of the site without the user's intervention to download an individual page. Because of that, Wget honors RES when downloading recursively. For instance, when you issue:
wget -r http://www.server.com/
First the index of ‘www.server.com’ will be downloaded. If Wget finds that it wants to download more documents from that server, it will request ‘http://www.server.com/robots.txt’ and, if found, use it for further downloads. robots.txt is loaded only once per each server.
Until version 1.8, Wget supported the first version of the standard, written by Martijn Koster in 1994 and available at http://www.robotstxt.org/wc/norobots.html. As of version 1.8, Wget has supported the additional directives specified in the internet draft ‘<draft-koster-robots-00.txt>’ titled “A Method for Web Robots Control”. The draft, which has as far as I know never made to an rfc, is available at http://www.robotstxt.org/wc/norobots-rfc.txt.
This manual no longer includes the text of the Robot Exclusion Standard.
The second, less known mechanism, enables the author of an individual document to specify whether they want the links from the file to be followed by a robot. This is achieved using the META tag, like this:
<meta name="robots" content="nofollow">
This is explained in some detail at http://www.robotstxt.org/wc/meta-user.html. Wget supports this method of robot exclusion in addition to the usual /robots.txt exclusion.
If you know what you are doing and really really wish to turn off the robot exclusion, set the robots variable to ‘off’ in your .wgetrc. You can achieve the same effect from the command line using the -e switch, e.g. ‘wget -e robots=off url...’.

復(fù)制代碼

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

gamester88

豐衣足食

論壇徽章:: 5

9樓 [報(bào)告]

發(fā)表于 2010-07-05 12:01 |只看該作者

本帖最后由 gamester88 于 2010-07-05 12:02 編輯

回復(fù) 7# expresss

因?yàn)閞obots.txt文件的緣故，所以上面的參數(shù)都會(huì)失效，所以

[gamester88@gamester88 iso]$ mkdir iso
[gamester88@gamester88 iso]$ cd iso
[gamester88@gamester88 iso]$ ls
[gamester88@gamester88 iso]$ wget -e robots=off -r -np -nd --accept=md5 http://mirrors.kernel.org/opensuse/distribution/11.2/iso/
[gamester88@gamester88 iso]$ls
openSUSE-11.2-Addon-Lang-i586.iso.md5
openSUSE-11.2-DVD-x86_64.iso.md5
openSUSE-11.2-KDE4-LiveCD-x86_64.iso.md5
openSUSE-11.2-Addon-Lang-x86_64.iso.md5
openSUSE-11.2-GNOME-LiveCD-i686.iso.md5
openSUSE-11.2-NET-i586.iso.md5
openSUSE-11.2-Addon-NonOss-BiArch-i586-x86_64.iso.md5
openSUSE-11.2-GNOME-LiveCD-x86_64.iso.md5
openSUSE-11.2-NET-x86_64.iso.md5
openSUSE-11.2-DVD-i586.iso.md5
openSUSE-11.2-KDE4-LiveCD-i686.iso.md5

復(fù)制代碼

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

expresss

稍有積蓄

論壇徽章:: 0

10樓 [報(bào)告]

發(fā)表于 2010-07-06 21:22 |只看該作者

本帖最后由 expresss 于 2010-07-07 09:16 編輯

回復(fù) 9# gamester88

謝謝，非常熱心的回答，真的非常感謝。看樣子Linux要學(xué)好了，還真要把英文搞好不可，呵呵。
非常感謝您熱心的解答。大概明白了，因?yàn)閞obots.txt里面的disallow:/，所以不允許搜索整個(gè)目錄，用-e robots=off可以不按robots的內(nèi)容來(lái)，也就是可以繞過(guò)robots.txt里的限制。

實(shí)戰(zhàn)分享：從技術(shù)角度談機(jī)器學(xué)習(xí)入門| 【大話IT】RadonDB低門檻向MySQL集群下戰(zhàn)書 | ChinaUnix打賞功能已上線！ | 新一代分布式關(guān)系型數(shù)據(jù)庫(kù)RadonDB知多少？

亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区

如何使用 wget 命令。 [復(fù)制鏈接]


平臺(tái) 論壇博客文庫(kù)

亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区

如何使用 wget 命令。 [復(fù)制鏈接]

如何使用 wget 命令。 [復(fù)制鏈接]