亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区

Chinaunix

標(biāo)題: 為啥re.findall的結(jié)果出現(xiàn) 多余的, " 等? [打印本頁]

作者: blackantt    時(shí)間: 2021-04-20 19:14
標(biāo)題: 為啥re.findall的結(jié)果出現(xiàn) 多余的, " 等?
import requests
import re
url = 'http://www.shubang.net/book/66_2151.html'
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36'}
web_data = requests.get(url, headers=headers)
web_data.encoding = 'utf-8'
txt = web_data.text
items = re.findall(r'line_en\" \>(.*)<|line_cn\" title=\"(.*)\"', txt)   
for item in items:
    print(item)

結(jié)果如下所示
。。。。。
('&#34;It doesn&#39;t look new. It looks old,&#34; one of the boys said.', '')('', '“房子一點(diǎn)也不新,舊死了,”其中一個(gè)男孩說。')('It just couldn&#39;t be.', '')('', '絕對(duì)不可能。')('The other members of his family turned to stare at me.', '')('', '其他人都把目光轉(zhuǎn)向了我。')
............


請(qǐng)問:
1.上面的 ') , ( 是哪來的?
2.couldn't 變成了 couldn&#39;  是咋回事?



作者: blackantt    時(shí)間: 2021-04-21 11:24
知道了, 要用 replace 函數(shù) 做替換




歡迎光臨 Chinaunix (http://www.72891.cn/) Powered by Discuz! X3.2