今天起我将开始介绍一些网络爬虫的教程~不足之处欢迎大家指出~
本程序想实现自动爬取脑筋急转弯(自动换页)存储等功能!
import requests import re for a in range(1,74): url = "https://www.2345.com/inner/jzw/" + str(a) + ".htm" what = requests.get(url) message = re.findall('''<li><span class="table_left">(.*)">点击显示答案</a></span></li>''', what.text) # print(message) for i in range(len(message)): try: FenGe = message[i].split("""</span><span class="table_right"><a href="javascript:;" class="answer" onclick="MM_popupMsg""") # print(FenGe[0]+FenGe[1]) with open(r"56.txt","a") as f: f.write(FenGe[0]+FenGe[1]+"\n") except: print("第%d页,第%d行有误,已自动略过!"%(a, i))
更多精彩内容