首先准备工作:
1.安装Python3
2.安装requests
3.安装BeautifulSoup
以mac为例, MacBook自带的是Python2,是不支持这套脚本的,所以我们得安装Python3,具体安装的方法百度即可,b站也有视频,安装完成后,输入Python3进入环境,然后pip3 install xxx [ xxx是上面2和3的名字].直接pip的话是调用Python2的,执行不了.
import requests
import os
from bs4 import BeautifulSoup
def getHtml(url):
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
"Referer": "https://www.mm131.net"
}
html = requests.get(url, headers = headers)
html.encoding = html.apparent_encoding
return html
def getSoup(html):return BeautifulSoup(html.text, "html.parser")
def getTitle(soup):
title = soup.title.contents[0]
return str(title)
def getAllPage(soup):
allPage = soup.select('body > div.content > div.content-page > span:nth-child(1)')[0].string[1:-1]
return allPage
def makedir(title):try:
os.mkdir(title)
except:
print(f"{title} folder is exist!")
return
def downloadPic(title, allPage, htmlMark):for number in range(1,int(allPage)+1):
picUrl = f"https://img1.mmmw.net/pic/{htmlMark}/{number}.jpg"
pic = getHtml(picUrl)
with open(f"{title}/{number}.jpg", "wb+") as f:
f.write(pic.content)
print(f"{number}.jpg download successful!")
def main():for mark in range(2343,5419): //这里是进入mm131的首页然后打开分类,打开一个内容,mm131的网址是/xxxx.html 这代表括号的数字,以2343,5419为例,就是采集2343.html-5419.html的内容 复制的时候删除这段注释!
htmlMark = str(mark)
try:
html = getHtml(f"https://www.mm131.net/qingchun/{htmlMark}.html") //这就是打开的网址格式,不同分类直接改,这里是清纯美眉的栏目.复制的时候删掉这段注释
soup = getSoup(html)
title = getTitle(soup)
allPage = getAllPage(soup)
makedir(title)
except:
continue
downloadPic(title, allPage, htmlMark)
if __name__ == '__main__':
main()
Mac运行这个脚本需要cd到脚本的所在目录,然后Python3 mm.py 即可,下面上效果图


懒人成品下载