下载天堂图片网的所有图片和壁纸

代码如下

rand_agent.py

from random import choice
class RandAgent:agents=['Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.163 Safari/535.1','Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0','Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50','Opera/9.80 (Windows NT 6.1; U; zh-cn) Presto/2.9.168 Version/11.50','Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; Tablet PC 2.0; .NET4.0E)','Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.0)']@classmethoddef random(cls):return choice(cls.agents)

IvskySpider.py

import os
import requests
from lxml import etree
from urllib.parse import urljoin
from urllib.request import urlretrieve
from rand_agent import RandAgentclass IvskySpider:@classmethoddef run(cls):navs = cls.parse_nav("http://www.ivsky.com/")for nav_title, nav_href in navs:bigs = cls.parse_big_cate(nav_href)for big_title, big_href in bigs:smalls=cls.parse_small_cate(big_href)for small_title,small_href in smalls:imgs=cls.parse_all_page(small_href)dir_path = f"imgs/{nav_title}/{big_title}/{small_title}"# 创建文件夹os.makedirs(dir_path,exist_ok=True)for s_img_name,s_img_src,b_img_name,b_img_src in imgs:#下载所有图片urlretrieve(s_img_src,dir_path+s_img_name)urlretrieve(b_img_src,dir_path+b_img_name)print("*" * 100)@classmethoddef get(cls, url, is_text=True):headers = {'User-Agent': RandAgent.random()}response = requests.get(url, headers=headers)response.encoding = response.apparent_encodingreturn response.text if is_text else response.content@classmethoddef parse_nav(cls, url):root = etree.HTML(cls.get(url))a_list = root.xpath("//ul[@id='menu']/li/a")for a in a_list[1:]:title = a.xpath("text()")title = title[0] if title else Nonehref = a.xpath("@href")href = href[0] if href else Noneif not title or not href:continuehref = urljoin(url, href)print(title, href)# 循环中要返回数据,推荐用yield# １．它能返回数据，跟return相似yield title, href@classmethoddef parse_big_cate(cls, url):root = etree.HTML(cls.get(url))big_cates = root.xpath("//ul[contains(@class,'menu')]/li/a")for big_cate in big_cates[1:]:big_name = big_cate.xpath("text()")big_name = big_name[0] if big_name else Nonebig_href = big_cate.xpath("@href")big_href = big_href[0] if big_href else Noneif not big_name or not big_href:continuebig_href = urljoin(url, big_href)print(big_name, big_href)yield big_name, big_href@classmethoddef parse_small_cate(cls, url):root = etree.HTML(cls.get(url))small_cates = root.xpath("//div[@class='sline']/div/a")for small_cate in small_cates:small_title = small_cate.xpath("text()")small_title = small_title[0] if small_title else Nonesmall_href = small_cate.xpath("@href")small_href = small_href[0] if small_href else Noneif not small_title or not small_href:continuesmall_href = urljoin(url, small_href)print(small_title, small_href)yield small_title,small_href@classmethoddef parse_all_page(cls, url):page = 1while True:perpage_url = url + f"index_{page}.html"root = etree.HTML(cls.get(perpage_url))imgs = root.xpath("//img/@src")for img_src in imgs:img_src = "http:" + img_src if not img_src.startswith("http") else img_srcimg_name = img_src.split("/")[-1]big_img_src = img_src.replace("/t/", "/pre/")big_img_name = "big_" + img_nameyield img_name, img_src, big_img_name, big_img_srcif not imgs:print("到达最后一页")breakpage += 1IvskySpider.run()

下载天堂图片网的所有图片和壁纸相关推荐

天堂图片网的星空图片保存到电脑上方法步骤
对于喜欢星空图片的朋友来说,拥有好看的星空图片是一种美的享受,对于收集材料的设计师来说,拥有高清的星空图是一种直观的感受.究竟在里面可以批量下载到高清的星空图呢?在小编的探索之下,发现了一款采集工具- ...
如何批量下载天堂图片网上多个精美作品并保存一个目录
无论是图集收藏者.摄影爱好者.还是设计师,都会经常逛一些图片网页或相册之类的网站,如像天堂图片网.图虫相册.花瓣网等等,然后就会把一些比较唯美的作品或图集给收藏保存起来.当前大家都在用一款如今市面上比 ...
Chrome强大之一-----人人网批量下载相册图片
不得不承认google浏览器的强大啊,下面以Chrome的网上应用店做个示范: 准备工作: 首先点开下面这个链接: https://chrome.google.com/webstore/categor ...
python批量下载模库网图片
这里写自定义目录标题 python批量下载模库网图片步骤: 代码 python批量下载模库网图片步骤: 获取页数获取列表页获取图片链接和名字相关字典创建存放图片的文件夹下载图片代码 im ...
python2下载Bing图片并设置为壁纸
Bing壁纸获取地址 https://cn.bing.com/HPImageArchive.aspx?format=js&n=1 n最大值为8 增加format=js则返回json格式数据获 ...
BSCI官网如何下载审核图片记录？
BSCI官网如何下载审核图片记录? [BSCI官网如何下载审核图片记录?] 从2021年4月开始,BSCI认证全部换到新平台操作,很多企业朋友对于新平台不了解,产生了很多问题.其中,关于BSCI审核图 ...
【彼岸美图】二十行代码下载上千张高清美图壁纸【python爬虫】
小白也能看懂的python爬虫,从零开始爬彼岸图网壁纸美图你是否有过以下烦恼: 想找壁纸却找不到找到的壁纸清晰度都不高? 下载图片太麻烦? 现在,你将可以用简简单单二十行代码解决这一切烦恼,还不赶 ...
Java爬虫之批量下载LibreStock图片（可输入关键词查询下载）
前言(废话) 公司产品新版本刚刚上线,所以也终于得空休息一下了,有了一点时间.由于之前看到过爬虫,可以把网页上的数据通过代码自动提取出来,觉得挺有意思的,所以也想接触一下,但是网上很多爬虫很多都是基于 ...
用Python下载煎蛋网全站好看的小姐姐！
转载来自:Python技术春天到了,春光明媚,鸟语花香,各地都回温了!公园里面的花都开了,这几天都没有心情工作,准备周末出去游山玩水,踏踏青!先用Python爬取一波妹子图,摸摸鱼吧. 导入模块首 ...

下载天堂图片网的所有图片和壁纸

下载天堂图片网的所有图片和壁纸相关推荐

最新文章

热门文章