本帖最后由 摸鱼写代码 于 2023-3-2 16:39 编辑
王者作为当今热门的游戏,深受各类人群的喜爱,有些人喜欢收集王者高清壁纸放在桌面,那么怎么才能从网络海量图库中收集可用的高清壁纸呢?下面就看看我是怎么样通过爬虫的手段实现高清壁纸收集的。
- # coding=utf-8
- import requests
- import os
- import re
- url = 'https://pvp.qq.com/web201605/js/herolist.json'
- headers = {
- 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.74 Safari/537.36 Edg/99.0.1150.55'
- }
- rsp = requests.get(url, headers=headers)
- # print(rsp.text)
- print(rsp.status_code)
- # print(rsp.json())
- for index in rsp.json():
- # 获取英雄名字和id
- hero_name = index['cname']
- hero_id = index['ename']
- # filename = f'{hero_name}\\'
- # if not os.path.exists(filename):
- # os.mkdir(filename)
- index_url = f'https://pvp.qq.com/web201605/herodetail/{hero_id}.shtml'
- # print(hero_name, hero_id, index_url)
- rsp1 = requests.get(url=index_url, headers=headers)
- # rsp1.encoding = 'gbk'
- rsp1.encoding = rsp1.apparent_encoding#自动识别编码
- # print(rsp1.text)
- temp = '
- '
- title_list = re.findall('
- ', rsp1.text)[0]
- title_list = re.sub('&\d+', '', title_list).split('|')
- for num in range(1, len(title_list) + 1):
- img_url = f'https://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/{hero_id}/{hero_id}-bigskin-{num}.jpg'
- img_title = title_list[num - 1]
- img_data = requests.get(url=img_url, headers=headers).content
- with open('photo/' + img_title + '.jpg', 'wb') as f:
- print(f'=====================正在爬取{hero_name}的皮肤========================')
- f.write(img_data)
- # print(img_title, img_url)
复制代码
|