序言

前排提示本文是挂羊头卖狗肉,正文在第二部分,第一部分纯属为了过审凑字数。


文章目录

  • 序言
    • 1 斯坦福句法解析库(句法树、依存关系图)使用概述
    • 2 烂活(可能对追番的朋友有用)

1 斯坦福句法解析库(句法树、依存关系图)使用概述

关于NLTK里斯坦福的句法解析模块,最近报警告说即将被弃用,最新版将被nltk.parse.corenlp.StanforCoreNLPParser模块取代,关于CoreNLP可以去斯坦福软件里下载JAR包,目前看至少依存分析和句法树是可行的,这两个也是最有用的,NER也能用,虽然分词和词性标注会报错,但是这两个也不必用非要用斯坦福的,有很多其他资源可以用,中文可以用jieba,英文的话nltk里就有内置的分词包和词性标注包,目前StanforCoreNLPParser还没搞清楚具体用法,近期会发布关于如何使用斯坦福JAR包详细教程。

从上面的链接中下载得到的几个JAR包如下图所示:

其中stanford-parser-full-2020-11-17是最重要的一个包,可以用于生成句法树和依存关系图,然后stanford-corenlp-4.4.0可能算是其他各个包的一个集成,但是我看下来里面的模型要缺很多,比如解析包的模型只有英文,而事实上前者中有包括中文在内的各种语言解析包。关于这些包的具体使用代码如下所示,其中一部分参考自https://www.cnblogs.com/baiboy/p/nltk1.html

# -*- coding: utf-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn# 2022/06/10 13:16:34 目前NLTK3.3.0
def segmenter_demo():# 2022/06/10 13:16:51 无法成功运行, 不知道为什么from nltk.tokenize.stanford_segmenter import StanfordSegmentersegmenter = StanfordSegmenter(path_to_jar=r'D:\data\stanford\software\stanford-segmenter-2020-11-17\stanford-segmenter-4.2.0.jar',# slf4j这个参数在stanford-segmenter-2020-11-17里找不到, 但是在stanford-parser-full-2020-11-17和stanford-corenlp-4.4.0里都有path_to_slf4j=r'D:\data\stanford\software\stanford-parser-full-2020-11-17\slf4j-api.jar',path_to_sihan_corpora_dict=r'D:\data\stanford\software\stanford-segmenter-2020-11-17\data',path_to_model=r'D:\data\stanford\software\stanford-segmenter-2020-11-17\data\pku.gz',path_to_dict=r'D:\data\stanford\software\stanford-segmenter-2020-11-17\data\dict-chris6.ser.gz',)string = u'我在博客园开了一个博客,我的博客名叫伏草惟存,写了一些自然语言处理的文章。'result = segmenter.segment(string)print(result)return resultdef tokenizer_demo():# 2022/06/10 13:15:03 无法运行 nltk.tokenize 已经被弃用了from nltk.tokenize import StanfordTokenizertokenizer = StanfordTokenizer(path_to_jar=r'D:\data\stanford\software\stanford-parser-full-2020-11-17\stanford-parser.jar')sent = 'Good muffins cost $3.88\nin New York.  Please buy me\ntwo of them.\nThanks.'result = tokenizer.tokenize(sent)return resultdef ner_tagger_demo():# 2022/06/10 13:16:56 可以运行英文, 但是中文的缺少模型jar包from nltk.tag import StanfordNERTaggereng_tagger = StanfordNERTagger(model_filename=r'D:\data\stanford\software\stanford-ner-2020-11-17\classifiers\english.all.3class.distsim.crf.ser.gz',path_to_jar=r'D:\data\stanford\software\stanford-ner-2020-11-17\stanford-ner.jar')result = eng_tagger.tag('Rami Eid is studying at Stony Brook University in NY'.split())print(result)# chi_tagger = StanfordNERTagger(model_filename=r'D:\data\stanford\software\stanford-ner-2020-11-17\classifiers\chinese.misc.distsim.crf.ser.gz',# path_to_jar=r'D:\data\stanford\software\stanford-ner-2020-11-17\stanford-ner.jar')# for word, tag in  chi_tagger.tag(result.split()):# print(word,tag)return resultdef pos_tagger_demo():# 2022/06/10 13:17:35 通过测试from nltk.tag import StanfordPOSTaggereng_tagger = StanfordPOSTagger(model_filename=r'D:\data\stanford\software\stanford-postagger-full-2020-11-17\models\english-bidirectional-distsim.tagger',path_to_jar=r'D:\data\stanford\software\stanford-postagger-full-2020-11-17\stanford-postagger.jar')print(eng_tagger.tag('What is the airspeed of an unladen swallow ?'.split()))chi_tagger = StanfordPOSTagger(model_filename=r'D:\data\stanford\software\stanford-postagger-full-2020-11-17\models\chinese-distsim.tagger',path_to_jar=r'D:\data\stanford\software\stanford-postagger-full-2020-11-17\stanford-postagger.jar')result = '四川省 成都 信息 工程 大学 我 在 博客 园 开 了 一个 博客 , 我 的 博客 名叫 伏 草 惟 存 , 写 了 一些 自然语言 处理 的 文章 。\r\n'print(chi_tagger.tag(result.split()))def dependency_demo():# 2022/06/10 13:21:17 通过测试from nltk.parse.stanford import StanfordDependencyParsereng_parser = StanfordDependencyParser(r'D:\data\stanford\software\stanford-parser-full-2020-11-17\stanford-parser.jar',r'D:\data\stanford\software\stanford-parser-full-2020-11-17\stanford-parser-4.2.0-models.jar',r'D:\data\stanford\software\stanford-parser-full-2020-11-17\englishPCFG.ser.gz')res = list(eng_parser.parse('the quick brown fox jumps over the lazy dog'.split()))for row in res[0].triples():print(row)chi_parser = StanfordDependencyParser(r'D:\data\stanford\software\stanford-parser-full-2020-11-17\stanford-parser.jar',r'D:\data\stanford\software\stanford-parser-full-2020-11-17\stanford-parser-4.2.0-models.jar',model_path=r'D:\data\stanford\software\stanford-parser-full-2020-11-17\chinesePCFG.ser.gz')        # 这个文件要从stanford-parser-4.2.0-models.jar中解压出来得到res = list(eng_parser.parse('我 和 他 是 朋友'.split()))print(list(res[0].triples()))print('#' * 64)for row in res[0].triples():print(row)def parse_tree_demo():  # 2022/06/10 13:21:17 通过测试from nltk.parse.stanford import StanfordParser    parser = StanfordParser(r'D:\data\stanford\software\stanford-parser-full-2020-11-17\stanford-parser.jar',r'D:\data\stanford\software\stanford-parser-full-2020-11-17\stanford-parser-4.2.0-models.jar',model_path=r'D:\data\stanford\software\stanford-parser-full-2020-11-17\chinesePCFG.ser.gz')      # 这个文件要从stanford-parser-4.2.0-models.jar中解压出来得到parse_tree = list(parser.parse(['我', '和', '他', '是', '朋友']))print(parse_tree)return parse_tree# segmenter_demo()
# tokenizer_demo()
# ner_tagger_demo()
# pos_tagger_demo()
# dependency_demo()
# parse_tree_demo()

目前更新nltk到最新版(3.7.0),可以使用corenlp模块,但是发现它调用的是远程接口,因而无需下载jar包到本地,但是容易连不上远程服务器。感觉是斯坦福不准备开放它们的解析包,而是封装成接口,看注释部分效果还挺fancy:

class CoreNLPParser(GenericCoreNLPParser)|  CoreNLPParser(url='http://localhost:9000', encoding='utf8', tagtype=None)||  >>> parser = CoreNLPParser(url='http://localhost:9000')||  >>> next(|  ...     parser.raw_parse('The quick brown fox jumps over the lazy dog.')|  ... ).pretty_print()  # doctest: +NORMALIZE_WHITESPACE|                       ROOT|                        ||                        S|         _______________|__________________________|        |                         VP               ||        |                _________|___             ||        |               |             PP           ||        |               |     ________|___         ||        NP              |    |            NP       ||    ____|__________     |    |     _______|____    ||   DT   JJ    JJ   NN  VBZ   IN   DT      JJ   NN  .|   |    |     |    |    |    |    |       |    |   ||  The quick brown fox jumps over the     lazy dog  .

另外stanza包同理,也是需要调用远程接口方能调用,API文档在https://stanfordnlp.github.io/stanza/index.html,笔者私以为有上面那个解析包应该差不多就够用了,这个stanza不搭梯子用起来也经常会失败。


2 烂活(可能对追番的朋友有用)

忙里偷闲分享一个烂活。

最近在B站追《辉夜大小姐想让人告白第三季》和《间谍过家家》,实话说以前的B站新番还是能做到跟动画发布商同步更新,零氪党追番也就是只比大会员慢一周少看一集而已,总归是可以忍受。现在的B站各种骚操作,更新巨慢也就算了,各种圣光、暗牧、删减,有些敏感片段还要自己亲自作画重改,实在是让人难以接受,若不是B站还有仅存的弹幕氛围,谁TM还在B站追番。

然后笔者找到了这个:蚂蚁Tube@动画板块

目前基本上所有的四月新番都在持续更新,过往的老番也比较,当然除了动画以外,还有电影、电视剧、综艺的资源,应该说是非常nice了。

经常光顾这种免费站点的人肯定都知道,这类站点的通病就是视频加载巨慢,而且经常会看到一半就完全宕机了,这可实在是太糟心了,所以笔者想能不能直接把视频下载到本地来观看。

其实这件事并不复杂,比B站视频的爬取要简单很多,这里就顺手把B站视频爬虫的脚本挂在下面(因为笔者也是借鉴别人的代码做了一些修改,试着运行主体部分的几个示例,应该还是非常清晰的,截至本文发布仍然可用,注释较为详细,这里是可以直接用番剧的episodeid去直接下载整部番剧的,当然要需要大会员的必须得有大会员的账号,这里用的Cookie是笔者本人的账号,目前应该已经失效了,需要的可以自己网页端登录一下账号然后把Cookie拷贝过来):

# -*- coding: utf-8 -*-
# @author: caoyang
# @email: caoyang@163.sufe.edu.cn
# https://github.com/iawia002/annieimport os
import re
import json
import requests
from tqdm import tqdmclass BiliBiliCrawler(object):def __init__(self) -> None:               self.user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'self.video_webpage_link = 'https://www.bilibili.com/video/{}'.formatself.video_detail_api = 'https://api.bilibili.com/x/player/pagelist?bvid={}&jsonp=jsonp'.format                        self.video_playurl_api = 'https://api.bilibili.com/x/player/playurl?cid={}&bvid={}&qn=64&type=&otype=json'.format   self.episode_playurl_api = 'https://api.bilibili.com/pgc/player/web/playurl?ep_id={}&jsonp=jsonp'.format           self.episode_webpage_link = 'https://www.bilibili.com/bangumi/play/ep{}'.formatself.anime_webpage_link = 'https://www.bilibili.com/bangumi/play/ss{}'.formatself.chunk_size = 1024self.regexs = {'host': 'https://(.*\.com)','episode_name': r'meta name="keywords" content="(.*?)"','initial_state': r'<script>window.__INITIAL_STATE__=(.*?);','playinfo': r'<script>window.*?__playinfo__=(.*?)</script>', }def easy_download_video(self, bvid, save_path=None) -> bool:"""Tricky method with available api"""# Request for detail information of videoresponse = requests.get(self.video_detail_api(bvid), headers={'User-Agent': self.user_agent})json_response = response.json()cid = json_response['data'][0]['cid']video_title = json_response['data'][0]['part']if save_path is None:save_path = f'{video_title}.mp4'        print(f'Video title: {video_title}')# Request for playurl and size of videoresponse = requests.get(self.video_playurl_api(cid, bvid), headers={'User-Agent': self.user_agent})json_response = response.json()video_playurl = json_response['data']['durl'][0]['url']# video_playurl = json_response['data']['durl'][0]['backup_url'][0]video_size = json_response['data']['durl'][0]['size']total = video_size // self.chunk_sizeprint(f'Video size: {video_size}')# Download videoheaders = {'User-Agent': self.user_agent,'Origin'    : 'https://www.bilibili.com','Referer'  : 'https://www.bilibili.com',         }headers['Host'] = re.findall(self.regexs['host'], video_playurl, re.I)[0]headers['Range'] = f'bytes=0-{video_size}'response = requests.get(video_playurl, headers=headers, stream=True, verify=False)tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download process', total=total)with open(save_path, 'wb') as f:for byte in tqdm_bar:f.write(byte)return Truedef easy_download_episode(self, epid, save_path=None) -> bool:"""Tricky method with available api"""# Request for playurl and size of episode# temp_headers = {# "Host": "api.bilibili.com",# "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:99.0) Gecko/20100101 Firefox/99.0",# "Accept": "application/json, text/plain, */*",# "Accept-Language": "zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2",# "Accept-Encoding": "gzip, deflate, br",# "Referer": "https://www.bilibili.com/bangumi/play/ep234407?spm_id_from=333.337.0.0",# "Origin": "https://www.bilibili.com",# "Connection": "keep-alive",# "Cookie": "innersign=0; buvid3=3D8F234E-5DAF-B5BD-1A26-C7CDE57C21B155047infoc; i-wanna-go-back=-1; b_ut=7; b_lsid=1047C7449_1808035E0D6; _uuid=A4884E3F-BF68-310101-E5E6-10EBFDBCC10CA456283infoc; buvid_fp=82c49016c72d24614786e2a9e883f994; buvid4=247E3498-6553-51E8-EB96-C147A773B34357718-022050123-7//HOhRX5o4Xun7E1GZ2Vg%3D%3D; fingerprint=1b7ad7a26a4a90ff38c80c37007d4612; sid=jilve18q; buvid_fp_plain=undefined; SESSDATA=f1edfaf9%2C1666970475%2Cf281c%2A51; bili_jct=de9bcc8a41300ac37d770bca4de101a8; DedeUserID=130321232; DedeUserID__ckMd5=42d02c72aa29553d; nostalgia_conf=-1; CURRENT_BLACKGAP=1; CURRENT_FNVAL=4048; CURRENT_QUALITY=0; rpdid=|(u~||~uukl)0J'uYluRu)l|J",# "Sec-Fetch-Dest": "empty",# "Sec-Fetch-Mode": "cors",# "Sec-Fetch-Site": "same-site",# "TE": "trailers",# }# response = requests.get(self.episode_playurl_api(epid), headers=temp_headers)# 2022/05/01 23:31:08 上面是带大会员的下载方式, 可以下载大会员可看的番剧response = requests.get(self.episode_playurl_api(epid))json_response = response.json()# episode_playurl = json_response['result']['durl'][0]['url']episode_playurl = json_response['result']['durl'][0]['backup_url'][0]episode_size = json_response['result']['durl'][0]['size']total = episode_size // self.chunk_sizeprint(f'Episode size: {episode_size}')# Download episode# 2022/05/01 23:31:41 大会员最好加入下面的cookie, 但是我不确信是否去掉还能不能可以headers = {'User-Agent': self.user_agent,'Origin'   : 'https://www.bilibili.com','Referer'  : 'https://www.bilibili.com', # 'Cookie'    : "innersign=0; buvid3=3D8F234E-5DAF-B5BD-1A26-C7CDE57C21B155047infoc; i-wanna-go-back=-1; b_ut=7; b_lsid=1047C7449_1808035E0D6; _uuid=A4884E3F-BF68-310101-E5E6-10EBFDBCC10CA456283infoc; buvid_fp=82c49016c72d24614786e2a9e883f994; buvid4=247E3498-6553-51E8-EB96-C147A773B34357718-022050123-7//HOhRX5o4Xun7E1GZ2Vg%3D%3D; fingerprint=1b7ad7a26a4a90ff38c80c37007d4612; sid=jilve18q; buvid_fp_plain=undefined; SESSDATA=f1edfaf9%2C1666970475%2Cf281c%2A51; bili_jct=de9bcc8a41300ac37d770bca4de101a8; DedeUserID=130321232; DedeUserID__ckMd5=42d02c72aa29553d; nostalgia_conf=-1; CURRENT_BLACKGAP=1; CURRENT_FNVAL=4048; CURRENT_QUALITY=0; rpdid=|(u~||~uukl)0J'uYluRu)l|J",}headers['Host'] = re.findall(self.regexs['host'], episode_playurl, re.I)[0]headers['Range'] = f'bytes=0-{episode_size}'response = requests.get(episode_playurl, headers=headers, stream=True, verify=False)tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download process', total=total)if save_path is None:save_path = f'ep{epid}.mp4'with open(save_path, 'wb') as f:for byte in tqdm_bar:f.write(byte)return Truedef download(self, bvid, video_save_path=None, audio_save_path=None) -> dict:"""General method by parsing page source"""if video_save_path is None:video_save_path = f'{bvid}.m4s'if audio_save_path is None:audio_save_path = f'{bvid}.mp3'common_headers = {'Accept'          : '*/*','Accept-encoding'   : 'gzip, deflate, br','Accept-language' : 'zh-CN,zh;q=0.9,en;q=0.8','Cache-Control'       : 'no-cache','Origin'           : 'https://www.bilibili.com','Pragma'           : 'no-cache','Host'             : 'www.bilibili.com','User-Agent'       : self.user_agent,}# In fact we only need bvid# Each episode of an anime also has a bvid and a corresponding bvid-URL which is redirected to another episode link# e.g. https://www.bilibili.com/video/BV1rK4y1b7TZ is redirected to https://www.bilibili.com/bangumi/play/ep322903response = requests.get(self.video_webpage_link(bvid), headers=common_headers)html = response.textplayinfos = re.findall(self.regexs['playinfo'], html, re.S)if not playinfos:raise Exception(f'No playinfo found in bvid {bvid}\nPerhaps VIP required')playinfo = json.loads(playinfos[0])# There exists four different URLs with observations as below# `baseUrl` is the same as `base_url` with string value# `backupUrl` is the same as `backup_url` with array value# Here hard code is employed to select playurldef _select_video_playurl(_videoinfo):if 'backupUrl' in _videoinfo:return _videoinfo['backupUrl'][-1]if 'backup_url' in _videoinfo:return _videoinfo['backup_url'][-1]if 'baseUrl' in _videoinfo:return _videoinfo['baseUrl']if 'base_url' in _videoinfo:return _videoinfo['base_url']   raise Exception(f'No video URL found\n{_videoinfo}')  def _select_audio_playurl(_audioinfo):if 'backupUrl' in _audioinfo:return _audioinfo['backupUrl'][-1]if 'backup_url' in _audioinfo:return _audioinfo['backup_url'][-1]if 'baseUrl' in _audioinfo:return _audioinfo['baseUrl']if 'base_url' in _audioinfo:return _audioinfo['base_url']raise Exception(f'No audio URL found\n{_audioinfo}')# with open(f'playinfo-{bvid}.js', 'w') as f:# json.dump(playinfo, f)if 'durl' in playinfo['data']:video_playurl = playinfo['data']['durl'][0]['url']# video_playurl = playinfo['data']['durl'][0]['backup_url'][1]print(video_playurl)video_size = playinfo['data']['durl'][0]['size']total = video_size // self.chunk_sizeprint(f'Video size: {video_size}')headers = {'User-Agent': self.user_agent,'Origin'    : 'https://www.bilibili.com','Referer'  : 'https://www.bilibili.com',         }headers['Host'] = re.findall(self.regexs['host'], video_playurl, re.I)[0]headers['Range'] = f'bytes=0-{video_size}'# headers['Range'] = f'bytes={video_size + 1}-{video_size + video_size + 1}'response = requests.get(video_playurl, headers=headers, stream=True, verify=False)tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download process', total=total)with open(video_save_path, 'wb') as f:for byte in tqdm_bar:f.write(byte)return Trueelif 'dash' in playinfo['data']:videoinfo = playinfo['data']['dash']['video'][0]audioinfo = playinfo['data']['dash']['audio'][0]video_playurl = _select_video_playurl(videoinfo)audio_playurl = _select_audio_playurl(audioinfo)else:raise Exception(f'No data found in playinfo\n{playinfo}')# First make a fake request to get the `Content-Range` params in response headersfake_headers = {'Accept'         : '*/*','Accept-Encoding'   : 'identity','Accept-Language'  : 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2','Accept-Encoding'  : 'gzip, deflate, br','Cache-Control'       : 'no-cache','Origin'           : 'https://www.bilibili.com','Pragma'           : 'no-cache','Range'                : 'bytes=0-299','Referer'          : self.video_webpage_link(bvid),'User-Agent'      : self.user_agent,'Connection'        : 'keep-alive',}response = requests.get(video_playurl, headers=fake_headers, stream=True)video_size = int(response.headers['Content-Range'].split('/')[-1])total = video_size // self.chunk_size# Next make a real request to download full videoreal_headers = {'Accept'         : '*/*','accept-encoding'   : 'identity','Accept-Language'  : 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2','Accept-Encoding'  : 'gzip, deflate, br','cache-control'       : 'no-cache','Origin'           : 'https://www.bilibili.com','pragma'           : 'no-cache','Range'                : f'bytes=0-{video_size}','Referer'            : self.video_webpage_link(bvid),'User-Agent'      : self.user_agent,'Connection'        : 'keep-alive',}response = requests.get(video_playurl, headers=real_headers, stream=True)tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc='Download video', total=total)with open(video_save_path, 'wb') as f:for byte in tqdm_bar:f.write(byte)# The same way for downloading audioresponse = requests.get(audio_playurl, headers=fake_headers, stream=True)audio_size = int(response.headers['Content-Range'].split('/')[-1])total = audio_size // self.chunk_size // 2# Confusingly downloading full audio at one time is forbidden# We have to download audio in two partswith open(audio_save_path, 'wb') as f:audio_part = 0for (_from, _to) in [[0, audio_size // 2], [audio_size // 2 + 1, audio_size]]:headers = {'Accept': '*/*','Accept-Encoding': 'identity','Accept-Language': 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2','Accept-Encoding': 'gzip, deflate, br','Cache-Control': 'no-cache','Origin': 'https://www.bilibili.com','Pragma': 'no-cache','Range': f'bytes={_from}-{_to}','Referer': self.video_webpage_link(bvid),'User-Agent': self.user_agent,'Connection': 'keep-alive',}audio_part += 1response = requests.get(audio_playurl, headers=headers, stream=True)tqdm_bar = tqdm(response.iter_content(self.chunk_size), desc=f'Download audio part{audio_part}', total=total)for byte in tqdm_bar:f.write(byte)return Truedef easy_download(self, url) -> bool:"""Download with page URL as below:>>> url = 'https://www.bilibili.com/video/BV1jf4y1h73r'>>> url = 'https://www.bilibili.com/bangumi/play/ep399420'"""headers = {'Accept': '*/*','Accept-Encoding': 'gzip, deflate, br','Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8','Cache-Control': 'no-cache','Origin': 'https://www.bilibili.com','Pragma': 'no-cache','Host': 'www.bilibili.com','User-Agent': self.user_agent,}      response = requests.get(url, headers=headers)html = response.textinitial_states = re.findall(self.regexs['initial_state'], html, re.S)if not initial_states:raise Exception('No initial states found in page source')initial_state = json.loads(initial_states[0])# Download anime with several episodesepisode_list = initial_state.get('epList')if episode_list is not None:name = re.findall(self.regexs['episode_name'], html, re.S)[0].strip()for episode in episode_list:if episode['badge'] != '会员':                          # No VIP requiredif not os.path.exists(name):os.mkdir(name)self.download(bvid=str(episode['bvid']),video_save_path=os.path.join(name, episode['titleFormat'] + episode['longTitle'] + '.m4s'),audio_save_path=os.path.join(name, episode['titleFormat'] + episode['longTitle'] + '.mp3'),)else:                                                    # Unable to download VIP animecontinue# Download common videoselse:video_data = initial_state['videoData']name = video_data['tname'].strip()if not os.path.exists(name):os.mkdir(name)self.download(bvid=str(episode['bvid']),video_save_path=os.path.join(name, video_data['title'] + '.m4s'),audio_save_path=os.path.join(name, video_data['title'] + '.mp3'),)return Trueif __name__ == '__main__':bb = BiliBiliCrawler()# bb.easy_download_video('BV14T4y1u7ST', 'temp/BV14T4y1u7ST.mp4')# bb.easy_download_video('BV1z5411W7tX', 'temp/BV1z5411W7tX.mp4')# bb.easy_download_video('BV1HX4y1T7Bz', 'temp/BV1HX4y1T7Bz.mp4')bb.easy_download_episode('234407', 'temp/ep234407.mp4')# bb.easy_download_episode('321808', 'temp/ep321808.mp4')# bb.download('BV1PT4y137CA')# bb.download('BV14T4y1u7ST')# bb.easy_download('https://www.bilibili.com/video/BV1jf4y1h73r')# bb.easy_download('https://www.bilibili.com/bangumi/play/ep399420')# bb.easy_download('https://www.bilibili.com/bangumi/play/ss12548/')

言归正传,笔者以下载《辉夜大小姐想让人告白第三季》第10集(截至本文发布的最新集,B站仅更新到第7集)简要描述一下下载蚂蚁Tube@动画板块视频的方法:

  1. 打开视频链接:https://mayitube.com/v_O5vUcy/9

  2. F12打开开发者工具,刷新页面,在网络一栏下面的筛选XHR请求信息,请务必找到一个发起者为hls.js(如下图红箭头指示的位置)的GET请求,并查看其响应(下图中右边的红色方框)中的URL

  3. 新建标签页打开该URL:

  4. 复制网页中的内容到下面代码的对应位置(7-290行):

    # -*- coding: UTF-8 -*-
    # @author: caoyang
    # @email: caoyang@163.sufe.edu.cnimport requestsstring = """#EXTM3U
    #EXT-X-VERSION:3
    #EXT-X-TARGETDURATION:11
    #EXT-X-PLAYLIST-TYPE:VOD
    #EXT-X-MEDIA-SEQUENCE:0
    #EXTINF:10.474,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0licGJGWFlvLnRz.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL291RW5WV1I1LnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvZk1WYTZFYnoudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvTUFkZWM2ODMudHM=.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzRqWm9laWxkLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0RLdUpObnA1LnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvajFNeE9ndmUudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvV0VrNnU3M3YudHM=.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3hsY2hBZW1LLnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzVzZktXOTF5LnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvVm5PM3RYbHEudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvcXhkMUlDS3cudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0tzb1VjaE4xLnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2tNZzVFd1FILnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvaFN6SWdtZEMudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvRDY2anZBOXkudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1JKVE5zZ1JoLnRz.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzZjTUExVkFiLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvVnRTZUVwY08udHM=.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvUEMyZzRWblIudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0xZUWk3c3pxLnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0JWMmVVblJzLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvNVBxOHB5c0kudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvNnMyYVl0cE8udHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzdYN21jMlM0LnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzhlMWJGMzQ5LnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvb0ZPYnhEZjkudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMveDI0UG5rV1oudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0xhYkJzRWoyLnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2hEOFJBN0lRLnRz.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvaGxmdTZwSGMudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvZ21MdjRUajkudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2NFVnRldlh1LnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1FvS2VNblNlLnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvN1N6N25kU0wudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvMGhTQ1NvMjUudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2V0cDAwb1BpLnRz.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2R4MmZlS3RRLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvcWhSdzRzR3IudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvMjkwdzBkRkYudHM=.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2RyS29PVzNULnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0Q0ZU93THJDLnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvOVRtSXFBYTcudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvTXQ1QW05V2MudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0FQYmpuNlhjLnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzV6c0lIbTNoLnRz.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvZkZJb0FHdXEudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvOEI4MzdwRzUudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL21wUjgxbzZQLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzdhT1BlQXFNLnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvVE9xR1BlZUgudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvSmtXNmF4Rm0udHM=.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzUxOFJQT1JDLnRz.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1JQQjhiMG9KLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvQ1BJOGx3V1AudHM=.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvS2dGVUdTdW8udHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3NWYnMwRUNhLnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1pEZG1MWVBlLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvUTBwTkFBb2cudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvaXpCM3ZrcVMudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL09BUmRZRGtNLnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3h5WTRWNkR0LnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvWVNHYTNPdjQudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvMzRhNmNmZEcudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzdIbXVTbXhDLnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0VNZGN1S0p4LnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvaUtoUkxwNmwudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvaFBtMXMzWHUudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2pLOXQ5ckpkLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzhOVlQzWk9jLnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvc3BxN2dGdlMudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvWWM2cm8xbVIudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3JpaDBha0ZPLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3RPMXp5NVo2LnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvc092Vko5TWMudHM=.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvWWJOQVlNQXcudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2xxSG05cmhDLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzZ4dE44TER0LnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvUmg3Z2pkSTAudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvRHJKTkJETG4udHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzExVHRqVE96LnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0FZeUp4M093LnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvbEtzSE9qTGgudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvc2xMa09uZTYudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2QzTFVhakFmLnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1R4MkRCRXhRLnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMva2tqMjlwUEQudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvQnJQNFBSYUEudHM=.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3h6ZTh4QTlHLnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2JWMHhqUUJBLnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvUUR5eG1IdmUudHM=.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvaDZLUURKdEUudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3BPNFBMRktZLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2lIeFBZdFJSLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvQXk3dU9MREwudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvYTJQUVkxU00udHM=.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0FZb1B0blM1LnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1c1ME9Ba1RFLnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvSW5mekVTSmYudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvUDVNU2c3UkgudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3R2N0hIano2LnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3o4U2tPbFRLLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvWmJjcWpZWDIudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvT25kRENlY2IudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL25mSFpCWnN0LnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2VlTTR0TW91LnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvWkxodEc0c2EudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvSGZjcktLc1EudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzhvN1NHQ3VULnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2d1bWJTWnBYLnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvUGRaRVFRSXQudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvMGZVN1pUVFMudHM=.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL3RFSURHOFNoLnRz.ts
    #EXTINF:10.428,
    https://server6.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0RnTXNnZHFGLnRz.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvNGYydUU5OTQudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvUmw0c3dYZjgudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1NDWkZzaXZILnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0RVTElFT2g4LnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvZG5TbjVqS2cudHM=.ts
    #EXTINF:10.428,
    https://server8.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvQXVUaFNJZnIudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2I2aWQwTDVuLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0RNR1dUZ3lvLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvaVg0Y21wRGIudHM=.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvbnNpYzJFcEcudHM=.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0JYNkd6RzFpLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzLzhzZkUyTm4xLnRz.ts
    #EXTINF:10.428,
    https://server3.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvYTNjcWpnTEIudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvYjZ4TmRxUWwudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL21tMXQxQmtlLnRz.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL1JDQWN6bUNrLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvZmozd01LSHIudHM=.ts
    #EXTINF:10.428,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvTVVLZHQ1Z1MudHM=.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL01QSk1Ja2NWLnRz.ts
    #EXTINF:10.428,
    https://server7.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2lNUU5ORnY0LnRz.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvOHF6cDZRVDAudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly92LmR1Ym9rdS5jby8yMDIyMDYxMS9mVzhnVnlCQy9obHMvdG92bVkzN1kudHM=.ts
    #EXTINF:10.428,
    https://server2.mayitube.com/video_source/aHR0cHM6Ly90cy52Ym9rdS5jb20vMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL0pva1RIM3diLnRz.ts
    #EXTINF:10.428,
    https://server5.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5uZXQvMjAyMjA2MTEvZlc4Z1Z5QkMvaGxzL2FFa1N5d3ROLnRz.ts
    #EXTINF:1.252,
    https://server4.mayitube.com/video_source/aHR0cHM6Ly93LmR1Ym9rdS5tZS8yMDIyMDYxMS9mVzhnVnlCQy9obHMvVUZPRjRIWWcudHM=.ts
    #EXT-X-ENDLIST"""# 用于将字符串形式的请求头转换为字典的工具函数
    def f(headers):new_headers = {}for _line in headers.splitlines():key, value = _line.split(':', 1)new_headers[key] = value.strip()return new_headersheaders = f("""Host: server8.mayitube.com
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:101.0) Gecko/20100101 Firefox/101.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
    Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
    Accept-Encoding: gzip, deflate, br
    Connection: keep-alive
    Cookie: _ga_NPW9Q6R88F=GS1.1.1655300402.2.1.1655300758.0; _ga=GA1.1.1556103515.1655291716; fpestid=OSMZ1DfZDFyJ5jijIXL1GWjkbvAbAsrZTrP91V-uIE25Nkd0zrEwhdZ0T9KIlsMTX1md5Q; _clck=1cp07q7|1|f2c|0; _clsk=1vuu0zr|1655300760540|4|1|f.clarity.ms/collect; bfp_sn_rf_8b2087b102c9e3e5ffed1c1478ed8b78=Direct; bfp_sn_rt_8b2087b102c9e3e5ffed1c1478ed8b78=1655293718453; bafp=1cadc1a0-ec9e-11ec-a6f4-25fd0ad0ea7e
    Upgrade-Insecure-Requests: 1
    Sec-Fetch-Dest: document
    Sec-Fetch-Mode: navigate
    Sec-Fetch-Site: none
    Sec-Fetch-User: ?1
    TE: trailers""")with open('video.ts', 'wb') as f:count = 0for line in string.splitlines():if line.startswith('http'):url = line.strip()count += 1print(_id, count, url)while True:try:response = requests.get(url, headers=headers, timeout=60)breakexcept:print('error')continuef.write(response.content)
    
  5. 视频将会下载到video.ts文件中;

简单解释一下视频下载的逻辑,所有的关键在于第三步中的页面内容,可以发现页面中有若干个URL,这些对应的是大约10秒时长的视频内容,我们要做的就是将这些URL的响应字节全部写入到video.ts中即可,当然请求这些URL往往会出错(也就是为什么网页端看视频经常会崩溃),因此代码里做了一些鲁棒性的调整。

关于ts格式的视频如何播放,笔者用的播放器是PotPlayer(强烈安利,这个播放器非常nice),可以直接播放ts格式的视频,如果想要转换成常规的mp4格式,建议另寻方法。

经测试,该方法可以推广到其他番剧。

可能有人会觉得这样做还是过于复杂,其实注意到第2步红字中发起者为hls.js的那个GET请求了吗?如果我们能够直到这个请求的URL是如何得到的,那么即可实现全自动的下载,事实上笔者已经花了十几分钟查看hls.js中的内容(从网络一栏筛选JS请求即可看到),但是实在是太长(有15000行),没有能够看明白该请求的URL的文件ID是如何构造出来的,不过的确可以在浏览器里设置断点进行调试,不过这实在是过于复杂,因此只是暂且走到这一步,也不打算深究下去(主要原因是在笔者看JS的时间里,《辉夜大小姐》7-10集已经全部下好了,那还看个P代码)。

下载速度不算很快,但是也不会慢到哪里去,如果需要批量下载建议改写多进程,或者直接躲开几个窗口同时运行就好了。

总之如果没有其他资源可取,私以为蚂蚁Tube@动画板块是一个权宜之计,虽然下载得到的视频是有水印的,但是能动就行,还要啥自行车?!

烂活整完,搁笔。

【烂活】斯坦福句法解析库使用小结+最新四月新番下载(以辉夜与阿尼亚为例)相关推荐

  1. IOS学习:常用第三方库(GDataXMLNode:xml解析库)

    IOS学习:常用第三方库(GDataXMLNode:xml解析库) 解析 XML 通常有两种方式,DOM 和 SAX: DOM解析XML时,读入整个XML文档并构建一个驻留内存的树结构(节点树),通过 ...

  2. 深入 Go 中各个高性能 JSON 解析库

    深入 Go 中各个高性能 JSON 解析库 转载请声明出处哦~,本篇文章发布于luozhiyun的博客:https://www.luozhiyun.com/archives/535 其实本来我是没打算 ...

  3. 【c语言】C语言配置文件解析库——iniparser

    转载自:http://blog.csdn.net/u011192270/article/details/49339071 C语言配置文件解析库--iniparser 前言:在对项目的优化时,发现Lin ...

  4. iOS开源JSON解析库MJExtension

    iOS中JSON与NSObject互转有两种方式:1.iOS自带类NSJSONSerialization 2.第三方开源库SBJSON.JSONKit.MJExtension.项目中一直用MJExte ...

  5. C语言配置文件解析库——iniparser

    C语言配置文件解析库--iniparser 1. 1.1前言:在对项目的优化时,发现Linux下没有专门的供给C语言使用的配置文件函数,于是搜索到了iniparser库,可以像那些面向对象语言一样,使 ...

  6. python3.6爬虫环境安装要多少内存_Python3爬虫环境配置——解析库安装(附tesserocr安装方法)...

    Python3爬虫环境配置--解析库安装(附tesserocr安装方法) 抓取网页代码后,第二步就是提取信息,为了方便程序设计,这里不采用繁琐的正则提取,利用社区里强大的Python解析库,如lxml ...

  7. 同花顺python_python的解析库pyquery解析并读取同花顺网站的焦点新闻

    代码如下: #本代码介绍requery第三方解析库的使用 #本代码用来读取同花顺网站的焦点新闻标题 #1.导入相应的模块 import requests from pyquery import PyQ ...

  8. BeautifulSoup解析库详解

    BeautifulSoup是灵活又方便的网页解析库,处理高效,支持多种解析器 利用它不用编写正则表达式即可方便地实现网页信息的提取 安装:pip3 install beautifulsoup4 用法详 ...

  9. beautifulsoup解析动态页面div未展开_两个资讯爬虫解析库的用法与对比

    " 阅读本文大概需要 10 分钟. " 舆情爬虫是网络爬虫一个比较重要的分支,舆情爬虫往往需要爬虫工程师爬取几百几千个新闻站点.比如一个新闻页面我们需要爬取其标题.正文.时间.作者 ...

最新文章

  1. Netty4具体解释三:Netty架构设计
  2. 1、ASP.NET MVC入门到精通——新语法
  3. CSS中通过import方式导入的方法
  4. HDU-2159 FATE 二维背包
  5. .NET框架程序设计--Globally Deployment Assembly全局部署程序集
  6. java写入txt文件 不替换_java非覆盖写入文件及在输出文本中换行
  7. ASP.NET Core 静态文件及JS包管理器(npm, Bower)的使用
  8. 抖音右上角一个小黄点是什么_抖音官方入驻视频号,释放了一个什么样的信号?...
  9. php中的echo、print,print_r、var_dump
  10. wince ./configure
  11. python经典实例pdf-Python机器学习经典实例_PDF电子书
  12. 文章自动采集重新组合工具
  13. Linux之flash流媒体服务器red5安装
  14. 市场需求分析(MRD)模板
  15. 74LS138译码器实现举重裁判电路-QuartusII 软件仿真
  16. 什么是 Access Token
  17. 一:计算机中加法的电路实现
  18. 通过GetVersionEx获取Win10版本号
  19. 【学术版】《最强大脑记忆力训练教程》
  20. 感应加热电源-谐振移相-感性移相

热门文章

  1. H5微场景设计和应用
  2. 基于AWS的云架构设计最佳实践——万字长文:云架构设计原则|附PDF下载
  3. linux挂载nfs文件失败,挂载nfs文件系统时错误
  4. 植物发育|大豆:多组学揭示体细胞胚胎发生过程DNA甲基化与发育转变的关系
  5. 计算机毕业后的打算英语作文,我未来的计划,毕业后的计划My Future Plan
  6. 构建一个网站必知7个要领
  7. 科技云报道:历经四年,RPA走向同质化?
  8. python计算器小程序源码_python代码编写计算器小程序
  9. 实用 Windows 软件系列分享(四)
  10. 概率矩阵分解(Probabilistic Matrix Factorization)