python etree爬取去哪儿数据

爬取去哪儿数据

import pymysql
from lxml import etree

#!/usr/bin/env python
# encoding: utf-8
"""
@author: owen.cai
@contact: 1181698715@qq.com
@file: qunarspider.py
@time: 2019/9/30 15:01
"""
import pymysql
from lxml import etree
class qunaer(object):def __init__(self):mysql_info={'host':'localhost','port':3306,'user':'root','password':'123456','db':'test','charset':'utf8','createdbsql':'''create table if not exists  test.qunar(time varchar (50),title varchar (50))'''}print(mysql_info['host'])# url = 'http://travel.qunar.com/travelbook/list.htm?page={0}&order=hot_heat'url='http://travel.qunar.com/travelbook/list.htm'# response=requests.get(url)def mysql_(self,sql):# 打开数据库连接（具体配置信息请自行替换）db = pymysql.Connect(host=self.mysql_info['host'],port=self.mysql_info['port'],user=self.mysql_info['user'],password=self.mysql_info['password'],db=self.mysql_info['db'],charset=self.mysql_info['charset'])# 创建一个游标对象cursor = db.cursor()# print('数据库连接成功')# 执行 SQL 建表语句cursor.execute(sql)db.commit()# print('数据库执行成功')def parse(self,url):response = etree.parse(url, etree.HTMLParser())# print(response)# aa=response.xpath("/html/body/div[2]/div/div[2]/ul/li[1]/p[1]/span[1]/span[3]/text()")times=response.xpath("//span[@class='days']/text()")titles = response.xpath("//h2/a/text()")# title = response.xpath("//aa[@target='_blank']/text()")for time,title in zip(times ,titles):# print(time,title)print('''insert into test.qunar values("{time}","{title}")'''.format(time=time,title=str(title).encode('utf-8')))if title in ("@王鋆鋆［OCT主题乐园3日游］It's Show Time五彩缤纷周末乐悠游","拾童心去珠海长隆海洋王国-邂逅一场神奇的海洋奇缘VS看一场马戏新巨创《龙秀?》","俯天津之眼?，童年动物园?，民国特色馆?游海洋公园?天津亲子3日游?"):continueself.mysql_('''insert into test.qunar values("{time}","{title}")'''.format(time=time,title=str(title).encode('utf-8')))# print(aa)# print(title)
if __name__=="__main__":qunaer=qunaer()for i in range(1,201):print("第{i}页开始".format(i=i))qunaer.parse(qunaer.url.format(i))# try:#     mysql_('select * from  mtime limit 10')#     mysql_(mysql_info['createdbsql'])# except Exception as except_:#     print(except_)#多线程编程 下一步计划，多线程编程

爬取的数据存入mysql

python etree爬取去哪儿数据相关推荐

python selenium 爬取去哪儿网的数据
python selenium 爬取去哪儿网的数据完整代码下载:https://github.com/tanjunchen/SpiderProject/tree/master/selenium+qu ...
python selenium爬取去哪儿网的酒店信息——详细步骤及代码实现
目录准备工作一.webdriver部分二.定位到新页面三.提取酒店信息 ??这里要注意?? 四.输出结果五.全部代码准备工作 1.pip install selenium 2.配置浏览器驱 ...
python+appium爬取微信运动数据，并分析好友的日常步数情况
python+appium爬取微信运动数据,并分析好友的日常步数情况声明:仅供技术交流,请勿用于非法用途,如有其它非法用途造成损失,和本博客无关目录 python+appium爬取微信运动数据,并 ...
python为啥爬取数据会有重复_利用Python来爬取“吃鸡”数据，为什么别人能吃鸡？...
原标题:利用Python来爬取"吃鸡"数据,为什么别人能吃鸡? 首先,神装镇楼背景最近老板爱上了吃鸡(手游:全军出击),经常拉着我们开黑,只能放弃午休的时间,陪老板在沙漠里奔波 ...
python爬虫爬取58网站数据_Python爬虫，爬取58租房数据字体反爬
Python爬虫,爬取58租房数据这俩天项目主管给了个爬虫任务,要爬取58同城上福州区域的租房房源信息.因为58的前端页面做了base64字体加密所以爬取比较费力,前前后后花了俩天才搞完. 项目演示 ...
python 爬虫表格,python爬虫爬取网页表格数据
用python爬取网页表格数据,供大家参考,具体内容如下 from bs4 import BeautifulSoup import requests import csv import bs4 #检查 ...
Python+Selenium爬取新浪微博评论数据
Python+Selenium爬取指定新浪微博的数据微博分析微博端类型选择爬取对象 Ajax动态加载数据分析 Python实现代码微博分析微博端类型首先找到一个待爬取的微博,需要注意的是, ...
Selenium实战之Python+Selenium爬取京东商品数据
实战目标:爬取京东商品信息,包括商品的标题.链接.价格.评价数量. 代码核心在于这几个部分: 其一:使用元素定位来获取页面上指定需要抓取的关键字: 其二:将页面上定位得到的数据永久存储到本地文件中. ...
python爬取网页代码-python爬虫爬取网页所有数据详细教程
Python爬虫可通过查找一个或多个域的所有 URL 从 Web 收集数据.Python 有几个流行的网络爬虫库和框架.大家熟知的就是python爬取网页数据,对于没有编程技术的普通人来说,怎么才能快 ...

python etree爬取去哪儿数据

python etree爬取去哪儿数据相关推荐

最新文章

热门文章