我正在做一个scrapy spider,我必须发送一个post请求循环才能转到下一个页面,问题是它只发送一个post请求。querystring更改每个页面的元素“currentPage”,因此我必须为每个页面更改此键的值并发送post。但是,正如我之前所说,它在第一个Post请求之后停止。在import scrapy

headers = {

'accept': "*/*",

'origin': "http://www.**********.com",

'x-requested-with': "XMLHttpRequest",

'user-agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36",

'referer': "http://www.**********.com/venta/",

'accept-encoding': "gzip, deflate",

'accept-language': "en-US,en;q=0.8,es;q=0.6",

'cookie': "G_ENABLED_IDPS=google; cookieInterestedProject=416; visid_incap_434661=wjkf7tU+QPKDjpmWXz/BKSBz+1kAAAAAQUIPAAAAAAA7bs2fXOSL0JmeVSXo337M; incap_ses_223_434661=9zyRHwEdwGxtE8Ly00EYAxQw/VkAAAAAq7gkFJrJjsGdCgrRTwOfvg==; s_vnum=1512243236606%26vn%3D3; __utmz=other; s_cm=Natural%20Searchwww.google.com.co; s_v10=%5B%5B%27Natural%2520Search%27%2C%271509651236616%27%5D%2C%5B%27Natural%2520Search%27%2C%271509651249121%27%5D%2C%5B%27Natural%2520Search%27%2C%271509765142570%27%5D%2C%5B%27Natural%2520Search%27%2C%271509765184463%27%5D%5D; s_v8=%5B%5B%27natural%2520search%253A%2520google%253A%2520keyword%2520unavailable%27%2C%271509651236618%27%5D%2C%5B%27natural%2520search%253A%2520google%253A%2520keyword%2520unavailable%27%2C%271509651249123%27%5D%2C%5B%27natural%2520search%253A%2520google%253A%2520keyword%2520unavailable%27%2C%271509765142572%27%5D%2C%5B%27natural%2520search%253A%2520google%253A%2520keyword%2520unavailable%27%2C%271509765184465%27%5D%5D; ; s_cc=true; _ga=GA1.2.701497075.1509651237; _gid=GA1.2.1068485902.1509765143; NSC_nfuspdvbesbep-wt=ffffffff0975c87745525d5f4f58455e445a4a4229a2; OX_sd=1; OX_plg=pm; gpv_pn=metrocuadrado%3A%20buscar%3A%20resultados%20inmuebles%3A%20nuevo%20y%20usado; s_invisit=true; s_nr=1509765213941-Repeat; s_lv=1509765213944; s_lv_s=Less%20than%207%20days; s_sq=eltiempometrocuadradoprod%2Celtiempoglobal%3D%2526pid%253Dmetrocuadrado%25253A%252520buscar%25253A%252520resultados%252520inmuebles%25253A%252520nuevo%252520y%252520usado%2526pidt%253D1%2526oid%253Dhttp%25253A%25252F%25252Fwww.metrocuadrado.com%25252Fventa%25252F%252523%2526ot%253DA; madicionales=; mbarrio=; mciudad=; mgrupo=; mgrupoid=; mnrobanos=; mnrocuartos=; mnrogarajes=; msector=; mubicacion=; mvalorarriendo=; mzona=; orderBy=; selectedLocationCategory=; selectedLocationFilter=; sortType=; writtenFilters=mnrogarajes%3Bmnrobanos%3Bmnrocuartos%3Bmtiempoconstruido%3Bmarea%3Bmvalorarriendo%3Bmvalorventa%3Bmciudad%3Bmubicacion%3Bmtiponegocio%3Bmtipoinmueble%3Bmzona%3Bmsector%3Bmbarrio%3BselectedLocationCategory%3BselectedLocationFilter%3Bmestadoinmueble%3Bmadicionales%3BorderBy%3BsortType%3Bmestadoinmueble%3BcompanyType%3BcompanyName%3Bmidempresa%3Bmgrupo%3Bmgrupoid%3B; m2-srv=ffffffff0975c82e45525d5f4f58455e445a4a4229a2; mtiponegocio=venta; mtipoinmueble=; mvalorventa=; marea=; mtiempoconstruido=; companyType=; companyName=; midempresa=; mestadoinmueble=",

'cache-control': "no-cache",

'postman-token': "2e5f00b9-7c7c-32ed-1bdd-63cf2fed3cd8"

}

querystring = {

"":"","mnrogarajes":"","mnrobanos":"","mnrocuartos":"","mtiempoconstruido":"","marea":"","mvalorarriendo":"","mvalorventa":"","mciudad":"","mubicacion":"","mtiponegocio":"venta","mtipoinmueble":"","mzona":"","msector":"","mbarrio":"","selectedLocationCategory":"","selectedLocationFilter":"","mestadoinmueble":"","madicionales":"","orderBy":"","sortType":"","companyType":"","companyName":"","midempresa":"","mgrupo":"","mgrupoid":"","currentPage":"2","totalPropertiesCount":"115747","totalUsedPropertiesCount":"113926","totalNewPropertiesCount":"1821","sfh":"1"

}

url = 'http://www.*******.com/search/list/ajax'

num = 0

class HouseseSpider(scrapy.Spider):

name = "hoimom"

start_urls = ['http://www.********.com/venta/']

def parse(self,response):

for num in range(2,100):

for href in response.xpath('.//a[@class="data-details-id" and @itemprop="url"]/@href').extract():

yield scrapy.Request(url = href ,callback = self.parsei)

querystring["currentPage"] = str(num)

yield scrapy.Request(url = 'http://www.*********.com/search/list/ajax',method="POST",headers=headers,meta=querystring)

def parsei(self, response):

yield {

'latitude': response.xpath('//input[@id="latitude"]/@value').extract(),

'longitud': response.xpath('//input[@id="longitude"]/@value').extract(),

'precio de arriendo': response.xpath('.//dl/dt[h3/text()="Valor de arriendo"]/following-sibling::dd[1]/h4/text()').extract_first(),

'precio de venta': response.xpath('.//dl/dt[h3/text()="Valor de venta"]/following-sibling::dd[1]/h4/text()').extract_first(),

'Barrio_com': response.xpath('.//dl/dt[h3/text()="Nombre común del barrio "]/following-sibling::dd[1]/h4/text()').extract_first(),

'Barrio_cat': response.xpath('.//dl/dt[h3/text()="Nombre del barrio catastral"]/following-sibling::dd[1]/h4/text()').extract_first(),

'Estrato': response.xpath('.//dl/dt[h3/text()="Estrato"]/following-sibling::dd[1]/h4/text()').extract_first(),

'id': response.xpath('//input[@id="propertyId"]/@value').extract_first(),

'Habitaciones': response.xpath('.//dl/dt[h3/text()="Habitaciones"]/following-sibling::dd[1]/h4/text()').extract_first(),

'Parqueadero': response.xpath('.//dl/dt[h3/text()="Parqueadero"]/following-sibling::dd[1]/h4/text()').extract_first(),

'Tipo de calentador': response.xpath('.//dl/dt[h3/text()="Tipo de calentador"]/following-sibling::dd[1]/h4/text()').extract_first(),

'Cuarto de servicio': response.xpath('.//dl/dt[h3/text()="Cuarto de servicio"]/following-sibling::dd[1]/h4/text()').extract_first(),

'Tipo de acabado piso': response.xpath('.//dl/dt[h3/text()="Tipo de acabado piso"]/following-sibling::dd[1]/h4/text()').extract_first(),

'Area_Cons': response.xpath('.//dl/dt[h3/text()="Área construida"]/following-sibling::dd[1]/h4/text()').extract_first()

}

python循环post请求_循环post请求太多相关推荐

  1. python 循环写文件_循环-读写文件-字符编码

    目录: 1.1 while与for循环 1.赋值魔法 #1. 序列解包: 将多个值的序列解开,然后放到序列的变量中. x,y,z = 1,2,3 print(x,y,z) #the result : ...

  2. continue语句只用于循环语句中_循环里continue,break,return的作用,你知道吗?

    循环里continue,break,return的作用,你知道吗?​mp.weixin.qq.com 前言 循环里Continue,Break,Return经常会用到,也是很容易出错的一个坑,今天特地 ...

  3. 发起http请求_关于HTTP请求发起和响应你了解多少

    在一个web程序开发中,一般都有前端和后端之分,前端负责向后端请求数据和展示页面,后端负责接收请求和做出响应发回给前端,他们之间的协作桥梁是API,而API其实就是一个URL,作为HTTP连接的一种具 ...

  4. selenium 实现循环点击_-循环点击遇到的坑(每次点击后返回,页面元素都会变化的解决方法)...

    # 前言 selenium定位一组元素,批量操作循环点击的时候会报错:Element not found in the cache - perhaps the page has changed sin ...

  5. python循环结构三角形_循环结构实例(for循环三角形)

    本篇包括14章内容,系统介绍了Python语言的基础知识.内容包括Python基础语法.数据类型和类型转换.运算符.流程控制(分支结构循环结构).数据结构(列表生成式).函数的定义及使用.异常处理.迭 ...

  6. python单元测试的应用_单元测试使用请求库的python应用程序

    如果你使用具体请求尝试 httmock.它的奇妙简单和优雅: from httmock import urlmatch, HTTMock import requests # define matche ...

  7. 发送请求_发送soap请求调用wsdl服务

    需求:在客户端发起流程后,向另一个OA系统中发送一条代办服务的通知 Web系统提供了wsdl 1,获取对方web服务的地址: 处理方式,获取我方系统的服务器路径,然后在路径下添加文件配置对方web系统 ...

  8. 外循环java作用_循环和外循环的区别和作用

    展开全部 这是关于for循环嵌62616964757a686964616fe4b893e5b19e31333365643033套的问题,下面解释供参考:首先内层循环属于外层循环循环体的一部分,当循环体 ...

  9. options请求_前端数据请求的终极方案

    数据请求是我们开发中非常重要的一环,如何优雅地进行抽象处理,不是一件很容易的事情,也是经常被忽略的事情,处理不好的话,重复的代码散落在各处,维护成本极高. 所以我们需要好好梳理下数据请求涉及到哪些方面 ...

最新文章

  1. 关注Cortex-M处理器,M0、M3、M4简单对比
  2. c#枚举类似于java_如何在Java中获得类似于C的性能
  3. Matlab(R2020a)添加工具箱Toolbox的方法(详细图解)
  4. html5 自制播放器
  5. VC++判断文件或文件夹是否存在(转)
  6. JavaScript实现:如何写出漂亮的条件表达式
  7. android小程序_小程序踩坑记
  8. 删除js数组中制定内容
  9. 集成学习 Ensemble Learing(???)
  10. 190407每日一句
  11. ae saber插件_2020全套AE基础入门(下),入门首选!
  12. 【数据结构与算法】车辆路径问题(Vehicle Routing Problem,VRP)
  13. windbg内核诊断方式--转载
  14. Es6模板字符串封装与使用
  15. NX2007软件下载
  16. 高德地图-----国家和省级地图切换
  17. 关于Img标签绑定:src不显示图片
  18. Mangopi MQ-R:T113-s3编译Tina Linux系统(二)SDK目录
  19. web学习-项目练习-No.4-朋友圈
  20. html 输入选择框

热门文章

  1. iptables nat实验_【零基础学云计算】LVS负载均衡群集之NAT模式搭建 (实践篇)...
  2. android如何删除项目,AndroidStudio中怎样删除项目
  3. PHP用gd库给图片添加水印,php用GD库给图片添加水印
  4. oracle 怎么创建类型,ORACLE—002:Create之创建类型
  5. threejs骨架形状
  6. HEC-RAS二维溃坝洪水模拟(尾矿库)
  7. 计算机网上作业系统论文,网上作业提交系统的设计与实现
  8. v210 启动脚本分析
  9. 制作.ppm格式Linux内核启动logo
  10. 浏览器登录_经常用浏览器自动登录忘记了密码?教你一键查看网页星号密码