多应用+插件架构,代码干净,二开方便,首家独创一键云编译技术,文档视频完善,免费商用码云13.8K 广告
- 找到pipelines.py里的类名MeituanPipeline - settings.py文件激活中间件MeituanPipeline ``` ITEM_PIPELINES = { 'meituan.pipelines.MeituanPipeline': 300, } ``` - cake.py ``` # -*- coding: utf-8 -*- import scrapy from ..items import MeituanItem # 引入items的类,数据通过items传入 class CakeSpider(scrapy.Spider): name = 'cake' allowed_domains = ['meituan.com'] start_urls = ['http://i.meituan.com/s/changsha-蛋糕/'] def parse(self, response): mt = MeituanItem() # 实例化 title_list = response.xpath('//*[@id="deals"]/dl/dd/dl/dd[1]/a/span[1]/text()').extract() money_list = response.xpath('//*[@id="deals"]/dl/dd[1]/dl/dd[2]/dl/dd[1]/a/div/div[2]/div[2]/span[1]/text()').extract() for i,j in zip(title_list,money_list): # print(i+"-------------"+j) mt['title'] = i # 把数据丢给管道items, mt['title']等同于 items中的title = scrapy.Field() mt['money'] = j yield mt ``` - items.py ``` # -*- coding: utf-8 -*- # Define here the models for your scraped items # # See documentation in: # https://doc.scrapy.org/en/latest/topics/items.html import scrapy class MeituanItem(scrapy.Item): # define the fields for your item here like: # name = scrapy.Field() title = scrapy.Field() money = scrapy.Field() # pass ``` - 在pipelines.py里打印测试 ``` # -*- coding: utf-8 -*- # Define your item pipelines here # # Don't forget to add your pipeline to the ITEM_PIPELINES setting # See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html class MeituanPipeline(object): def process_item(self, item, spider): print(spider.name) return item ``` >scrapy crawl cake