在线观看不卡亚洲电影_亚洲妓女99综合网_91青青青亚洲娱乐在线观看_日韩无码高清综合久久

鍍金池/ 問答/人工智能  Python/ scrapy報錯:Unhandled error in Deferred

scrapy報錯:Unhandled error in Deferred

使用scrapy框架爬蟲,但是遇到各種問題以至于一次都沒有成功,想知道自己錯在哪里了?

items.py

# -*- coding: utf-8 -*-

# Define here the models for your scraped items
#
# See documentation in:
# http://doc.scrapy.org/en/latest/topics/items.html

import scrapy


class BnuzpjtItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()

    #公告通知標(biāo)題
    title = scrapy.Field()
    #公告通知鏈接
    url = scrapy.Field()

pipelines.py

# -*- coding: utf-8 -*-
import codecs
import json

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html


class BnuzpjtPipeline(object):
    def __init__(self):
        self.file = codecs.open(r"C:/Users/j/bnuzpjt","wb",encoding = "utf-8")
        
    def process_item(self, item, spider):
        i = json.dumps(dict(item),ensure_ascii=False)
        line = i+'\n'
        #數(shù)據(jù)寫入到j(luò)sou文件中
        self.file.write(line)
        return item

    def close_spider(self,spider):
        #關(guān)閉mydata文件
        self.file.close()

settings

# -*- coding: utf-8 -*-

# Scrapy settings for bnuzpjt project
#
# For simplicity, this file contains only settings considered important or
# commonly used. You can find more settings consulting the documentation:
#
#     http://doc.scrapy.org/en/latest/topics/settings.html
#     http://scrapy.readthedocs.org/en/latest/topics/downloader-middleware.html
#     http://scrapy.readthedocs.org/en/latest/topics/spider-middleware.html

BOT_NAME = 'bnuzpjt'

SPIDER_MODULES = ['bnuzpjt.spiders']
NEWSPIDER_MODULE = 'bnuzpjt.spiders'


# Disable cookies (enabled by default)
COOKIES_ENABLED = False

# Configure item pipelines
# See http://scrapy.readthedocs.org/en/latest/topics/item-pipeline.html
ITEM_PIPELINES = {
    'bnuzpjt.pipelines.BnuzpjtPipeline': 300,
}

爬蟲代碼:bnuzspd.py

# -*- coding: utf-8 -*-
import scrapy
from bnuzpjt.items import BnuzpjtItem
from scrapy.http import Request


class BnuzspdSpider(scrapy.Spider):
    name = 'bnuzspd'
    allowed_domains = ['bnuz.edu.cn']
    start_urls = ['http://bnuz.edu.cn/']

    def parse(self, response):
        item=BnuzpjdItem()
        item["title"] = response.xpath("http://span[@class='lefttitle']/text()").extract()
        item["url"] = response.xpath("http://ul[@class='leftclick']/li/a/@href").extract()

        yield item

運行之后會出現(xiàn)一個錯誤,如圖:

clipboard.png

請問到底是出了什么問題?

回答
編輯回答
瘋子范

學(xué)習(xí)調(diào)試,和看報錯信息,可以執(zhí)行下斷點看程序在哪里掛的

2017年10月7日 12:56
編輯回答
氕氘氚

我也遇到了同樣的問題 請問您現(xiàn)在解決了嗎

2018年7月2日 08:20