Project

General

Profile

Actions

Bug #59

open

Middleware: External Links

Added by Lane Shaw over 5 years ago.

Status:
New
Priority:
Low
Assignee:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

From previous crawls no external links have been successfully saved into any databases. The ProductWebExternalLinkMiddleware is currently designed to act as a replacement for Scrapy's OffsiteMiddleware even though they're technically in the same queue 1 after another, and the statistics are being updated but it's unclear by which one (possibly both, worse case scenario).

Fix: It would probably best to simply use ProductWebExternalLinkMiddleware to record external links only instead of updating statistics and trying to drop requests itself, and simply pass the requests on to the builtin OffsiteMiddleware to handle stats and drops.

No data to display

Actions

Also available in: Atom PDF