Actions
Bug #59
openMiddleware: External Links
Start date:
Due date:
% Done:
0%
Estimated time:
Description
From previous crawls no external links have been successfully saved into any databases. The ProductWebExternalLinkMiddleware is currently designed to act as a replacement for Scrapy's OffsiteMiddleware even though they're technically in the same queue 1 after another, and the statistics are being updated but it's unclear by which one (possibly both, worse case scenario).
Fix: It would probably best to simply use ProductWebExternalLinkMiddleware to record external links only instead of updating statistics and trying to drop requests itself, and simply pass the requests on to the builtin OffsiteMiddleware to handle stats and drops.
No data to display
Actions