Update broken links in mysql database column with scrapy

Scrapy version: 0.24

It might be a python issue but i’m trying to update dead links in a MySql database and the pipeline only updates the first record, there has to be something wrong with the logic somewhere or something silly.

I’m setting item[‘is_active’] to 0 if link is dead then in the database pipeline I connect to mysql database and change the column is_active to 0.

For some reason it only works for the first “404” url, then it keeps crawling but it doesnt update the database, it only sets is_active column to 0 for the first item[‘is_active’] that has been set to 0.

Here is all the code:


Note: I’m using a single-file script, I prefer it that way, if you want me to copy and paste the gist contents over here let me know but it’s 140 lines.

Source: python

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.