Forums

scheduled task of scrapy crawl did not work

I created a .py file to run scrapy crawl using api and the example from scrapy documention. Running this file in bash commend works. I followed the pythonanywhere instruction to create a .sh file to schedule this run but nothing happened. Is there something wrong with the bash file or other setup?

crawl.py file

import scrapy
from scrapy.crawler import CrawlerProcess

class MySpider(scrapy.Spider):
    # Your spider definition
    ...

process = CrawlerProcess()
process.crawl(MySpider)
process.start()

run.sh

1
2
3
#!/bin/bash
cd /home/username/scrapyprojects/projectname
"/home/username/.virtualenvs/virtualenvxx/bin/python" crawl.py

on the scheduled task screen, I put this url in the field box. /home/username/scrapyprojects/projectname/run.sh

Found the reason. bash needs to convert from windows format to unix.

Glad to hear you worked it out!