Lucy's web crawler and web scraper.
Lucy is a tool to visualize in a friendly way truthful information about the status of air pollution.
Commands to install Scrapy:
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 627220E7
$ echo 'deb http://archive.scrapy.org/ubuntu scrapy main' | sudo tee /etc/apt/sources.list.d/scrapy.list
$ sudo apt-get update && sudo apt-get install scrapy-0.24 python-psycopg2Install pip:
$ sudo easy_install pipAnd then, install Scrapy
$ sudo pip install scrapy$ env LCDATABASE=yourdb LCUSER=youruser LCPASSWORD=yourpassword LCHOST=yourhost LCPORT=yourport scrapy crawl pollutantsPoor man's scheduler:
while [ $((1)) == 1 ]; do
env LCDATABASE=yourdb LCUSER=youruser LCPASSWORD=yourpassword LCHOST=yourhost LCPORT=yourport scrapy crawl pollutants
sleep 3600
doneAlso, you can check scrapy's documentation.