A web scraping project built using Scrapy to extract data from https://www.dlapiperdataprotection.com.
This is a solution project to the below upwork job.
git clone https://github.com/msenior85/dla_piper.git
cd dla_piperIf you are on Linux or macOS
curl -Ls https://astral.sh/uv/install.sh | shOr using Homebrew (macOS)
brew install astral-sh/tap/uvIf you are on Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"uv run scrapy crawl lawsuv run scrapy crawl laws -O laws.csvThe Spider will generate a csv file (laws.csv) in the root of the project.
Update middlewares.py file to use your own proxy string or comment out the line if you don't have one.
import os
class CustomProxyMiddleware:
def __init__(self):
self.proxy = os.getenv("proxy_us") # use your own proxy string'
def process_request(self, request, spider):
request.meta["proxy"] = self.proxyAlternatively if you have no proxy, disable the proxy middleware by commenting the below line in the settings.py file
DOWNLOADER_MIDDLEWARES = {
"dla_piper.middlewares.CustomProxyMiddleware": 543, # comment out this line
}| country | description | last_modified |
|---|---|---|
| Algeria | Law No. 18-07 of 10 June 2018 on protection of natural persons in personal data processing (“Law No. 18-07”). | 20 January 2025 |
| Armenia | Personal Data Protection Law as of 18.05.2015, number ՀՕ-49-Ն. | 28 January 2025 |
MIT License – see LICENSE for details.
