This project is a web scraper designed to extract job listings from the Vendr website, specifically targeting various positions in the tech industry. The scraper retrieves data in JSON format and stores it in a PostgreSQL database.
- Fetches job listings for specified positions.
- Constructs URLs to access JSON data.
- Extracts relevant information such as company name, salary details, and job descriptions.
- Stores the extracted data in a PostgreSQL database.
- Utilizes multi-threading for efficient data retrieval.
- Python 3.x
- Libraries:
- requests
- BeautifulSoup4
- psycopg2
- tqdm
CREATE DATABASE vendr
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name_company VARCHAR(255),
min_salary FLOAT,
median_salary FLOAT,
max_salary FLOAT,
describe TEXT
);export db_password=<your_db_password>
export db_port=<your_db_port>
export db_host=<your_db_host>
pip install -r requirements.txt
This document outlines the process for extracting data from the Vendr website, specifically focusing on the "Application Development" category.
The target website is structured to provide various job listings and company information. Below are screenshots illustrating the site layout:
We aim to extract all relevant data from the "Application Development" section. The following screenshot highlights this category:
To fetch the necessary data, we need to construct a URL using the following parameters:
- position: This will be set to "DevOps".
- i: This represents the page number to parse.
The URL format is as follows:
{
"currentUrl":"https://www.vendr.com/categories/data-analytics-and-management/big-data?page=2",
"companies":[
{
"id":"82b34605-9ae4-41a4-9c97-4ee36fd3e898",
"slug":"kyligence",
"name":"Kyligence",
"legalName":"Kyligence, Inc.",
"icon":"//backoffice.vendr.com/public-assets/logos/1724519092301/httpssiteprodcdn.kyligence.iowpcontentthemeskyligencethemeimagesheaderfooterlogo.png",
"description":"Kyligence Zen's AI-powered self-service analytics gives you an AI copilot for your data and metrics. Discover new insights and improve decision-making.",
"isVendrVerified":false,
"stats":{
...- Name company: Kyligence
- For create url: kyligence

