Skip to content

shoeb370/Ipi-Real-Estate-Agent-Scraper-Git-Hub-Repository

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🏠 IPI Real Estate Agent Contact Scraper (Python)

A Python-based web scraping project to extract Belgian real estate agent contact details from the official IPI website (https://www.ipi.be/agent-immobilier).

This project demonstrates AJAX reverse‑engineering, dynamic URL generation, JSON‑LD parsing, and clean data export to Excel.


🚀 Features

  • 🔍 Extract agent BIV (contact) numbers via hidden AJAX endpoint
  • 🔗 Generate SEO‑friendly agent profile URLs
  • 🧠 Parse structured JSON‑LD data
  • 📞 Scrape email, phone, website & Google Maps links
  • 📦 Export clean data to Excel (.xlsx)

📂 Project Structure

ipi-real-estate-scraper/
│
├── extract_contact_id.py      # Main scraping script
├── constants.py               # Headers, cookies & base URLs
├── ipi_contact_details.xlsx   # Output file (generated)
├── requirements.txt           # Python dependencies
└── README.md                  # Project documentation

🧰 Tech Stack

  • Python 3.x
  • requests
  • BeautifulSoup4
  • pandas
  • JSON / Regex / unicodedata

⚙️ Installation

Clone the repository:

git clone https://github.com/your-username/ipi-real-estate-scraper.git
cd ipi-real-estate-scraper

Install dependencies:

pip install -r requirements.txt

▶️ Usage

Run the scraper:

python extract_contact_id.py

After execution, you’ll get:

ipi_contact_details.xlsx

Containing:

  • Agent Name
  • Profile URL
  • Email
  • Phone Number
  • Website
  • Address details
  • Google Maps link

🧠 How It Works (High Level)

  1. Sends a POST request to IPI’s internal AJAX endpoint
  2. Extracts agent names & BIV numbers
  3. Slugifies names to generate profile URLs
  4. Scrapes each profile page
  5. Extracts data from JSON‑LD + HTML
  6. Saves structured output to Excel

🚧 Challenges Solved

✔ AJAX-based data loading ✔ Accent-safe slug generation ✔ Mixed JSON + HTML parsing ✔ Error handling & fallbacks


📌 Use Cases

  • Real estate lead generation
  • Market & competitor research
  • CRM data enrichment
  • Python automation demos

📖 Related Article

📝 Medium walkthrough: https://nagmanahid27.medium.com/how-i-scraped-belgian-real-estate-agent-contact-details-from-ipi-be-using-python-19b415ca7ca3


⚠️ Disclaimer

This project is for educational purposes only. Always review and comply with a website’s robots.txt and terms of service before scraping.


🤝 Let’s Connect

If you’re looking for help with web scraping, Python automation, or data extraction, feel free to connect on LinkedIn or Upwork.

⭐ If this repo helped you, don’t forget to star it!

About

A Python-based web scraping project to extract Belgian real estate agent contact details from the official IPI website (https://www.ipi.be/agent-immobilier).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages