This project addresses two main challenges using Airbnb data:
- Sentiment Analysis: Classifying Airbnb reviews as positive or negative to help hosts, guests, and platforms gain insights into user experiences.
- Price Prediction: Predicting the nightly price of Airbnb listings based on features such as location, amenities, and review sentiment.
The project is divided into two main parts:
We leverage transfer learning by pretraining models on publicly available datasets:
- IMDB Movie Reviews – clear polarity-labeled text data
This approach enables better generalization and more effective sentiment classification on Airbnb reviews.
We use machine learning regression models to predict Airbnb listing prices. Features include:
- Listing characteristics (location, number of rooms, amenities, etc.)
- Aggregated sentiment scores from guest reviews
This enables more accurate and data-driven pricing strategies for hosts and platforms.
- Data Preprocessing: Cleaning and transforming text and tabular data.
- Model Training:
- Sentiment Analysis: Models like SGDClassifier, Logistic Regression, etc., trained on IMDB and fine-tuned on Airbnb.
- Price Prediction: Regression models (e.g., Linear Regression, Random Forest) trained on Airbnb listing data.
- Evaluation: Using metrics such as accuracy, F1-score (for sentiment) and RMSE, MAE (for price prediction).
Price Prediction/– Contains the files and notebook to predict nightly prices based on featuresSentiment Analysis/– Contains the files and notebook to classify airbnb reviews between positive and negative sentiments
This project is being developed by:
