This project aims to detect fake news and manipulative marketing content using advanced machine learning techniques. It utilizes a large real-world dataset and compares the performance of traditional ML models, a transformer-based model, and an ensemble voting approach.
In today's world of digital misinformation, itβs critical to identify and combat fake news and paid PR propaganda. This project addresses the issue by applying multiple classification models on a labeled dataset of real and fake news articles.
- β BERTweet Transformer Model
- β Classical ML Models (Logistic Regression, Naive Bayes, SVM, Random Forest)
- β Ensemble Voting Classifier (Majority voting across the above models)
- Source: Fake and Real News Dataset on Kaggle
- Files:
Fake.csv: 23,502 articlesTrue.csv: 21,417 articles
- Columns:
title: Headline of the articletext: Full article contentsubject: News categorydate: Publication date
- Python 3.9+
- Pandas, NumPy, Scikit-learn
- PyTorch, HuggingFace Transformers
- Matplotlib, Seaborn
- Jupyter Notebook
- Clone this repo:
git clone https://github.com/your-username/fake-news-marketing-detector.git
cd fake-news-marketing-detector