CS5664_TextPreProcessing_MetaData.ipynb contains:
- Parsing the metadata and produce essential pandas dataframes.
- Creating product-copurchase edge lists.
CS5664_NetworkAnalysis.ipynb contains:
- Network information such as degree centrality analysis.
- Node degree distribution.
- Powerlaw and heavy tail distribution.
CS5664_MetaData_and_ProductReviews_Analysis_EGO_Graph_Recommendations.ipynb contains:
- Text preprocessing.
- Meta data and Product Review processing.
- Sentiment Analysis.
- Topic Modeling.
- EgoGraph based Product Recommendations using product Titles.
CS5664_Product_Recommendations_MachineLearning.ipynb contains:
- Review rating analysis.
- KNN based SVD experimentation.
- Hyperparameter tuning and training SVD with best params.
- Product recommendations based on reviews.
Datasets:
Appliances.json- Reviews information for electronic appliances.Magazine_Subscriptions.json- Reviews information for magazines.amazon-books.csv- information on products (books) generated fromamazon-meta.txt.amazon-books-copurchase.edgelist- information generated fromamazon-meta.txtcontains purchase-copurchase similar edgelist.amazon-meta.txt- Data downloaded from SNAP.products_copurchases_links.csv- Purchase Copurchase list for all products generated fromamazon-meta.txtused for network analysis.products_data.csv- contains all product information.
Supplementary Materials & Visualizations folders include:
- Network Analysis plots.
- WordClouds - Reviews, Sentiments, Topics.
- Sentiment Analysis plots.
- LDA topics.