Web Scraping and Word Colud
-
Updated
Jan 9, 2026 - Python
Web Scraping and Word Colud
Learn Chinese word representations using subword and subcharacter information
Use PTT and Chinese Wiki corpora to build count-based and prediction-based word embeddings.
Source code of paper "Incorporating prior knowledge into word embedding for Chinese word similarity measurement", accepted by ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP).
Add a description, image, and links to the chinese-word-embedding topic page so that developers can more easily learn about it.
To associate your repository with the chinese-word-embedding topic, visit your repo's landing page and select "manage topics."