Recently Published
NLP Capstone
A real-time next word prediction app built with n-gram language modelling and Stupid Backoff. Trained on 102M+ words from blogs, news, and Twitter (HC Corpora). Deployed as an interactive Shiny app with sub-10ms prediction latency.
NLP Capstone EDA
Exploratory data analysis of the HC Corpora dataset for the Johns Hopkins Data Science Capstone. Includes summary statistics, word frequency analysis, and plans for the n-gram prediction algorithm and Shiny app.