David Fong

Recently Published

Intensive telephone-based case-finding of under-screened patients for cervical screening
Identify active eligible women who either have no recorded cervical screening or whose last recorded screening is very overdue (most recent result 45 months or older)
Predicting Stock Returns with Cluster-Then-Predict
Predicting stock returns in December based on preceding monthly stock returns. Using a cluster-then-predict methodology. (MITx homework assignment)
Separating Spam from Ham
Nearly every email user has at some point encountered a “spam” email, which is an unsolicited message often advertising a product, containing links to malware, or attempting to scam the recipient. In this homework problem, we will build and evaluate a spam filter. (MITx "Analytics Edge" homework assignment)
Automating Reviews in Medicine - analyzing abstracts
Reviews of medical literature are often performed manually, with multiple people reviewing each search result, this is tedious and time consuming. In this problem, we will see how text analytics can be used to automate the process of information retrieval.
Predicting vandalism in Wikipedia
Wikipedia is a free online encyclopedia that anyone can edit and contribute to. It is available in many languages and is growing all the time. On the English language version of Wikipedia: One of the consequences of being editable by anyone is that some people vandalize pages. In this assignment we will attempt to develop a vandalism detector that uses machine learning to distinguish between a valid edit and vandalism.
Investigating Enron e-mails
Predicting 'responsiveness' of e-mails from Enron during investigation into electricity price manipulation
Turning Tweets into Knowlege
Predicting sentiment in tweets about Apple. An example from MITx "The Analytics Edge"
Predicting Earnings from Census Data
The United States government periodically collects demographic information by conducting a census. In this problem, we are going to use census information about an individual to predict how much a person earns – in particular, whether the person earns more than $50,000 per year. This data comes from the UCI Machine Learning Repository. The file census.csv contains 1994 census data for 31,978 individuals in the United States. (assignment for MITx “The Analytics Edge")
Letter Recognition
Identifying letters from images. Assignment for MITx "The Analytics Edge"
Diabetes outcomes
Improving diabetes outcomes. Daily case-finding
Understanding Why People Vote
Predicting why people vote using logistic regression and CART models. An experiment in various types of social pressure. (Assignment for MITx The Analytics Edge)
Parole Violation Prediction
Predicting parole violations based on potential parolee characteristics. (Assignment for MITx Analytics Edge)
Predicting Loan Repayment
Predicting loan repayment from borrower and loan characteristics using logistic regression and receiver operator characteristic curve (ROCR). (assignment for MITx Analytics Edge)
Claims Data Medicare and Medicaid
Predicting Medicare (USA) and Medicaid claims using Classification and Regression Trees (CART). Part of Analytics Edge (MITx) exercise
Judge, Jury, Classifier
Predicting Judge Steven's decisions in the Supreme Court of the United States
Zostavax (herpes zoster) immunisation, coHealth Kensington
Improving Zostavax immunization coverage among eligible population at coHealth Kensington.
Colorectal cancer screening - Kensington 2018
Colorectal cancer is under-screened in Australian primary care, and at the Kensington clinic of coHealth. A project to measure and improve the rate of colorectal cancer screening at Kensington site of coHealth.
Weather Variability to Estimate the Response of Savings to Transitory Income in Thailand
Replicating the data analysis of "Using Weather Variability to Estimate the Response of Savings to Transitory Income in Thailand (1992)" by Christina Paxson
Results yet to be notified, coHealth Kensington 2015-2017
Proportion of results not yet notified (of results marked 'Discuss') at the Kensington Clinic of coHealth, 2015-2017
SpurAfrika 2018 short slideshow - pitch
for John Hopkins Data Products Course
Population and General Practitioner numbers in Australia
by Primary Health Network areas Population - 2014 data General Practitioners - 2016 data
Storm impacts on public health and economy
Analysis of U.S. National Oceanic and Atmospheric Administration (NOAA) storm database 2007 to 2011, at the request of John Hopkins University Reproducible Data training course.