Identify active eligible women who either have no recorded cervical screening or whose last recorded screening is very overdue (most recent result 45 months or older)
Predicting stock returns in December based on preceding monthly stock returns. Using a cluster-then-predict methodology. (MITx homework assignment)
Nearly every email user has at some point encountered a “spam” email, which is an unsolicited message often advertising a product, containing links to malware, or attempting to scam the recipient. In this homework problem, we will build and evaluate a spam filter. (MITx "Analytics Edge" homework assignment)
Reviews of medical literature are often performed manually, with multiple people reviewing each search result, this is tedious and time consuming. In this problem, we will see how text analytics can be used to automate the process of information retrieval.
Wikipedia is a free online encyclopedia that anyone can edit and contribute to. It is available in many languages and is growing all the time. On the English language version of Wikipedia: One of the consequences of being editable by anyone is that some people vandalize pages. In this assignment we will attempt to develop a vandalism detector that uses machine learning to distinguish between a valid edit and vandalism.
Predicting 'responsiveness' of e-mails from Enron during investigation into electricity price manipulation
Predicting sentiment in tweets about Apple. An example from MITx "The Analytics Edge"
The United States government periodically collects demographic information by conducting a census. In this problem, we are going to use census information about an individual to predict how much a person earns – in particular, whether the person earns more than $50,000 per year. This data comes from the UCI Machine Learning Repository. The file census.csv contains 1994 census data for 31,978 individuals in the United States. (assignment for MITx “The Analytics Edge")
Identifying letters from images. Assignment for MITx "The Analytics Edge"
Improving diabetes outcomes. Daily case-finding
Predicting why people vote using logistic regression and CART models. An experiment in various types of social pressure. (Assignment for MITx The Analytics Edge)
Predicting parole violations based on potential parolee characteristics. (Assignment for MITx Analytics Edge)
Predicting loan repayment from borrower and loan characteristics using logistic regression and receiver operator characteristic curve (ROCR). (assignment for MITx Analytics Edge)
Predicting Medicare (USA) and Medicaid claims using Classification and Regression Trees (CART). Part of Analytics Edge (MITx) exercise
Predicting Judge Steven's decisions in the Supreme Court of the United States
Improving Zostavax immunization coverage among eligible population at coHealth Kensington.
Colorectal cancer is under-screened in Australian primary care, and at the Kensington clinic of coHealth. A project to measure and improve the rate of colorectal cancer screening at Kensington site of coHealth.
Replicating the data analysis of "Using Weather Variability to Estimate the Response of Savings to Transitory Income in Thailand (1992)" by Christina Paxson
Proportion of results not yet notified (of results marked 'Discuss') at the Kensington Clinic of coHealth, 2015-2017
for John Hopkins Data Products Course
by Primary Health Network areas Population - 2014 data General Practitioners - 2016 data
Analysis of U.S. National Oceanic and Atmospheric Administration (NOAA) storm database 2007 to 2011, at the request of John Hopkins University Reproducible Data training course.