gravatar

BigBangData

Marcelo Sanches

Recently Published

Time & Place
A timeline of places and distances between me and my fiancee, before we first (intentionally) met.
Coronavirus Data Analysis
Titanic Survival Part 1: Exploratory Data Analysis
A project involving the Kaggle Titanic Competition. The first of two parts, this part consists of Exploratory Data Analysis in R including mild pre-processing. The second part consists of building and evaulating machine-learning models in Python, creating a single pipeline for cleaning, modeling, and prediction.
The Asymptotic Normality of Means from an Exponential Distribution
This project explores, via simulation and comparison with theory, the asymptotic convergence to normality of a distribution of means of 40 random iid draws from an exponential distribution with rate 0.2. The simulation consists of taking 1,000 means of 40 randomly generated exponentials. We explore the mean and variance of this distribution of means. As expected from the Central Limit Theorem, the resulting distribution is approximately normal, despite its origins in the positively-skewed exponential distribution.
The Impact of Harmful Weather on the U.S. Population and Economy
An exploratory analysis of the NOAA's Storm Data. Final project for 'Reproducible Research', Course 5 of the 9-Course 'Data Science Specialization' from Johns Hopkins University in Coursera.