gravatar

mazub91

Marina Zub

Recently Published

Plot
Soups Study
Studies of Storm Data
In this report, I aim to answer 2 questions: Which natural disasters brought the most harm to the population health and to the economic conditions for the period from 4/18/1950 and 11/28/2011. To address these questions, I took the data from the National Oceanic and Atmospheric Administration, which was collected by the National Weather Service. The National Weather service receives their information from a variety of sources, which include but are not limited to: county, state and federal emergency management officials, local law enforcement officials, Skywarn spotters, NWS damage surveys, newspaper clipping services, the insurance industry and the general public. From the data, I found that the most harmful natural disaster is Tornado which brought 256 incidents respect to population health, 3212258.16 USD property damage. Also, hail appeared to be the most harmful for crops, for the stated period of time hail brought 579596.28 USD of harm.
Analyzing FitBit data
Taking advantages of the step-trackers to find patterns of daily users activities over the weekends and weekday.
Regression analysis for movies.
I completed exploratory data analysis (EDA), modeling, and prediction for movies. I picked a few variables and explored how they could affect movie's popularity ratings.
Prediction model for a house price in Ames, Iowa
My task was to develop a model to predict the selling price of a given home in Ames, Iowa. We found out that the model could be applied for out-of case data and it gives 95,4% predicted prices which stays in 95% confidence interval.
Studying pollution
Studying pollution
Studying pollution
Air pollution in US
Case study of air pollution in US
Clustering
more practice on clustering
Hierarchical clustering
Studying hierarchical clustering. Euclidian and Manhattan approach. dendrograms and heat maps
K-means clustering
Working with clustering, using kmeans function
Mastering ggplot2 package in r
exploring and mastering ggplot and qplot commands with ggplot2 package. Applying cut and quantile to be plotted
Mastering skills with ggplot2
exploring and mastering features of ggplot2 - ggplot function
Working with ggplot2
Exploring features of ggplot2 package
Regression analysis for house market
The project was done for a real estate agency which was interested in prediction factors that affect the cost of a house
EDA for house market
Exploratory Data Analysis for housing market of Ames, Iowa.
Bayesian Prediction for Movies
The model could be applied to predict audience scores for movies by using a few variable like run time, Oscar nomination if it's a drama genre and if it's a feature film
General Survey analisis
The analysis had a goal to find a correlation between the educational level and those who received welfare and tend to spend more time watching TV. As the result, we are 95% confident that people who receive welfare tend to spend 0.8 - 1.6 hours watching Tv more than those who do not receive welfare. Data provide convincing evidence of dependence between hours spend in front of TV and level of education.
Document