RPubs

by RStudio

RSdata

R Singh

Recently Published

The final report should be presented in more formal format. Consider your audience to be non data analysts. Fellow data analysts (i.e. students) will be able to access your R Markdown file for details on the analysis. Submit a Zip file with your R Markdown file, the HTML output, and any supplementary files (e.g. data, figures, etc.). You must address the five following sections: Introduction: What is your research question? Why do you care? Why should others care? Data: Write about the data from your proposal in text form. Address the following points: Data collection: Describe how the data were collected. Cases: What are the cases? (Remember: case = units of observation or units of experiment) Variables: What are the two variables you will be studying? State the type of each variable. Type of study: What is the type of study, observational or an experiment? Explain how you’ve arrived at your conclusion using information on the sampling and/or experimental design. Scope of inference - generalizability: Identify the population of interest, and whether the findings from this analysis can be generalized to that population, or, if not, a subsection of that population. Explain why or why not. Also discuss any potential sources of bias that might prevent generalizability. Scope of inference - causality: Can these data be used to establish causal links between the variables of interest? Explain why or why not. Exploratory data analysis: Perform relevant descriptive statistics, including summary statistics and visualization of the data. Also address what the exploratory data analysis suggests about your research question. Inference: If your data fails some conditions and you can’t use a theoretical method, then you should use simulation. If you can use both methods, then you should use both methods. It is your responsibility to figure out the appropriate methodology. Check conditions Theoretical inference (if possible) - hypothesis test and confidence interval Simulation based inference - hypothesis test and confidence interval Brief description of methodology that reflects your conceptual understanding Conclusion: Write a brief summary of your findings without repeating your statements from earlier. Also include a discussion of what you have learned about your research question and the data you collected. You may also want to include ideas for possible future research.

about 7 years ago

Data 607 Final Project

about 7 years ago

Data 607 Tidyverse Book contribution: forcats

about 7 years ago

Data 607_DS in Context

Data Science in Context Presenation for Data 607

about 7 years ago

Data 606 Lab8

about 7 years ago

Data 606 HW8

about 7 years ago

Data 606: Project Proposal

about 7 years ago

FinalProjectProposal

about 7 years ago

Data607 RecommenderSystems- LinkedIn

about 7 years ago

Data 606_Ch 7 Homework

Linear Regression

about 7 years ago

Data606_Lab7: Introduction to linear regression

about 7 years ago

Data 607 Project 4

For this project, you can start with a spam/ham dataset, then predict the class of new documents (either withheld from the training dataset or from another source such as your own spam folder).

about 7 years ago

Data 606: Project Proposal

about 7 years ago

Data 606_Ch 6 Homework

Inference for Categorical Data

about 7 years ago

Data 606: Lab6 Inference for Categorical Data

Inference for categorical data

about 7 years ago

Data 607- Assignment 9- WEB API's

The New York Times web site provides a rich set of APIs, as described here: http://developer.nytimes.com/docs - You’ll need to start by signing up for an API key. - Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it to an R dataframe

about 7 years ago

Data606_ Ch5 HW

Exercises from the following text: OpenIntro Statistics 3rd Ed. Chapter 5:Inference for Numerical Data

about 7 years ago

Data 606 Lab 5: Inference for numerical data

Inference for numerical data

about 7 years ago

Data 606 Presentation

about 7 years ago

Data 607: Project 2

The goal of this assignment is to give you practice in preparing different datasets for downstream analysis work. Your task is to: 1. Choose any three of the “wide” datasets. For each of the three chosen datasets: - Create a .CSV file (or optionally, a MySQL database!) that includes all of the information included in the dataset. You’re encouraged to use a “wide” structure similar to how the information appears in the discussion item, so that you can practice tidying and transformations as described below. - Read the information from your .CSV file into R, and use tidyr and dplyr as needed to tidy and transform your data. [Most of your grade will be based on this step!] - Perform the analysis requested in the discussion item. - Your code should be in an R Markdown file, posted to rpubs.com, and should include narrative descriptions of your data cleanup work, analysis,and conclusions.

about 7 years ago

DATA 607_week7_web technologies

Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, and separately create three files which store the book’s information in HTML (using an html table), XML, and JSON formats (e.g. “books.html”, “books.xml”, and “books.json”). To help you better understand the different file structures, I’d prefer that you create each of these files “by hand” unless you’re already very comfortable with the file formats. Write R code, using your packages of choice, to load the information from each of the three sources into separate R data frames. Are the three data frame s identical? Your deliverable is the three source files and the R code. If you can, package your assignment solution up into an .Rmd file and publish to rpubs.com. [This will also require finding a way to make your three text files accessible from the web].

about 7 years ago

Data606 Assignment5- Ch4: Foundations for Inference

Foundations for Inference

over 7 years ago

Data606 Lab4b: Confidence Levels

Confidence Levels

over 7 years ago

Data606_Lab4a: Sampling Distributions

In this lab, we investigate the ways in which the statistics from a random sample of data can serve as point estimates for population parameters. We’re interested in formulating a sampling distribution of our estimate in order to learn about the properties of the estimate, such as its distribution.

over 7 years ago

Data606_Assignment 4_Ch3_Distributions of Random Variables

over 7 years ago

Data606_Lab3: Normal Distributions

In this lab we’ll investigate the probability distribution that is most central to statistics: the normal distribution. If we are confident that our data are nearly normal, that opens the door to many powerful statistical methods. Here we’ll use the graphical tools of R to assess the normality of our data and also learn how to generate random numbers from a normal distribution.

over 7 years ago

Data607_Assignment4 (week5): Tidy and Transform Data

1. Create a .CSV file (or optionally, a MySQL database!) that includes all of the information above. You’re encouraged to use a “wide” structure similar to how the information appears above, so that you can practice tidying and transformations as described below. 2. Read the information from your .CSV file into R, and use tidyr and dplyr as needed to tidy and transform your data. 3. Perform analysis to compare the arrival delays for the two airlines.

over 7 years ago

Data607_Project1-Chess

n this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file with the following information for all of the players: Player’s Name, Player’s State, Total Number of Points, Player’s Pre-Rating, and Average Pre Chess Rating of Opponents.

over 7 years ago

RPubs

RSdata

R Singh

Recently Published

Data 606 Final Exam

Data 606 Final Project

Data 607 Final Project

Data 607 Tidyverse Book contribution: forcats

Data 607_DS in Context

Data 606 Lab8

Data 606 HW8

Data 606: Project Proposal

FinalProjectProposal

Data607 RecommenderSystems- LinkedIn

Data 606_Ch 7 Homework

Data606_Lab7: Introduction to linear regression

Data 607 Project 4

Data 606: Project Proposal

Data 606_Ch 6 Homework

Data 606: Lab6 Inference for Categorical Data

Data 607- Assignment 9- WEB API's

Data606_ Ch5 HW

Data 606 Lab 5: Inference for numerical data

Data 606 Presentation

Data 607: Project 2

DATA 607_week7_web technologies

Data606 Assignment5- Ch4: Foundations for Inference

Data606 Lab4b: Confidence Levels

Data606_Lab4a: Sampling Distributions

Data606_Assignment 4_Ch3_Distributions of Random Variables

Data606_Lab3: Normal Distributions

Data607_Assignment4 (week5): Tidy and Transform Data

Data607_Project1-Chess

Data 606_Lab2_Probability

Data607_Assignment3_R character manip_Date Processing

Data607_week2_movies

Data606_HW1

Data606_Lab1

Data607_Assignment1

Data606_Lab0

Sign In

RSdata

R Singh

Recently Published