RPubs

by RStudio

Recently Published

STM1001 Lecture 6 (Data Science Stream)

By LTU_STM1001

This is the lecture for the Functions topic in the Data Science stream

about 1 year ago

Interaction Effects and Clarify

By Yung

about 1 year ago

Predicting Concrete Strength: A Multivariate and Logistic Regression Approach to Classifying Compressive Strength Outcomes

By nduonochie

We wanted to understand if given the ingredients of concrete, could we accurately predict if the resultant compressive strength of that concrete would meet industry standards (4000 PSI). First, we performed EDA to refine a multi-linear regression model. Then, we took our model and our engineered term of above or below 4000 PSI to train a subset of our data and assess the accuracy against a testing subset. Our final model came out at 87% accurate. Below, you can find our interpretation of this value. Interesting insights and limitations: Some variables of concrete are not ‘necessary’ but are rather additives that can strengthen concrete by enhancing the effects of more primary ingredients like cement. These types of elements (like Superplasticity, Slag, and Fly Ash) have strong interactivity with other concrete ingredients to improve the overall strength. Once we added interactive terms to our multi-linear regression for these ancillary ingredients with more primary ingredients, the model R2 value improved by 15% Some variables have a direct effect on the strength of concrete without any interactive term (or added ingredients). For example, cement content correlates closely with concrete compressive strength. This is evident in the low p-value from the model summary, and a simple scatter plot between the two variables. Our logistic regression model had an accuracy of 87%. In other words, if someone has the ingredients to make concrete and plugs those values into our model, our model will predict whether the resultant compressive strength is above or below 4000 PSI. 87% of the time, our model will accurately predict if the concrete strength is above or below that threshold. These types of logistic regression models are likely widely used in the real-world. If concrete unexpectedly fails, the consequences can be severe. While decent, our model could likely be improved. We estimate that more time would be needed to determine exact interactive terms between variables. In this model, we managed to capture a few obvious ones from some of our diagnostic plots and EDA. Given how those interactive terms improved our model accuracy, more refined ones may further improve this model.

about 1 year ago

RPubs

Recently Published

Homework#5

STM1001 Lecture 6 (Data Science Stream)

Document

Project2

Plot

Interaction Effects and Clarify

Predicting Concrete Strength: A Multivariate and Logistic Regression Approach to Classifying Compressive Strength Outcomes

Homework04-naseam-smith.Rmd

민감국가 지정 (업데이트 - 3.17)

2025 U.S. Consumer Incomes & Liabilities

민감국가 지정 - 0317 (업데이트)

Classifying U.S. States Based on Business Competitiveness: A Clustering Approach

Sign In

Recently Published