Linear Regression
Fit a linear regression model with response variable
sqrt(passing.distance)
and predictors helmet
, vehicle
,
and kerb
. This includes an intercept by default.
bikedata <- read.csv("bikedata.csv")
model <- lm(sqrt(passing.distance) ~ helmet + vehicle + kerb,
data = bikedata)
Get the coefficient estimates, their standard errors, t statistics, and p values.
summary(model)
Call: lm(formula = sqrt(passing.distance) ~ helmet + vehicle + kerb, data = bikedata) Residuals: Min 1Q Median 3Q Max -0.6761 -0.0932 -0.0063 0.0882 0.7455 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.21756 0.02204 55.26 < 2e-16 *** helmetY -0.02255 0.00603 -3.74 0.00019 *** vehicleCar 0.11207 0.02164 5.18 2.4e-07 *** vehicleHGV 0.04895 0.02674 1.83 0.06726 . vehicleLGV 0.09134 0.02299 3.97 7.3e-05 *** vehiclePTW 0.09773 0.03275 2.98 0.00287 ** vehicleSUV 0.11193 0.02456 4.56 5.4e-06 *** vehicleTaxi 0.06884 0.02977 2.31 0.02083 * kerb -0.10326 0.00850 -12.15 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.145 on 2346 degrees of freedom Multiple R-squared: 0.0906, Adjusted R-squared: 0.0875 F-statistic: 29.2 on 8 and 2346 DF, p-value: <2e-16
Check regression assumptions with diagnostic plots.
plot(model)
Scatter plot of standardized residuals versus kerb
.
plot(bikedata$kerb, rstandard(model))
Boxplots of standardized residuals versus helmet
boxplot(rstandard(model) ~ bikedata$helmet)
99% confidence interval for coefficient of helmetY
.
confint(model, "helmetY", 0.99)
0.5 % 99.5 % helmetY -0.03809 -0.007009