https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/airquality
head(airquality)
summary(airquality)
Ozone Solar.R Wind Temp Month Day
Min. : 1.00 Min. : 7.0 Min. : 1.700 Min. :56.00 Min. :5.000 Min. : 1.0
1st Qu.: 18.00 1st Qu.:115.8 1st Qu.: 7.400 1st Qu.:72.00 1st Qu.:6.000 1st Qu.: 8.0
Median : 31.50 Median :205.0 Median : 9.700 Median :79.00 Median :7.000 Median :16.0
Mean : 42.13 Mean :185.9 Mean : 9.958 Mean :77.88 Mean :6.993 Mean :15.8
3rd Qu.: 63.25 3rd Qu.:258.8 3rd Qu.:11.500 3rd Qu.:85.00 3rd Qu.:8.000 3rd Qu.:23.0
Max. :168.00 Max. :334.0 Max. :20.700 Max. :97.00 Max. :9.000 Max. :31.0
NA's :37 NA's :7
df <- na.omit(airquality)
summary(df)
Ozone Solar.R Wind Temp Month Day
Min. : 1.0 Min. : 7.0 Min. : 2.30 Min. :57.00 Min. :5.000 Min. : 1.00
1st Qu.: 18.0 1st Qu.:113.5 1st Qu.: 7.40 1st Qu.:71.00 1st Qu.:6.000 1st Qu.: 9.00
Median : 31.0 Median :207.0 Median : 9.70 Median :79.00 Median :7.000 Median :16.00
Mean : 42.1 Mean :184.8 Mean : 9.94 Mean :77.79 Mean :7.216 Mean :15.95
3rd Qu.: 62.0 3rd Qu.:255.5 3rd Qu.:11.50 3rd Qu.:84.50 3rd Qu.:9.000 3rd Qu.:22.50
Max. :168.0 Max. :334.0 Max. :20.70 Max. :97.00 Max. :9.000 Max. :31.00
plot(df$Temp, df$Ozone, xlab = "Temperature", ylab = "Ozone")
Assumption:
y = exp(a + b * x)
x’, y’ transformation:
log(y) = log(exp(a + b*x))
log(y) = a + b*x
df$LogOzone <- log(df$Ozone)
linModel <- lm(formula = df$LogOzone ~ df$Temp)
linModel
Call:
lm(formula = df$LogOzone ~ df$Temp)
Coefficients:
(Intercept) df$Temp
-1.84852 0.06767
summary(linModel)
Call:
lm(formula = df$LogOzone ~ df$Temp)
Residuals:
Min 1Q Median 3Q Max
-2.14417 -0.32555 0.02066 0.34234 1.49100
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.848518 0.455080 -4.062 9.2e-05 ***
df$Temp 0.067673 0.005807 11.654 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5804 on 109 degrees of freedom
Multiple R-squared: 0.5548, Adjusted R-squared: 0.5507
F-statistic: 135.8 on 1 and 109 DF, p-value: < 2.2e-16
linModel$coefficients
(Intercept) df$Temp
-1.84851790 0.06767266
ozone <- df$Ozone
temp <- df$Temp
a <- linModel$coefficients[1]
b <- linModel$coefficients[2]
model <- nls(ozone ~ exp(a + b*temp), start = list(a=a, b=b))
summary(model)
Formula: ozone ~ exp(a + b * temp)
Parameters:
Estimate Std. Error t value Pr(>|t|)
a -1.175445 0.539823 -2.177 0.0316 *
b 0.061241 0.006215 9.854 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 22.9 on 109 degrees of freedom
Number of iterations to convergence: 5
Achieved convergence tolerance: 1.541e-06
plot(ozone ~ temp)
curve(exp(coef(model)["a"] + coef(model)["b"] * x), add = T, col = "red")
plot(ozone, predict(model), xlab="True Ozone", "ylab" = "Ozone Forecast")
abline(c(0, 1), col="red")
resid <- residuals(model)
pred <- predict(model)
plot(pred, resid)
abline(0, 0, col="red")
qqnorm(resid)
qqline(resid, col="red")