Homework 4: Data Analysis for Problems 2--5
===========================================
*Firstname LastName (Replace this part with your name)*
---
**NOTE:** In this assignment, we will use the "forecast" package for R, which adds
capabilities for identifying and fitting ARIMA models. You need to install
this package before you can complete the assignment.
To install the package choose "Install Packages..." from the "Tools" menu in
RStudio. Type `forecast` into the packages field, and make
sure the "Install dependencies" box is checked. This will download and
install the package.
After you have installed the package, use the following line to load it into
R:
```{r}
library("forecast")
```
If you get an error about a missing library (possibly `quadprog`), then you will
need to install the missing library using "Tools > Install Packages..."
Problem 2
=========
Housing Starts
--------------
```{r}
# Read in data:
# New Privately Owned Housing Units Started (in thousands, for U.S.), Yearly,
# 1959 - Present.
# Seasonally adjusted annual rate.
#
# Source: https://research.stlouisfed.org/fred2/series/HOUST/downloaddata
#
hs.data <- read.csv("http://ptrckprry.com/course/forecasting/data/housingstarts.csv")
housing.starts <- hs.data$housing.starts
# Time series plot
plot(as.Date(hs.data$date), housing.starts, type="l",
xlab="Date", ylab="Housing Starts")
# ACF and PACF
Acf(housing.starts)
Pacf(housing.starts)
```
**Identify an ARIMA(p, d, q) model. Give reasons for your choices of p, d, q.**
Log GDP
-------
```{r}
# Read in data:
# Real United States Gross Domestic Product,
# 1947 (Q1) - Present.
# Seasonally adjusted and inflation adjusted, in billions of 2009 dollars.
#
# Source: http://bea.gov/national/
#
gdp.data <- read.csv("http://ptrckprry.com/course/forecasting/data/gdp.csv")
gdp <- gdp.data$gdp
# Take logs
# log.gdp <- ???
# Time series plot
# ACF and PACF
```
**Identify p, d, q. Give reasons for your choices.**
Diff Log GDP
------------
```{r}
# Take first differences
# (uncomment the following line)
# diff.log.gdp <- c(NA, diff(log.gdp))
# Time series plot
# ACF and PACF
```
**Identify p, d, q. Give reasons for your choices.**
Diff Log CPI (Inflation)
------------------------
```{r}
# Read in data:
# U.S. Consumer Price Index, Monthly,
# December, 1946 - Present.
# Seasonally Adjusted. All Urban Consumers (1982 to 1984 as 100).
#
# Source: https://research.stlouisfed.org/fred2/series/CPIAUCSL/downloaddata
#
cpi.data <- read.csv("http://ptrckprry.com/course/forecasting/data/cpi.csv")
cpi <- cpi.data$cpi
# Take logs
# Take first differences
# Time series plot
# ACF and PACF
```
**Identify p, d, q. Give reasons for your choices.**
**For the inflation series (first log of the CPI) determine if additional
differencing results in over-differencing.**
```{r}
# Take differences of inflation (second differences of log CPI)
# Compute ACF
# Use this plot to determine if additional differencing results in over-differencing.
```
Problems 3--5
=============
To complete Problems 3--5, we need to compute the sample autocorrelations for
the first difference of the log GDP series.
```{r}
# Hint: use the Acf funciton with plot=FALSE to get the estimated autocorrelations
```
**Use the estimated autocorrelations to complete Problems 3--5. Show your
work on a separate page.**
For Problem 5(c), we need the last few values of `diff.log.gdp` and `log.gdp`.
We can get these with the `tail` command:
```{r}
# Uncomment the following lines to get the last 5 values of log.gdp and diff.log.gdp:
# tail(log.gdp)
# tail(diff.log.gdp)
```