library(tidyverse)
library(tidyfinance)
library(tsibble)
library(fable)
library(feasts)
library(scales)
Day 1 Part 2: Exploring Time Series Data
Load Libraries
Load necessary libraries for data manipulation, finance data, and time-series analysis. These packages will enable sophisticated econometric analysis, such as stationarity checks, model fitting, and forecasting of financial time series data. These libraries will allow us to efficiently handle and visualize time series data, perform statistical analysis, and work with financial datasets.
Download Apple Stock Data
Download Apple stock prices from 2010 to 2020 and inspect the data. This time series will be used to explore stock price dynamics, perform volatility analysis, and conduct model estimation for financial econometric purposes. This data will help us understand Apple’s historical performance and serve as the foundation for our time-series analysis.
<-
AAPL download_data(
"stock_prices",
symbols = "AAPL",
start = "2010-01-01",
end = "2020-01-01"
)
%>% glimpse() AAPL
Rows: 2,516
Columns: 8
$ symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL",…
$ date <date> 2010-01-04, 2010-01-05, 2010-01-06, 2010-01-07, 2010-0…
$ volume <dbl> 493729600, 601904800, 552160000, 477131200, 447610800, …
$ open <dbl> 7.622500, 7.664286, 7.656429, 7.562500, 7.510714, 7.600…
$ low <dbl> 7.585000, 7.616071, 7.526786, 7.466071, 7.466429, 7.444…
$ high <dbl> 7.660714, 7.699643, 7.686786, 7.571429, 7.571429, 7.607…
$ close <dbl> 7.643214, 7.656429, 7.534643, 7.520714, 7.570714, 7.503…
$ adjusted_close <dbl> 6.447412, 6.458561, 6.355827, 6.344078, 6.386256, 6.329…
Prepare Closing Price Data
Rename and organize closing prices, converting the data into a tsibble
. The tsibble
structure will allow for the application of time series econometric methods, such as differentiation and model specification, by leveraging its temporal index. A tsibble
is a specialized time-series data format in R, which allows us to perform various time-based operations more easily.
<-
closing_price %>%
AAPL rename(price = adjusted_close) %>%
select(symbol, date, price) %>%
as_tsibble(
index = date,
regular = FALSE
%>%
) glimpse()
Rows: 2,516
Columns: 3
$ symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL",…
$ date <date> 2010-01-04, 2010-01-05, 2010-01-06, 2010-01-07, 2010-01-08, 20…
$ price <dbl> 6.447412, 6.458561, 6.355827, 6.344078, 6.386256, 6.329917, 6.2…
Calculate Logarithmic Returns
Calculate the log of prices and the daily log returns. Logarithmic returns are useful for time series econometrics because they stabilize variance over time and allow additive decomposition of returns, simplifying the model estimation and analysis. Log returns are commonly used in finance to better capture the relative change in prices and handle the compounding nature of returns.
<-
log_returns %>%
closing_price mutate(
lprice = log(price),
lreturn = difference(lprice, lag = 1, differences = 1)
%>%
) glimpse()
Rows: 2,516
Columns: 5
$ symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL"…
$ date <date> 2010-01-04, 2010-01-05, 2010-01-06, 2010-01-07, 2010-01-08, 2…
$ price <dbl> 6.447412, 6.458561, 6.355827, 6.344078, 6.386256, 6.329917, 6.…
$ lprice <dbl> 1.863679, 1.865407, 1.849372, 1.847522, 1.854148, 1.845287, 1.…
$ lreturn <dbl> NA, 0.001727791, -0.016034522, -0.001850293, 0.006626426, -0.0…
Visualize Prices and Returns
Price Over Time
This plot will show the trend of Apple’s stock price over the given period, providing an overview of its growth or decline. This visualization is crucial for identifying non-stationarity, potential structural breaks, and trends that may necessitate differencing before model fitting.
%>% autoplot(.vars = price) log_returns
Log Price Over Time
The log price plot allows us to visualize the price changes on a logarithmic scale, which is useful for observing relative growth. This transformation helps in linearizing exponential growth patterns, making it more appropriate for econometric modeling and reducing potential heteroskedasticity.
%>% autoplot(.vars = lprice) log_returns
Log Returns Over Time
This plot will depict the daily log returns, highlighting the variability and volatility of Apple’s stock over time. Examining log returns over time can reveal volatility clustering, a common feature in financial time series that will inform our choice of econometric models, such as GARCH.
%>% autoplot(.vars = lreturn) log_returns
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Quantile Analysis
Calculate the 5th percentile of daily log returns. The 5th percentile is often used in Value at Risk (VaR) calculations to assess the potential downside risk in holding the asset over a given time frame. The 5th percentile helps us understand the lower tail of the return distribution, indicating extreme negative returns that could represent risk scenarios.
<-
quantile_05 quantile(
%>%
log_returns remove_missing() %>%
pull(lreturn),
probs = 0.05
)
Warning: Removed 1 row containing missing values or values outside the scale
range.
quantile_05
5%
-0.02531911
Plot Distribution of Daily Returns
Plot a histogram of daily log returns with a dashed line for the 5th percentile. Understanding the distribution of returns is critical for time series modeling, as it informs the assumptions of normality or fat tails, which are essential for selecting appropriate econometric models. The histogram gives us a visual representation of the return distribution, while the dashed line indicates the 5th percentile, helping us identify the risk threshold.
%>%
log_returns ggplot(aes(x = lreturn)) +
geom_histogram(bins = 100) +
geom_vline(aes(xintercept = quantile_05),
linetype = "dashed"
+
) labs(
x = NULL,
y = NULL,
title = "Distribution of daily Apple stock returns"
+
) scale_x_continuous(labels = percent)
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).
Aggregate Weekly Log Returns
Summarize log returns by week and plot the results. Aggregating returns to a weekly level can help mitigate the noise inherent in daily data and better reveal underlying cyclical patterns, which can be relevant for macroeconomic linkages. Aggregating returns by week helps smooth out daily fluctuations and provides a clearer picture of the overall trend.
<-
log_returns_weekly %>%
log_returns index_by(yearweek = ~yearweek(.)) %>%
summarise(lreturn = sum(lreturn)) %>%
glimpse()
Rows: 522
Columns: 2
$ yearweek <week> 2010 W01, 2010 W02, 2010 W03, 2010 W04, 2010 W05, 2010 W06, …
$ lreturn <dbl> NA, -0.028955741, -0.040532669, -0.029195736, 0.017547702, 0.…
%>% autoplot() log_returns_weekly
Plot variable not specified, automatically selected `.vars = lreturn`
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Aggregate Monthly Log Returns
Summarize log returns by month and plot the results. Monthly aggregation is useful for analyzing longer-term trends and understanding seasonality, which are key considerations in time series econometric models, such as ARIMA. Monthly aggregation allows us to observe long-term patterns and seasonal trends in Apple’s stock performance.
<-
log_returns_monthly %>%
log_returns index_by(yearmonth = ~yearmonth(.)) %>%
summarise(lreturn = sum(lreturn)) %>%
glimpse()
Rows: 120
Columns: 2
$ yearmonth <mth> 2010 Jan, 2010 Feb, 2010 Mar, 2010 Apr, 2010 May, 2010 Jun, …
$ lreturn <dbl> NA, 0.063346631, 0.138430335, 0.105280303, -0.016255829, -0.…
%>% autoplot() log_returns_monthly
Plot variable not specified, automatically selected `.vars = lreturn`
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Exercises
In this exercise you will download data for a stock of your choice and apply the concepts you have learned to its data.
- Read the documentation of the
data_download
function using the command?download_data
and download the constituents from one of the supported indices: DAX, EURO STOXX 50, Dow Jones Industrial Average, Russell 1000, Russell 2000, Russell 3000, S&P 100, S&P 500, Nasdaq 100, FTSE 100, MSCI World, Nikkei 225, TOPIX. - Pick one of the constituents, look it up on Yahoo Finance, download its data using
data_download
and plot the adjusted closing price. - Aggregate the series to the monthly level and plot the monthly log returns.