Day 1 Part 2: Exploring Time Series Data

Author

Dr Christian Engels

Last updated

November 16, 2024

Load Libraries

Load necessary libraries for data manipulation, finance data, and time-series analysis. These packages will enable sophisticated econometric analysis, such as stationarity checks, model fitting, and forecasting of financial time series data. These libraries will allow us to efficiently handle and visualize time series data, perform statistical analysis, and work with financial datasets.

library(tidyverse)
library(tidyfinance)
library(tsibble)
library(fable)
library(feasts)
library(scales)

Download Apple Stock Data

Download Apple stock prices from 2010 to 2020 and inspect the data. This time series will be used to explore stock price dynamics, perform volatility analysis, and conduct model estimation for financial econometric purposes. This data will help us understand Apple’s historical performance and serve as the foundation for our time-series analysis.

AAPL <- 
  download_data(
    "stock_prices", 
    symbols = "AAPL", 
    start = "2010-01-01", 
    end = "2020-01-01"
  )

AAPL %>% glimpse()
Rows: 2,516
Columns: 8
$ symbol         <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL",…
$ date           <date> 2010-01-04, 2010-01-05, 2010-01-06, 2010-01-07, 2010-0…
$ volume         <dbl> 493729600, 601904800, 552160000, 477131200, 447610800, …
$ open           <dbl> 7.622500, 7.664286, 7.656429, 7.562500, 7.510714, 7.600…
$ low            <dbl> 7.585000, 7.616071, 7.526786, 7.466071, 7.466429, 7.444…
$ high           <dbl> 7.660714, 7.699643, 7.686786, 7.571429, 7.571429, 7.607…
$ close          <dbl> 7.643214, 7.656429, 7.534643, 7.520714, 7.570714, 7.503…
$ adjusted_close <dbl> 6.447412, 6.458561, 6.355827, 6.344078, 6.386256, 6.329…

Prepare Closing Price Data

Rename and organize closing prices, converting the data into a tsibble. The tsibble structure will allow for the application of time series econometric methods, such as differentiation and model specification, by leveraging its temporal index. A tsibble is a specialized time-series data format in R, which allows us to perform various time-based operations more easily.

closing_price <- 
  AAPL %>% 
  rename(price = adjusted_close) %>% 
  select(symbol, date, price) %>% 
  as_tsibble(
    index = date, 
    regular = FALSE
  ) %>% 
  glimpse()
Rows: 2,516
Columns: 3
$ symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL",…
$ date   <date> 2010-01-04, 2010-01-05, 2010-01-06, 2010-01-07, 2010-01-08, 20…
$ price  <dbl> 6.447412, 6.458561, 6.355827, 6.344078, 6.386256, 6.329917, 6.2…

Calculate Logarithmic Returns

Calculate the log of prices and the daily log returns. Logarithmic returns are useful for time series econometrics because they stabilize variance over time and allow additive decomposition of returns, simplifying the model estimation and analysis. Log returns are commonly used in finance to better capture the relative change in prices and handle the compounding nature of returns.

log_returns <- 
  closing_price %>% 
  mutate(
    lprice = log(price),
    lreturn = difference(lprice, lag = 1, differences = 1)
  ) %>% 
  glimpse()
Rows: 2,516
Columns: 5
$ symbol  <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL"…
$ date    <date> 2010-01-04, 2010-01-05, 2010-01-06, 2010-01-07, 2010-01-08, 2…
$ price   <dbl> 6.447412, 6.458561, 6.355827, 6.344078, 6.386256, 6.329917, 6.…
$ lprice  <dbl> 1.863679, 1.865407, 1.849372, 1.847522, 1.854148, 1.845287, 1.…
$ lreturn <dbl> NA, 0.001727791, -0.016034522, -0.001850293, 0.006626426, -0.0…

Visualize Prices and Returns

Price Over Time

This plot will show the trend of Apple’s stock price over the given period, providing an overview of its growth or decline. This visualization is crucial for identifying non-stationarity, potential structural breaks, and trends that may necessitate differencing before model fitting.

log_returns %>% autoplot(.vars = price)

Log Price Over Time

The log price plot allows us to visualize the price changes on a logarithmic scale, which is useful for observing relative growth. This transformation helps in linearizing exponential growth patterns, making it more appropriate for econometric modeling and reducing potential heteroskedasticity.

log_returns %>% autoplot(.vars = lprice)

Log Returns Over Time

This plot will depict the daily log returns, highlighting the variability and volatility of Apple’s stock over time. Examining log returns over time can reveal volatility clustering, a common feature in financial time series that will inform our choice of econometric models, such as GARCH.

log_returns %>% autoplot(.vars = lreturn)
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Quantile Analysis

Calculate the 5th percentile of daily log returns. The 5th percentile is often used in Value at Risk (VaR) calculations to assess the potential downside risk in holding the asset over a given time frame. The 5th percentile helps us understand the lower tail of the return distribution, indicating extreme negative returns that could represent risk scenarios.

quantile_05 <- 
  quantile(
    log_returns %>% 
      remove_missing() %>% 
      pull(lreturn), 
    probs = 0.05
  )
Warning: Removed 1 row containing missing values or values outside the scale
range.
quantile_05
         5% 
-0.02531911 

Plot Distribution of Daily Returns

Plot a histogram of daily log returns with a dashed line for the 5th percentile. Understanding the distribution of returns is critical for time series modeling, as it informs the assumptions of normality or fat tails, which are essential for selecting appropriate econometric models. The histogram gives us a visual representation of the return distribution, while the dashed line indicates the 5th percentile, helping us identify the risk threshold.

log_returns %>% 
  ggplot(aes(x = lreturn)) +
  geom_histogram(bins = 100) +
  geom_vline(aes(xintercept = quantile_05),
             linetype = "dashed"
  ) +
  labs(
    x = NULL,
    y = NULL,
    title = "Distribution of daily Apple stock returns"
  ) +
  scale_x_continuous(labels = percent)
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).

Aggregate Weekly Log Returns

Summarize log returns by week and plot the results. Aggregating returns to a weekly level can help mitigate the noise inherent in daily data and better reveal underlying cyclical patterns, which can be relevant for macroeconomic linkages. Aggregating returns by week helps smooth out daily fluctuations and provides a clearer picture of the overall trend.

log_returns_weekly <- 
  log_returns %>% 
    index_by(yearweek = ~yearweek(.)) %>% 
    summarise(lreturn = sum(lreturn)) %>% 
    glimpse()
Rows: 522
Columns: 2
$ yearweek <week> 2010 W01, 2010 W02, 2010 W03, 2010 W04, 2010 W05, 2010 W06, …
$ lreturn  <dbl> NA, -0.028955741, -0.040532669, -0.029195736, 0.017547702, 0.…
log_returns_weekly %>% autoplot()
Plot variable not specified, automatically selected `.vars = lreturn`
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Aggregate Monthly Log Returns

Summarize log returns by month and plot the results. Monthly aggregation is useful for analyzing longer-term trends and understanding seasonality, which are key considerations in time series econometric models, such as ARIMA. Monthly aggregation allows us to observe long-term patterns and seasonal trends in Apple’s stock performance.

log_returns_monthly <- 
  log_returns %>% 
    index_by(yearmonth = ~yearmonth(.)) %>% 
    summarise(lreturn = sum(lreturn)) %>% 
    glimpse()
Rows: 120
Columns: 2
$ yearmonth <mth> 2010 Jan, 2010 Feb, 2010 Mar, 2010 Apr, 2010 May, 2010 Jun, …
$ lreturn   <dbl> NA, 0.063346631, 0.138430335, 0.105280303, -0.016255829, -0.…
log_returns_monthly %>% autoplot()
Plot variable not specified, automatically selected `.vars = lreturn`
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Exercises

In this exercise you will download data for a stock of your choice and apply the concepts you have learned to its data.

  1. Read the documentation of the data_download function using the command ?download_data and download the constituents from one of the supported indices: DAX, EURO STOXX 50, Dow Jones Industrial Average, Russell 1000, Russell 2000, Russell 3000, S&P 100, S&P 500, Nasdaq 100, FTSE 100, MSCI World, Nikkei 225, TOPIX.
  2. Pick one of the constituents, look it up on Yahoo Finance, download its data using data_download and plot the adjusted closing price.
  3. Aggregate the series to the monthly level and plot the monthly log returns.