Tesla Stock Price Forecast By Using ARIMA and GARCH models

gracecamc168
Jan 2, 2021
7 min read

Updated: Jan 7, 2021

I have been intriguing about the stock forecast since I was 20 years old. I was naive to think that if someone can forecast the stock price well, that person will be the next billionaire? Time series topic from one of my Business Analytics courses gave me the chance to learn a bit about time series forecasting. I said to myself, why not just do one stock forecast on my own, though it was not taught in class. Therefore, this is absolutely self-learned. I made it for fun because a decent stock forecasting model is extremely difficult, even with years of professional experiences.

The stock I chose is Tesla, the newest S&P 500 member. The green car maker gained its momentum in 2020. R is the programming language for this project.

Let's start.

# Load the stock price from Yahoo Finance.com.

library(quantmod)

tesla <- getSymbols(Symbols ="TSLA",src="yahoo", from="2015-01-01",

to = "2020-12-31",warnings = FALSE,

auto.assign = F)

head(tesla)

The stock price has been adjusted after the split, I thought I still can see the price before the split. It does not affect the forecasting.

dim(tesla)

It contains 1510 observations and 6 columns.

# Since I am only interested in the close price, I will only select the close price.

tesla_close <- tesla$TSLA.Close

class(tesla_close)

The data is time series

We can explore more about the data by using deltat(),frequency(),cycle().

# Plot it

chart_Series(tesla_close)

It looks like an exponential-like upward trend since early January of 2020, though it does have some down trend in the middle. From the plot below, we can see the data has trend spotting, linear growth and mild decay, periodic and variance problems.These problems are normal for stock price. From these, especially the price of 2020, I have a sense that the prediction might have large error or variance because the price of 2020 and before 2020 was kind of huge difference.

# I will use log() to diminish the rapid linear growth trend

tesla_log <- log(tesla_close)

par(mfrow=c(1,2))

plot(tesla_close) # before log

plot(tesla_log)

In order to apply ARIMA model, I need to check whether the data is autocorrelation.

Because stock prices are usually assumed to follow a log-normal distribution, it is better to check the acf and pacf after logging it.

par(mfrow=c(2,1))

acf(tesla_log)

pacf(tesla_log)

From the two plots above, this should be the AR(Autocorrelation) process, because the ACF plot gradually decreases and the PACF has a sharp drop after 1 lag. Therefore, it should be AR(1) model. Understanding the ACF and PACF plots is important to identify which model we should use, and it is also useful for finding the parameters. Another concept comes in mind that this series is not white noise like, but we will test it later on.

The PACF plot tell us that the series does not have any significant seasonal pattern.

From these signs, the stock price movement is not stationary, indicating the mean, variance, are not constant over time. In order to use ARIMA model, we have to transform the series into a stationary series. Stationary is very important when we use ARIMA model, by doing so, the model will generate a robust and un-biased forecast.

# Before conducting the model, let's double confirm if the series is white noise

Box.test(x=tesla_close, lag = 1, type = "Ljung-Box", fitdf = 0)

Box.test(x=tesla_close, lag = 1, type = "Box-Pierce", fitdf = 0)

The two tests below show that the series is not white noise because the p value is small.

# Differencing the series

tesla_diff <- diff(tesla_log,lag=1)

plot(tesla_diff)

As we can see that the plot looks much more stable, though we still can see some sudden up-trends and down-trends. But we will also test it later. When doing differencing, we will have less values than the original one.

# Deal with the NAs

library(imputeTS)

tesla_diff <- na_locf(tesla_diff)

# Test if the data is stationary

library(tseries)

tesla_log_test <- adf.test(tesla_log,k=0)

We received p-value = 0.99, and it has a unit root, meaning we can not reject the null hypothesis, the log series of TESLA is not stationary. In other words, it is not stationary.

tesla_diff_test <- adf.test(tesla_diff,k=0)

We received p-value = 0.01 and p-value smaller than printed p-value, meaning after differencing, the series is now stationary. We set the k=0 because we calculate it from the raw data. You can absolutely try different k values to see the differences. I would say from this dataset, it does not have differences. We still receive 0.01 p value.

# Let's recheck the ACF and PACF with the differenced series

par(mfrow=c(2,1))

acf(tesla_diff)

pacf(tesla_diff)

The ACF shows a sharp drop at lag 1 and damped wave pattern while PACF shows a geometric trend. From this, we might say MA(q) model might fit the series. But to look for the best ARIMA model manually, it requires us to have an extensive knowledge. Unfortunately, I am not one of this type of talents. Fortunately, we can use some handy functions to help us. We will use auto.arima() to find the best ARIMA model.The auto.arima() will help us to find the best values of p,d,q.

library(forecast)

set.seed(168)

arima.model <- auto.arima(tesla_close,ic = c("aicc", "aic", "bic"),

seasonal = F,trace=TRUE,test="kpss")

The model chose ARIMA(0,2,2) for us. Let's diagnostic the model

checkresiduals(arima.model,lag=302)

tsdiag(arima.model)

We have 1 lag from the ACF plot, but the rest looks fine.

autoplot(arima.model)

This shows us that the model is not that stable because the one dot lines on the circle.

arima.residuals <-arima.model$residuals

ggtsdisplay(arima.residuals ,main="Tesla ARIMA with (0,2,2) Residuals")

This tells us again, the model has volatility.

From all the plots above, we can see that the model is not doing great because we still see correlations from the ACF, meaning the model is not white noise. Also, although, we do not see a clear pattern in the residual plot, but there is still some volatility that is not captured by the model.However, the error terms is normally distributed. The p-value from Ljung-Box is very small, the test tells us that the model's residuals are not independent and are auto-correlated. Put in other words, the arima model suggested does have heterocedasticity problem. We need to try another model later on. But let's still forecast with this model.

# Use (0,2,2) to fit to the ARIMA model

arima.fit <- arima(tesla_close, order = c(0,2,2))

summary(arima.fit)

We usually use the values of RMSE, MAE, MAPE, respectively to select the best model, the lower of these values, the better the model is. However, since we are not going to use ARIMA model for our final forecast, we do not need to take care of these values here.

# Forecasting with ARIMA(0,2,2) for the next 20 days of stock price

forecast <- forecast(arima.fit,h=20)

plot(forecast)

As mentioned before, the ARIMA model is the not the best fit for this situation. We will use GARCH model for the volatility of this dataset. In order to do GARCH model, we need to check if there is ARCH effect, if we do, then we can go ahead for the GARCH model.

library(FinTS)

ArchTest(arima.residuals)

We conclude that there is presence of ARCH(1) effects because the given p-value is very small. Hence, we reject the null hypothesis that there is no ARCH effects.

Since the ARIMA model is not ideal due to the variability, and there has ARCH effects. Let's try GARCH model for this, which is for Conditional Variance.

library(rugarch)

tesla.spec <- ugarchspec(mean.model=list(armaOrder=c(0,0)),

variance.model = list(model="sGARCH"),

distribution.model="std")

# Estimate the model

garch.fit <- ugarchfit(data=tesla_close,spec=tesla.spec)

garch.fit

There is so much information from the garch.fit object. The next plots just show part of it. We usually care about the Information Criteria for model selection, the lower values, the better.

Let's try another set of order of the mean.model.

tesla.spec1 <- ugarchspec(mean.model=list(armaOrder=c(2,1)),

variance.model = list(model="sGARCH",

garchOrder=c(1,1)),

distribution.model="std")

# estimate the model

garch.fit1 <- ugarchfit(data=tesla_close,spec=tesla.spec1)

garch.fit1

We received lower values from the Information Criteria of this model. Also, this model does show us statistically significant when you look at the sections of Optimal Parameters and Robust Standard Errors. The p value of the different parameters are less than 5%. Therefore, I trust this model. Let's use this to forecast. But it's free to try more different parameters. With this model, we can observe the period of high and low volatility. But I did not set up the Date to the X axis, I assume you understand the dates of those high volatility periods. It is actually very easy to have a sense of it, just check back the original price plot. We will know the high volatility period would be 2020, before 2020, it was much less volatility, even looks like flat.

With this model, we can also compute the volatility of a given trading day based one this model.For example, I want to know the volatility of 2020-12-30, given the different prices of 2020-12-29.

Formula: omega+alpha1* yesterday's squared residual(error^2)+ beta1 * yesterday's variance(difference between its highest and lowest values on a given trading day)

0.06974+0.12062*0.5408994^2+0.87838*(696.60-668.36) = 24.91048.

Therefore, the volatility of the stock price of 2020-12-30 is 24.91048.

Let's do the forecast for the next 20 days in 2021.

forecast2 <- ugarchboot(garch.fit1,n.ahead=20,method="Partial")

plot(forecast2,which=2) # Series Standard Error Plots

forecast2

Let's plot the fitted values to compare with the actual values from 2015 to 2020.

The red line represents the actual stock price, while the light blue line represents the predicted price. We can see that the predicted price is very close to actual price by using this GARCH model.

Let's take a closer look for 2020 only since 2020 was a crazy year. We should see more variance for this year forecasting.

Let's see the last 30 days error from the model forecast. The error varies at a certain level.

Let's plot the forecasting from the GARCH model for the next 20 days of Tesla stock price in 2021.

From the output and graph, we can observe the stock price with optimistic values and pessimistic values. The dark blue dots in the top of the graph represent the optimistic values, while the lighter blue dots in the bottom of the graph represent the pessimistic values. This indicates that with the optimistic situation, the stock price could soar to around $900 in day 20, while with the pessimistic situation, the stock price could go down to around $600 in day 20. I believe the model can be improved to further narrow the error and variance. If you know, please let me know. Also, we should backtesting the model, but since I do it for fun, then I will skip this part here. I heard the Neural Network models can do a better job for stock forecasting. But this is out of my ability to do it at this moment.

Last thought

As of December 31, 2020, TESLA stock price was closed by $705. With the insane 2020 for TESLA, many professional analysts predict that 2021 will still be another great year for the company, whereas, there are still many people think the stock price of TESLA could be in a bubble. What do you think?

Tesla Stock Price Forecast By Using ARIMA and GARCH models

Recent Posts

Comments