VaR Estimation for Crude Oil Data via Different Approaches: Historical Simulations, EVT Model, and ACER Method
Abstract
This thesis implements different approaches to predict the one-day ahead Value at Risk (VaR) of crude oil return data. The Historical Simulation (HS) approach, a non-parametric model, randomly resamples past observations with replacement to estimate the next day quantile. The Filtered HS (FHS) approach, a semi-parametric model, uses the same methodology but attempts to capture the volatility dynamics. The Conditional Extreme Value Theory (EVT) approach, a parametric model with asymptotic limits of the tail data, uses a combination of the Peaks-Over-Threshold (POT) method and the conditional variance model to extract extreme data and estimate the conditional error variance in order to compute the VaR of the next day. The Average Conditional Exceedance Rate (ACER) method, a parametric model targeting subasymptotic tail data, takes statistical dependence between the data points into account in an effort to accurately predict the extreme value distribution, i.e., the next day’s VaR.
The datawas retrieved from the Quandl database of crude oil continuous futures contracts traded on NYMEX WTI from April 1985 to December 2015. By dividing the data set into in-sample and out-of-sample periods, we evaluate the VaR estimates from the above approaches and assess the VaR violations based on the actual returns of the next day. From these VaR violations, we backtest the VaR estimates from these approaches via three tests. First, the unconditional coverage test checks whether the proportion of the violations is statistically different from a predetermined probability. Second, the independence test checks the clustering of these violations. The final test — a combination of the two previous tests — checks the accuracy as well as the independence of the results.
The thesis concludes that the conditional EVT approach performs best among the tested approaches. We also learn that the approaches capturing the heteroscedastic features in the data generally perform better.