Deeper look at fat tails - 15 year NSE - Nifty EDA in python

So the high priest of fat tailed events and Black Swans; The president of the Republic of Extremistan has started a YouTube channel on some of the common topics in probability and statistics which you should watch if you're interested in financial markets. These topics have a lot in common with machine learning so watch if you're a data scientist.

His video on fat tailed events got me thinking about how large swings of 200 - 500 points or so in equity markets may or may not be fat tailed events. So I decided to explore this using the past 15 year data for the NSE Nifty. All source code for the below analysis is on Github

Data Download

While you can get the data from the NSE Historical data section; obtaining the data is a pain since it limits search requests to a 365 day period. I discovered a python package NSEPy that allowed me to download historical data. you can get it from here

I downloaded data for the past 15 years from Jan 2005 to December 2020 in a CSV file for further analysis. The file was saved as a csv and loaded into a separate jupyter notebook for further analysis.

This is how the data looks after some basic preprocessing and calculating the cumulative daily returns of Rs 100 invested at the beginning of the period. Pandas functions on shifting values, percentage change and cumulative product have made this a simple task.


Simple index investing - ignore fat tails

If we take the view that volatility is something that sorts itself out over time we find that 100 Rupees invested has grown nearly 7X to Rs. 694 which is a CAGR of 31%!! Pretty good I say. Code and results below



The problem with these long term return projections is that your returns are extremely sensitive to the time of entry and exit. Say if you had invested at the peak of the market in 2008 when the index was at 6000 levels and had to endure the pain of the subsequent crash. Assuming that you would not sworn of equity and believed in long term investing for the next 12 years only to see the market crash to 7000 levels due to the COVID crisis in March 2020. Getting spooked and redeeming in March would have left you with annualized returns of ~2.5% much worse than a fixed deposit in a bank! 

What this means - Volatility in the markets cannot be ignored.


How volatile are markets really?

Standard deviation(SD) is the simplest and often most commonly (mis)taught measure of volatility in the world. Plugging this video by Taleb again. Simply calculating the SD of the closing value as below over a 15 year period is misleading on account of the fact that the index has gone up over a 15 year period and there is no standard or mean value from which we calculate deviation.

A better way is to look at the change in closing values. We look at daily changes in closing values over 1day, 2day and 3day periods and plot them in terms of time and volume of change (histogram). We take changes over 1day, 2day and 3day periods since when extreme events happen they often take place over a few days.

We see the volatility in the daily returns and the resemblance of these values to a standard normal distribution. In order to get a true sense of how volatile these changes are we compare these changes with expected values from a standard normal distribution. 

The 68-95-99.7 rule of a standard normal distribution is the percentage of values of a normal distribution that lie within 1, 2 and 3 standard deviations of the mean respectively. This article has a good explanation of the rationale behind this if you are more mathematically inclined.

We compare the actual number of days the changes in daily closing of the NSE Nifty with the expected number of days as per a standard normal distribution below.


Each column above is the number of days for a normal distribution and 1day, 2day & 3day changes; while each row is the number of standard deviations.

We expect 1290 days (1-68%) of days to exceed 1 standard deviation (1SD) of the Nifty daily changes while the actual values are 876, 906 and 927 days for 1day, 2day and 3day changes respectively. 

What is really surprising is that while a normal distribution would tell us that we expect only 12 days over a 15 year (4030 days) period to be greater than 3SDs of the mean; the actual values are in the range of 50 days! This is almost 5 times. While this value in itself is significantly large; it does not show up on the plot above.


To the skeptic in me seems too good to be true. I suspect it is still on account of the fact that the Nifty has gone up by 7 times in the last 15 years and the mean of such a wide range is probably affecting the standard deviation calculation. Another question to ask is how many such days actually occur during the course of a year? Have the fat tail events increased or decreased in the past 15 years? 

We repeat the calculations above for each year; calculating the mean, SD and deviations from a normal distribution for each year separately. In order to do this we write a function to calculate the values needed and run it for 1day, 2day and 3day changes separately. A sample of the results and code is shown below for 1 day changes while the notebook has the code and results for all time periods.



The annual SD of 1day changes is 47 points in 2005 which has gone up to 318 points by 2020. Our earlier suspicions of 7X increase in the Nifty having an impact on the mean and SD calculation has an impact on the calculations. 

We see that the number of days over 1SD, 2SD and 3SD are more or less constant across each year. The SD-MAD ratio is slightly higher than its expected value of 1.25 which means that the distribution of daily returns are definitely not normal. If you've seen the taleb video above then this would be clear to you.

Lets look at how the changes in number of days vary by days exceeding 1SD, 2SD and 3SD as well as the total change in SD - MAD ratio over the 15 year period
We see from the chart above that the number of days the return exceeds a particular SD is similar for 1day, 2day and 3day changes. This would mean that while sudden changes happen over 2-3 days a large amount of the change happens in day or so which does not leave one with much time to react.

The SD-MAD ratio ranges between 1.25 and 1.52 with the value for the 1day change trending higher than the rest.

Given that the values in the above plots are pretty close to each other; we take an average and compare with values expected in a standard normal distribution. 


Every year we have 2 days on an average when the returns exceed 3SD of the mean return in the Nifty compared to 0.75 days expected per year. We hereby confirm non-normality in the returns of the exchange since we have accounted for changes in values and looked at the correct numerator and denominator in the values! 

Non-normality - Positive or Negative?

We typically associate black swan events with market crashes and black days. As researchers and truth seekers we do not go by conventional wisdom but let the data speak for itself. Lets compare the number of positive and negative changes in 1day returns over 2SD.


In any given year the number of positive events are much higher than the number of negative events! Which means that there is room for downside protection as well as upside capture in the fat tails. 

Upside Capture & Downside protection

So now we know that financial markets have fat tails on the left side and on the right side. Let's play "What If?" 

What if we were to capture part of the upside say 50% and protect ourselves from 50% of the downside and in an ideal world - do both !! 

What would our 100 Rs invested 15 years ago look like today in each of the above scenarios?

Significantly higher a we see below; though the true benefit comes from compounding the monies earned over a long timeframe. 

It is important to protect the downside and capture additional alpha from the upside but that itself will not yield earth shattering returns till you are able to re-invest the additional earnings back into the markets for compounding returns.


How Real is this?

We've seen a fair amount of analysis on the presence of fat tails, the impact of black swan events and the benefits of capturing upside as well as downside protection. 

Being able to do this is actually a completely different world since in the real world we need to account for human emotion, market timing, transaction costs and ability to not book profits early but to stay invested. 

There are option premiums, slippages and liquidity during extreme events that we need to allow for. We will explore these points in future posts. 

All the code for the analysis is in this GitHub page.

Connect with me on LinkedIn, Twitter or Medium to stay updated. That's all folks!

Note: All content is for research purposes and not investment recommendations. You can assume that I invest my personal funds based on this research and that I can change my views anytime. I recommend that you do not try the same without consulting a registered financial advisor before making investment decisions.

Comments

Popular posts from this blog

Black Swan Trading in the age of Volatility