May 02, 2018

So I built a Deep Neural Network to predict the price of Bitcoin — and it’s astonishingly accurate.

Curious?

See the prediction results for yourself.

Looks pretty accurate, doesn’t it?

And before you ask: Yes, the above evaluation was performed on unseen test data — only prior data was used to train the model (more details later).

**So this is a money-making machine I can use to get rich!**

Right?

In fact, I am giving you the code for the above model so that you can use it yourself…

I repeat: *Don’t do it!* Do not use it for trading.

Don’t be fooled.

There is something utterly deceptive about these results.

Let me explain.

During the last couple of weeks and months I’ve encountered many articles that take a similar approach to the one presented here and that show graphs of cryptocurrency price predictions that look like the one above.

The seemingly stunning accuracy of price predictions should immediately set off alarm bells.

*These results are obviously too good to be true.*

When something looks too good to be true, it usually is. — Emmy Rossum

In the following, I want to demonstrate why this is the case.

Don’t get me wrong — my intention is not to undermine the work put into those articles. They are good and deserve the claps they received. In fact, many of those approaches are very accurate — technically speaking.

The goal of this article is to bring out why those models are, in practice, fallacious and why their predictions are not necessarily suitable for usage in actual trading.

So why exactly is this the case? Let’s take a close look.

To explain, let me walk you through an example of building a multidimensional Long Short Term Memory (LSTM) neural network to predict the price of Bitcoin that yields the prediction results you saw above.

LSTMs are a special kind of Recurrent Neural Networks (RNN), that are particularly suitable for time series problems. Hence, they have become popular when trying to forecast cryptocurrency prices, as well as stock markets.

*For in-depth introductions to LSTMs I recommend this and this article.*

For the present implementation of the LSTM, I used Python and Keras. *(You can find the corresponding Jupyter Notebook with the complete code on my Github.)*

First, I fetched historic Bitcoin price data (you can do this for any other cryptocurrency as well). To do so I used the API from cryptocompare:

*A snapshot of historic Bitcoin price data.*

Voilà, historic daily BTC data for the last *2000* days, from *2012–10–10* until *2018–04–04.*

Then, I split the data into a *training* and a *test* set. I used the last 10% of the data for testing, which splits the data on the *2017–09–14*. All data before this date was used for training, all data from this date on was used to test the trained model. Below, I plotted the close column of our DataFrame, which is the daily closing price I intended to predict.

*Train-test split of historic Bitcoin price data.*

For training the LSTM, the data was split into windows of 7 days (this number is arbitrary, I simply chose a week here) and within each window I normalised the data to *zero base*, i.e. the first entry of each window is 0 and all other values represent the change with respect to the first value. Hence, I am predicting price *changes*, rather than absolute price.

I used a simple neural network with a single LSTM layer consisting of 20 neurons, a dropout factor of 0.25, and a Dense layer with a single linear activation function. In addition, I used Mean Absolute Error (MAE) as loss function and the Adam optimiser.

I trained the network for 50 epochs with a batch size of 4.

*Note: The choice of the network architecture and all parameters is arbitrary and I didn’t optimise for any them, as this is not the focus of this article.*

Using the trained model to predict on the left-out test set, we obtain the graph shown in the beginning of this article.

So what exactly is wrong with these results?

Why shouldn’t we use this model for actual trading?

Let’s take a closer look and zoom into the last 30 days of the plot.

See that?

You might have already correctly guessed that the fundamental flaw with this model *is that for the prediction of a particular day, it is mostly using the value of the previous day.*

**The prediction line doesn’t seem to be much more than a shifted version of the actual price.**

In fact, if we adjust the predictions and shift them by a day, this observation becomes even more obvious.

As you can see, we suddenly observe an almost perfect match between actual data and predictions, indicating that the model is essentially learning the price at the previous day.

These results are exactly what I’ve been seeing in many of the examples using single-point predictions with LSTMs.

To make this point clearer, let’s compute the expected *returns* as predicted by the model and compare those with the actual returns.

Looking at the actual and predicted returns, both in their original form as well as with the *1-day-shift* applied to them, we obtain the same observation.

*Actual and predicted returns. In the left plot predictions are adjusted by a day.*

Actually, if we compute the correlation between actual and predicted returns both for the original predictions as well as for those adjusted by a day, we can make the following observation:

As you can see from the plots above, actual and predicted returns are uncorrelated. Only after applying the *1-day-shift* on the predictions we obtain highly correlated returns that resemble the returns of the actual bitcoin data.

The goal of the this blogpost was to address the many examples of predictions of cryptocurrency and stock market prices using deep neural networks that I have encountered in the past couple of months — these take a similar approach as the one employed here: Implementing an LSTM using historic price data to predict future outcomes. I have demonstrated why these models might not be necessarily viable for actual trading.

Yes, the network is effectively able to learn. But it ends up using a strategy in which predicting a value close to the previous one turns out to be successful in terms of minimising the mean absolute error.

However, no matter how accurate the predictions are in terms of the loss error — in practice, the results of single-point prediction models based on *historic price data alone*, as the one showcased here, remain hard to accomplish and are not particularly useful for trading.

Needless to say that more sophisticated approaches of implementing useful LSTMs for price predictions potentially do exist. Using more data, as well as optimising network architecture and hyperparameters are a start. In my opinion, however, there is more potential in incorporating data and features that go beyond historic prices alone. After all, the finance world has already known for long that “past performance is not an indicator for future outcomes”.

And the same might also hold for cryptocurrencies.

*Disclaimer: This is not financial advice. The article and the presented model are for educational purposes only. Do not use it for trading or making investment decisions.*

**WE ARE WATTX**

A recap from our last internal hackathon. We used AR to visualize sensor data, created...

**WE ARE WATTX**

This is the fourth part in our article series “We are WATTx” in which we...

**WE ARE WATTX**

Since summer, WATTx is now led by a tandem of Martin’s with our new COO...