We will be exploring and understanding concepts from both worlds, data science, and trading, obtaining our first approach to mix them in one.
Hi, darlings, last week I was reached by Gaurav Agrawal, who works as an editor on Coinmonks. He invited me to publish there, adding me as a writer, so you can find my content there. Coinmonks describe itself as “A Non-profit Crypto Educational Publication,” and they’re pretty much that; there are a lot of great publishers and thousands of interesting articles. The link embedded earlier is for you to check them out (the fact that I’ve been invited has no relation to this invitation). By the way… if you haven’t read part 1 or part 2 of this series, what are you waiting for?
Ok, ok, I’m starting. Today we begin the exploratory data analysis(EDA), this is an essential part of the data science process.
I personally love this one! It’s exciting to understand new problems and see how the charts start speaking on their own, giving you beautiful and smart hints just waiting for you to follow them.
But why it’s so important? Mainly because it allows us to understand how our variables are behaving. What I mean by that: because of their inherent nature, they have the strength to guide us to a better feature engineering and model selection.
To refresh our goal a little bit, let’s remember our questions: “When is a good moment to buy cryptos?”, “When is a good moment to sell?” (ok, maybe I didn’t write much about our questions in the other episodes).
Our goal will be earning money by making trades into the crypto market. We will be gaining insights related to our variables and try to link them with our goals.
It’s already said that the EDA will guide us to victory, but I want to clear out a point first. There is a lot of times when machine learning avoids the business sense, not necessarily on purpose, sometimes the features are just given with no context (on specific Kaggle competitions, for instance).
The conventional approach given this situation is trying to make the most of the features juggling with different strategies to gain performance on the models, which is fantastic.
Still, when there are professional traders who have been playing the same game that you are trying to just enter, you can’t compete with their intuition (or knowledge in case of scientific matters).
Because of that, this market hasn’t had the same time as the others to be analyzed and understood by investors. So probably a lot of people that trade in here came from the other markets.
This is super important, because given that this market is mainly described by its own price fluctuation (besides the obvious relevant news), and that the players are using similar strategies that they applied in the stock market, maybe, and just maybe, technical analysis will have something to say in our models.
And it’s precisely for the maybes, that we are going to dig a little deeper on this. It’s already said that when we face a new problem, it’s essential to ask the people who have been gaining some intuition about it.
Still, in trading, the connection it’s even stronger because the decisions that traders make are fully connected with the price. This means that, if everyone agrees that the crypto behavior it’s predicting that the price will go up, it’s the same people that will be betting on that, then, it’s a fact that the price will be rising until a certain point.
I will connect this with the self-fulfilling prophecy event. The equation looks like this: If everyone thinks that something it’s going to happen, plus, their actions matter, plus, these actions go in the same direction as the outcome, then we might assure that this event was caused by the preceding expectation that it was actually going to occur. As statsQuest say: BAM?.
Obviously, all this narrative I’m constructing could be just that, me connecting unexisting points. For that reason, I’m sure that you are eager to see some proofs of what I’m writing about. So, to motivate you (in case you are doubting a little about this cross-over of concepts between these worlds), I’m going to introduce a beautiful one, it’s my A-game to catch your attention so, fasten your seatbelts. Besides, I will go much deeper into the analysis of this feature than the others. I can’t — just for a matter of time — do the same for every feature, I’d love to write and analyze in detail all of our features, but if I do that, we will end up with a book instead of this series. Enjoy then…
In general, to buy any crypto, we need a signal that triggers our action. The GOLDEN CROSS is a well-known signal in trading. This phenomenon happens when a 50-day simple moving average close price (SMA: is the average of the last periods in the selected range) rises above 200-day SMA, and the most important, it’s considered as a bullish sign. Ok, let’s give us a second to understand what a bullish sign is. In the meantime, I take the instance to say that are also bearish signs (in this case, the DEATH CROSS: occur when a 50-day SMA drops below 200-day SMA). The bullish sign stands for bull horns doing an upward movement (value rising), and the bearish for the bear claws attacking down (value dropping).
I know that you might be tired of hearing the story of my life, so to regain your attention, let’s quickly analyze our first chart: have the golden cross had actually happened for the BTC-USD pair?.
Technical note: we will be using cryptocompare package, in case you don’t remember, we didn’t use this one in our last episode, because it doesn’t have the exchange parameter. But for this scenario, it works perfectly fine.
You can see in the legend which color relate to each concept, I made the 50 and 200 days SMA based on the closing price. I also constructed a golden cross detector method, and, my special gift for you: those beautiful arrows pointing the price and the date at the moment (here is the gist). In the last 5 years had happened only 4 times. This part it’s easy; we all can see them (if you don’t, I made a zoom for you in the two unclear dates).
Given that the signals have been defined, the interesting part begins: understanding it. Can we know if the signals were good or bad?. I encourage you to take two minutes of your time to think about it, I have a quick answer, but I think it’s best if I share it with you after a little digesting process. See your device’s clock: actual time is t. I hope that by now, we are in the t+2 minutes in your temporal line. In case you don’t, reread this after 2 minutes.
Ok, now I’m sure that you waited because of the last loop sentence, you deserve an excellent discussion. If, after hitting your head against the table, you came up with the “depends when do you decide to sell it” AWESOME, you could have saved your integrity, but I decided to let that go. BASIC! How can we evaluate this situation if we don’t attach the selling scenario to it?.
In trading, risk management, it’s essential. Nevertheless, we will be talking about that in detail in the next episodes, but for now, I will only tell you that an important part it’s defining the price when you are giving up or taking your wins. For example, in the case of our first two signals (Price: 286.96 and 303.54), visualize two scenarios (win/lose: 10% and 35%):
- Signal 1 First Scenario (S1–1): Imagine that you were planning a short-term investment. Your goal was defined to win/lose 10% of the initial value. In this scenario, what happened first is that the day 2015–08–16, the BTC close price was 257.12, below your threshold. You would had to take your losses this time.
- Signal 1 Second Scenario (S1–2): In this scenario, you chose a long-term investment. Your goal was defined to win/lose 35% of the initial value. What happened first is that the day 2015–11–03, the BTC close price was 396.49, above your threshold!. You would have taken the gains this time.
- Signal 2 First Scenario (S2–1): Again, you were planning a short-term investment. Your goal was defined to win/lose 10% of the initial value. What happened first is that the day 2015–11–02, the BTC value close price was 359.28. Great!.
- Signal 2 Second Scenario (S2–2): Finally, you were planning a long-term investment. Your goal was defined to win/lose 35% of the initial value. In this scenario, what happened first is that the day 2015–12–08, the BTC value close price was 410.67. You could have taken a bath in your money.
The chart describes the four combinations assuming that you bought on the date of the signal. This is giving us an important takeaway: defining a profitable or non-profitable signal depends on the gain/losses that you are willing to take. For the other 2 signals, you can make conclusions on your own, but it’s pretty much the same. What I didn’t tell you is that this SMA of 200 days, can be replaced by 100 depending on the analyst. Indeed, there’s a lot of popular SMAs. Also, you can do this exercise in shorter or longer periods because there are analysts who see minute, and even weekly data.
This means that our model should be capable of testing these combinations, and also, we need to check all the other indicators, understand them, and try them out. But hey, stop there! You are getting ahead, be patient, good things take time. All of that is coming in the next chapters. As a matter of fact, I will explain it to you in detail the risk management process, and how our model will be dealing with the probabilistic part of it. This probabilistic part will take us aside from the typical algorithmic trading that is triggered just when a simple signal occurs. But again! Slow down that is coming, I promise.
I think we are good for today, let’s check our takeaways:
- Before jumping directly into a problem, we must ask to experienced people in the matter (traders for us). You can avoid a lot of work and even shame to yourself (like COVID-19 forecasters data scientists).
- Now we know that between the trading and data science worlds can be bridges, so we are constructing them.
- You now why bears and bulls are constantly fighting each other.
- We learn what an SMA is and how to identify a golden cross signal.
- We introduce the risk management concept.
- We learn that we need a signal to buy. Also, a higher and a lower price than our entry, those will be our triggers for selling.
Fuah! So many takeaways, I think that you will have more than enough to entertain yourself for the next two weeks. In the next episodes, we will be exploring in more detail risk management, and see how to combine it with our problem. Also, we still have a lot of work to create our variables, so we are closer to our model with each passing day. See you in 14 days!.