Order Books & Volatility: Part 1—Reality Check

Before using ML for crypto volatility, see why indicators hit a ceiling, how the order book flags liquidity stress, and why PR-AUC matters more than accuracy.

Lucas NogJanuary 15, 2026

Dummies

Order Books & Volatility: Part 1—Reality Check

The stop-loss wick and the wrong question most traders are asking

There’s a very specific kind of pain in trading: you catch the right trend, your entry is on point, the market moves exactly in the direction you predicted… and your stop-loss still gets wicked out by a tiny spike. You are kicked out of the position, you sigh, and then watch price run beautifully without you.

The first few times, it’s comforting to blame “market makers hunting stops”. After the third or fourth time, it gets harder to believe the universe personally hates you. Like most people (my past self included), you might be stuck on the wrong question:

“Will the price be higher or lower in 10 minutes?”

“Should I long or short here?”

But if you scalp, day trade, or even swing trade with tight stops, the market doesn’t kill you with direction. It kills you with volatility. It doesn’t need a full trend reversal. A quick shake, a few ticks up or down at the wrong moment, is enough to wipe you out.

So sometimes the more useful question isn’t “up or down?” but:

“Will there be drama in the next 10 seconds?”

If you know – even probabilistically – whether the next 10 seconds will be wild or sleepy, you change everything: your size, your stop distance, your spread, or even the decision to enter at all.

That’s where the idea of “predicting volatility” comes in. And of course, the moment you say “predict”, the ML people show up: “Let’s use AI for that.”

Indicators, rule-based logic, and the invisible ceiling of old-school approaches

To be fair, everyone starts with indicators. EMA, RSI, MACD, ATR, maybe Bollinger Bands, trendlines, support and resistance zones. Then you build some rules like “if ATR is rising, volume spikes, and price closes near the upper band, volatility is coming”.

This has real advantages. You understand exactly what your system does. You can explain it over coffee in ten minutes. No GPUs, no pipelines, no fear of overfitting or data leakage.

The problem is: indicator + simple rule-based logic usually only looks at a handful of things at once. A bit of price, a bit of volume, maybe spread or ATR, and that’s it.

But markets – especially crypto – respond to combinations of many things:

How the order book is tilted.
Time-of-day effects: Asia vs Europe vs US.
Whether this altcoin is being dragged by BTC or doing its own thing.
The current regime: normal, news-driven, short squeeze, panic dump…

The longer you trade, the more you feel an invisible ceiling. You can stack more indicators, but it doesn’t add much. The rules your human brain writes are usually straight lines: if A and B then C. The market is not that polite.

This is exactly where machine learning becomes attractive. Instead of manually writing all the rules, you let a model learn how to combine many signals to recognize the “smell” of upcoming volatility.

And this is also where crypto quietly gives you a superpower that most traditional markets don’t. In TradFi, there is a limit order book, but a lot of real size never shows up there: big trades are often negotiated off-book between institutions, printed later, and only become visible in reports or post-trade data. In crypto, you still have that on-chain world where you can track whales moving size between wallets, but for short-term, tick-by-tick volatility, the most immediate “camera” is the live limit order book on major exchanges. You can literally see where liquidity sits, where it disappears, and how the book leans just before things move.

What is the order book, and why is it the close-up camera on volatility?

Before we talk about ML, we need to talk about what we actually feed the model. For short-term volatility, the order book is the main ingredient.

Imagine a marketplace. The order book is the big board in the middle that says:

“At price 2.33, someone wants to buy 500 CAKE.
At 2.34, someone wants to sell 300 CAKE.
At 2.32, 800 CAKE are waiting to be bought.
At 2.35, 1,000 CAKE are waiting to be sold…”

The left side is bid – people who want to buy. The right side is ask – people who want to sell.
The highest bid is the best bid.
The lowest ask is the best ask.
The difference between them is the spread.

The candles you see on TradingView are just a summary of what already traded. The order book is the behind-the-scenes camera. Before the price moves, something often happens there: one side’s size disappears, someone stacks a wall, the spread suddenly widens, depth vanishes on one side, and so on.

If you only look at candles to predict very short-term volatility (a few seconds to a few tens of seconds), you’re ignoring half the story. For ML, you can show the model both: the “movie” of order book snapshots and the “summary” of trades.

Volatility: don’t memorize the formula yet – think “mood swings”

In plain language, volatility is how moody price is.

If in 10 minutes price barely moves, that’s low vol.
If in 30 seconds price is jumping up and down like crazy – even if it ends near where it started – that’s high vol.

Statistically, people often measure volatility using the standard deviation of returns. If Pt is the price at time t, a common choice is log-return:

Then volatility over some window is roughly the standard deviation σ of the in that window.

But in practice, especially for very short horizons, we don’t always need a full-blown σ for a whole day. Sometimes we just care about a much simpler question:

“In the next 10 seconds, is the max–min price move big enough to hit my stop or blow up my spread?”

That’s why, in the real project I ran, the label was defined in a very simple way. For each time t , look at the window from t+1 to t+10 seconds. Take the maximum and minimum of the last traded price in that window. If

say threshold = 0.002 for CAKEUSDT, we mark it as an “event” (label 1). If not, it’s “normal” (label 0).

This definition ties volatility directly to trader pain. We don’t care whether it’s a pump or a dump; if the intra-window move is large enough to ruin a tight stop or a narrow spread, it counts as volatility that matters.

*Figure 2 – Defining a 10-second volatility event*

How does an ML model “learn” to sniff out volatility?

Now imagine you have a gigantic notebook. Each row is one snapshot in time:

current price, spread, bid/ask depth,
a short history of recent price moves over the last few seconds,
time of day,
and at the end of the row, a label: “big move in the next 10 seconds” or “nothing special”.

You let the model read hundreds of thousands of rows like that.

If the model is a tree-based one like Random Forest, XGBoost, or LightGBM, it will keep asking binary questions, over and over, such as:

“Is the current spread less than 0.001?”
“If yes, is bid-side depth more than twice ask-side depth?”
“If yes again, is this close to US session open?”

Each question splits the data into branches. Each branch is a “type” of market state, and in each type the model learns an estimated probability that an event will happen in the next 10 seconds.

When you pass in a new state (a new row), the model just walks down this forest of questions and spits out a number between 0 and 1: the predicted probability of a volatility event.

The nice part: you don’t need to manually design every rule and threshold. The model discovers combinations of conditions that separate “about to move” from “chill” better than your intuition could. Sometimes it finds patterns you would never think of, like “when short-term vol is already elevated, depth imbalance is extreme in a specific direction, and time-of-day is in this narrow band, the chance of a 10-second spike jumps”.

The not-so-nice part: if your data is messy and your labels are poorly defined, the model will very faithfully learn the wrong thing. It has no shame and no sense of “trading common sense”. It optimizes exactly what you tell it to optimize, nothing more, nothing less.

How do you know your model is better than random? Precision, recall, PR-AUC

Imagine you have a friend who loves sending you messages like: “Big storm coming, bring a raincoat.”If he texts you this every single day, you will eventually mute him.

Precision is basically asking: “Out of all the times he said ‘storm is coming’, how often did it actually storm?”
High precision means when the model screams “volatility coming”, it’s usually right.

Recall is the reverse: “Out of all the days when there really was a storm, how many did he warn you about?”
Low recall means the model misses most of the real events and is too lazy to shout.

Formally, if we define

TP (true positives): model predicts an event and an event actually happens,
FP (false positives): model predicts an event but nothing happens,
FN (false negatives): an event happens but the model stayed silent,

then

In our 10-second volatility problem, events are rare – just a few percent of all timestamps. If the model simply predicts “no event” all the time, its accuracy can be very high, but precision and recall for the event class are exactly zero.

That’s why people use PR-AUC (Precision–Recall Area Under Curve). Instead of fixing a single probability threshold like 0.5, you sweep it: 0.1, 0.2, 0.3… and see how precision and recall trade off. PR-AUC summarizes how well the model ranks real events above non-events.

If PR-AUC is only slightly above the base rate (the overall proportion of 1s in the data), your model is barely better than a clever coin flip. If it’s significantly higher, you have something that can meaningfully prioritize “these timestamps are dangerous, pay attention”.

In a live trading desk, the “best” threshold depends on your goals.

If you care about catching as many events as possible and don’t mind lots of false alarms, you choose a threshold that favors recall. If you only want to be warned when it really matters and you hate being spammed, you choose a higher threshold that favors precision.

Wrapping up Part 1: ML is tempting, but not a magic wand

In this first part, the goal was to lay the foundation:

Why the question “Will there be drama in the next 10 seconds?” can be more useful than “Up or down?”.
Why classic indicators and hand-made rules hit a ceiling when dealing with short-term volatility in complex markets.
How the order book gives a much richer, “close-up” view of liquidity and risk than candles alone.
How you can think of volatility both intuitively (mood swings) and concretely (10-second range vs your stop).
How an ML model actually “learns” from historical snapshots, and why metrics like precision, recall and PR-AUC matter more than plain accuracy.

All of this is still the “friendly theory” layer. Once you plug into real data streams, things get much less romantic: dropped websocket messages, duplicated timestamps, 300–500 seconds of flat price that might be real or might be a bug, label distributions shifting over time, models that shine on the training period and collapse on the next day.

In Part 2, I’ll walk through that messy reality using a concrete failed experiment: trying to use Bitget and Binance orderbook data together to predict volatility on Bitget. On paper it looked smart. In practice, it turned into a long lesson about data quality, label design, and how easily ML can be tricked into learning the wrong problem.

Then in Part 3, we’ll shrink the sandbox: one exchange (Binance), one pair (CAKEUSDT), a clear 10-second event definition, carefully filtered data, and a set of models (from GARCH to tree-based ML) tested not just on “yesterday” but on a completely new day. That’s where we’ll see how far ML can realistically take you in answering the very simple but very useful question:

“In the next 10 seconds, is the market about to mess with me?”

Disclaimer:The content published on Cryptothreads does not constitute financial, investment, legal, or tax advice. We are not financial advisors, and any opinions, analysis, or recommendations provided are purely informational. Cryptocurrency markets are highly volatile, and investing in digital assets carries substantial risk. Always conduct your own research and consult with a professional financial advisor before making any investment decisions. Cryptothreads is not liable for any financial losses or damages resulting from actions taken based on our content.

trading

crypto volatility