u/melon_crust

Feature engineering > model hacking

Feature engineering > model hacking

I recently learned about fractionally differenced features, from Marcos Lopez de Prado, and it really makes a difference in the microstructure strategy I'm exploring.

Fractional differentiation consists in transforming non-stationary features like prices into stationary features while preserving some memory.

It helps ML models generalize better while remembering past data.

u/melon_crust — 1 day ago

Keeping a trial balance during research

I've realized that fooling myself is surprisingly easy when looking for an edge in data.

I try many things and select whatever reports the best numbers. Then iterate on that to further 'improve' it.

However, after stepping back, I realize those numbers are likely very inflated.

It's like finding an edge in a coin flip. If I try 100 coin flips with 1,000 different coins, Chances are I will find at least one coin that reports 0.75 heads and 0.25 tails. If I perform a t-test on these results, I will get a tiny p-value that proves the edge is significant.

Then I start betting money on that coin and, to my surprise, it barely breakevens.

The problem is trial count. I performed 1,000 trials, so the threshold I need to pass to take the results seriously is higher.

The coin flip case is clear and unambiguous: 1,000 trials. But things are more difficult when it comes to quant trading. What counts as a trial and how could we systematize it?

I thought about this definition:

"Given a strategy and a train, validation, and test split on the data, a trial is a distinct evaluation of the strategy against the validation set"

With this in mind, we can keep a trial balance on our strategy research pipeline. It would be a counter that starts at 0 and gets added 1 every time you run your evaluation function.

The deflated Sharpe ratio gets updated in real time, and you can't run your test function unless the observed Sharpe ratio is above the deflated Sharpe ratio threshold.

By enforcing this mechanically, it would be much harder to overfit. I'm thinking about writing a Python library or maybe even productize it, but still unsure how.

The core idea is: 'an opinionated quant trading research framework where result signficance is dictated by your trial balance and enforced systematically'.

What are your thoughts on this?

reddit.com
u/melon_crust — 7 days ago

I see two clear problems:

  1. It assumes a normal distribution, but it’s not uncommon to find fatter tails and skewness.
  2. It penalizes upwards volatility.

Calmar ratio seems much more appropriate.

Why still use Sharpe?

reddit.com
u/melon_crust — 16 days ago

Math undergraduate here, with a background in software engineering. I’ve always been interested in algo trading, though I haven’t been consistent. I built my first bot 7 years ago, and it was profitable for some time (until it wasn’t). Looking back, I don’t know if I had a statistical edge or it was just luck.

I started dabbling again and found something promising, though I don’t want to fool myself and I want to validate the numbers thoroughly before deploying real money.

Here’s what I’ve done:

  1. Checking for look ahead biases
  2. Factoring in trading fees
  3. Walk forward mean testing calculating p-values for k-folds, and then performing the binomial test given the number of folds whose mean is significantly worse than the full data mean.
  4. Testing fields individually. For example, asking ‘are shorts on Friday significantly worse than other days?’ and usinf t-test p-values to include filters or not.

I’m getting astronomical returns in a 4 years backtest.

What else should I check?

reddit.com
u/melon_crust — 21 days ago