FONFLO LAB

An autonomous research engine that invents, backtests, and validates ES futures trading strategies around the clock. AI generates the hypotheses. Historical data decides if they live or die. The survivors become live signals.

883
Hypotheses
96
Backtested
83
Killed
11
Observing
2
Live
86%
Kill Rate

From hypothesis to production.

Every strategy starts as an idea and ends as a live signal or a logged kill. There's no middle ground. The pipeline is designed to be ruthless — most strategies don't survive. The ones that do have earned it.

01
AI Generates a New Hypothesis
Claude AI ingests every prior result — kill reasons, PF distributions, regime failures, parameter dead zones. It identifies unexplored microstructure patterns and writes Python detection logic targeting phenomena like delta divergence at IB boundaries, volume asymmetry on range extensions, or cumulative delta trend breaks at session VPOC. The system runs continuously.
02
Walk-Forward Backtest
200,000+ bars of ES 1-min OHLCV from Schwab API. Train on months 1-4, test on months 5-6 (strict out-of-sample). Entry: t+1 bar open (zero lookahead). Stops: checked intrabar on bar H/L, not close. Slippage: 0.25 pts per side. Commission: $4.50 RT. Minimum sample: 30 in-sample, 20 OOS.
03
Monte Carlo Random Entry Control
200 iterations of randomized entries on the same trading days with identical stop/target distances. Computes the full distribution of random P&L. If the signal doesn't beat random by 2x+ (Grade A) or 1x+ (Grade B), the apparent "edge" is just the payoff structure. This separates signal alpha from mechanical R:R.
04
VIX Regime Decomposition
Win rate and profit factor decomposed across VIX <16 (low vol), VIX 16-22 (normal), VIX >22 (elevated). Wilson score 95% confidence intervals on each bucket. A strategy that only works in one regime is a conditional bet, not a robust edge. We encode the optimal regime as a production filter.
05
Day-Type Classification
Every session is classified into one of six day types: Trend, Trend From Open, Balance Narrow, Balance Wide, Reversal, or Late Break. A mean reversion strategy that works on balance days will get destroyed on a trend day. We test every strategy across all six types and only deploy it on day types where it's proven. In production, the classifier runs in real-time — updating probabilities at 9:45, 10:30, 12:00, and 14:00 ET as the session develops.
06
Grade Assignment
A: Full PF >1.5, OOS PF >1.3, OOS WR >50% (Wilson CI), edge >2x random, n ≥ 30/20. B: Full PF >1.3, OOS PF >1.0, OOS WR >45%, edge >1x, n ≥ 25/15. F: No edge — killed, logged, fed back as a negative constraint.
07
Deep Analysis + Production
Survivors enter contextual enrichment — every trade tagged with opening type, IB classification, VIX regime, day of week, gap direction, session range percentile. Win rates decomposed across 7 dimensions. Top conditions extracted as production filters. Strategy only fires on days matching its proven edge. First 10 trades on probation — tighter kill threshold. Dormant after 15 days without firing. Audited daily, auto-killed if PF degrades.

The system knows what kind of day it is.

Most signal services treat every day the same. We don't. The day-type classifier runs in real-time, updating its read on the session at four checkpoints as data accumulates:

9:45 ET — Opening type classified. If it's a drive open, trend probability increases. If price opens inside prior range, balance probability increases.
10:30 ET — Initial balance set. Narrow IB + auction open = likely balance day. Wide IB + drive = likely trend. Probabilities sharpen.
12:00 ET — Has IB been broken? One side or both? How many VWAP crosses? By noon, most day types are identifiable with 70%+ confidence.
14:00 ET — If still in balance, late break probability rises. If trending, trend day confirmed. Classification at 85%+ confidence.

Mean reversion strategies are suppressed when trend probability exceeds 50%. Continuation strategies are boosted when trend probability exceeds 40%. The system doesn't fight the tape — it reads the tape and adapts.

Promotion is just the beginning.

Grade A/B strategies go straight to production — no waiting room. The first 10 trades are a probation period with a tighter kill threshold: PF drops below 1.0 after 5 trades and it's killed immediately. After probation, the normal audit applies — PF below 0.8 = degraded, below 0.6 after 30 = auto-killed. If a strategy goes 15 trading days without firing a single signal, it's flagged as DORMANT — the conditions it was built for may not exist in the current market.

VIEW PRODUCTION TRACKING →

Every signal has a paper trail.

Trace any signal back through the pipeline — the AI hypothesis, the backtest, the random control, the regime it's optimized for, and the live trades since promotion. Nothing hidden. The kills are published alongside the wins.

What feeds the engine.

Primary: Schwab API — 1-min OHLCV, 7 months continuous front-month ES. Secondary: IBKR TWS — Level 2 DOM via reqMktDepth, tick-by-tick with BBO aggressor classification. Analysis: Cumulative delta, session VWAP, volume profile (VPOC/VAH/VAL), initial balance classification, opening type detection, regime detection via 10/30-bar ATR ratio.