AutoTrader started the way a lot of side projects do: with a question I couldn't let go of. I'd been working with recommendation systems at my day job, and it struck me that the core problem—predicting what a person will want next based on noisy, incomplete signals—isn't that different from predicting where a stock will move next based on noisy, incomplete market data.
So I started small. A few tickers, a basic feature set, a model that ran on my M3 laptop. But as I dug in, the scope grew naturally. A single ticker needed multi-timeframe analysis. Multi-timeframe analysis needed richer features. Richer features needed a real data pipeline. A real data pipeline needed cloud infrastructure. And before long, I was building a system that ingests data for 600+ tickers every night, engineers 500+ features from eight distinct sources, trains over 1,800 models, and delivers ranked predictions to subscribers before the opening bell.
Every component was designed and built by me from scratch. It runs autonomously on a multi-cloud setup (GCP + Azure) for about $235/month, and it's become the most technically satisfying project I've worked on—a place where I get to combine ML modeling, data engineering, infrastructure design, and product thinking all in one system.
1,800+Trained Models
500+Engineered Features
600+Tickers Covered
~$235/moTotal Infrastructure Cost
System Architecture
The system is split across two GCP virtual machines, coordinated through Google Cloud Storage and PostgreSQL. The separation isn't arbitrary: feature engineering is I/O-bound (lots of API calls and database writes), while model training is CPU-bound (lots of number crunching). Putting them on different VMs means I can right-size each machine's resources without overpaying for either workload.
Everything is orchestrated by cron jobs that hand off data downstream in sequence. There are no manual steps in the daily workflow—from raw market data to delivered email predictions, the system runs end-to-end without intervention.
Daily Workflow (EST)
12:10 AM
Data Collection
VM2 pulls OHLCV data from 600+ tickers via the EODHD API, computes 500+ features per ticker, and writes everything to PostgreSQL and GCS. Takes about 45–60 minutes.
3:00 AM (Sat)
Comprehensive Training
VM3 retrains all ~1,800 models with Optuna hyperparameter optimization and walk-forward validation. Incremental training (100–200 models) runs on weekdays.
5:00 AM
Inference
VM3 loads every active model and generates predictions for the upcoming trading day. Each prediction combines a directional call with a magnitude estimate and a confidence score.
5:30 AM
Email Delivery
Tiered emails go out to subscribers with ranked predictions, market sentiment context, and analysis reports—all before the 9:30 AM open.
Infrastructure
VM2: Data & Execution
2 vCPU, 8 GB RAM · ~$49/mo
Data collection & features
Inference & email delivery
Trading execution (Alpaca)
LLM event signals (Claude)
VM3: Training (GCP)
4 vCPU, 32 GB RAM · ~$50-80/mo
Dual model training (XGBoost)
Optuna hyperparameter search
Preemptible (auto-recovery)
GCS model sync
PostgreSQL + GCS
2 vCPU, 8 GB · ~$52/mo
TimescaleDB (market data)
Model registry & predictions
GCS model artifact storage
PgBouncer connection pooling
This is where the raw ingredients come from. Every night, the pipeline pulls fresh market data from the EODHD API for all S&P 500 constituents plus 184 ETFs, then transforms that data into a rich set of 500+ engineered features spanning technical, fundamental, sentiment, behavioral, and alternative data dimensions.
How It Works
Ingest OHLCV market data across multiple timeframes for all tracked tickers
Prioritize high-liquidity names to ensure freshest data for major positions
Store to PostgreSQL with GCS redundancy
Run the feature computation pipeline to generate 500+ features per ticker across eight signal-family categories
Collect multi-source sentiment and alternative data signals
Compute proprietary training labels designed to capture directional intent rather than simple close-to-close returns
Feature Sources (8 Families, 500+ Features)
Technical Indicators Trend, momentum, volatility, volume, and pattern-based signals across multiple timeframes
Fundamental Data Valuation metrics, earnings estimates, and corporate event signals
Price Microstructure Higher-order derivatives of price dynamics and structural pattern recognition
Behavioral Economics Cognitive bias indicators: anchoring, disposition effect, herding intensity, and loss aversion asymmetries
Alternative Data & Sentiment Multi-source sentiment aggregation, social trend analysis, and event-driven signals with lagged impact modeling
Statistical & Regime Features Mean reversion signals, market phase detection, and volatility regime classification
Proprietary Composite Signals Calibrated multi-factor combinations derived from ongoing research into market microstructure
Model Training
▼
The core insight behind the training architecture is that direction and magnitude are fundamentally different prediction tasks and benefit from being modeled separately. Every ticker/timeframe combination gets two XGBoost models: a classifier that predicts whether the stock goes up or down, and a regressor that predicts by how much.
Dual Model Architecture
Training two models per ticker lets each be optimized for what it's best at:
Direction Model: Predicts bullish or bearish. Optimized on classification accuracy. Trained on filtered data that removes noise days where direction is essentially random.
Magnitude Model: Predicts expected move size. Optimized on directional accuracy. Trained on the full dataset to capture the complete distribution of outcomes.
At inference time, the two predictions are combined into a single calibrated confidence score that captures both conviction and expected size of the move. Post-hoc calibration ensures the confidence values reflect true accuracy rates.
Training Process
Priority queue: Models queued by strategy (worst-performing first) so training time goes where it has the most impact
Data loading: Features and labels pulled from PostgreSQL with GCS fallback
Noise filtering: Low-movement days removed for direction model training to focus on meaningful signals
Walk-forward validation: Expanding-window folds that respect temporal ordering (no future data leakage)
Hyperparameter optimization: Automated search across model parameters using Bayesian optimization
Evaluation: Multiple accuracy metrics tracked per fold including directional accuracy
Lifecycle management: Top model versions retained per ticker/timeframe; older versions pruned automatically
Comprehensive training (all ~1,800 models) runs every Saturday and takes 2–4 hours. Incremental training (100–200 models) runs on weekdays in 20–60 minutes, focusing on new tickers and underperformers.
Inference & Prediction
▼
Every weekday morning at 5:00 AM, the inference pipeline loads all active models and generates a prediction for each ticker/timeframe pair. The output is a ranked list of the day's highest-confidence predictions, ready for delivery.
How It Works
Query the model registry for all active models (status = active)
For each ticker/timeframe: load the classifier and regressor from GCS
Load the most recent features for the current prediction date
Generate a direction prediction (bullish/bearish) with probability
Generate a magnitude prediction (% expected move)
Combine into a single confidence-ranked score
Store all predictions in PostgreSQL and upload a snapshot to GCS
Rank by confidence and split into top bullish and top bearish lists
Current Production Scale
937Predictions Generated Daily
358Tickers with Active Models
~32/minPrediction Throughput
Email Delivery & Subscriptions
▼
The delivery system takes predictions and wraps them in context: market sentiment, economic calendar events, and analysis reports. Subscribers receive content matched to their tier, delivered as polished HTML emails with optional attachments.
Delivery Workflow
Validate PostgreSQL tunnel connectivity (auto-start if needed)
Check data freshness via TradingDayValidator—trigger a sync if data is stale
Generate or load analysis reports for the current trading day
Load predictions from PostgreSQL
Collect market context: Put/Call ratio, Fear & Greed index, social sentiment (ApeWisdom), Forex Factory economic calendar
Load subscriber list and filter by tier
Render tier-specific HTML emails with appropriate attachments
Send via SMTP with a lock file to prevent duplicate sends
SMS notification to admin on success or failure
See it live: Browse the Daily Updates page for real examples of the Basic tier email output, published every trading day.
Subscription Tiers
Content scales with tier—everyone gets predictions, but the depth of analysis and number of picks increases as you move up.
Tier
Predictions
Analysis
Extras
Basic
SPY only
F&G, headlines
—
Premium
Top 50
PCR, social, congress, full news
—
Professional
All 600+
LLM synthesis, entity tracker, alt-data
CSV + heatmaps
Secret
All 600+
Sonnet synthesis, raw model data
CSV + heatmaps + API
Design Decisions
A system like this involves hundreds of small choices. Here are the ones that shaped the architecture most significantly—and the reasoning behind each.
Why dual models instead of one?
Early on I tried a single model that predicted signed returns directly. It was mediocre at both direction and magnitude. Splitting the problem into a classifier ("which way?") and a regressor ("how far?") lets each model focus on what it does best. The classifier trains on filtered data with noise days removed; the regressor sees the full distribution. The combined signal is stronger than either alone.
Why walk-forward validation?
Standard K-fold cross-validation would let the model see Tuesday's data while training on Thursday's. In financial data, that's cheating—any time-series pattern, regime change, or structural break gets leaked across the boundary. Walk-forward validation with expanding windows respects temporal ordering, which means the performance estimates I get are realistic rather than flattering.
Why custom training labels instead of simple returns?
Simple close-to-close returns miss the intraday story—a stock can gap up 2% then sell off all day. The training labels are designed to capture where actual trading conviction lies, producing better signal for the models even if they're noisier to compute.
Why separate VMs?
Feature engineering spends most of its time waiting on API responses and writing to databases (I/O-bound). Model training spends most of its time in XGBoost's gradient computations (CPU-bound). Running both on a single VM would mean paying for 16 GB of RAM during data collection when I only need 8, or paying for beefy CPUs during the data pipeline when they'd sit idle. The multi-VM split lets me right-size each workload—GCP handles data pipelines and a dedicated PostgreSQL instance, while an Azure VM provides parallel training capacity.
Why filter noise days for the direction model?
On days when a stock barely moves, predicting "up" or "down" is essentially a coin flip—and training on coin flips adds noise without signal. Filtering low-movement days lets the direction model focus on days with actual directional commitment, while the magnitude model still sees the full distribution.
Why build everything from scratch?
Partly because I wanted to understand every piece of the system at a level that using off-the-shelf solutions wouldn't give me. But also because the constraints of a personal project—tight budget, single maintainer, zero tolerance for pager fatigue—reward simplicity. Cron jobs, PostgreSQL, and GCS are boring, well-understood technologies. That's the point. I'd rather spend my engineering time on feature research and model architecture than debugging Kubernetes manifests.
AutoTrader delivers ML-driven market predictions to your inbox every trading day before the opening bell. 1,800+ models, 600+ tickers, 500+ features — fully autonomous.
All paid tiers include a 7-day free preview of the Basic tier so you can see the system in action.
AutoTrader is a living system—it runs in production daily, but it's also my primary playground for exploring new ideas. A few things on the roadmap:
Ensemble methods: Exploring how to combine predictions across timeframes (daily, weekly, monthly) into a single multi-horizon signal, weighted by each model's recent accuracy.
Transformer-based models: The current XGBoost approach works well on tabular features, but I'm curious whether attention mechanisms over raw price sequences could capture patterns that hand-engineered features miss.
Portfolio optimization: Moving beyond individual ticker predictions to portfolio-level allocation—factoring in correlation, sector exposure, and risk constraints.
Real-time inference: Currently predictions run once daily. Exploring whether intraday feature updates and streaming inference could capture opportunities that the overnight pipeline misses.