AutoTrader

An ML-Powered Stock Prediction & Trading System

Personal Project · 2024–Present · Live in Production

The Story

AutoTrader started the way a lot of side projects do: with a question I couldn't let go of. I'd been working with recommendation systems at my day job, and it struck me that the core problem—predicting what a person will want next based on noisy, incomplete signals—isn't that different from predicting where a stock will move next based on noisy, incomplete market data.

So I started small. A few tickers, a basic feature set, a model that ran on my M3 laptop. But as I dug in, the scope grew naturally. A single ticker needed multi-timeframe analysis. Multi-timeframe analysis needed richer features. Richer features needed a real data pipeline. A real data pipeline needed cloud infrastructure. And before long, I was building a system that ingests data for 600+ tickers every night, engineers 400+ features from eight distinct sources, trains over 1,800 models, and delivers ranked predictions to subscribers before the opening bell.

Every component was designed and built by me from scratch. It runs autonomously on Google Cloud for about $55 a month, and it's become the most technically satisfying project I've worked on—a place where I get to combine ML modeling, data engineering, infrastructure design, and product thinking all in one system.

1,800+ Trained Models
400+ Engineered Features
600+ Tickers Covered
~$55/mo Total Infrastructure Cost

System Architecture

The system is split across two GCP virtual machines, coordinated through Google Cloud Storage and PostgreSQL. The separation isn't arbitrary: feature engineering is I/O-bound (lots of API calls and database writes), while model training is CPU-bound (lots of number crunching). Putting them on different VMs means I can right-size each machine's resources without overpaying for either workload.

Everything is orchestrated by cron jobs that hand off data downstream in sequence. There are no manual steps in the daily workflow—from raw market data to delivered email predictions, the system runs end-to-end without intervention.

Daily Workflow (EST)

12:10 AM
Data Collection
VM2 pulls OHLCV data from 600+ tickers via the EODHD API, computes 400+ features per ticker, and writes everything to PostgreSQL and GCS. Takes about 45–60 minutes.
3:00 AM (Sat)
Comprehensive Training
VM3 retrains all ~1,800 models with Optuna hyperparameter optimization and walk-forward validation. Incremental training (100–200 models) runs on weekdays.
5:00 AM
Inference
VM3 loads every active model and generates predictions for the upcoming trading day. Each prediction combines a directional call with a magnitude estimate and a confidence score.
5:30 AM
Email Delivery
Tiered emails go out to subscribers with ranked predictions, market sentiment context, and analysis reports—all before the 9:30 AM open.

Infrastructure

VM2: Feature Engineering

2 vCPU, 8 GB RAM · $15/mo


EODHD API data collection
Feature computation (400+)
VWAP label generation
Auxiliary data (PCR, VIX, sentiment)

GCS + PostgreSQL

Shared data layer · $0.30/mo


Raw OHLCV data & feature store
Trained model artifacts
Model registry & predictions
Subscriber management

VM3: Training & Inference

4 vCPU, 16 GB RAM · $40/mo


Dual model training (XGBoost)
Optuna hyperparameter search
Daily prediction generation
Tiered email delivery

Data flow: EODHD API → VM2 (collect & engineer) → GCS / PostgreSQL (store) → VM3 (train & predict) → Subscribers (email)

Pipeline Details

Click any section below to expand or collapse it.

Data Collection & Feature Engineering

This is where the raw ingredients come from. Every night, the pipeline pulls fresh market data from the EODHD API for all S&P 500 constituents plus 184 ETFs, then transforms that data into a rich set of 400+ engineered features. The features come from eight distinct sources, each capturing a different dimension of market behavior.

How It Works

  • Check daily API usage against the 100K call limit (and queue any overage for backfill)
  • Prioritize high-signal tickers (SPY, QQQ, AAPL processed first)
  • Fetch OHLCV data across three timeframes (daily, weekly, monthly)
  • Store raw data to both PostgreSQL and GCS for redundancy
  • Run the HybridFeatureComputationPipeline to compute 400+ features per ticker
  • Collect auxiliary signals: CBOE Put/Call ratio, CNN Fear & Greed Index, ApeWisdom social sentiment, Forex Factory economic calendar
  • Compute VWAP-based labels that serve as training targets for downstream models

Feature Sources (8 Families)

TA-Lib Technical Indicators~200
Trend (EMA, MACD, ADX), momentum (RSI, Stochastic), volatility (ATR, Bollinger), volume (OBV, CMF), plus 61 candlestick patterns
EODHD Fundamentals~100
Valuation ratios (P/E, P/B, P/S), dividend yield, EPS estimates, and news sentiment polarity scores
Cross-Industry Signals~100
Sector rotation strength, defensive vs. cyclical momentum, market breadth, and capitulation detection
Kinematics~80
Derivatives of price movement: velocity, acceleration, jerk, turning point patterns, and momentum extremes
All-Time-High Analysis~45
Days since ATH, drought severity and frequency, statistical significance of proximity to highs
Enhanced VWAP~38
Calendar and event flags (FOMC meetings, earnings dates, options expiration), institutional activity signals
Auxiliary & Sentiment~30
Economic calendar events, Fear & Greed index, Reddit trending stocks (ApeWisdom), CBOE Put/Call ratios
Intraday~20
5-minute VWAP, intraday momentum profiles, hour-of-day effects
Model Training

The core insight behind the training architecture is that direction and magnitude are fundamentally different prediction tasks and benefit from being modeled separately. Every ticker/timeframe combination gets two XGBoost models: a classifier that predicts whether the stock goes up or down, and a regressor that predicts by how much.

Dual Model Architecture

Training two models per ticker lets each be optimized for what it's best at:

  • Classifier (XGBClassifier): Predicts direction (bullish or bearish). Optimized on AUC. Trained on filtered data with low-movement noise days removed (bottom 20% by absolute VWAP move), so it focuses on days when the market is actually making a call.
  • Regressor (XGBRegressor): Predicts magnitude (% expected move). Optimized on directional accuracy. Trained on the full dataset to capture the complete range of outcomes, including quiet days.

At inference time, the two predictions are combined:
signal = (direction * 2 - 1) * magnitude
confidence = probability * (1 + magnitude)

This produces a single confidence-ranked score that captures both conviction and expected size of the move.

Training Process

  • Priority queue: Models are queued by strategy (worst-performing first, new tickers first, etc.) so training time is spent where it has the most impact
  • Data loading: Features and VWAP labels pulled from PostgreSQL/GCS
  • Noise filtering: Bottom 20th percentile of movement days removed for classifier training
  • Walk-forward validation: 5 expanding-window folds that respect temporal ordering (no future data leakage)
  • Hyperparameter optimization: Optuna runs 30–50 trials per model, searching across learning rate, max depth, subsample, and regularization
  • Evaluation: AUC, directional accuracy, and separate bullish/bearish accuracy tracked per fold
  • Model storage: Artifacts saved to GCS, metadata registered in PostgreSQL
  • Lifecycle management: Top 3 model versions retained per ticker/timeframe; older versions pruned automatically
Comprehensive training (all ~1,800 models) runs every Saturday and takes 2–4 hours. Incremental training (100–200 models) runs on weekdays in 20–60 minutes, focusing on new tickers and underperformers.
Inference & Prediction

Every weekday morning at 5:00 AM, the inference pipeline loads all active models and generates a prediction for each ticker/timeframe pair. The output is a ranked list of the day's highest-confidence predictions, ready for delivery.

How It Works

  • Query the model registry for all active models (status = active)
  • For each ticker/timeframe: load the classifier and regressor from GCS
  • Load the most recent features for the current prediction date
  • Generate a direction prediction (bullish/bearish) with probability
  • Generate a magnitude prediction (% expected move)
  • Combine into a single confidence-ranked score
  • Store all predictions in PostgreSQL and upload a snapshot to GCS
  • Rank by confidence and split into top bullish and top bearish lists

Current Production Scale

937 Predictions Generated Daily
358 Tickers with Active Models
~32/min Prediction Throughput
Email Delivery & Subscriptions

The delivery system takes predictions and wraps them in context: market sentiment, economic calendar events, and analysis reports. Subscribers receive content matched to their tier, delivered as polished HTML emails with optional attachments.

Delivery Workflow

  • Validate PostgreSQL tunnel connectivity (auto-start if needed)
  • Check data freshness via TradingDayValidator—trigger a sync if data is stale
  • Generate or load analysis reports for the current trading day
  • Load predictions from PostgreSQL
  • Collect market context: Put/Call ratio, Fear & Greed index, social sentiment (ApeWisdom), Forex Factory economic calendar
  • Load subscriber list and filter by tier
  • Render tier-specific HTML emails with appropriate attachments
  • Send via SMTP with a lock file to prevent duplicate sends
  • SMS notification to admin on success or failure

Subscription Tiers

Content scales with tier—everyone gets predictions, but the depth of analysis and number of picks increases as you move up.

Tier Predictions Analysis Attachments
Basic Top 5 Summary
Premium Top 10 Market context TXT
Professional All Full reports TXT + PDF
Secret All Full + econ calendar TXT + PDF

Design Decisions

A system like this involves hundreds of small choices. Here are the ones that shaped the architecture most significantly—and the reasoning behind each.

Why dual models instead of one?

Early on I tried a single model that predicted signed returns directly. It was mediocre at both direction and magnitude. Splitting the problem into a classifier ("which way?") and a regressor ("how far?") lets each model focus on what it does best. The classifier trains on filtered data with noise days removed; the regressor sees the full distribution. The combined signal is stronger than either alone.

Why walk-forward validation?

Standard K-fold cross-validation would let the model see Tuesday's data while training on Thursday's. In financial data, that's cheating—any time-series pattern, regime change, or structural break gets leaked across the boundary. Walk-forward validation with expanding windows respects temporal ordering, which means the performance estimates I get are realistic rather than flattering.

Why VWAP labels instead of close-to-close returns?

A stock can gap up 2% at the open, sell off all day, and still show a positive daily return. Close-to-close labels miss the intraday story. Volume-Weighted Average Price captures where trading activity actually concentrated, giving a more honest signal about directional intent. It's noisier to compute but produces better labels for training.

Why separate VMs?

Feature engineering spends most of its time waiting on API responses and writing to databases (I/O-bound). Model training spends most of its time in XGBoost's gradient computations (CPU-bound). Running both on a single VM would mean paying for 16 GB of RAM during data collection when I only need 8, or paying for beefy CPUs during the data pipeline when they'd sit idle. The 2-VM split lets me right-size each workload and keeps total costs under $55/month.

Why filter low-movement days for the classifier?

On days when a stock barely moves (less than 0.1%), predicting "up" or "down" is basically a coin flip—and training on those coin flips adds noise without signal. Removing the bottom 20% of movement days by absolute VWAP change lets the classifier focus on days when the market is actually making a directional commitment. The regressor still sees all data, because even small-movement days carry information about magnitude distributions.

Why build everything from scratch?

Partly because I wanted to understand every piece of the system at a level that using off-the-shelf solutions wouldn't give me. But also because the constraints of a personal project—tight budget, single maintainer, zero tolerance for pager fatigue—reward simplicity. Cron jobs, PostgreSQL, and GCS are boring, well-understood technologies. That's the point. I'd rather spend my engineering time on feature research and model architecture than debugging Kubernetes manifests.

Tech Stack

Machine Learning

XGBoost Optuna Scikit-learn TA-Lib FAISS Pandas NumPy SciPy

Data Sources

EODHD API CBOE CNN Fear & Greed ApeWisdom Forex Factory AAII Sentiment

Infrastructure

GCP Compute Engine Google Cloud Storage PostgreSQL SQLite Cron SSH Tunnels

Trading & Delivery

Alpaca API SMTP / Gmail SMS Alerts Stripe

Languages & Tools

Python Bash SQL Git Selenium

What's Next

AutoTrader is a living system—it runs in production daily, but it's also my primary playground for exploring new ideas. A few things on the roadmap:

  • Ensemble methods: Exploring how to combine predictions across timeframes (daily, weekly, monthly) into a single multi-horizon signal, weighted by each model's recent accuracy.
  • Transformer-based models: The current XGBoost approach works well on tabular features, but I'm curious whether attention mechanisms over raw price sequences could capture patterns that hand-engineered features miss.
  • Portfolio optimization: Moving beyond individual ticker predictions to portfolio-level allocation—factoring in correlation, sector exposure, and risk constraints.
  • Real-time inference: Currently predictions run once daily. Exploring whether intraday feature updates and streaming inference could capture opportunities that the overnight pipeline misses.