Project

FreightSense

A shipment triage tool that combines rule-based risk scoring with LLM recommendations to suggest delay interventions and record overrides.

GenAI Feb 27, 2026 4 min read
Groq LLM FastAPI Docker GCP

Problem

When delayed shipments start piling up, operations teams often end up reviewing each one manually. FreightSense was built to shorten that decision loop by combining deterministic scoring with LLM-based reasoning, then keeping a clear audit trail when a human overrides the recommendation.

How It Works

FreightSense uses two layers:

  1. Rule-based scoring to measure delay risk, financial exposure, and benchmark-based severity.
  2. LLM reasoning to recommend an intervention and estimate potential savings in a structured format.

The output is meant for triage rather than full automation. A human still sees the recommendation, the reasoning, and any disagreement between the two layers before taking action.

System Overview

Browser dashboard
|
v
FastAPI service
|- POST /api/evaluate
|- POST /api/evaluate/{id}/override
|- GET /api/audit
|- GET /api/audit/{id}/overrides
`- GET /api/meta
|
+-- Layer 1: deterministic scoring
+-- Layer 2: Groq LLM evaluation
`-- SQLite audit log

Decision Logic

Risk scoreRecommendationTypical trigger
>= 75EXPEDITEHigh delay and high exposure
50-74DISCOUNTModerate delay with retention risk
25-49MONITORSome delay, not urgent yet
< 25NO_ACTIONWithin acceptable variance

If the deterministic layer and the LLM disagree, the UI shows both outputs so the operator can make the final call explicitly.

API Surface

POST /api/evaluate

Evaluates a shipment and returns the risk score, recommended action, confidence, reasoning, and estimated intervention impact.

{
"order_id": "ORD-00123",
"customer_segment": "Corporate",
"market": "USCA",
"category_name": "Electronics",
"shipping_mode": "Standard Class",
"days_scheduled": 5,
"days_actual_estimate": 9,
"order_item_total": 1200.0,
"profit_ratio": 0.18
}

POST /api/evaluate/{id}/override

Stores a human decision against an evaluation, including custom reasoning.

GET /api/audit

Returns the audit log for previous evaluations.

GET /api/audit/{id}/overrides

Returns the full override history for one evaluation.

GET /api/meta

Returns available categories and markets for the dashboard.

Benchmark Layer

The deterministic layer uses precomputed benchmark statistics from roughly 180k supply-chain records. Those benchmarks are compiled into data/benchmarks.json and loaded at startup so inference stays simple and fast at request time.

Terminal window
uv run python scripts/build_benchmarks.py

Each benchmark group stores:

  • average scheduled days
  • average delay days
  • late-delivery rate
  • average profit ratio
  • sample size

Local Setup

Requirements

  • Python 3.12+
  • uv or pip
  • A Groq API key

Install

Terminal window
git clone https://github.com/VedantAndhale/FreightSense.git
cd FreightSense
uv sync

Configure

Terminal window
cp .env.example .env
GROQ_API_KEY=gsk_...
DATABASE_URL=./freightsense.db
GROQ_MODEL=llama-3.3-70b-versatile

Run

Terminal window
uv run uvicorn main:app --reload

The operational dashboard is served at http://localhost:8000, and the API docs are available at http://localhost:8000/docs.

Deployment

FreightSense is set up for Google Cloud Run with a warm instance so the SQLite-backed audit log stays available between requests.

One-time GCP setup

Terminal window
GITHUB_ORG=your-org GITHUB_REPO=freightsense bash scripts/setup_gcp.sh

The setup script:

  1. enables required GCP APIs
  2. creates the Artifact Registry repository
  3. creates the service account used by GitHub Actions
  4. configures Workload Identity Federation
  5. stores GROQ_API_KEY in Secret Manager

Continuous deployment

Pushes to main build the image, push it to Artifact Registry, and deploy a new Cloud Run revision through GitHub Actions.

Project Structure

freightsense/
|- main.py
|- service.yaml
|- Dockerfile
|- pyproject.toml
|- app/
| |- api/
| |- core/
| |- db/
| `- static/
|- data/
| |- benchmarks.json
| `- DataCoSupplyChainDataset.csv
|- scripts/
| |- build_benchmarks.py
| |- setup_gcp.sh
| `- test_groq.py
`- .github/workflows/deploy.yml

Tech Stack

LayerTechnology
APIFastAPI
LLM inferenceGroq API
Data + scoringpandas + Python
PersistenceSQLite via aiosqlite
ContainerizationDocker
DeploymentGoogle Cloud Run

Outcome

FreightSense is a practical example of using an LLM as a decision support layer instead of a standalone answer engine. The value comes from pairing model output with explicit scoring, guardrails, and a human override path.