FreightSense | Projects

Problem

When delayed shipments start piling up, operations teams often end up reviewing each one manually. FreightSense was built to shorten that decision loop by combining deterministic scoring with LLM-based reasoning, then keeping a clear audit trail when a human overrides the recommendation.

How It Works

FreightSense uses two layers:

Rule-based scoring to measure delay risk, financial exposure, and benchmark-based severity.
LLM reasoning to recommend an intervention and estimate potential savings in a structured format.

The output is meant for triage rather than full automation. A human still sees the recommendation, the reasoning, and any disagreement between the two layers before taking action.

System Overview

1
Browser dashboard
2
    |
3
    v
4
FastAPI service
5
    |- POST /api/evaluate
6
    |- POST /api/evaluate/{id}/override
7
    |- GET  /api/audit
8
    |- GET  /api/audit/{id}/overrides
9
    `- GET  /api/meta
10
        |
11
        +-- Layer 1: deterministic scoring
12
        +-- Layer 2: Groq LLM evaluation
13
        `-- SQLite audit log

Decision Logic

Risk score	Recommendation	Typical trigger
`>= 75`	`EXPEDITE`	High delay and high exposure
`50-74`	`DISCOUNT`	Moderate delay with retention risk
`25-49`	`MONITOR`	Some delay, not urgent yet
`< 25`	`NO_ACTION`	Within acceptable variance

If the deterministic layer and the LLM disagree, the UI shows both outputs so the operator can make the final call explicitly.

API Surface

`POST /api/evaluate`

Evaluates a shipment and returns the risk score, recommended action, confidence, reasoning, and estimated intervention impact.

1
{
2
  "order_id": "ORD-00123",
3
  "customer_segment": "Corporate",
4
  "market": "USCA",
5
  "category_name": "Electronics",
6
  "shipping_mode": "Standard Class",
7
  "days_scheduled": 5,
8
  "days_actual_estimate": 9,
9
  "order_item_total": 1200.0,
10
  "profit_ratio": 0.18
11
}

`POST /api/evaluate/{id}/override`

Stores a human decision against an evaluation, including custom reasoning.

`GET /api/audit`

Returns the audit log for previous evaluations.

`GET /api/audit/{id}/overrides`

Returns the full override history for one evaluation.

`GET /api/meta`

Returns available categories and markets for the dashboard.

Benchmark Layer

The deterministic layer uses precomputed benchmark statistics from roughly 180k supply-chain records. Those benchmarks are compiled into data/benchmarks.json and loaded at startup so inference stays simple and fast at request time.

uv run python scripts/build_benchmarks.py

Each benchmark group stores:

average scheduled days
average delay days
late-delivery rate
average profit ratio
sample size

Local Setup

Requirements

Python 3.12+
uv or pip
A Groq API key

Install

git clone https://github.com/VedantAndhale/FreightSense.git
cd FreightSense
uv sync

Configure

cp .env.example .env

1
GROQ_API_KEY=gsk_...
2
DATABASE_URL=./freightsense.db
3
GROQ_MODEL=llama-3.3-70b-versatile

Run

uv run uvicorn main:app --reload

The operational dashboard is served at http://localhost:8000, and the API docs are available at http://localhost:8000/docs.

Deployment

FreightSense is set up for Google Cloud Run with a warm instance so the SQLite-backed audit log stays available between requests.

One-time GCP setup

GITHUB_ORG=your-org GITHUB_REPO=freightsense bash scripts/setup_gcp.sh

The setup script:

enables required GCP APIs
creates the Artifact Registry repository
creates the service account used by GitHub Actions
configures Workload Identity Federation
stores GROQ_API_KEY in Secret Manager

Continuous deployment

Pushes to main build the image, push it to Artifact Registry, and deploy a new Cloud Run revision through GitHub Actions.

Project Structure

1
freightsense/
2
|- main.py
3
|- service.yaml
4
|- Dockerfile
5
|- pyproject.toml
6
|- app/
7
|  |- api/
8
|  |- core/
9
|  |- db/
10
|  `- static/
11
|- data/
12
|  |- benchmarks.json
13
|  `- DataCoSupplyChainDataset.csv
14
|- scripts/
15
|  |- build_benchmarks.py
16
|  |- setup_gcp.sh
17
|  `- test_groq.py
18
`- .github/workflows/deploy.yml

Tech Stack

Layer	Technology
API	FastAPI
LLM inference	Groq API
Data + scoring	pandas + Python
Persistence	SQLite via aiosqlite
Containerization	Docker
Deployment	Google Cloud Run

Outcome

FreightSense is a practical example of using an LLM as a decision support layer instead of a standalone answer engine. The value comes from pairing model output with explicit scoring, guardrails, and a human override path.