Skip to content

Visual‑analytics & ML pipeline for Nova Scotia Open Data tenders - clustering, BERTopic, and a one‑click Docker deploy.

License

Notifications You must be signed in to change notification settings

samshad/public-tender-analysis-dashboard

Repository files navigation

Public Tender Analysis & Visualization Dashboard (Nova Scotia)

A one‑stop, interactive dashboard that turns raw Nova Scotia Open Data tender records into actionable insights. Built as the final project for Dalhousie’s CSCI 6612 – Visual Analytics, it blends modern data‑processing, machine‑learning and visualization tooling into a single Docker‑deployable app.

✨ TL;DR

  • Dash + Plotly UI with two complementary lenses
    • Cluster View – macro patterns across 15 clusters (plus a curated Health super‑cluster).
    • Entity View – micro drill‑downs for a single public entity.
  • Machine Learning
    • K‑Means & Agglomerative clustering on entity behaviour.
    • Context‑aware BERTopic modelling of 125 k+ descriptions to surface procurement themes.
  • Dynamic UX – modal pop‑ups, cross‑filtering, category toggles, topic timelines.
  • One‑click deploy via Docker Compose; hot‑reload for development.

Table of Contents

  1. Features
  2. Quick Start
  3. Local Development
  4. Folder Structure
  5. Data Pipeline
  6. License

1. Features

📊 Interactive Visual Analytics

  • Real‑time filters for cluster, entity, year, category (Goods | Services | Construction).
  • Linked bar & line charts (awarded amount, tender counts, vendor concentration).
  • Modal drill‑downs with full tender meta‑data.
  • Hover & click callbacks for instant contextual narratives.

🧠 Machine‑Learning Modules

Task Algorithm Purpose
Clustering K‑Means (k = 15 via Elbow) + Agglomerative check Group entities by spend behaviour
Topic modelling BERTopic (BERT embeddings + HDBSCAN) Extract procurement themes & trend over time

2. Quick Start

Docker (recommended)

git clone https://github.com/samshad/public-tender-analysis-dashboard.git
cd public-tender-analysis-dashboard
docker-compose up --build
# open http://localhost:8050

Local Python

python -m venv .venv && source .venv/bin/activate   # or `.\.venv\Scripts\activate` on Windows
pip install -r requirements.txt
python app.py

3. Local Development

  • Hot‑reload: edit code and Dash restarts automatically.
  • Linting: ruff .

4. Folder Structure

├── app.py                # Dash entry‑point
├── docker-compose.yml    # One‑command deployment
├── Dockerfile            # Light‑weight image (python:3.12‑slim)
├── data/                 # Raw & cleaned tender CSVs
├── data_cleaning/        # Pre‑processing scripts
├── utils/                # ML helpers (clustering, topic model)
├── layouts/              # Reusable Dash layout builders
├── callbacks/            # All Dash callback wiring
├── visualizations/       # Plotly figure factories
├── assets/               # Dash static assets (CSS, images, icons)
└── requirements.txt

5. Data Pipeline

  1. Cleaning & Standardisation
    • Resolve 12 594 vendor spellings → 5 600 unique names.
    • Reduce 225 entity labels → 215 standardised entities.
    • Drop incomplete rows < $1 000 or with missing descriptions.
  2. Feature Engineering – tender duration, category dummies, inflation‑adjusted spend.
  3. ML – cluster entities, assign topics, persist artefacts.
  4. Dashboard – load artefacts, render interactive views.

6. License

MIT © 2025 Md Samshad Rahman

About

Visual‑analytics & ML pipeline for Nova Scotia Open Data tenders - clustering, BERTopic, and a one‑click Docker deploy.

Topics

Resources

License

Stars

Watchers

Forks