🩺 Clinical Insight Engine

Clinical Decision Support for Preventive Diabetes Risk Assessment

Interpretable ML + Modern React — built for clinicians and patients alike

A full-stack clinical decision support system that surfaces early diabetes risk signals from routine patient data.
Combines an interpretable ML model with a modern React frontend, presenting results tailored for both clinicians and patients.

Warning

Medical Disclaimer — This system is intended for educational and research purposes only. It does not provide medical diagnoses and should not be used as a substitute for professional medical advice.

💡 Why Clinical Insight Engine?

Diabetes affects over 500 million adults worldwide, yet early risk signals are often buried in routine clinical data. Clinical Insight Engine bridges that gap:

Problem	Our Approach
Risk models are opaque black boxes	Interpretable Logistic Regression with per-feature impact scores
Results are one-size-fits-all	Dual-view output — detailed for clinicians, simplified for patients
Predictions lack context	Confidence-aware assessments with actionable follow-up recommendations
Patient data sits in silos	Longitudinal tracking with full assessment history

✨ Key Features

🧾 Risk Assessment Form

Collects clinically relevant inputs:

Age · Gender · Hypertension · Heart Disease · Smoking History · BMI · HbA1c · Blood Glucose

👥 Dual-View Results

🩻 Clinician View

Exact risk percentage (0–100%)
Top contributing factors with impact scores
Model confidence indicators
Suggested clinical follow-up actions
Interactive factor contribution charts

🧑‍⚕️ Patient View

Simplified category: LOW / MODERATE / HIGH
Plain-language explanation of risk drivers
Personalized preventive lifestyle guidance

🕒 Assessment History

Stores assessments with full timestamps
Enables longitudinal patient risk tracking over time

📊 Data Visualization

Interactive bar charts for factor contributions
Diabetes correlation heatmap for data exploration

🏗 Architecture

graph TB
    subgraph Client["🖥️ Client — React + TypeScript"]
        UI["Risk Assessment Form"]
        CV["Clinician View"]
        PV["Patient View"]
        VIZ["Data Visualizations"]
        HIST["Assessment History"]
    end

    subgraph Server["⚙️ Server — Express.js"]
        API["REST API Routes"]
        VAL["Zod Validation"]
        ORM["Drizzle ORM"]
        PY["Python Bridge"]
    end

    subgraph ML["🧠 ML Pipeline — Python"]
        PROC["Data Preprocessing"]
        MODEL["Logistic Regression"]
        INTERP["Feature Interpretation"]
        CACHE["Model Cache (pickle)"]
    end

    subgraph DB["🗄️ PostgreSQL"]
        ASSESS["Assessments Table"]
    end

    Client -->|"HTTP Requests"| API
    API --> VAL --> ORM
    API --> PY -->|"spawn process"| ML
    ORM --> DB
    ML -->|"risk scores + factors"| PY
    CACHE -.->|"load cached model"| MODEL

🛠 Tech Stack

Layer	Technology	Purpose
Frontend	React 18 + TypeScript	UI framework with type safety
	Vite	Lightning-fast dev server & bundler
	Tailwind CSS	Utility-first styling with dark mode
	TanStack Query	Server state & cache management
	React Hook Form + Zod	Form handling with schema validation
	Recharts	Interactive data visualizations
	Framer Motion	Smooth UI animations
Backend	Express.js	REST API server
	Drizzle ORM	Type-safe database queries
	PostgreSQL 14+	Relational data storage
	Zod	Runtime schema validation
ML Pipeline	Python 3.10+	ML runtime environment
	scikit-learn	Logistic Regression model
	pandas / NumPy	Data manipulation & preprocessing
	pickle	Model & scaler caching

🚀 Getting Started

Prerequisites

Tool	Version	Check	Download
Node.js	18+ LTS	`node -v`	nodejs.org
npm	9+	`npm -v`	bundled with Node
Python	3.10+	`python3 --version`	python.org
PostgreSQL	14+	`psql --version`	postgresql.org
Git	Any	`git --version`	git-scm.com
Docker	20+	`docker --version`	docker.com
Docker Compose	2+	`docker compose version`	bundled with Docker

🐳 Fast Setup with Docker (Recommended)

If you have Docker installed, you can skip the manual installation of Node.js, Python, and PostgreSQL entirely. Running the application requires just a single command.

1. Launching the App

Simply run the following command in the project root:

docker compose up

This command will:

Spin up a PostgreSQL 16 database container with persistent storage.
Build the app container including Node.js 20 and a Python 3 virtual environment with all scikit-learn/pandas dependencies.
Wait for the database to be healthy, then run migrations (npm run db:push).
Automatically seed the database with sample clinical assessments (in development mode).
Launch the full-stack server with live-reloading (HMR) enabled.

Once started, open your browser and navigate to:

Web App & REST API: http://localhost:3000

2. Stop the App

To stop the services while preserving your data:

docker compose down

To stop the services and completely reset the database (deleting persistent volumes):

docker compose down -v

3. Rebuilding after Updates

If you update package.json or requirements.txt dependencies, trigger a clean rebuild:

docker compose up --build

⚙️ Manual Installation & Setup

1. 📥 Clone & Install

git clone https://github.com/gopaljilab/Clinical-Insight-Engine.git
cd Clinical-Insight-Engine
npm install

2. 🔐 Environment Configuration

Linux / macOS

cp .env.example .env

Windows (PowerShell)

Copy-Item .env.example .env

Windows (Command Prompt)

copy .env.example .env

If .env.example doesn't exist, create .env manually and add:

DATABASE_URL=postgresql://postgres:postgres@localhost:5432/clinical_insight_engine

🧪 Developer Authentication Setup (optional)

For local frontend authentication testing, create a .env.local file (git-ignored):

NODE_ENV=development
NEXT_PUBLIC_APP_URL=http://localhost:3000

DEV_CLINICIAN_EMAIL=developer@cardioguard.local
DEV_CLINICIAN_PASSWORD=DevSecurePassword123!

NEXT_PUBLIC_LOCAL_ENCRYPTION_KEY=your_local_32_character_secret_key_here

Rules of thumb:

🔒 .env → database & server secrets only

🔒 .env.local → local seeded credentials only (never commit)

Restart the dev server after editing .env.local so Vite reloads variables

Never paste demo credentials into UI, docs, screenshots, or PRs

🖥️ Local Login Workflow

Start the app with npm run dev
Open http://localhost:5173
Click Login or Go to App
Enter your .env.local seeded credentials
Complete the simulated OTP step
You'll be redirected to /dashboard

In development mode, the login form shows a small amber notice reminding you to use local seeded credentials. This banner and the DEV_* variables are never exposed in production builds.

3. 🗄 Database Setup

🐧 Linux (Ubuntu / Debian)

# Install PostgreSQL
sudo apt update && sudo apt install postgresql postgresql-contrib

# Start & enable the service
sudo systemctl start postgresql
sudo systemctl enable postgresql

# Create database & set password
sudo -u postgres psql -c "ALTER USER postgres WITH PASSWORD 'postgres';"
sudo -u postgres psql -c "CREATE DATABASE clinical_insight_engine;"

🍎 macOS (Homebrew)

# Install PostgreSQL
brew install postgresql

# Start the service
brew services start postgresql

# Create database & set password
psql postgres -c "ALTER USER postgres WITH PASSWORD 'postgres';"
psql postgres -c "CREATE DATABASE clinical_insight_engine;"

🪟 Windows

Download and install PostgreSQL from postgresql.org/download/windows
During installation, use:
- Username: postgres
- Password: postgres
- Port: 5432
Create a database named clinical_insight_engine using pgAdmin or the PostgreSQL CLI.
Update your .env file:

DATABASE_URL=postgresql://postgres:postgres@localhost:5432/clinical_insight_engine

Push the database schema:

npm run db:push

The server runs a PostgreSQL preflight check on startup. If you see Database startup check failed, verify that:

PostgreSQL service is running

DATABASE_URL in .env is correct

The migration above has been run

Port 5432 is not blocked

4. 🐍 Python Environment

🐧 Linux / 🍎 macOS

# Create virtual environment
python3 -m venv .venv

# Activate
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

🪟 Windows (PowerShell)

# Create virtual environment
py -m venv .venv

# Activate
.\.venv\Scripts\Activate.ps1

# Install dependencies
pip install -r requirements.txt

5. 📊 Dataset Preparation

If the dataset already exists in the project:

# Linux / macOS
cp attached_assets/diabetes_dataset.csv ./diabetes_dataset.csv

# Windows (PowerShell)
Copy-Item attached_assets/diabetes_dataset.csv ./diabetes_dataset.csv

If the dataset is missing, generate synthetic data:

# Linux / macOS
python3 -c "from analyze import create_synthetic_data; create_synthetic_data()"

# Windows
py -c "from analyze import create_synthetic_data; create_synthetic_data()"

6. 🚀 Launch

# Start the full-stack dev server
npm run dev

Service	URL
Frontend	http://localhost:5173
Backend API	http://localhost:3000

7. 🛑 Shutting Down

Stop the dev server:

Ctrl + C

Deactivate the Python environment:

deactivate

📁 Project Structure

Clinical-Insight-Engine/
│
├── client/                        # React frontend
│   └── src/
│       ├── components/            # Reusable UI components
│       ├── pages/                 # Route-level page components
│       ├── hooks/                 # Custom React hooks
│       │   ├── use-assessments.ts # TanStack Query hooks for API calls
│       │   └── use-toast.ts       # Toast notification state
│       ├── lib/                   # Utilities & API client
│       │   ├── queryClient.ts     # Global fetch config + React Query setup
│       │   └── utils.ts           # cn() Tailwind class merge utility
│       └── utils/
│           ├── search_filters.ts  # Patient search & filter logic
│           └── date_fix.ts        # Safe date parser helper
│
├── server/                        # Express.js backend
│   ├── index.ts                   # Server entry point & startup
│   ├── routes.ts                  # API route definitions
│   ├── storage.ts                 # Data access layer (DB queries)
│   ├── db.ts                      # Drizzle ORM + PostgreSQL pool
│   ├── static.ts                  # Serves built React frontend
│   ├── vite.ts                    # Vite dev server integration (HMR)
│   └── db_fix.ts                  # Clean process exit on DB errors
│
├── shared/                        # Shared between client & server
│   ├── schema.ts                  # Drizzle DB schema + Zod types
│   └── routes.ts                  # Shared API request/response schemas
│
├── script/
│   └── build.ts                   # esbuild + Vite production build script
│
├── attached_assets/               # Static assets (dataset, images)
│   └── diabetes_dataset.csv
│
├── analyze.py                     # ML pipeline — training & inference
├── main.py                        # Python entry point
├── diabetes_dataset.csv           # Training dataset (root copy)
├── correlation_heatmap.png        # Diabetes feature correlation heatmap
├── patient.json                   # Sample patient input for CLI prediction
│
├── drizzle.config.ts              # Drizzle ORM configuration
├── vite.config.ts                 # Vite bundler configuration
├── tailwind.config.ts             # Tailwind CSS configuration
├── tsconfig.json                  # TypeScript configuration
├── postcss.config.js              # PostCSS configuration
├── components.json                # shadcn/ui component registry
├── pyproject.toml                 # Python project metadata
├── requirements.txt               # Python dependencies
├── package.json                   # Node.js dependencies & scripts
├── package-lock.json              # Locked dependency versions
├── uv.lock                        # uv Python lock file
│
├── README.md                      # Project documentation
├── ANALYSIS_README.md             # ML analysis documentation
├── CONTRIBUTING.md                # Contribution guidelines
└── CODE_OF_CONDUCT.md             # Community code of conduct

📡 API Reference

Method	Endpoint	Description
`GET`	`/health`	Application health check endpoint for monitoring
`POST`	`/api/assessments`	Submit a new risk assessment
`GET`	`/api/assessments`	Retrieve assessment history
`GET`	`/api/assessments/:id`	Get a specific assessment by ID
`POST`	`/api/ingest/fhir`	Ingest a FHIR R4 JSON bundle

Example Request

# Health Check
curl -X GET http://localhost:3000/health

# Submit Assessment
curl -X POST http://localhost:3000/api/assessments \
  -H "Content-Type: application/json" \
  -d '{
    "gender": "Female",
    "age": 52,
    "hypertension": true,
    "heartDisease": false,
    "smokingHistory": "former",
    "bmi": 30.1,
    "hba1cLevel": 6.4,
    "bloodGlucoseLevel": 148
  }'

FHIR Ingestion & Explainable Insights

Allows submitting standard FHIR R4 JSON bundles containing patient demographic details, clinical vitals/lab values, and clinical notes.

Supported Resources

Patient: Extracts id, name, gender (mapped to Male/Female), and calculates patient age from birthDate.
Observation: Extracts clinical values such as BMI, HbA1c, Blood Glucose, and flags hypertension and heartDisease using LOINC codes and display terms.
DocumentReference: Extracts note titles, descriptions, and decoded base64 attachments, merging them into a unified clinical note transcript.

💡 Explainable Insights (Source Citation & Highlighting)

To ensure clinical decisions are traceable and verifiable, the pipeline extracts source citations for key clinical features. When note text is found in DocumentReference entries, the parser:

Performs regex/vitals and keyword scanning for Hypertension (e.g. BP measurements like 145/90 or keywords like hypertension), Heart Disease (e.g. CAD, myocardial infarction), and Smoking History (e.g. former smoker, never smoked).
Extracts the exact sentence snippet enclosing the evidence (source_snippet).
Computes the zero-indexed character bounds [start, end] within the raw concatenated text (source_index).
If no evidence is found, these values are returned as null.

Example API Response Payload

A successful FHIR ingestion response returns the extracted clinical note and explainable insights:

{
  "status": "success",
  "id": 42,
  "clinical_note": "Routine visit. BP reading 145/95 noted. Quit smoking last year.",
  "explainable_insights": [
    {
      "insight": "Patient shows signs of hypertension",
      "source_snippet": "BP reading 145/95 noted",
      "source_index": [15, 38]
    },
    {
      "insight": "Patient shows signs of heart disease",
      "source_snippet": null,
      "source_index": null
    },
    {
      "insight": "Patient has a smoking history (former)",
      "source_snippet": "Quit smoking last year",
      "source_index": [40, 62]
    }
  ]
}

🖥️ Interactive Note Viewer

On the Clinician View tab of the results page, the clinical note is rendered in an interactive viewer:

Interactive Highlights: Clicking any cited insight automatically highlights the matching text in the note.
Auto-Scroll: The highlighted source text is scrolled smoothly into view.
Keyboard Navigation:
- Use Arrow Down / Arrow Right to move to the next cited insight.
- Use Arrow Up / Arrow Left to move to the previous cited insight.
- Press Escape to clear the selection and highlight.

Example Request

curl -X POST http://localhost:3000/api/ingest/fhir \
  -H "Content-Type: application/json" \
  -d '{
    "resourceType": "Bundle",
    "type": "collection",
    "entry": [
      {
        "resource": {
          "resourceType": "Patient",
          "id": "pat-123",
          "name": [
            {
              "use": "official",
              "given": ["John", "Edward"],
              "family": "Smith"
            }
          ],
          "gender": "male",
          "birthDate": "1980-01-01"
        }
      },
      {
        "resource": {
          "resourceType": "Observation",
          "code": {
            "coding": [
              {
                "system": "http://loinc.org",
                "code": "39156-5",
                "display": "Body Mass Index"
              }
            ]
          },
          "valueQuantity": {
            "value": 24.5,
            "unit": "kg/m2"
          }
        }
      }
    ]
  }'

🧠 ML Pipeline

The machine learning pipeline (analyze.py) implements an interpretable risk assessment model:

graph LR
    A["📂 Raw Data"] --> B["🧹 Cleaning & Validation"]
    B --> C["⚙️ Feature Engineering"]
    C --> D["📏 StandardScaler"]
    D --> E["📊 Logistic Regression"]
    E --> F["🎯 Risk Score 0–100%"]
    E --> G["📋 Feature Importance"]
    F --> H["💾 Cached Model"]
    G --> H

Step	Details
Data Cleaning	Filters unrealistic values (BMI < 10, glucose < 50, HbA1c < 3) and replaces with medians
Encoding	Gender → binary; Smoking history → one-hot encoding
Scaling	`StandardScaler` on age, BMI, HbA1c, blood glucose
Model	`LogisticRegression` with balanced class weights
Caching	Trained model + scaler serialized via `pickle` for fast inference

Train the Model (Optional)

# Linux/macOS
python3 analyze.py

# Windows
py analyze.py

🔬 Single-Patient Prediction (CLI)

Create a patient JSON file:

{
  "gender": "Female",
  "age": 52,
  "hypertension": true,
  "heartDisease": false,
  "smokingHistory": "former",
  "bmi": 30.1,
  "hba1cLevel": 6.4,
  "bloodGlucoseLevel": 148
}

Run prediction:

# Linux/macOS
python3 analyze.py predict_file patient.json

# Windows
py analyze.py predict_file patient.json

🔑 Environment Variables

Variable	File	Description
`DATABASE_URL`	`.env`	PostgreSQL connection string
`NODE_ENV`	`.env.local`	Set to `development` for local dev features
`SESSION_SECRET`	`.env`	Required in production for signed Express sessions
`DEV_CLINICIAN_EMAIL`	`.env.local`	Seeded clinician email (dev only)
`DEV_CLINICIAN_PASSWORD`	`.env.local`	Seeded clinician password (dev only)
`NEXT_PUBLIC_LOCAL_ENCRYPTION_KEY`	`.env.local`	Local encryption key (dev only)
`ENABLE_PHI_REDACTION`	`.env`	Enable privacy-preserving PHI redaction (defaults to `true`)

Security: .env.local is git-ignored and should never be committed. Production builds do not expose dev credentials.

Request limits: JSON and URL-encoded API payloads are limited to 256kb by default. Add route-specific upload handling before increasing this global limit. Production sessions: When the app runs behind a TLS-terminating reverse proxy or load balancer, Express trusts one proxy hop in production so secure session cookies are issued from X-Forwarded-Proto: https requests.

❓ Troubleshooting

"PostgreSQL is unreachable"

Verify PostgreSQL is running: sudo systemctl status postgresql (Linux) or brew services list (macOS)
Confirm DATABASE_URL in .env matches your local credentials
Ensure port 5432 is not blocked by another process
Check that the clinical_insight_engine database exists

"Database startup check failed"

Run npm run db:push to create/update the required tables
Verify your .env file is in the project root (not inside server/ or client/)

Python model errors

Ensure the virtual environment is activated: source .venv/bin/activate
Verify dependencies: pip install -r requirements.txt
If diabetes_dataset.csv is missing, copy it: cp attached_assets/diabetes_dataset.csv ./
Or generate synthetic data: python3 -c "from analyze import create_synthetic_data; create_synthetic_data()"

Port conflicts

The dev server defaults to port 5173 (Vite)
If occupied, Vite will automatically pick the next available port
Check for processes: lsof -i :5173 (Linux/macOS) or netstat -ano | findstr :5173 (Windows)

🗺 Roadmap

📈 Longitudinal patient risk tracking across visits
💡 Counterfactual reasoning — "What single change reduces risk most?"
🔬 Cohort discovery and population-level insights
🏥 Integration with Electronic Health Records (EHR)
⚖️ Advanced bias detection and ML fairness metrics
☁️ Cloud deployment (Vercel / Render)

🤝 Contributing

We love contributions! Whether it's a bug fix, a new feature, or improved docs — every PR makes a difference.

Fork the repository
Create your feature branch (git checkout -b feat/amazing-feature)
Commit your changes (git commit -m 'feat: add amazing feature')
Push to the branch (git push origin feat/amazing-feature)
Open a Pull Request

Please read our Contributing Guide and Code of Conduct before submitting.

👥 Contributors

👤 Author -

Gopal Gupta Computer Science Engineer · Full-Stack Developer · Data Science & ML Enthusiast

Built with ❤️ for better preventive healthcare

⭐ Star this repo if you find it useful — it helps others discover the project!

GSSoC Drizzle Migrations Policy

All schema changes must go through drizzle-kit generate.

✨ README Improvement Notes

📌 Formatting Enhancements Needed

Improve heading hierarchy for better readability
Ensure consistent spacing between sections
Use proper Markdown formatting for code blocks and lists
Align all installation and usage steps properly

🚀 Suggested Structure Upgrade

Introduction
Features
Tech Stack
Installation
Usage
Project Structure
Contribution Guidelines
License

🛠️ Documentation Improvements

Add badges (optional): build, license, contributors
Add screenshots for better UI understanding
Standardize code blocks for commands

🎯 Goal

Improve onboarding experience for new contributors and users by making README more structured, readable, and professional.

Name		Name	Last commit message	Last commit date
Latest commit History 641 Commits
.agents/rules		.agents/rules
.cursor		.cursor
.github		.github
.jetro		.jetro
app		app
attached_assets		attached_assets
client		client
coverage		coverage
diagnosed-issues		diagnosed-issues
docs		docs
migrations		migrations
script		script
scripts		scripts
server		server
services		services
shared		shared
tests		tests
validation		validation
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json		.mcp.json
.nvmrc		.nvmrc
.prettierrc		.prettierrc
.windsurfrules		.windsurfrules
AGENT.md		AGENT.md
ANALYSIS_README.md		ANALYSIS_README.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
analyze.py		analyze.py
clinical_insight_banner.png		clinical_insight_banner.png
components.json		components.json
correlation_heatmap.png		correlation_heatmap.png
create_issues.sh		create_issues.sh
diabetes_dataset.csv		diabetes_dataset.csv
diabetes_model.pkl.sig		diabetes_model.pkl.sig
docker-compose.yml		docker-compose.yml
drizzle.config.ts		drizzle.config.ts
job_logs.txt		job_logs.txt
main.py		main.py
my_changes.patch		my_changes.patch
package-lock.json		package-lock.json
package.json		package.json
patch_backend.js		patch_backend.js
patient.json		patient.json
playwright.config.ts		playwright.config.ts
postcss.config.cjs		postcss.config.cjs
pr_body.md		pr_body.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
replit.md		replit.md
requirements.txt		requirements.txt
tailwind.config.cjs		tailwind.config.cjs
tsc-output-2.txt		tsc-output-2.txt
tsconfig.json		tsconfig.json
typedoc.json		typedoc.json
uv.lock		uv.lock
vercel.json		vercel.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts
vitest.workspace.ts		vitest.workspace.ts

Folders and files

Latest commit

History

Repository files navigation

🩺 Clinical Insight Engine

Clinical Decision Support for Preventive Diabetes Risk Assessment

📑 Table of Contents

💡 Why Clinical Insight Engine?

✨ Key Features

🧾 Risk Assessment Form

👥 Dual-View Results

🕒 Assessment History

📊 Data Visualization

🏗 Architecture

🛠 Tech Stack

🚀 Getting Started

Prerequisites

🐳 Fast Setup with Docker (Recommended)

1. Launching the App

2. Stop the App

3. Rebuilding after Updates

⚙️ Manual Installation & Setup

1. 📥 Clone & Install

2. 🔐 Environment Configuration

🖥️ Local Login Workflow

3. 🗄 Database Setup

4. 🐍 Python Environment

5. 📊 Dataset Preparation

6. 🚀 Launch

7. 🛑 Shutting Down

📁 Project Structure

📡 API Reference

Example Request

FHIR Ingestion & Explainable Insights

Supported Resources

💡 Explainable Insights (Source Citation & Highlighting)

Example API Response Payload

🖥️ Interactive Note Viewer

Example Request

🧠 ML Pipeline

Train the Model (Optional)

🔬 Single-Patient Prediction (CLI)

🔑 Environment Variables

❓ Troubleshooting

🗺 Roadmap

🤝 Contributing

👥 Contributors

👤 Author -

GSSoC Drizzle Migrations Policy

✨ README Improvement Notes

📌 Formatting Enhancements Needed

🚀 Suggested Structure Upgrade

🛠️ Documentation Improvements

🎯 Goal

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages