feat(backend): add confidence score for each extracted field by Godatcode · Pull Request #66 · fireform-core/FireForm

Godatcode · 2026-02-25T10:39:17Z

Summary

Closes #60

Add a confidence score (0.0–1.0) to each LLM-extracted field, changing the output format from raw values to {"value": "...", "confidence": 0.92}
Update the Ollama prompt and payload ("format": "json") to request structured JSON responses
Add parse_llm_response() with clamping and graceful fallback for malformed responses
Log [WARNING] for fields with confidence < 0.5
Ensure Fill.fill_form() still passes plain strings to PDF filling

Changes

src/backend.py: Updated build_prompt(), main_loop(), add_response_to_json(); added parse_llm_response() and get_confidence_report()
src/test/test_confidence.py: 17 unit tests covering JSON parsing, confidence clamping, fallback behavior, storage format, and warning output

Test plan

All 17 unit tests pass: PYTHONPATH=src python -m pytest src/test/test_confidence.py -v
End-to-end test with Ollama: verify [LOG] Resulting JSON shows {"value": ..., "confidence": ...} format
Verify [WARNING] appears for low-confidence fields
Verify filled PDF contains correct plain string values (not stringified dicts)

…m-core#60) Extend LLM extraction to return a confidence score (0-1) alongside each extracted value, enabling detection of uncertain extractions and supporting future human-review workflows. - Update build_prompt() to request JSON {value, confidence} format - Add "format": "json" to Ollama payload for reliable JSON output - Add parse_llm_response() with clamping and graceful fallback - Update add_response_to_json() to store {value, confidence} dicts - Log [WARNING] for fields with confidence < 0.5 - Update Fill.fill_form() to extract .value so PDF filling still works - Add get_confidence_report() convenience method - Add 17 unit tests covering parsing, clamping, fallback, and storage

Godatcode · 2026-02-25T10:47:07Z

@juanalvv @vharkins @marcvergees — requesting your review on this PR. It adds confidence scores to the LLM extraction pipeline (issue #60). Happy to address any feedback!

Godatcode · 2026-02-25T11:11:16Z

@juanalvv @vharkins1 @marcvergees Could you review this when you get a chance? Happy to address any feedback.

- Route low-confidence and JSON parse failure warnings to stderr instead of stdout for cleaner output separation - Add warning log when LLM response fails JSON parsing - Update tests to check stderr, add parse failure warning test

Godatcode mentioned this pull request Feb 25, 2026

[FEAT]: Add Confidence Score for Each Extracted Field #60

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): add confidence score for each extracted field#66

feat(backend): add confidence score for each extracted field#66
Godatcode wants to merge 2 commits into
fireform-core:mainfrom
Godatcode:feat/60-confidence-score-extraction

Godatcode commented Feb 25, 2026

Uh oh!

Godatcode commented Feb 25, 2026 •

edited

Loading

Uh oh!

Godatcode commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Godatcode commented Feb 25, 2026

Summary

Changes

Test plan

Uh oh!

Godatcode commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Godatcode commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Godatcode commented Feb 25, 2026 •

edited

Loading