Skip to content

feat(backend): add confidence score for each extracted field#66

Open
Godatcode wants to merge 2 commits into
fireform-core:mainfrom
Godatcode:feat/60-confidence-score-extraction
Open

feat(backend): add confidence score for each extracted field#66
Godatcode wants to merge 2 commits into
fireform-core:mainfrom
Godatcode:feat/60-confidence-score-extraction

Conversation

@Godatcode

Copy link
Copy Markdown

Summary

Closes #60

  • Add a confidence score (0.0–1.0) to each LLM-extracted field, changing the output format from raw values to {"value": "...", "confidence": 0.92}
  • Update the Ollama prompt and payload ("format": "json") to request structured JSON responses
  • Add parse_llm_response() with clamping and graceful fallback for malformed responses
  • Log [WARNING] for fields with confidence < 0.5
  • Ensure Fill.fill_form() still passes plain strings to PDF filling

Changes

  • src/backend.py: Updated build_prompt(), main_loop(), add_response_to_json(); added parse_llm_response() and get_confidence_report()
  • src/test/test_confidence.py: 17 unit tests covering JSON parsing, confidence clamping, fallback behavior, storage format, and warning output

Test plan

  • All 17 unit tests pass: PYTHONPATH=src python -m pytest src/test/test_confidence.py -v
  • End-to-end test with Ollama: verify [LOG] Resulting JSON shows {"value": ..., "confidence": ...} format
  • Verify [WARNING] appears for low-confidence fields
  • Verify filled PDF contains correct plain string values (not stringified dicts)

…m-core#60)

Extend LLM extraction to return a confidence score (0-1) alongside each
extracted value, enabling detection of uncertain extractions and
supporting future human-review workflows.

- Update build_prompt() to request JSON {value, confidence} format
- Add "format": "json" to Ollama payload for reliable JSON output
- Add parse_llm_response() with clamping and graceful fallback
- Update add_response_to_json() to store {value, confidence} dicts
- Log [WARNING] for fields with confidence < 0.5
- Update Fill.fill_form() to extract .value so PDF filling still works
- Add get_confidence_report() convenience method
- Add 17 unit tests covering parsing, clamping, fallback, and storage
@Godatcode

Godatcode commented Feb 25, 2026

Copy link
Copy Markdown
Author

@juanalvv @vharkins @marcvergees — requesting your review on this PR. It adds confidence scores to the LLM extraction pipeline (issue #60). Happy to address any feedback!

@Godatcode

Copy link
Copy Markdown
Author

@juanalvv @vharkins1 @marcvergees Could you review this when you get a chance? Happy to address any feedback.

- Route low-confidence and JSON parse failure warnings to stderr
  instead of stdout for cleaner output separation
- Add warning log when LLM response fails JSON parsing
- Update tests to check stderr, add parse failure warning test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT]: Add Confidence Score for Each Extracted Field

1 participant