name: 🚀 Feature Request
about: Suggest an idea or a new capability for FireForm.
title: "[FEAT]: Batch LLM extraction to reduce N API calls to 1"
labels: enhancement
assignees: ''
📝 Description
Currently, LLM.main_loop() makes one Ollama API call per form field.
For a form with N fields, this results in:
- N prompts
- N network round-trips
- N model inference cycles
- N JSON parses
This creates a scalability bottleneck, especially for multi-form submissions.
We propose adding a batch extraction method that sends all target fields in a single structured prompt and parses a single JSON response.
💡 Rationale
At scale, sequential per-field API calls significantly increase latency and infrastructure load.
Example:
- 1 form × 10 fields → 10 LLM calls
- 3 forms × 10 fields → 30 LLM calls
Reducing this to a single API call per form would:
- Improve performance
- Reduce response time
- Lower model inference overhead
- Improve scalability for multi-agency submissions
🛠️ Proposed Solution
Introduce a new method, e.g., LLM.main_loop_batch():
✅ Acceptance Criteria
📌 Additional Context
Before:
N fields → N API calls
After:
N fields → 1 API call
This change significantly improves performance while maintaining backward compatibility.
name: 🚀 Feature Request
about: Suggest an idea or a new capability for FireForm.
title: "[FEAT]: Batch LLM extraction to reduce N API calls to 1"
labels: enhancement
assignees: ''
📝 Description
Currently,
LLM.main_loop()makes one Ollama API call per form field.For a form with
Nfields, this results in:This creates a scalability bottleneck, especially for multi-form submissions.
We propose adding a batch extraction method that sends all target fields in a single structured prompt and parses a single JSON response.
💡 Rationale
At scale, sequential per-field API calls significantly increase latency and infrastructure load.
Example:
Reducing this to a single API call per form would:
🛠️ Proposed Solution
Introduce a new method, e.g.,
LLM.main_loop_batch():Build one structured prompt containing:
Instruct the model to return strict JSON
Parse once
Strip markdown code fences (```json) if present
Handle
nulland list valuesGracefully fall back to
main_loop()if JSON parsing failsLogic change in
src/New prompt for Mistral/Ollama
Unit tests in
tests/test_llm.pyUpdate
src/filler.pyto use batch method✅ Acceptance Criteria
docs/📌 Additional Context
Before:
N fields → N API calls
After:
N fields → 1 API call
This change significantly improves performance while maintaining backward compatibility.