📝 Description
Currently, the extraction pipeline in src/llm.py relies on sequential API calls (one per field) and fragile manual string parsing (e.g., value.strip().replace('"', "") inside add_response_to_json) to clean up the LLM's output.
This approach has two major drawbacks:
- Performance bottleneck: A form with 20 fields requires 20 separate inferences, sending the same transcript context to the local LLM repeatedly.
- Fragility: If the LLM hallucinates formatting, wraps its answer in markdown, or provides unhandled string variations, the string parsing fails, resulting in corrupted PDF data.
💡 Rationale
To make FireForm production-ready and reliable, the system shouldn't guess the shape of the LLM's response. By integrating pydantic, we can dynamically generate a JSON schema based on the PDF's target fields and pass it directly to Ollama's native structured output capabilities. This forces the LLM to return a strictly typed, fully mapped JSON object in a single pass.
🛠️ Proposed Solution
- Add
pydantic>=2.0.0 to the project dependencies.
- Overhaul
src/llm.py to dynamically generate a Pydantic create_model using the keys from self._target_fields.
- Convert the Pydantic model to a JSON Schema and pass it via the
format parameter in the Ollama API payload.
- Remove the legacy
add_response_to_json string-cleaning logic entirely, as the API response will be guaranteed to match the exact JSON dictionary expected by filler.py.
- Consolidate the
main_loop into a single, comprehensive API call rather than iterating over fields.
✅ Acceptance Criteria
📌 Additional Context
This architectural shift completely eliminates "garbage-in, garbage-out" scenarios and will drastically reduce extraction latency by batching the inference.
📝 Description
Currently, the extraction pipeline in
src/llm.pyrelies on sequential API calls (one per field) and fragile manual string parsing (e.g.,value.strip().replace('"', "")insideadd_response_to_json) to clean up the LLM's output.This approach has two major drawbacks:
💡 Rationale
To make FireForm production-ready and reliable, the system shouldn't guess the shape of the LLM's response. By integrating
pydantic, we can dynamically generate a JSON schema based on the PDF's target fields and pass it directly to Ollama's native structured output capabilities. This forces the LLM to return a strictly typed, fully mapped JSON object in a single pass.🛠️ Proposed Solution
pydantic>=2.0.0to the project dependencies.src/llm.pyto dynamically generate a Pydanticcreate_modelusing the keys fromself._target_fields.formatparameter in the Ollama API payload.add_response_to_jsonstring-cleaning logic entirely, as the API response will be guaranteed to match the exact JSON dictionary expected byfiller.py.main_loopinto a single, comprehensive API call rather than iterating over fields.✅ Acceptance Criteria
pydanticis integrated for schema generation.main_loopinsrc/llm.pyexecutes a single API request for all fields simultaneously.📌 Additional Context
This architectural shift completely eliminates "garbage-in, garbage-out" scenarios and will drastically reduce extraction latency by batching the inference.