Skip to content

feat: Refactor LLM extraction pipeline to use a single structured JSON generation request#250

Open
kushu30 wants to merge 1 commit into
fireform-core:mainfrom
kushu30:feature-structured-extraction
Open

feat: Refactor LLM extraction pipeline to use a single structured JSON generation request#250
kushu30 wants to merge 1 commit into
fireform-core:mainfrom
kushu30:feature-structured-extraction

Conversation

@kushu30
Copy link
Copy Markdown

@kushu30 kushu30 commented Mar 14, 2026

Summary

This PR refactors the LLM extraction pipeline to generate structured JSON using a single LLM request instead of issuing one request per field.

Previously, the system iterated through each template field and queried the LLM separately to extract values from the transcript. This resulted in multiple inference calls per form submission, increasing latency and introducing potential inconsistencies across responses.

The new implementation sends a single structured prompt containing all required fields and expects a JSON response from the model. This reduces the number of LLM calls and simplifies the extraction pipeline.

Motivation

The previous extraction approach performed sequential LLM requests for each field in the template:

  • reporting_officer
  • incident_location
  • amount_of_victims
  • victim_name_s
  • assisting_officer

This meant the number of LLM calls scaled linearly with the number of fields in the template.

Using a single structured extraction request improves performance, simplifies the pipeline, and ensures that the model has full context when generating the structured response.

Key Improvements

  • Replaced multiple sequential LLM requests with a single structured extraction call
  • Simplified extraction logic within the LLM class
  • Improved extraction consistency by providing the model with the full schema
  • Reduced inference overhead for form processing
  • Added structured prompt generation for JSON output

Implementation

The extraction workflow was refactored inside the LLM class.

Before:

Transcript
   ↓
Loop through fields
   ↓
LLM request per field
   ↓
Build JSON response
   ↓
Fill PDF

After:

Transcript
   ↓
Single LLM request
   ↓
Structured JSON response
   ↓
Validate and parse JSON
   ↓
Fill PDF

The structured prompt contains all required fields and instructs the model to return valid JSON.

Example extraction result:

{
  "reporting_officer": "Officer Voldemort",
  "incident_location": "456 Oak Street",
  "amount_of_victims": "2",
  "victim_name_s": ["Mark Smith", "Jane Doe"],
  "assisting_officer": null
}

Files Changed

src/llm.py

Testing

The updated pipeline was tested locally using the existing API endpoints.

  1. Created a template using:
POST /templates/create
  1. Generated a filled form using:
POST /forms/fill
  1. Verified that:
  • the LLM extraction runs successfully using a single request
  • structured JSON is generated correctly
  • the filled PDF form is produced as expected

Unit tests were executed with:

PYTHONPATH=. pytest

Result:

2 passed

Impact

This change reduces the number of LLM inference calls required for form extraction and simplifies the overall extraction workflow without altering the existing API interface.

The new approach improves performance and provides a cleaner foundation for future improvements to the extraction pipeline.

@utkarshqz
Copy link
Copy Markdown

Hi @kushu30 , great to see interest in optimizing the extraction pipeline! I actually implemented this single-request batching architecture on March 10 in PR #210 (later refined and consolidated into PR #241). My implementation currently supports dynamic field merging across multiple templates and is already integrated into the end-to-end frontend. Happy to hear your thoughts on how we can align these efforts so we don't have overlapping logic in the core!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants