feat: Refactor LLM extraction pipeline to use a single structured JSON generation request#250
Open
kushu30 wants to merge 1 commit into
Open
feat: Refactor LLM extraction pipeline to use a single structured JSON generation request#250kushu30 wants to merge 1 commit into
kushu30 wants to merge 1 commit into
Conversation
…g a single LLM request
|
Hi @kushu30 , great to see interest in optimizing the extraction pipeline! I actually implemented this single-request batching architecture on March 10 in PR #210 (later refined and consolidated into PR #241). My implementation currently supports dynamic field merging across multiple templates and is already integrated into the end-to-end frontend. Happy to hear your thoughts on how we can align these efforts so we don't have overlapping logic in the core! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR refactors the LLM extraction pipeline to generate structured JSON using a single LLM request instead of issuing one request per field.
Previously, the system iterated through each template field and queried the LLM separately to extract values from the transcript. This resulted in multiple inference calls per form submission, increasing latency and introducing potential inconsistencies across responses.
The new implementation sends a single structured prompt containing all required fields and expects a JSON response from the model. This reduces the number of LLM calls and simplifies the extraction pipeline.
Motivation
The previous extraction approach performed sequential LLM requests for each field in the template:
This meant the number of LLM calls scaled linearly with the number of fields in the template.
Using a single structured extraction request improves performance, simplifies the pipeline, and ensures that the model has full context when generating the structured response.
Key Improvements
LLMclassImplementation
The extraction workflow was refactored inside the
LLMclass.Before:
After:
The structured prompt contains all required fields and instructs the model to return valid JSON.
Example extraction result:
{ "reporting_officer": "Officer Voldemort", "incident_location": "456 Oak Street", "amount_of_victims": "2", "victim_name_s": ["Mark Smith", "Jane Doe"], "assisting_officer": null }Files Changed
src/llm.pyTesting
The updated pipeline was tested locally using the existing API endpoints.
Unit tests were executed with:
Result:
Impact
This change reduces the number of LLM inference calls required for form extraction and simplifies the overall extraction workflow without altering the existing API interface.
The new approach improves performance and provides a cleaner foundation for future improvements to the extraction pipeline.