Skip to content

[FEAT]: Optimize LLM Generation from O(N) to O(1) batch requests in llm.py #196

@saurabh12nxf

Description

@saurabh12nxf

name: 🚀 Feature Request
about: Suggest an idea or a new capability for FireForm.
title: "[FEAT]: "
labels: enhancement
assignees: ''


📝 Description

Generating a PDF form currently takes a significant amount of time, particularly when a template has many fields. I noticed that src/llm.py's main_loop() function loops over every field sequentially and makes a brand new HTTP request to the Ollama API for each individual field. This creates an O(N) performance bottleneck.

💡 Rationale

An O(N) approach is very resource-intensive because it forces the local LLM to completely re-read and re-process the incident context N times. If a firefighter uploads a form with 20 fields, the user could be waiting over a minute.

🛠️ Proposed Solution

We can optimize this into an O(1) operation by requesting all fields simultaneously in a single batch request.

  • Logic change in src/
  • Update to requirements.txt
  • New prompt for Mistral/Ollama

✅ Acceptance Criteria

How will we know this is finished?

  • Feature works in Docker container.
  • Documentation updated in docs/.
  • JSON output validates against the schema.

📌 Additional Context

I implemented this fix locally using llama3. For a standard 7-field form, generation time dropped to ~17 seconds total by eliminating the sequential looping overhead. I will open a PR for this shortly!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions