Add provenance field to audit payload to distinguish missing vs. genuinely-zero prompt token counts

## Context

In PR #1895 (Add a dedicated OpenAI-compatible LLM adapter), the `_record_usage` method in `unstract/sdk1/src/unstract/sdk1/llm.py` was updated to fall back to `prompt_tokens=0` when:
1. The provider does not return `usage.prompt_tokens` in its response, **and**
2. LiteLLM's `token_counter()` raises (e.g., for unmapped custom/OpenAI-compatible models).

A warning is logged in this case, but the recorded usage payload contains `prompt_llm_token_count=0` with no flag to distinguish "estimation failed / data unavailable" from "the prompt genuinely consumed zero tokens".

This was acknowledged as a known limitation and deferred from PR #1895 because adding a provenance field requires a wider end-to-end contract change across the usage pipeline beyond sdk1.

> **Note (post-#1929):** The usage path has changed — `_record_usage` no longer calls `Audit().push_usage_data`. Usage now flows through `self._pending_usage` → worker `bulk_create_usage`. The core problem remains unchanged.

Relevant discussion: https://github.com/Zipstack/unstract/pull/1895#discussion_r3135244964

## Problem

Downstream consumers of the usage data (cost attribution, analytics, billing) cannot distinguish:
- **Missing data** — token estimation failed for an unmapped model
- **Genuinely zero** — the prompt actually consumed zero tokens

This silently understates prompt-token consumption in cost attribution and analytics for long-running workloads against OpenAI-compatible endpoints that do not return `usage.prompt_tokens`.

## Proposed Solution

Add a provenance / sentinel field to the usage payload, for example:

- `prompt_tokens_source`: an enum/string such as `"provider"`, `"estimated"`, or `"unknown"`
- or `estimation_failed`: a boolean flag

This would require changes to (re-anchored against the new shape post-#1929):
1. `unstract/sdk1/src/unstract/sdk1/llm.py` — populate the provenance field in `_record_usage` before appending to `self._pending_usage`
2. The OSS `Usage` model — extend the schema to include the provenance field
3. The worker `bulk_create_usage` path — consume and persist the new field
4. Downstream platform/usage-record schema — surface and store the new field

Optionally, increment an ops metric/counter when the fallback path is triggered so operations teams can detect silent drift without parsing logs.

## References

- PR #1895 — Add a dedicated OpenAI-compatible LLM adapter
- PR #1929 — Changed usage flow from `Audit().push_usage_data` to `self._pending_usage` → `bulk_create_usage`
- Related issues: #1894, #856, #1443
- Opened by: @hari-kuriakose

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add provenance field to audit payload to distinguish missing vs. genuinely-zero prompt token counts #1953

Context

Problem

Proposed Solution

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add provenance field to audit payload to distinguish missing vs. genuinely-zero prompt token counts #1953

Description

Context

Problem

Proposed Solution

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions