[codex] emit per-request TTFT completion telemetry#30883
Merged
Conversation
Contributor
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 36dadfb8bc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
This was referenced Jul 2, 2026
dylan-hurd-oai
approved these changes
Jul 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Codex telemetry pipeline needs a per-request TTFT value. The existing
codex.turn_ttftis recorded once per turn, so it cannot represent later inference requests in the same turn and can miss the beginning of hidden reasoning.This restores the low-volume per-request signal proposed in bk-nvidia#3 without bringing back per-WebSocket-event TRACE logging.
What changed
response.output_item.added, including an empty hidden-reasoning itemttft_msto the existingcodex.sse_event/response.completedtelemetry recordSemantics
The value is per inference request, not per turn. It measures mapped-stream-to-first-output-item latency, matching the customer-proposed metric. For HTTP, the stream is already established before timing begins, so request setup and response-header latency are excluded.
response.output_item.addedis a client-visible proxy for the start of hidden reasoning; this does not claim access to the server's internal first raw-token timestamp.Validation
just test -p codex-otel(47 passed)just test -p codex-core process_sse_emits_completed_telemetry(1 passed after the final timer-placement change)just test -p codex-core: 2,855 passed and 53 failed because of unrelated local-environment failures (missingtest_stdio_serverfixture binary, shell startup noise, and timing-sensitive tests); the focused telemetry test passed in that run as well