Skip to content

rfc(feature): SDK-symbolicated stack-frames#152

Open
supervacuus wants to merge 6 commits intogetsentry:mainfrom
supervacuus:rfc/sdk-symbolicated-frames
Open

rfc(feature): SDK-symbolicated stack-frames#152
supervacuus wants to merge 6 commits intogetsentry:mainfrom
supervacuus:rfc/sdk-symbolicated-frames

Conversation

@supervacuus
Copy link

@supervacuus supervacuus commented Feb 27, 2026

This is a feature RFC to find a solution for SDK/client-side symbolicated stack frames:

Rendered RFC

@supervacuus supervacuus force-pushed the rfc/sdk-symbolicated-frames branch from 7abc153 to 00c6d10 Compare February 27, 2026 13:15
@romtsn romtsn requested a review from a team March 3, 2026 10:57
Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall makes sense.

We definitely shouldn't re-Use symbolicator_status, this field is not part of the Event ingestion, it's purely a symbolicator/sentry field (like you call out).

Inferring from other data seems not too bad, but an explicit signal is probably preferred and much less confusing.

There is also an option of re-using the platform for this, since that already controls symbolication. Either by extending it, replacing it or setting a magic noop (not really a fan of this).

Clarify language and punctuation in the RFC for SDK symbolication of stack frames, improving readability and consistency.
@supervacuus
Copy link
Author

Thanks for taking the time, @Dav1dde!

There is also an option of re-using the platform for this, since that already controls symbolication. Either by extending it, replacing it or setting a magic noop (not really a fan of this).

Could I ask you to elaborate on what this would look like in practice?

Do you mean in the sense that the platform for each frame carries the logic for handling already symbolicated frames that failed symbolicator resolution in lang/native/processing.py? Because that would fall into the category of Option B for me ("Infer from existing frame data" is long for: handle in the backend).

Or do you mean the platform event payload entry should be extended to signal from the SDK to processing how particular frames would be handled? What would be the benefit of that vs per-frame properties?

@Dav1dde
Copy link
Member

Dav1dde commented Mar 3, 2026

So as an example, I think if you were to set the platform on a frame (overriding the default platform) to noop it would already skip symbolication of that frame.

Now obviously this isn't the greatest idea, but maybe there is a world where we go away from platform and instead define a field symbolicate_as.

Examples:

  • symbolicate_as: native <- Apply native symbolication
  • symboliicate_as: js <- This frame must have source maps (e.g. react native stack traces)
  • symbolicate_as: jvm <- Proguard etc.
  • symbolicate_as: none <- Skips symbolication

We currently infer the Symbolication decision from the platform, but we could also make it explicit. This is a pretty unfiltered thought and definitely needs a lot more thought and bikeshedding. I think I'd prefer the flag over this, but possibly worth a discussion.

@supervacuus
Copy link
Author

supervacuus commented Mar 3, 2026

We currently infer the Symbolication decision from the platform, but we could also make it explicit.

Yes, this is a relevant point I didn't explicitly highlight (edit: added): we also tie the rendering of each frame to the platform, which in turn leads to entirely different outcomes wrt "missing symbol/image" signalling.

That is also why I thought RFCing this makes sense: there are good reasons to do it entirely server-side as well as SDK-side.

The fact that the SDK-set platform currently implicitly carries how a frame is both rendered and symbolicated might be a good hint for the general direction this RFC could take: do we want to keep this implicit marker approach for both, or break it (partially/fully) open towards the SDK? And of course, what are the ups/downs of each?

This is a pretty unfiltered thought and definitely needs a lot more thought and bikeshedding. I think I'd prefer the flag over this, but possibly worth a discussion.

Agree and matches the format. Right now, I am looking for clarity in the proposal (and counterproposals, thx) and for issues I haven't thought of (or incorrectly framed). So, this is welcome feedback at that level of filtering.

@romtsn romtsn requested review from a team, JPeer264 and RulaKhaled and removed request for a team March 9, 2026 10:20

- **relay-event-schema**: Make `symbolicator_status` a recognized field that SDKs may set, and ensure it is not stripped during ingestion.
- **processing.py**: Check if `symbolicator_status` is already set before overwriting it with symbolicator's result.
- **SDKs**: Set `data.symbolicator_status` on symbolicated frames.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see an Option D where The SDK and symbolicator use a common format to annotate the symbolication status. This would be future-proof towards what software component does the symbolication. Something like

"symbolication": {
    "by": "symbolicator", // or "client"
    "status": "success"
}

Later we could even extend the protocol to log multiple unsuccessful attempts, if needed:

"symbolication": {
    "by": "symbolicator",
    "status": "success",
    "previous_attempts": [
        {"by": "client", "status": "failed", "reason": "unknown_image"}
    ]
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how this could be useful from a diagnostic/audit perspective, and I also agree that it would be more in line with what symbolicator_status currently does: it records the outcome of symbolicator's handling of the frame.

However, I think this is almost orthogonal to my intention for this RFC (though it could still be in scope if we decide to implement it), and I might have communicated it poorly by introducing symbolicator_status as a potential solution to the problem (because it acts more as a counter pattern).

These are the core questions for which I seek a design proposal from this RFC:

  • Can we let the SDK decide whether processing/symbolicator should handle a particular stack frame or not?
  • How far do we want to parameterize this (is this an on/off switch per frame, or do we want to open the encoding vs platform as described here: rfc(feature): SDK-symbolicated stack-frames #152 (comment) and what about symbolicator responsibilities that go beyond symbols, like source-code and line numbers, etc.)?
  • Should SDKs have that power or not? While the two questions above are very much "does the shoe fit now?", this one is about a trajectory: do we already have capabilities lined up that would be inhibited by pushing this towards the SDK?

So, purely from an intent perspective, I see your proposal as a record after the fact, whereas I am currently looking for a mechanism to control that outcome. Maybe I misunderstood. Also, if anything in this comment wasn't clear from the RFC, let me know whether and where I should extend it. Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, would you argue that this is also intended as a mechanism for control, because native processing could implement not to send a frame to symbolicator (or overwrite the client symbol if a symbolicaton request is attempted anyway) if a stack frame contains:

 "symbolication": {
    "by": "client",
    "status": "success"
}

Then it could be both diagnostic and control. But we should decide on the control meaning getting such a symbolication tag from the SDK.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, I integrated the feedback I got until now: bbd301c

As you can see, I just massaged your "Option D" into a control mechanism. Let me know if that aligns with what you had in mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, would you argue that this is also intended as a mechanism for control

Yes, I would intend it mainly as a control mechanism which can also serve as a diagnostic tool or to give the user more visibility of what happened to their stack traces. Thank you for adding the option!

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think Option A would work perfectly fine. Option D is just an attempt to foresee potential future use-cases / unify the newly proposed protocol with the existing one.

- **relay-event-schema**: Add a typed `symbolication` object to `Frame` with fields `by: enum("client", "symbolicator")`, `status: enum("success", "failed", "missing", "unknown")`, and optionally `reason: string` and `previous_attempts: list`.
- **SDKs**: Set `symbolication: { "by": "client", "status": "success" }` on frames the SDK has symbolicated.
- **processing.py**: Check for `symbolication.by == "client" && symbolication.status == "success"` early in frame handling; if present, skip symbolicator. Otherwise, proceed normally and write the symbolicator result into the same `symbolication` field.
- **symbolicator**: Write results into the `symbolication` object instead of (or in addition to) the current `symbolicator_status` field in `frame.data`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbolicator could double-write into both fields for a while, until the sentry backend is updated to use the new field for a while.


- **relay-event-schema**: Make `symbolicator_status` a recognized field that SDKs may set, and ensure it is not stripped during ingestion.
- **processing.py**: Check if `symbolicator_status` is already set before overwriting it with symbolicator's result.
- **SDKs**: Set `data.symbolicator_status` on symbolicated frames.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, would you argue that this is also intended as a mechanism for control

Yes, I would intend it mainly as a control mechanism which can also serve as a diagnostic tool or to give the user more visibility of what happened to their stack traces. Thank you for adding the option!

@chargome
Copy link
Member

For JS this this would not have implications I think as there is no client-side symbolication apart from nextjs dev server. We can adapt out stackframe type definition if we land on a type here but would likely leave that undefined in all cases that we currently cover. 👍

- How are the associated debug-meta images being resolved? If a frame is marked as "client-symbolicated" and thus doesn't show a "missing symbol" warning, images that are missing, but whose associated frames are entirely "client-symbolicated", shouldn't be marked as an error either (but should still be listed). Should this be resolved on the client (with a tag similar to the frame)? Or should this be resolved in native processing?
- Are there workflow interactions between symbolication, line-number, and source lookup that would be prevented by the presented approach?
- What is the interaction with demangling? If an SDK provides a mangled C++ symbol, should the backend still demangle it even if the frame is marked as client-symbolicated?
- Should the UI distinguish between server-symbolicated and client-symbolicated frames (e.g., a subtle indicator), or treat them identically?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe surfacing the origin of the symbolication could be beneficial for debugging absent symbols - which happens easily in case CI workflows uploading debug symbols soft-fail to do so. By comparing symbolicated issues to issues without symbols with indication of the source could indicate if it's an SDK issue or not.

This problem is not theoretical. It already manifests today in at least two concrete scenarios:

- **Tombstone / native crash reporting**: Native SDKs that symbolicate system library frames on-device before sending the event. The backend cannot distinguish these from unsymbolicated frames and flags them as broken since it cannot find any symbol information in its stores. On Android in particular, we usually have both situations: native user libraries will be packaged stripped and `Symbolicator` must resolve associated frames (i.e., the UI warning is sensible if symbol/debug-info is missing), but system and framework libraries usually will be symbolicated on-device and `Symbolicator` won't have access to data to further enrich the stack frame. In that case, the UI error is misleading and inactionable to the user.
- **.NET SDK**: The .NET SDK resolves function names, file paths, and line numbers locally using portable PDB metadata. When these frames arrive at the backend, symbolicator still attempts to process them. Because the user hasn't uploaded PDB files to Sentry (the SDK already handled this), every frame returns `symbolicator_status: "missing"` and the UI shows a misleading symbolication error banner. This is tracked in [getsentry/sentry#97054](https://github.com/getsentry/sentry/issues/97054).
Copy link
Member

@Flash0ver Flash0ver Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: applies to JIT-compiled apps ... for Native AOT compiled apps, we only got function names, but I believe file names and line numbers are missing ... but Native AOT maybe a whole different story in this context

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or Native AOT compiled apps, we only got function names, but I believe file names and line numbers are missing

Okay, but is the assumption that those filenames and line numbers can be resolved later because the user will upload them as debug info?

but Native AOT maybe a whole different story in this context

How so?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's pretty unlikely we'd be symbolicating AOT compiled code on the client (the subject of this RFC)... but we should be able to resolve symbols later.

AFAIK during AOT compilation, when converting the IL into assembly, the linker can create a map from the native instructions back to the original IL... and this map gets stored in the symbol file. So even though the app is AOT compiled and you no longer have the original IL, you can still symbolicate if you have the maps in the symbol files.

Testing this out by running dotnet publish -c Release -f net10.0 samples/Sentry.Samples.Console.Basic/Sentry.Samples.Console.Basic.csproj to get an AOT complied app I see that we sometimes get line numbers and sometimes don't.

For this exception I see full symbolication of the system frames but not for the InApp frames:

System.InvalidOperationException: Something happened that crashed the app!
  at Program.<>c__DisplayClass0_0.<<<Main>$>g__ThirdFunction|4>d.MoveNext()()
  at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()() in ExceptionDispatchInfo.cs:line 53
  at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task)() in TaskAwaiter.cs:line 154
  at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions)() in TaskAwaiter.cs:line 118
  at Program.<<Main>$>d__0.MoveNext()()
  at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()() in ExceptionDispatchInfo.cs:line 53
  at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task)() in TaskAwaiter.cs:line 154
  at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task, ConfigureAwaitOptions)() in TaskAwaiter.cs:line 118
  at Program.<Main>(String[] args)()
  at ?()

That will probably be an issue with not resolving the symbols I uploaded for that app... none the less, even the symbolication of the system frames demonstrates that we can symbolicate (file names and line numbers) AOT compiled code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input from both of you. Just to clarify: this proposal is only relevant for scenarios where full symbol coverage via a backend store is not an option (the classic example being Android, because there are more platform OEMs than we can realistically collect from). So, if you know that the uploaded debug info can fully recover every frame, there is no need for this feature.

The .NET-related issue I linked in the RFC just highlights a problem that aligns with what we are trying to solve here:

  • The UI signals to the user that something is missing, and while that is technically true, it is not necessarily actionable.
  • In that case, the client was able to symbolicate frames from ConsoleOnboarding.dll, but the signal to the user is still that something is not right and needs to be fixed.
  • With a tag from the SDK indicating that symbolication happened on the client for a particular frame, the UI could check that all frames associated with that module have already been correctly symbolicated and, as such, lower the severity of the warning and call to action.
  • Of course, in this particular case, there might still be a realistic chance that this is an okay warning (but the issue was created in response, so I guess even here the error is overstating the fact and confusing to users)
  • In scenarios where the referenced module will never be uploaded, the warning is just factually wrong (and in the case of native, it would not only affect the module being marked as missing, but each associated frame would be marked, and the event would also get an error at the top, as shown in the screenshots)

- **Symbolicator**: May need awareness of the flag if processing delegates the decision.
- **UI**: Should stop showing misleading "missing symbols" errors for frames that are intentionally client-symbolicated.

# Options Considered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do I understand it correctly that when both sides are given (uploaded debug files & SDK-runtime-resolved metadata), the Client-provided metadata is preferred over the previously uploaded symbols?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of this proposal's options, or how it is currently implemented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants