Skip to content

RUM-14471: Auto-detect interstitial activities and forward TTID#3184

Draft
hamorillo wants to merge 18 commits intohector.morilloprieto/RUM-14471from
hector.morilloprieto/RUM-14471-auto
Draft

RUM-14471: Auto-detect interstitial activities and forward TTID#3184
hamorillo wants to merge 18 commits intohector.morilloprieto/RUM-14471from
hector.morilloprieto/RUM-14471-auto

Conversation

@hamorillo
Copy link
Contributor

@hamorillo hamorillo commented Feb 17, 2026

What does this PR do?

Note: This PR is targeting hector.morilloprieto/RUM-14471, as it is an iteration of it.

Automatically detects when the startup activity is destroyed without drawing a frame (e.g., splash screens that call startActivity() + finish() in onCreate()) and forwards the TTID measurement to the next tracked activity, preserving the original startup initialTime. Supports chained interstitials (A → B → C).

Core changes:

  • RumAppStartupDetectorImpl tracks a pendingStartupScenario and on activity destruction checks whether TTID was reported. If not, retargets to the next tracked activity via a new onAppStartupRetargeted listener callback.
  • RumFeature re-subscribes to the first-frame draw on retarget without re-sending AppStartEvent. For forwarded activities that already drew, the callback is invoked directly.
  • RumStartupScenario subclasses converted to data classes with forwardTo() method.
  • RumTTIDInfo.wasForwarded reported in startup telemetry.

Motivation

When an interstitial activity calls startActivity() + finish() without setContentView(), the OnDrawListener never fires and TTID is never reported. The predicate-based solution (base branch) requires customer configuration. This adds a complementary "just works" solution that handles the common case automatically.

How to test

Apply the attached patch to the branch:

The patch modifies the sample app to add a SplashActivity as the launcher activity. This activity immediately calls startActivity(NavActivity) + finish() without setting a content view, simulating an interstitial splash screen. It also sets telemetry sampling to 100% and reduces session inactivity timeout to 15 seconds for easier testing.

Run the sample app and check the RUM dashboard — you should see TTID reported for NavActivity with was_forwarded: true in the startup telemetry metric.

Additional Notes

  • The predicate API (setAppStartupActivityPredicate) is still needed for activities that DO draw but should be excluded from TTID (e.g., a branded splash with a logo). Both mechanisms coexist.
  • Updated AppStartupActivityPredicateTest to use a new DrawingSplashActivity that renders content, properly testing the predicate use case now that auto-forwarding handles non-drawing interstitials.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Make sure you discussed the feature or bugfix with the maintaining team in an Issue
  • Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

@hamorillo hamorillo force-pushed the hector.morilloprieto/RUM-14471-auto branch from d848ef2 to 7b69d29 Compare February 17, 2026 13:47
@hamorillo
Copy link
Contributor Author

hamorillo commented Feb 17, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b69d29ae6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@datadog-official

This comment has been minimized.

@hamorillo hamorillo force-pushed the hector.morilloprieto/RUM-14471-auto branch from 7b69d29 to 060f6f3 Compare February 17, 2026 14:40
@codecov-commenter
Copy link

codecov-commenter commented Feb 17, 2026

Codecov Report

❌ Patch coverage is 76.92308% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.33%. Comparing base (5538b89) to head (6cba3cf).

Files with missing lines Patch % Lines
...lin/com/datadog/android/rum/internal/RumFeature.kt 33.33% 3 Missing and 3 partials ⚠️
.../rum/internal/startup/RumAppStartupDetectorImpl.kt 85.29% 0 Missing and 5 partials ⚠️
...android/rum/internal/startup/RumStartupScenario.kt 83.33% 1 Missing ⚠️
Additional details and impacted files
@@                        Coverage Diff                         @@
##           hector.morilloprieto/RUM-14471    #3184      +/-   ##
==================================================================
+ Coverage                           71.30%   71.33%   +0.03%     
==================================================================
  Files                                 930      930              
  Lines                               34455    34497      +42     
  Branches                             5814     5829      +15     
==================================================================
+ Hits                                24566    24606      +40     
+ Misses                               8248     8244       -4     
- Partials                             1641     1647       +6     
Files with missing lines Coverage Δ
...roid/rum/internal/startup/RumAppStartupDetector.kt 100.00% <ø> (ø)
...rnal/startup/RumAppStartupTelemetryReporterImpl.kt 100.00% <100.00%> (ø)
...atadog/android/rum/internal/startup/RumTTIDInfo.kt 100.00% <100.00%> (ø)
...android/rum/internal/startup/RumStartupScenario.kt 96.00% <83.33%> (-4.00%) ⬇️
.../rum/internal/startup/RumAppStartupDetectorImpl.kt 91.25% <85.29%> (-2.75%) ⬇️
...lin/com/datadog/android/rum/internal/RumFeature.kt 89.68% <33.33%> (-0.76%) ⬇️

... and 33 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hamorillo hamorillo force-pushed the hector.morilloprieto/RUM-14471 branch 2 times, most recently from eeb86be to a9af87b Compare February 18, 2026 08:35
@hamorillo hamorillo force-pushed the hector.morilloprieto/RUM-14471-auto branch from 6550faf to 126249f Compare February 18, 2026 08:36
@hamorillo hamorillo marked this pull request as ready for review February 18, 2026 09:35
@hamorillo hamorillo requested a review from a team as a code owner February 18, 2026 09:35
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 126249f396

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

pendingStartupScenario = forwarded
listener.onAppStartupRetargeted(forwarded)
} else {
pendingStartupScenario = null

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve pending startup when no retarget candidate exists yet

Clearing pendingStartupScenario as soon as the startup activity is destroyed drops the original startup context when the replacement activity is created later (for example, navigation posted to the main thread or triggered by an async callback after finish()). In that flow, the next tracked activity is treated as a new warm start, so TTID is no longer measured from the original app-start time and the forwarding feature silently under-reports launch time.

Useful? React with 👍 / 👎.

Copy link
Contributor

@aleksandr-gringauz aleksandr-gringauz Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably a valid issue. Suppose we have AuthActivity that starts MainActivity in AuthActivity.onCreate. I don't know if there is any guarantee that MainActivity.onCreate will be called before AuthActivity.onDestroy, especially in the case of using handler.post as described in the comment.

I wonder if we could do something like this:

  1. When we detect the launch we remember the RumStartupScenario as pendingScenario inside RumAppStartupDetectorImpl. We call listener.onAppStartupDetected(scenario) and subscribe to first frame.
  2. If another Activity is created while we have pendingScenario, we call some new callback listener.onNextActivityCreated(pendingScenario, nextActivity) and there subscribe to its first frame.
  3. In onFirstFrameDrawn we check that RumAppStartupDetector still has a pending scenario (maybe we need a getter RumAppStartupDetector.getPendingScenario) and if this is the same scenario that when we subscribed, then we call rumMonitor.sendTTIDEvent and also clean the pending scenario using some new method RumAppStartupDetector.clearPendingScenario.
  4. The Activity inside RumStartupScenario might be different from the Activity that actually drew the frame, so we can create a field in RumTTIDInfo called let's say firstFrameDrawActivity: WeakReference<Activity>. So probably forwardTo will not be needed anymore.

These aren't the hard requirements, just some thoughts from me that you might find useful. And I might be wrong somewhere, didn't extensively check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current code, the else branch no longer clears pendingStartupScenario. Instead, it records awaitingRetargetSinceNs = timeProvider().nanoTime, preserving the startup scenario for deferred retarget. When the next tracked activity is created (in onBeforeActivityCreated), it checks whether the elapsed time is within a 1-second timeout and, if so, forwards the preserved scenario via onAppStartupRetargeted. If the timeout expires, the stale scenario is discarded and normal startup detection takes over. This handles the async interstitial pattern (e.g., startActivity() + finish() posted via Handler.post) where the replacement activity's onCreate fires after the interstitial's onDestroy.

I chose this approach instead of subscribing to first frame for every new activity while a pending scenario exists because I think it's simpler and less risky — it avoids increasing coupling between RumAppStartupDetectorImpl and the listener, and eliminates the possibility of duplicate TTID events if multiple activities draw their first frame while the scenario is still pending. That said, happy to discuss if you think the alternative would be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the next tracked activity is created (in onBeforeActivityCreated), it checks whether the elapsed time is within a 1-second timeout

IMHO if these kinds of time based heuristics can be avoided (another solution that works in more cases exists), they should be avoided. What if in AuthActivity.onCreate we have postDelayed(2.seconds, { startActivity(MainActivity) })?

it avoids increasing coupling between RumAppStartupDetectorImpl and the listener

I don't understand why coupling (if I understood correctly what you meant) here is a bad thing. Also you already introduced notifyStartupTTIDReported, I don't see how it is different from my approach in terms of coupling.

and eliminates the possibility of duplicate TTID events if multiple activities draw their first frame while the scenario is still pending

Part 3 in my comment takes care of this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are fair points. In fact, the time-based solution is what I especially don’t like about my proposed approach.

I’ve probably been too focused on my previous idea. Let me dig deeper into your initial suggestion, as it clearly has some benefits if we can come up with a good implementation.

I’ll pause this PR while I explore the other approach.

Thanks again for your insight, @aleksandr-gringauz!

pendingStartupScenario = forwarded
listener.onAppStartupRetargeted(forwarded)
} else {
pendingStartupScenario = null
Copy link
Contributor

@aleksandr-gringauz aleksandr-gringauz Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably a valid issue. Suppose we have AuthActivity that starts MainActivity in AuthActivity.onCreate. I don't know if there is any guarantee that MainActivity.onCreate will be called before AuthActivity.onDestroy, especially in the case of using handler.post as described in the comment.

I wonder if we could do something like this:

  1. When we detect the launch we remember the RumStartupScenario as pendingScenario inside RumAppStartupDetectorImpl. We call listener.onAppStartupDetected(scenario) and subscribe to first frame.
  2. If another Activity is created while we have pendingScenario, we call some new callback listener.onNextActivityCreated(pendingScenario, nextActivity) and there subscribe to its first frame.
  3. In onFirstFrameDrawn we check that RumAppStartupDetector still has a pending scenario (maybe we need a getter RumAppStartupDetector.getPendingScenario) and if this is the same scenario that when we subscribed, then we call rumMonitor.sendTTIDEvent and also clean the pending scenario using some new method RumAppStartupDetector.clearPendingScenario.
  4. The Activity inside RumStartupScenario might be different from the Activity that actually drew the frame, so we can create a field in RumTTIDInfo called let's say firstFrameDrawActivity: WeakReference<Activity>. So probably forwardTo will not be needed anymore.

These aren't the hard requirements, just some thoughts from me that you might find useful. And I might be wrong somewhere, didn't extensively check.

activity = activity,
callback = callback
)
if (wasForwarded && activity.window.peekDecorView()?.isLaidOut == true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the first frame is already drawn, it means that we didn't manage to properly capture the time when it happend. How can it be that here the first frame is already drawn?

Introduce AppStartupActivityPredicate to allow customers to exclude interstitial Activities (e.g., auth screens) that launch another Activity in onCreate() and finish themselves. This fixes TTID not being reported
when such Activities never draw a frame.
Use WeakHashMap to cache shouldTrackStartup() result at activity creation
and reuse it at destruction, ensuring symmetric counter increment/decrement.

This prevents counter corruption when predicates depend on mutable state
(e.g., activity flags, runtime toggles) that may change between lifecycle
callbacks.
…interstitial activities from TTID measurement
Problem:
When running multiple startup tests together, Datadog.stopInstance() in
afterActivityFinished() triggered async cleanup operations that didn't
complete before the next test initialized. This caused
RumSessionScopeStartupManager state (ttidSentForSession = true) to leak
between tests, preventing TTID from being reported in subsequent tests.

Solution:
Added @after tearDown() method in RumTest base class that:
1. Calls Datadog.stopInstance() to trigger cleanup
2. Calls InstrumentationRegistry.getInstrumentation().waitForIdleSync()
   to ensure main thread becomes idle before proceeding

The waitForIdleSync() approach is superior to sleep() because it waits
for actual completion rather than an arbitrary timeout. This ensures:
- All pending UI operations complete
- Main thread work finishes
- Async cleanup operations have time to complete
- Proper test isolation

The double-stop (once in @after, once in afterActivityFinished) is safe
due to Datadog.stopInstance() being idempotent.

Verified: Both AppStartupActivityPredicateTest and
AppStartupMainActivityDirectTest pass reliably when run together.
Added viewArguments parameter to ExpectedVitalAppLaunchEvent to allow verifying view attributes in vital app launch events. Updated startup tests to verify that TTID and TTFD events originate from MainContentActivity by checking the view name attribute using type-safe class references.
…rImpl and add safe methods to "detetk_custom_safe_call.yml"
@hamorillo hamorillo force-pushed the hector.morilloprieto/RUM-14471 branch from a9af87b to 5538b89 Compare February 18, 2026 21:14
When an interstitial activity (e.g., a splash screen) calls
startActivity() + finish() in onCreate() without ever calling
setContentView(), the OnDrawListener never fires and TTID is never
reported. This change adds automatic detection of this pattern: when
the startup activity is destroyed without TTID being reported and
another tracked activity exists, the TTID measurement is "retargeted"
to that next activity while preserving the original startup initialTime.

The core mechanism lives in RumAppStartupDetectorImpl, which tracks a
pendingStartupScenario and a startupTTIDReported flag. On activity
destruction, if the startup activity was destroyed before drawing, the
detector looks for the next tracked activity and calls the new
onAppStartupRetargeted listener callback (which re-subscribes to the
first-frame draw without re-sending AppStartEvent). This supports
chained interstitials (A -> B -> C) where multiple non-drawing
activities forward in sequence. For forwarded activities that have
already drawn their first frame, the callback is invoked directly
instead of subscribing to OnDrawListener.

RumStartupScenario subclasses are now data classes with a forwardTo()
method that creates a copy with the new activity reference while
preserving all timing fields. RumTTIDInfo gains a wasForwarded field
reported in startup telemetry.

Integration tests cover three scenarios: auto-forwarding with a single
non-drawing interstitial, chained auto-forwarding through two
non-drawing activities, and the predicate-based exclusion of a drawing
splash activity. The AppStartupActivityPredicateTest was updated to use
a DrawingSplashActivity that actually renders content, properly testing
the predicate for activities that draw but should be excluded from TTID.
…y has drawn

Check isLaidOut on the decor view rather than just its existence when deciding
whether to report TTID immediately for forwarded activities. peekDecorView()
becomes non-null right after setContentView(), before the first frame is
rendered, which causes TTID to be under-reported. isLaidOut ensures the view
has completed at least one layout pass, which is a much more reliable signal
that the activity has actually rendered.
The previous WeakHashMap-backed set had non-deterministic iteration order,
which meant that when retargeting TTID after an interstitial activity was
destroyed, firstOrNull() could pick an arbitrary activity if multiple
tracked activities were alive simultaneously.

Replace it with a MutableList<WeakReference<Activity>> to guarantee
insertion-order iteration while preserving weak-reference semantics. Stale
references are cleaned up on every onActivityDestroyed call.

Unlike WeakHashMap, entries whose referents are GC'd won't be removed
automatically. This is acceptable because onActivityDestroyed always fires
for every activity, and the cleanup there also prunes any stale references,
so the list never accumulates orphaned entries in practice.

The size check also changes from O(1) to O(n) since we now use count()
instead of size. This is negligible in practice because the number of
concurrently tracked startup activities is extremely small (typically 1-2
during app launch).
…transitions

When an interstitial activity calls startActivity() + finish() asynchronously
(e.g., via Handler.post), the next activity's onCreate may fire after the
interstitial's onDestroy. The previous code cleared pendingStartupScenario in
onDestroy when no next candidate existed, losing the startup context. The
replacement activity was then treated as a new WarmAfterActivityDestroyed
instead of a forwarded cold start.

Fix: instead of clearing pendingStartupScenario, record the destroy timestamp
(awaitingRetargetSinceNs). When the next tracked activity is created within a
1-second timeout, forward the preserved scenario via onAppStartupRetargeted.
If the timeout expires, discard the stale scenario and fall through to normal
startup detection.

This approach was chosen over the alternative of subscribing to first frame
for every new activity while a pending scenario exists (with a new
onNextActivityCreated callback). That design would increase coupling between
RumAppStartupDetectorImpl and the listener — the detector would need to know
about frame subscriptions, and the listener would need to coordinate multiple
concurrent subscriptions. It also raises the risk of sending duplicate TTID
events if multiple activities draw their first frame while the scenario is
still pending. By keeping the explicit single-target forwarding model, only
one activity is ever subscribed at a time, eliminating that class of bugs.
@hamorillo hamorillo force-pushed the hector.morilloprieto/RUM-14471-auto branch from 7fc43d9 to 3783e55 Compare February 18, 2026 21:15
…akiness

Replace the isLaidOut branch in subscribeToFirstFrame with a single code
path: always subscribe via rumFirstDrawTimeReporter, then invalidate the
decor view for forwarded activities to ensure the OnDrawListener fires.
The previous approach called onFirstFrameDrawn() directly with the
current timestamp when isLaidOut was true, which reported an inaccurate
TTID duration that included retarget overhead rather than the actual
first draw time.
@hamorillo hamorillo force-pushed the hector.morilloprieto/RUM-14471-auto branch from 1709515 to 6cba3cf Compare February 19, 2026 15:44
callback = callback
)
if (wasForwarded) {
// The forwarded activity may have already drawn before retarget.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it possible that the frame is already drawn when we are here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s imagine the following case:

A.onCreate → A.onStart → A.onResume → B.onCreate → B.onStart → B.onResume → ... → A.onDestroy

A doesn’t draw any frames, and it doesn’t finish until B has already been drawn. When A is destroyed, we move forward with the scenario so that B calls subscribeToFirstFrame, but it has already drawn its first frame.

I discovered this case while implementing the instrumented test, so I would say it’s a possible scenario.

pendingStartupScenario = forwarded
listener.onAppStartupRetargeted(forwarded)
} else {
pendingStartupScenario = null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the next tracked activity is created (in onBeforeActivityCreated), it checks whether the elapsed time is within a 1-second timeout

IMHO if these kinds of time based heuristics can be avoided (another solution that works in more cases exists), they should be avoided. What if in AuthActivity.onCreate we have postDelayed(2.seconds, { startActivity(MainActivity) })?

it avoids increasing coupling between RumAppStartupDetectorImpl and the listener

I don't understand why coupling (if I understood correctly what you meant) here is a bad thing. Also you already introduced notifyStartupTTIDReported, I don't see how it is different from my approach in terms of coupling.

and eliminates the possibility of duplicate TTID events if multiple activities draw their first frame while the scenario is still pending

Part 3 in my comment takes care of this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants