Skip to content

Conversation

@sylveon
Copy link
Contributor

@sylveon sylveon commented Aug 18, 2023

Fixes #1329

This fixes the issue by using m_state before suspending, rather than after. The issue occurs because the callback could fire before our thread of execution resumes, causing the timespan_awaiter/signal_awaiter to be destroyed inside the coroutine frame before m_state is accessed.

As a drive-by improvement, currently if await_suspend is called with a non-idle state, the threadpool object is closed (cancelling the timer/wait), the existing coroutine handle is just dropped, and resume on the new handle is fired immediately. This would cause the existing pending coroutine to hang forever. Instead, avoid doing anything and throw an exception when the awaiter is not idle. This is a very unlikely event, the test does some gymnastics (reused awaiter) to achieve this state, but better safe than sorry.

Fixes microsoft#1329

This fixes the issue by using m_state before suspending, rather than after. The issue occurs because the callback could fire before our thread of execution resumes, causing the timespan_awaiter/signal_awaiter to be destroyed inside the coroutine frame before m_state is accessed.

As a drive-by improvement, currently if await_suspend is called with a non-idle state, the threadpool object is closed (cancelling the timer/wait), the existing coroutine handle is just dropped, and resume on the new handle is fired immediately. This would cause the existing pending coroutine to hang forever. Instead, avoid doing anything and throw an exception when the awaiter is not idle. This is a very unlikely event, the test does some gymnastics (reused awaiter) to achieve this state, but better safe than sorry.
@kennykerr
Copy link
Collaborator

CI build failing with:

C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include\atomic(702) : Assertion failed: Invalid memory order

@sylveon
Copy link
Contributor Author

sylveon commented Sep 13, 2023

Yep, taking a look

@sylveon
Copy link
Contributor Author

sylveon commented Sep 14, 2023

A driver update hosed my system install (I love Windows), will take a bit more time than expected

@sylveon
Copy link
Contributor Author

sylveon commented Sep 21, 2023

I stopped trying to be clever and just used a mutex, that way we can be sure that await_suspend completes before the threadpool callback resumes the coroutine.

I also took the opportunity to refactor the code to use CRTP and share implementation between timers and waits, as well as insert some useful short circuits in case of cancellation.

{
}

#if defined(__GNUC__) && !defined(__clang__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this because the upstream bug has been fixed since GCC 12

@sylveon
Copy link
Contributor Author

sylveon commented Sep 21, 2023

Everything should work now :)

@kennykerr
Copy link
Collaborator

@oldnewthing are you able to review this?

@oldnewthing
Copy link
Member

@kennykerr Will try to get to it, but kind of busy with other API stuff.

@github-actions
Copy link

This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@sylveon
Copy link
Contributor Author

sylveon commented Oct 13, 2023

Bump

@github-actions
Copy link

This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@sylveon
Copy link
Contributor Author

sylveon commented Oct 29, 2023

Can someone reopen?

@sylveon
Copy link
Contributor Author

sylveon commented Feb 8, 2024

Bump

@github-actions
Copy link

This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link

github-actions bot commented Mar 1, 2024

This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@sylveon
Copy link
Contributor Author

sylveon commented Mar 1, 2024

Is there anything preventing this from being merged?

@kennykerr
Copy link
Collaborator

I'm not able to review this code #1329 (comment). My preference is to strip coroutine support in C++/WinRT down to the basics, removing all of the helpers, cancellation, and progress support as it all appears somewhat unreliable and buggy and is too difficult to reason about.

@sylveon
Copy link
Contributor Author

sylveon commented Mar 2, 2024

Migrating all of that to WIL seems preferred (I'm not opposed to that), but then it would be a breaking change that requires C++/WinRT 3. Maybe @oldnewthing can review this until then

@oldnewthing
Copy link
Member

The PR description says "This fixes the issue by using m_state before suspending, rather than after. " But the change is much bigger than this sentence suggests.

Can we untangle the essential fix from the drive-by fix?

@sylveon
Copy link
Contributor Author

sylveon commented Mar 3, 2024

Making the fix separately in both classes would not lead to a much better PR diff unfortunately, so I merged them together into a single base class to reduce code duplication (and therefore less effort to review).

@github-actions
Copy link

This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@sylveon
Copy link
Contributor Author

sylveon commented Mar 17, 2024

In short, I don't think untangling the drive-by would make it much easier to read.

@github-actions
Copy link

This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions
Copy link

github-actions bot commented Apr 9, 2024

This pull request is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@kennykerr
Copy link
Collaborator

Closing this PR as there's been no real activity in nearly 8 months. If a project maintainer is available to review, they can always reopen this PR as needed.

@kennykerr kennykerr closed this Apr 16, 2024
@jaigak
Copy link
Contributor

jaigak commented Apr 20, 2024

Why hasn't Microsoft got anyone to look after C++/WinRT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Access violation when using winrt::resume_on_signal

6 participants