Skip to content

[!] sched/event: Replace semaphore with direct scheduler operations in event implementation to improve performance#17223

Closed
wangchdo wants to merge 2 commits into
apache:masterfrom
wangchdo:improve_event_1021
Closed

[!] sched/event: Replace semaphore with direct scheduler operations in event implementation to improve performance#17223
wangchdo wants to merge 2 commits into
apache:masterfrom
wangchdo:improve_event_1021

Conversation

@wangchdo
Copy link
Copy Markdown
Contributor

@wangchdo wangchdo commented Oct 21, 2025

Summary

The current event implementation uses semaphores for wait and post
operations. Since semaphores are relatively heavy-weight and intended
for resource-based synchronization, this is suboptimal.

So this patch replaced the semaphore-based mechanism with direct
scheduler operations to improve performance and reduce memory footprint.

This patch also introduce a new task state TSTATE_WAIT_EVENT to indicate
the task is waiting for a event.

Impact

improvement for the event module, no impact to other nuttx parts

Testing

ostest passed on board a2g-tc397-5v-tft (including event testcases)

image

@github-actions github-actions Bot added Area: OS Components OS Components issues Size: M The size of the change in this PR is medium labels Oct 21, 2025
@wangchdo wangchdo changed the title sched/event: improve sched/event implementation with sleep/wakeup paire sched/event: improve sched/event implementation with sleep/wakeup pair Oct 22, 2025
@wangchdo wangchdo force-pushed the improve_event_1021 branch 3 times, most recently from ff596cb to fa2e152 Compare October 22, 2025 03:10
Comment thread sched/event/event_wait.c Outdated
@wangchdo wangchdo changed the title sched/event: improve sched/event implementation with sleep/wakeup pair sched/event: Replace semaphore with direct scheduler operations in event implementation to improve performance Oct 23, 2025
@wangchdo wangchdo requested a review from anchao October 23, 2025 01:07
@wangchdo wangchdo force-pushed the improve_event_1021 branch 8 times, most recently from e3e98e8 to d37c771 Compare October 23, 2025 02:27
Comment thread include/nuttx/event.h
{
struct list_node list; /* Waiting list of nxevent_wait_t */
volatile nxevent_mask_t events; /* Pending Events */
spinlock_t lock; /* Spinlock */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but spinlock could avoid the global big lock, why do we switch back?

Copy link
Copy Markdown
Contributor Author

@wangchdo wangchdo Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly for removing reliance on semaphore, after removing semaphore, we will use critical section to protect event, and spinlock is not needed any more.

The reason to remove semaphore is that it is too heavy for event timeout:

  1. It has global big lock inside the sema_wait/sema_post api,so in fact there are double lock here: spinlock for event and critical section lock for semaphore
  2. Semaphore object costs more memory
  3. It has lost of logic inside sema_wait and sema_post that are not related to event timeout, these logic is even more complicated than event itself

Indeed the lock scope is very small in event, and after removing semaphore it is even smaller, so i think this would be better

By the way The current spinlock is also a global one for event post, it uses flags = spin_lock_irqsave_nopreempt(&event->lock) api.

I also submitted #17244 to remove the event's dependency on wait object, please check detailed information below, thanks

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there needs to be a balance here with the SMP mode, I prefer to use spinlock

Copy link
Copy Markdown
Contributor Author

@wangchdo wangchdo Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there needs to be a balance here with the SMP mode, I prefer to use spinlock

Hi @anchao

My key point here is removing semaphore, becasue:

The current event implementation uses a spinlock plus a semaphore — the spinlock protects the event object, while the semaphore manages synchronization between the event waiter and poster. However, this design feels unnecessarily heavy. The semaphore internally relies on enter_critical_section()/leave_critical_section() to protect its own object, and its internal logic is overly complex for such a simple synchronization scenario. In addition, the semaphore object itself consumes extra memory.

Therefore, I decided to remove the semaphore dependency and instead directly use enter_critical_section()/leave_critical_section() to both protect the event object and synchronize between the waiter and poster. This should be more efficient

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, if you check PR #17244
, you’ll see that removing the semaphore also provides an opportunity to eliminate the wait object used by event-waiting tasks. Removing the wait object not only reduces memory usage but also makes the event mechanism safer and the API cleaner and more straightforward.

Comment thread include/nuttx/event.h
Comment thread include/nuttx/sched.h
TSTATE_TASK_INACTIVE, /* BLOCKED - Initialized but not yet activated */
TSTATE_WAIT_SEM, /* BLOCKED - Waiting for a semaphore */
TSTATE_WAIT_SIG, /* BLOCKED - Waiting for a signal */
TSTATE_WAIT_EVENT, /* BLOCKED - Waiting for a event */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not continue use TSTATE_WAIT_SIG with alias

Copy link
Copy Markdown
Contributor Author

@wangchdo wangchdo Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think event is different than sleep... the scheduler need to treat task waiting on event separately for things like task cancellation, so a new task state is really needed... you can refer to the function nxevent_wait_irq() I implemented and it is called from nxnotify_cancellation():

image

You can also refer to PR17244 for more information I explained, in this PR i continued to remove the separate wait object in event implementation

Copy link
Copy Markdown
Contributor Author

@wangchdo wangchdo Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaoxiang781216 @anchao

This PR removed event's reliance on semaphore, I also submited a PR to remove event's reliance on a separate wait object, please also check : PR17244

FYI:
#17223 plus this #17224 change summary is:

Refactors the event module by removing its dependency on semaphores and separate wait objects, and introduces a new task state TSTATE_WAIT_EVENT to simplify scheduling and improve maintainability.

Detailed Changes are

1. Remove semaphore dependency

Reason:

  • Semaphore objects consume more memory than necessary for event synchronization.
  • Semaphore interfaces are relatively complex, involving global locks and logic that exceeds
    the needs of the event mechanism.

Benefit:

  • Simplifies the event module and reduces runtime and memory overhead.

2. Remove wait object dependency

Reason:

  • Wait objects introduce additional memory usage.
  • The current design either uses a local wait object in the waiting task (which is unsafe because the posting task also accesses it) or requires users to define global wait objects and call event_tickwait_wait(). This leads to complicated and error-prone usage.
  • By removing wait objects, the event module can be implemented more cleanly.

Benefit:

  • Simplifies API usage.
  • Improves safety and code maintainability.

3. Introduce TSTATE_WAIT_EVENT and move the scheduling list to the event object

Reason:

  • Makes the event module implementation more concise.
  • Allows the scheduler to handle tasks blocked on events more flexibly in special cases (e.g., task deletion).

Benefit:

  • Improves modularity and better integrates event handling with the scheduler.

    Restore the use of critical sections to provide mutual exclusion
    between event wait and post operations. This allows replacing the
    heavier semaphore-based mechanism with direct scheduler operations
    for synchronization.

Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
Copy link
Copy Markdown
Contributor

@jerpelea jerpelea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please signal the breaking change with [!] before the commit and PR title
ex:
[!] sched/event: Replace semaphore with direct scheduler operations in event

@jerpelea jerpelea changed the title sched/event: Replace semaphore with direct scheduler operations in event implementation to improve performance [!] sched/event: Replace semaphore with direct scheduler operations in event implementation to improve performance Oct 30, 2025
…n event

    The current event implementation uses semaphores for wait and post
    operations. Since semaphores are relatively heavy-weight and intended
    for resource-based synchronization, this is suboptimal.

    So this patch replaced the semaphore-based mechanism with direct
    scheduler operations to improve performance and reduce memory footprint.

    This patch also introduce a new task state TSTATE_WAIT_EVENT to indicate
    the task is waiting for a event.

    BREAKING CHANGE:  This commit introduced a new task state TSTATE_WAIT_EVENT
    so apps/nshlib/, procfs/ and tools/pynuttx/nxgdb/ are needed to be updated accordingly.

Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
@wangchdo wangchdo requested a review from jerpelea October 30, 2025 14:39
@wangchdo
Copy link
Copy Markdown
Contributor Author

please signal the breaking change with [!] before the commit and PR title ex: [!] sched/event: Replace semaphore with direct scheduler operations in event

HI @jerpelea

Done, please check:

image

@xiaoxiang781216
Copy link
Copy Markdown
Contributor

@wangchdo why dup with #17244

@wangchdo
Copy link
Copy Markdown
Contributor Author

@wangchdo why dup with #17244

This PR is in the same branch with 17244, let me close this duplicated one

@wangchdo wangchdo closed this Oct 30, 2025
@wangchdo wangchdo deleted the improve_event_1021 branch January 11, 2026 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: OS Components OS Components issues Size: M The size of the change in this PR is medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants