[!] sched/event: Replace semaphore with direct scheduler operations in event implementation to improve performance#17223
[!] sched/event: Replace semaphore with direct scheduler operations in event implementation to improve performance#17223wangchdo wants to merge 2 commits into
Conversation
f1bbf2d to
307dcbd
Compare
ff596cb to
fa2e152
Compare
fa2e152 to
453e9dc
Compare
e3e98e8 to
d37c771
Compare
| { | ||
| struct list_node list; /* Waiting list of nxevent_wait_t */ | ||
| volatile nxevent_mask_t events; /* Pending Events */ | ||
| spinlock_t lock; /* Spinlock */ |
There was a problem hiding this comment.
but spinlock could avoid the global big lock, why do we switch back?
There was a problem hiding this comment.
This is mainly for removing reliance on semaphore, after removing semaphore, we will use critical section to protect event, and spinlock is not needed any more.
The reason to remove semaphore is that it is too heavy for event timeout:
- It has global big lock inside the sema_wait/sema_post api,so in fact there are double lock here: spinlock for event and critical section lock for semaphore
- Semaphore object costs more memory
- It has lost of logic inside sema_wait and sema_post that are not related to event timeout, these logic is even more complicated than event itself
Indeed the lock scope is very small in event, and after removing semaphore it is even smaller, so i think this would be better
By the way The current spinlock is also a global one for event post, it uses flags = spin_lock_irqsave_nopreempt(&event->lock) api.
I also submitted #17244 to remove the event's dependency on wait object, please check detailed information below, thanks
There was a problem hiding this comment.
I think there needs to be a balance here with the SMP mode, I prefer to use spinlock
There was a problem hiding this comment.
I think there needs to be a balance here with the SMP mode, I prefer to use spinlock
Hi @anchao
My key point here is removing semaphore, becasue:
The current event implementation uses a spinlock plus a semaphore — the spinlock protects the event object, while the semaphore manages synchronization between the event waiter and poster. However, this design feels unnecessarily heavy. The semaphore internally relies on enter_critical_section()/leave_critical_section() to protect its own object, and its internal logic is overly complex for such a simple synchronization scenario. In addition, the semaphore object itself consumes extra memory.
Therefore, I decided to remove the semaphore dependency and instead directly use enter_critical_section()/leave_critical_section() to both protect the event object and synchronize between the waiter and poster. This should be more efficient
There was a problem hiding this comment.
Besides, if you check PR #17244
, you’ll see that removing the semaphore also provides an opportunity to eliminate the wait object used by event-waiting tasks. Removing the wait object not only reduces memory usage but also makes the event mechanism safer and the API cleaner and more straightforward.
| TSTATE_TASK_INACTIVE, /* BLOCKED - Initialized but not yet activated */ | ||
| TSTATE_WAIT_SEM, /* BLOCKED - Waiting for a semaphore */ | ||
| TSTATE_WAIT_SIG, /* BLOCKED - Waiting for a signal */ | ||
| TSTATE_WAIT_EVENT, /* BLOCKED - Waiting for a event */ |
There was a problem hiding this comment.
why not continue use TSTATE_WAIT_SIG with alias
There was a problem hiding this comment.
I think event is different than sleep... the scheduler need to treat task waiting on event separately for things like task cancellation, so a new task state is really needed... you can refer to the function nxevent_wait_irq() I implemented and it is called from nxnotify_cancellation():
You can also refer to PR17244 for more information I explained, in this PR i continued to remove the separate wait object in event implementation
There was a problem hiding this comment.
This PR removed event's reliance on semaphore, I also submited a PR to remove event's reliance on a separate wait object, please also check : PR17244
FYI:
#17223 plus this #17224 change summary is:
Refactors the event module by removing its dependency on semaphores and separate wait objects, and introduces a new task state TSTATE_WAIT_EVENT to simplify scheduling and improve maintainability.
Detailed Changes are
1. Remove semaphore dependency
Reason:
- Semaphore objects consume more memory than necessary for event synchronization.
- Semaphore interfaces are relatively complex, involving global locks and logic that exceeds
the needs of the event mechanism.
Benefit:
- Simplifies the event module and reduces runtime and memory overhead.
2. Remove wait object dependency
Reason:
- Wait objects introduce additional memory usage.
- The current design either uses a local wait object in the waiting task (which is unsafe because the posting task also accesses it) or requires users to define global wait objects and call event_tickwait_wait(). This leads to complicated and error-prone usage.
- By removing wait objects, the event module can be implemented more cleanly.
Benefit:
- Simplifies API usage.
- Improves safety and code maintainability.
3. Introduce TSTATE_WAIT_EVENT and move the scheduling list to the event object
Reason:
- Makes the event module implementation more concise.
- Allows the scheduler to handle tasks blocked on events more flexibly in special cases (e.g., task deletion).
Benefit:
- Improves modularity and better integrates event handling with the scheduler.
077e0be to
3afe62e
Compare
Restore the use of critical sections to provide mutual exclusion
between event wait and post operations. This allows replacing the
heavier semaphore-based mechanism with direct scheduler operations
for synchronization.
Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
3afe62e to
72642c4
Compare
jerpelea
left a comment
There was a problem hiding this comment.
please signal the breaking change with [!] before the commit and PR title
ex:
[!] sched/event: Replace semaphore with direct scheduler operations in event
…n event
The current event implementation uses semaphores for wait and post
operations. Since semaphores are relatively heavy-weight and intended
for resource-based synchronization, this is suboptimal.
So this patch replaced the semaphore-based mechanism with direct
scheduler operations to improve performance and reduce memory footprint.
This patch also introduce a new task state TSTATE_WAIT_EVENT to indicate
the task is waiting for a event.
BREAKING CHANGE: This commit introduced a new task state TSTATE_WAIT_EVENT
so apps/nshlib/, procfs/ and tools/pynuttx/nxgdb/ are needed to be updated accordingly.
Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
72642c4 to
7fcfeae
Compare
HI @jerpelea Done, please check:
|

Summary
The current event implementation uses semaphores for wait and post
operations. Since semaphores are relatively heavy-weight and intended
for resource-based synchronization, this is suboptimal.
So this patch replaced the semaphore-based mechanism with direct
scheduler operations to improve performance and reduce memory footprint.
This patch also introduce a new task state TSTATE_WAIT_EVENT to indicate
the task is waiting for a event.
Impact
improvement for the event module, no impact to other nuttx parts
Testing
ostest passed on board a2g-tc397-5v-tft (including event testcases)