Commit 6936ff7
Fix debugger hangs due to runtime deadlock by using patch DJI (dotnet#123651)
# Description
This PR implements a targeted fix for the debugger deadlock issue where
the debugger hangs while waiting for a deadlocked debuggee process. The
deadlock occurs when `DebuggerController::BindPatch` calls `GetJitInfo`
while holding the DebuggerController lock. `GetJitInfo` can trigger
HashMap lookups that perform GC mode transitions, leading to lock
ordering issues.
The fix modifies `BindPatch` to check if the patch already has a
`DebuggerJitInfo` (DJI) stored on it via `patch->HasDJI()`. When the DJI
is available, it uses `patch->GetDJI()` instead of calling the
problematic `GetJitInfo` method. This approach is cleaner than adding a
new parameter since the patch already stores the DJI.
## Changes Made
- Updated `BindPatch` implementation to check `patch->HasDJI()` before
calling `GetJitInfo`
- When DJI exists on the patch, use `patch->GetDJI()` to avoid the
deadlock-prone `GetJitInfo` call
- Maintains backward compatibility - patches without DJI continue to
call `GetJitInfo` as before
The implementation uses a ternary operator for concise, idiomatic C++
code:
```cpp
DebuggerJitInfo *info = patch->HasDJI() ? patch->GetDJI() : g_pDebugger->GetJitInfo(pMD, (const BYTE *)startAddr);
```
# Customer Impact
Without this fix, customers can experience debugger hangs when debugging
applications with ReadyToRun methods under specific timing conditions.
This occurs when:
1. A breakpoint is set on a ReadyToRun method that hasn't executed yet
2. One thread is stepping through code while another thread first
executes the ReadyToRun method with the breakpoint
The hang requires the debugger and IDE to be force-closed, resulting in
loss of debugging session state and developer productivity.
# Regression
This is not a regression from the most recent release. The root cause
has existed for a while, but timing changes in the code may have
impacted the likelihood of hitting the race condition.
# Testing
- Full CoreCLR build completed successfully with no errors or warnings
- Code review passed with no issues identified
- No security vulnerabilities introduced
The changes are minimal (1 file changed, 3 insertions, 1 deletion) and
surgical to reduce regression risk for potential backporting to prior
.NET releases.
# Risk
**Low Risk**. The changes are minimal and highly targeted:
- No API changes - the `BindPatch` signature remains unchanged
- The logic flow remains unchanged - we're just avoiding an unnecessary
lookup when the DJI is already available on the patch
- The patch already stores the DJI via `AddPatchForMethodDef`, so we're
using existing data
- Full build validation confirms no compilation issues
- The fix addresses the specific deadlock scenario without affecting
other debugger functionality
- Maintains backward compatibility - patches without DJI still call
`GetJitInfo` as before
<!-- START COPILOT ORIGINAL PROMPT -->
<details>
<summary>Original prompt</summary>
----
*This section details on the original issue you should resolve*
<issue_title>Debugger hangs due to runtime deadlock. HashMap takes
ThreadStore out of order</issue_title>
<issue_description>### Description
The debugger is blocked in mscordbi code at:
```
[Inline Frame] mscordbi.dll!SafeWaitForSingleObject(CordbProcess *) Line 257 C++
> mscordbi.dll!CordbProcess::StopInternal(unsigned long dwTimeout, VMPTR_Base<AppDomain,void> pAppDomainToken) Line 3611 C++
[Inline Frame] mscordbi.dll!StopContinueHolder::Init(CordbProcess *) Line 208 C++
mscordbi.dll!CordbModule::GetFunctionFromToken(unsigned int token, ICorDebugFunction * * ppFunction) Line 1387 C++
```
mscordbi is waiting on the debuggee to respond which it won't do because
it is deadlocked.
Thread 0x9314 in the debuggee (trying to acquire DebuggerController
lock):
```
[Inline Frame] coreclr.dll!CrstBase::AcquireLock(CrstBase *) Line 174 C++
[Inline Frame] coreclr.dll!FunctionBase<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock>::DoAcquire() Line 694 C++
[Inline Frame] coreclr.dll!BaseHolder<CrstBase *,FunctionBase<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock>,0,&CompareDefault<CrstBase *>>::Acquire() Line 266 C++
[Inline Frame] coreclr.dll!BaseHolder<CrstBase *,FunctionBase<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock>,0,&CompareDefault<CrstBase *>>::{ctor}(CrstBase *) Line 233 C++
[Inline Frame] coreclr.dll!Holder<CrstBase *,&CrstBase::AcquireLock,&CrstBase::ReleaseLock,0,&CompareDefault<CrstBase *>,1>::{ctor}(CrstBase *) Line 729 C++
> coreclr.dll!DebuggerController::DispatchPatchOrSingleStep(Thread * thread, _CONTEXT * context, const unsigned char * address, SCAN_TRIGGER which, DebuggerSteppingInfo * pDebuggerSteppingInfo) Line 3063 C++
coreclr.dll!DebuggerController::DispatchNativeException(_EXCEPTION_RECORD * pException, _CONTEXT * pContext, unsigned long dwCode, Thread * pCurThread, DebuggerSteppingInfo * pDebuggerSteppingInfo) Line 4618 C++
coreclr.dll!Debugger::FirstChanceNativeException(_EXCEPTION_RECORD * exception, _CONTEXT * context, unsigned long code, Thread * thread, int fIsVEH) Line 5470 C++
coreclr.dll!IsDebuggerFault(_EXCEPTION_RECORD * pExceptionRecord, _CONTEXT * pContext, unsigned long exceptionCode, Thread * pThread) Line 5811 C++
And the DebuggerController lock is held by thread 0x1E38 which is here:
```
[Inline Frame] coreclr.dll!CLREventBase::Wait(unsigned long) Line
412 C++
coreclr.dll!Thread::WaitSuspendEventsHelper() Line 4458 C++
coreclr.dll!Thread::RareDisablePreemptiveGC() Line 2185 C++
[Inline Frame] coreclr.dll!Thread::DisablePreemptiveGC() Line
1288 C++
[Inline Frame]
coreclr.dll!GCHolderBase::EnterInternalCoop_HackNoThread(bool) Line
4619 C++
[Inline Frame] coreclr.dll!GCCoopHackNoThread::{ctor}(bool) Line
4834 C++
[Inline Frame] coreclr.dll!HashMap::LookupValue(unsigned __int64)
Line 552 C++
[Inline Frame] coreclr.dll!PtrHashMap::LookupValue(unsigned __int64
key, void *) Line 607 C++
[Inline Frame]
coreclr.dll!ReadyToRunInfo::GetMethodDescForEntryPointInNativeImage(unsigned
__int64) Line 376 C++
[Inline Frame]
coreclr.dll!ReadyToRunInfo::GetMethodDescForEntryPoint(unsigned __int64
entryPoint) Line 104 C++
coreclr.dll!ReadyToRunJitManager::JitCodeToMethodInfo(RangeSection
* pRangeSection, unsigned __int64 currentPC, MethodDesc * *
ppMethodDesc, EECodeInfo * pCodeInfo) Line 6451 C++
coreclr.dll!EECodeInfo::Init(unsigned __int64 codeAddress,
ExecutionManager::ScanFlag scanFlag) Line 15025 C++
[Inline Frame] coreclr.dll!EECodeInfo::{ctor}(unsigned __int64)
Line 2877 C++
[Inline Frame]
coreclr.dll!ExecutionManager::GetCodeStartAddress(unsigned __int64) Line
4980 C++
coreclr.dll!EEDbgInterfaceImpl::GetNativeCodeStartAddress(unsigned
__int64 address) Line 416 C++
coreclr.dll!Debugger::GetJitInfoWorker(MethodDesc * fd, const
unsigned char * pbAddr, DebuggerMethodInfo * * pMethInfo) Line
2711 C++
[Inline Frame] coreclr.dll!Debugger::GetJitInfo(MethodDesc *) Line
2650 C++
coreclr.dll!DebuggerController::BindPatch(DebuggerControllerPatch *
patch, MethodDesc * pMD, const unsigned char *) Line 1432 C++
> coreclr.dll!DebuggerController::AddBindAndActivatePatchForMethodDesc(MethodDesc
* fd, DebuggerJitInfo * dji, unsigned __int64 nativeOffset,
DebuggerPatchKind kind, FramePointer fp, AppDomain * pAppDomain) Line
2242 C++
coreclr.dll!DebuggerController::AddBindAndActivateILReplicaPatch(DebuggerControllerPatch
* primary, DebuggerJitInfo * dji) Line 2006 C++
coreclr.dll!Debugger::MapPatchToDJI(DebuggerControllerPatch * dcp,
DebuggerJitInfo * djiTo) Line 4963 C++
coreclr.dll!Debugger::MapAndBindFunctionPatches(DebuggerJitInfo *
djiNew, MethodDesc * fd, const unsigned char *) Line 4879 C++
coreclr.dll!Debugger::JIT...
</details>
<!-- START COPILOT CODING AGENT SUFFIX -->
- Fixes dotnet#123650
<!-- START COPILOT CODING AGENT TIPS -->
---
✨ Let Copilot coding agent [set things up for
you](https://github.com/dotnet/runtime/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: noahfalk <6243776+noahfalk@users.noreply.github.com>
Co-authored-by: AaronRobinsonMSFT <30635565+AaronRobinsonMSFT@users.noreply.github.com>1 parent 46daa57 commit 6936ff7
1 file changed
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1415 | 1415 | | |
1416 | 1416 | | |
1417 | 1417 | | |
1418 | | - | |
| 1418 | + | |
| 1419 | + | |
| 1420 | + | |
1419 | 1421 | | |
1420 | 1422 | | |
1421 | 1423 | | |
| |||
0 commit comments