[SPARK-47577][SPARK-47579] Correct misleading usage of log key TASK_ID#46951
[SPARK-47577][SPARK-47579] Correct misleading usage of log key TASK_ID#46951gengliangwang wants to merge 3 commits into
Conversation
| // re-execute it. | ||
| logError(log"Task ${MDC(TASK_ID, info.id)} in stage ${MDC(STAGE_ID, taskSet.id)} " + | ||
| log"(TID ${MDC(TID, tid)}) can not write to output file: " + | ||
| logError(log"Task ${MDC(TASK_INFO_ID, info.id)} in stage ${MDC(STAGE_ID, taskSet.id)} " + |
There was a problem hiding this comment.
the taskSet.id is s"$stageId.$stageAttemptId"
There was a problem hiding this comment.
for cases like TASK_INFO_ID and TASK_SET_ID, I am wondering if we can just expose
- TASK_INDEX
- TASK_ATTEMP_NUM
- STAGE_ID
- STAGE_ATTEMP_ID
to the MDC?
There was a problem hiding this comment.
Good point, updated.
| isExecutorDecommissioningOrDecommissioned(taskScheduler, bmAddress) | ||
| if (ignoreStageFailure) { | ||
| logInfo(log"Ignoring fetch failure from ${MDC(TASK_ID, task)} of " + | ||
| logInfo(log"Ignoring fetch failure from ${MDC(TASK_NAME, task)} of " + |
There was a problem hiding this comment.
Do we really want to call this TASK_NAME?
From the code below, it seems inconsistent:
spark/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala
Lines 583 to 587 in 33a9c5d
There was a problem hiding this comment.
well, I can't find a better name here..
| log"${MDC(ERROR, ef.description)}; not retrying") | ||
| logError( | ||
| log"Task ${MDC(TASK_INDEX, info.index)}.${MDC(TASK_ATTEMPT_ID, info.attemptNumber)} " + | ||
| log"in stage ${MDC(STAGE_ID, taskSet.stageId)}." + |
There was a problem hiding this comment.
This one only shows once...I was thinking about not having a new method
| override def toString: String = "TaskSet " + id | ||
|
|
||
| // Identifier used in the structured logging framework. | ||
| lazy val logId: MessageWithContext = { |
There was a problem hiding this comment.
So we intentionally not treat it as a log parameter?
There was a problem hiding this comment.
what do you mean by "log parameter"?
There was a problem hiding this comment.
nvm, it's a MessageWithContext
|
@pan3793 @panbingkun @cloud-fan Thanks for the review. |

What changes were proposed in this pull request?
Correct misleading usage of log key TASK_ID from #45834 and #46739
Why are the changes needed?
Provide more accurate log keys in the structure logging
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing GA tests
Was this patch authored or co-authored using generative AI tooling?
No