# Severe slowdown on _temp with k8s-novolume hooks (v0.13.0) and actions/setup-go
#179085
Replies: 1 comment
-
|
💬 Your Product Feedback Has Been Submitted 🎉 Thank you for taking the time to share your insights with us! Your feedback is invaluable as we build a better GitHub experience for all our users. Here's what you can expect moving forward ⏩
Where to look to see what's shipping 👀
What you can do in the meantime 💻
As a member of the GitHub community, your participation is essential. While we can't promise that every suggestion will be implemented, we want to emphasize that your feedback is instrumental in guiding our decisions and priorities. Thank you once again for your contribution to making GitHub even better! We're grateful for your ongoing support and collaboration in shaping the future of our platform. ⭐ |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Why are you starting this discussion?
Bug
What GitHub Actions topic or product is this about?
ARC (Actions Runner Controller)
Discussion Details
Severe slowdown on
_tempwithk8s-novolumehooks (v0.13.0) andactions/setup-goEnvironment
kuberneteswithtype: novolumeACTIONS_RUNNER_CONTAINER_HOOKS=/home/runner/k8s-novolume/index.jsk8s-novolume0.13.0overlayfilesystem, no PVC, ephemeral storage onlyactions/setup-go@v5go test+ coverageactions/upload-artifact@v4dorny/test-reporter@v2There are two pods involved per job:
/__w/...)Symptoms
During the job, each step logs the container hook:
For some steps (especially later ones like
actions/upload-artifact@v4anddorny/test-reporter@v2), the workflow appears to hang for tens of minutes while this Node process is running, even though the actual user commands (e.g.go test) have already completed.Example around
actions/upload-artifact@v4:And then again for
dorny/test-reporter@v2:The overall job runtime was ~50+ minutes, with a large portion of time spent around these hook invocations.
What
index.jsis doing (find + stat on_temp)In the workflow pod
From the workflow pod (as
root):Shows:
In the runner pod
From the runner pod (as
runner):Shows:
So
index.jsis:cd-ing into_tempin each pod, andrunning:
This looks like a disk-usage / quota check for novolume mode, but it becomes very expensive when
_tempis large.Size of
_tempand I/O characteristicsIn the workflow pod under
/__w/_temp:So
_temphas ~300MB and ~14k files.I/O on the underlying storage is actually quite fast (tested in the runner pod):
So the underlying disk is not the bottleneck. The expensive part is the repeated
find/statover a large_temptree.How
actions/setup-gofills_tempThe Go setup step logs:
So
actions/setup-go@v5:/__w/_temp/<guid>/.../__w/_tool/go/1.21.8/x64_tempas part of normal operationThis explains why
_tempgrows significantly during the job.Key experiment: deleting go setup tmp dir under
_tempmakes the job fastWhile a job was in a “slow” phase (around upload-artifact / test-reporter, with
index.jsactive), I tried manually cleaning_tempin both pods.In the runner pod
In the workflow pod
After doing this:
So:
_temp~300MB / 14k files →index.jsspends a long time scanning_tempwithfind/stat._tempemptied → the same hooks complete very quickly and the job runtime drops dramatically.This strongly suggests the disk-usage scan in
_tempis the main bottleneck in novolume mode for this scenario.Workflow action (for context)
The main composite action used for tests:
Questions / Requests
Is it expected that the
k8s-novolumehooks (v0.13.0) run:on every hook invocation?
Is there any configuration / environment variable to:
Since
actions/setup-goand other actions naturally populate_tempwith many files, this scan becomes a major bottleneck innovolumemode.Is there a recommended way to:
_temp, or_tempand_workonto different mounts/paths specifically for the hook logic?Right now, the only reliable workaround I have found is to manually contents of
_tempin both pods during the job, which immediately speeds up the workflow, probably there is a "copy" of that temp dir between runner pod and pod workload. Obviously this is not ideal, so any guidance or improvements around disk-usage checks in thek8s-novolumehooks would be very helpful.Beta Was this translation helpful? Give feedback.
All reactions