Skip to content

fix: hami-webui-be-oss startup panic due to fetchContainerInfo function#35

Merged
archlitchi merged 1 commit into
Project-HAMi:mainfrom
chinaran:fix/priority-panic
May 9, 2025
Merged

fix: hami-webui-be-oss startup panic due to fetchContainerInfo function#35
archlitchi merged 1 commit into
Project-HAMi:mainfrom
chinaran:fix/priority-panic

Conversation

@chinaran

@chinaran chinaran commented May 1, 2025

Copy link
Copy Markdown
Contributor

(Might fix #31 as well)
When I use HAMi-WebUI v1.0.5 version and HAMi master (7ee5dce) version, there is a panic issue starting hami-webui-be-oss.
Looking at the logs and code, the panic is related to the assignment of the Priority field in the fetchContainerInfo function.

E0430 20:33:25.844455       1 runtime.go:79] Observed a panic: runtime.boundsError{x:1, y:1, signed:true, code:0x0} (runtime error: index out of range [1] with length 1)
goroutine 86 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1d3e560, 0xc000bfcdb0})
	/go/pkg/mod/k8s.io/apimachinery@v0.30.1/pkg/util/runtime/runtime.go:75 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000553e20?})
	/go/pkg/mod/k8s.io/apimachinery@v0.30.1/pkg/util/runtime/runtime.go:49 +0x6b
panic({0x1d3e560?, 0xc000bfcdb0?})
	/usr/local/go/src/runtime/panic.go:785 +0x132
vgpu/internal/data.(*podRepo).fetchContainerInfo(0xc000553e00, 0xc000b7a008)
	/src/internal/data/pod.go:145 +0x96b
vgpu/internal/data.(*podRepo).addPod(0xc000553e00, 0xc000b7a008, {0xc00036e320, 0xf}, 0xc000b09350)
	/src/internal/data/pod.go:93 +0x98
vgpu/internal/data.(*podRepo).onAddPod(0xc000553e00, {0x1e3b4a0?, 0xc000b7a008?})
	/src/internal/data/pod.go:70 +0x189

Taking two container pods as an example (the first one uses GPU resources, the second one doesn't).
Before the Project-HAMi/HAMi#1012 change, the hami.io/vgpu-devices-allocated annotation for the pod was:
GPU-b61419d8-fa53-4c5f-124c-5b23ec39dfe4,NVIDIA,4096,50:;,,0,0:;.
After the change, the annotation becomes:
GPU-b61419d8-fa53-4c5f-124c-5b23ec39dfe4,NVIDIA,4096,50:;;.
This resulted in the code Priority: bizContainerDevices[i][0].Priority assigning a value that does not exist for bizContainerDevices[i].

So, I tried to make a change to be compatible with both hami.io/vgpu-devices-allocated annotations.

Signed-off-by: 王然 <ranwang@alauda.io>
@chinaran

chinaran commented May 1, 2025

Copy link
Copy Markdown
Contributor Author

Also, is it necessary that the ; used to separate containers and the : used for devices are appended at the end?
Appending at the end causes strings.Split() to get a count that doesn't match the true count.

@archlitchi archlitchi merged commit c43785b into Project-HAMi:main May 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hami-webui stuck in crashloopback.

2 participants