Skip to content

bump github.com/NVIDIA/k8s-device-plugin to v0.15.0#298

Merged
hami-robott[bot] merged 1 commit into
Project-HAMi:masterfrom
morlay:k8s-device-plugin-upgrade
Jul 25, 2024
Merged

bump github.com/NVIDIA/k8s-device-plugin to v0.15.0#298
hami-robott[bot] merged 1 commit into
Project-HAMi:masterfrom
morlay:k8s-device-plugin-upgrade

Conversation

@morlay

@morlay morlay commented May 6, 2024

Copy link
Copy Markdown
Contributor

What type of PR is this?

What this PR does / why we need it:

updates to support WSL.
NVIDIA/k8s-device-plugin#646

Which issue(s) this PR fixes:
No

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

@github-actions

github-actions Bot commented May 6, 2024

Copy link
Copy Markdown

Hi @morlay,
Thanks for your pull request!
If the PR is ready, use the /auto-cc command to assign Reviewer to Review.
We will review it shortly.

Details

Instructions for interacting with me using comments are available here.
If you have questions or suggestions related to my behavior, please file an issue against the gh-ci-bot repository.

@morlay

morlay commented May 6, 2024

Copy link
Copy Markdown
Contributor Author

/auto-cc

@github-actions github-actions Bot requested review from archlitchi and wawa0210 May 6, 2024 06:55
@wawa0210

wawa0210 commented May 7, 2024

Copy link
Copy Markdown
Member

Thanks for the contribution. Can you upload the test results you ran after building the new version?

At present, HAMi does not have a complete e2e system, but it is planned

@morlay

morlay commented May 7, 2024

Copy link
Copy Markdown
Contributor Author

Thanks for the contribution. Can you upload the test results you ran after building the new version?

It works well with my custom image build (as multi arch image).

image

WSL is not tested.

I will test arm64 when the device ready.

Btw. v2.3.10 monitor container panic with (so i patched from v2.3.9)

I0507 02:42:26.842928 3374747 metrics.go:324] Initializing metrics for vGPUmonitor
I0507 02:42:26.843779 3374747 pathmonitor.go:159] server listening at [::]:9395
I0507 02:42:31.914854 3374747 pathmonitor.go:126] Adding ctr dirname /usr/local/vgpu/containers/2c9e5e40-317d-4daf-8b24-14bff598fb3b_svc in monitorpath
I0507 02:42:31.914885 3374747 pathmonitor.go:56] Checking path /usr/local/vgpu/containers/2c9e5e40-317d-4daf-8b24-14bff598fb3b_svc
unexpected fault address 0x7f0cb036073c
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7f0cb036073c pc=0x16219b1]

goroutine 75 [running]:
runtime.throw({0x1a7037b?, 0x7f0cb023c000?})
	/usr/local/go/src/runtime/panic.go:1077 +0x5c fp=0xc00046db48 sp=0xc00046db18 pc=0x448cfc
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:858 +0x116 fp=0xc00046dba8 sp=0xc00046db48 pc=0x45ee36
main.mmapcachefile({0xc0000261c0, 0x6e}, 0xc00046dd68)
	/k8s-vgpu/cmd/vGPUmonitor/cudevshr.go:146 +0x151 fp=0xc00046dc90 sp=0xc00046dba8 pc=0x16219b1
main.getvGPUMemoryInfo(0xc00046dd68)
	/k8s-vgpu/cmd/vGPUmonitor/cudevshr.go:154 +0x31 fp=0xc00046dcc0 sp=0xc00046dc90 pc=0x1621b51
main.checkfiles({0xc0008895e0, 0x43})
	/k8s-vgpu/cmd/vGPUmonitor/pathmonitor.go:79 +0x23b fp=0xc00046dd98 sp=0xc00046dcc0 pc=0x16255bb
main.monitorpath(0x12a05f200?)
	/k8s-vgpu/cmd/vGPUmonitor/pathmonitor.go:127 +0x3ba fp=0xc00046df60 sp=0xc00046dd98 pc=0x1625cba
main.watchAndFeedback()
	/k8s-vgpu/cmd/vGPUmonitor/feedback.go:277 +0x91 fp=0xc00046dfe0 sp=0xc00046df60 pc=0x16229f1
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00046dfe8 sp=0xc00046dfe0 pc=0x47afa1
created by main.main in goroutine 1
	/k8s-vgpu/cmd/vGPUmonitor/main.go:36 +0x137

@wawa0210

Copy link
Copy Markdown
Member

/lgtm

/cc @archlitchi

@github-actions github-actions Bot added the lgtm label May 10, 2024
@wawa0210

Copy link
Copy Markdown
Member

/approve

@hami-robott hami-robott Bot added the approved label May 17, 2024
@wawa0210

Copy link
Copy Markdown
Member

/approve cancel

@hami-robott hami-robott Bot removed the approved label May 17, 2024
@wawa0210

Copy link
Copy Markdown
Member

/approve

@hami-robott

hami-robott Bot commented May 17, 2024

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: morlay, wawa0210

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hami-robott hami-robott Bot added the approved label May 17, 2024
@wawa0210 wawa0210 removed the request for review from archlitchi May 17, 2024 02:11
@calvin0327

Copy link
Copy Markdown
Contributor

/kind cleanup

@morlay morlay force-pushed the k8s-device-plugin-upgrade branch from 612d4ec to da17368 Compare May 31, 2024 10:46
@archlitchi

Copy link
Copy Markdown
Member

please resolve these conflicting files, and we are ready to merge

Signed-off-by: Morlay <morlay.null@gmail.com>
@morlay morlay force-pushed the k8s-device-plugin-upgrade branch from da17368 to b36835b Compare July 3, 2024 01:40
@morlay

morlay commented Jul 3, 2024

Copy link
Copy Markdown
Contributor Author

@archlitchi done.

@archlitchi

Copy link
Copy Markdown
Member

/lgtm

@hami-robott hami-robott Bot added the lgtm label Jul 25, 2024
@hami-robott hami-robott Bot merged commit 92acc44 into Project-HAMi:master Jul 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants