-
Notifications
You must be signed in to change notification settings - Fork 32
Added submit_keep_args_alive #1395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3a92246 to
19b628e
Compare
|
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1395/index.html |
|
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_9 ran successfully. |
|
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_10 ran successfully. |
|
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_12 ran successfully. |
|
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_13 ran successfully. |
|
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_15 ran successfully. |
|
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_16 ran successfully. |
6daca36 to
1b393d4
Compare
|
Array API standard conformance tests for dpctl=0.15.0rc1=py310ha25a700_14 ran successfully. |
4c8ae59 to
bb473da
Compare
|
Array API standard conformance tests for dpctl=0.15.0rc1=py310ha25a700_30 ran successfully. |
bb473da to
7c51f2f
Compare
|
Array API standard conformance tests for dpctl=0.15.0rc2=py310ha25a700_17 ran successfully. |
11b92bb to
7c51f2f
Compare
|
Array API standard conformance tests for dpctl=0.15.0rc2=py310ha25a700_24 ran successfully. |
|
Array API standard conformance tests for dpctl=0.15.0rc2=py310ha25a700_25 ran successfully. |
7c51f2f to
7f79887
Compare
|
Array API standard conformance tests for dpctl=0.15.0rc3=py310ha25a700_13 ran successfully. |
7f79887 to
e303eaa
Compare
This is an adaptation of pipelining technique shared by @mbecker in https://github.com/IntelPython/numbda_dpex/issues/147 This is built to work with async-ref-count-increment branch IntelPython/dpctl#1395 which implements asynchronous memcpy, asynchronous submit and asynchronous keep_arg_alve task submission.
|
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_16 ran successfully. |
|
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_17 ran successfully. |
f822827 to
ed67ac8
Compare
|
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_18 ran successfully. |
ed67ac8 to
802ead7
Compare
|
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_19 ran successfully. |
|
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_44 ran successfully. |
5b3643c to
cebd7f9
Compare
|
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_53 ran successfully. |
Usage: q = dpctl.SyclQueue() ... e = q.submit(krn, args, ranges) ht_e = q._submit_keep_args_alive(args, [e]) .... ht_e.wait()
Instead delegated the task of Python object life-time management to the user via use of _submit_keep_args_alive method
The SyclQueue.submit has become synchronosing, although it still returns a SyclEvent (with exectuion_status always complete)
This is the copy operation where one can specify list of events the copy operation requires before start of its execution. DPCTLQueue_MemcpyWithEvents( __dpctl_keep DPCTLSyclQueueRef QRef, void *dst, const void *src, size_t nbytes, const DPCTLSyclEventRef *depEvents, size_t nDE ) Uses this function in tests.
Also extends `dpctl.SyclQueue.memcpy` to allow arguments to be objects that expose buffer protocol, allowing `dpctl.SyclQueue.memcpy` and `dpctl.SyclQueue.memcpy_async` to be used to copy from/to USM-allocation or host buffer.
```
In [9]: timer = dpctl.SyclTimer()
In [10]: with timer(q):
...: y = dpt.linspace(1, 2, num=10**6, sycl_queue=q)
...:
In [11]: timer.dt
Out[11]: (0.0022024469999450957, 0.002116712)
In [12]: with timer(q):
...: x = dpt.linspace(0, 1, num=10**6, sycl_queue=q)
...:
In [13]: timer.dt
Out[13]: (0.004531950999989931, 0.004239664000000001)
```
The object can unpack into a tuple, like before, but it prints
with annotation of what each number means, and provides names
getters.
with timer(q):
code
dur = timer.dt
print(dur) # outputs (host_dt=..., device_dt=...)
dur.host_dt # get host-timer delta
dur.device_dt # get device-timer delta
hdt, ddt = dur # unpack into a tuple
|
I tested |
cebd7f9 to
10722d4
Compare
|
Array API standard conformance tests for dpctl=0.15.1dev0=py310ha25a700_56 ran successfully. |
Added
dpctl.SyclQueue._submit_keep_args_alive(args, events)that increments reference count ofargsobject (typically a sequence of arguments an asynchronous task is operating on), ensuring thatargsobject is not garbage collected until aftereventssignal that tasks working on these objects complete their execution.