v0.16.1
·
1830 commits
to master
since this release
This release includes bug fixes and provides a change needed by numba_dpex project to support dispatching kernels
consuming instances of sycl::local_accessor template type.
Changed
- Changed behavior of
dpctl.tensor.usm_ndarray.__dlpack_device__method to return device id of the parent unpartitioned device if array is allocated on a sub-device instead of raising an exception: #1604
- Array creation functions and the
usm_ndarrayconstructor indpctl.tensorsubmodule now use cached default-selected device to improve performance: #1606 - Changed treatment of
axiskeyword fordpctl.tensor.tensordotanddpctl.tensor.vecdotto align with Python Array API 2023.12 specification: #1608 - Changed implementation of
DPCTLQueue_SubmitRange,DPCTLQueue_SubmitNDRangein DPCTLSyclInterface library to supportsycl::local_accessorarguments needed bynumba_dpex; the enumDPCTLKernelArgT\ ypeto correspond to C++ disjoint types: #1609, #1611, #1612
Fixed
- Fixed a crash on Windows platform during execution of getter of
dpctl.SyclPlatfom.default_contextproperty: : #1604 - Fixed kernel submission error on NVidia CUDA GPUs during
dpctl.tensor.matmuloperation: #1605 - Fixed corruption of context cache table entries: #1607
- Fixed incorrect result from
dpctl.tensor.tensordotreported in issue #1570: #1608 - Fixed output of
python -m dpctl --libraryto fix specified library name: #1615