Skip to content

Release v22.10.00#652

Merged
marcinz merged 82 commits intomainfrom
branch-22.10
Oct 12, 2022
Merged

Release v22.10.00#652
marcinz merged 82 commits intomainfrom
branch-22.10

Conversation

@marcinz
Copy link
Collaborator

@marcinz marcinz commented Oct 11, 2022

No description provided.

ipdemes and others added 30 commits August 3, 2022 19:48
…nced indexing (#486)

* fxing logic for some advanced_indexing test cases

* Reformatting of new testcases by @bryevdv

* Add new required test packages to conda env files

Co-authored-by: Manolis Papadakis <manopapad@gmail.com>
* Refactor test driver for cpu/gpu sharding

* fix -cunumeric:test

* Add system info to top-level banner

* make some methods functions for easier testing

* add --debug to CPU jobs

* don't special case verbsoe mode

* add debug output to all jobs

Co-authored-by: Manolis Papadakis <manopapad@gmail.com>
Conda packages now build with support for curand both in the CPU and the GPU builds.

Co-authored-by: Marcin Zalewski <mzalewski@nvidia.com>
)

* report times in summary lines

* fix typo

* Add an overall test suite summary

* defer test output until test completion

* remove -j 1 argument to test.sh

* try bloat factor = 1.25

* fix default fbsize and bloat factor

* specify fbmem in MB
* Unify the template for device reduction tree and do some cleanup

* Fix performance bugs in scalar reduction kernels:

* Use unsigned 64-bit integers instead of signed integers wherever
  possible; CUDA hasn't added an atomic intrinsic for the latter yet.

* Move reduction buffers from zero-copy memory to framebuffer. This
  makes the slow atomic update code path in reduction operators
  run much more efficiently.

* Use thew new scalar reduction buffer in binary reductions as well

* Use only the RHS type in the reduction buffer as we never call apply

* Minor clean up per review

* Rename the buffer class and method to make the intent explicit

* Flip the polarity of reduce's template parameter
Pre-commit update and the necessary fixes.

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Marcin Zalewski <mzalewski@nvidia.com>
updates:
- [github.com/PyCQA/flake8: 5.0.2 → 5.0.4](PyCQA/flake8@5.0.2...5.0.4)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix for an off-by-one bug
* Shared memory size had not been passed to the kernel launch
* Ensure test.py --use flag fully overrides USE_* envvars

* Update a test-tools unit test

Co-authored-by: Manolis Papadakis <mpapadakis@nvidia.com>
* Enhance two integration tests

Enhance test_append and test_array_creation
1.  add negative tests
2.  add more test cases
3.  refactor test code

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address comments
1. Create test class for negative testing
2. Refactor out test functions
3. Use parameterize

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address comments - part2
1. update run_test name to check_array_method
2. use parameterize for step zero cases of arange

* Address comments - Part 3
1. add pytest.mark.xfail for cases with expected failure
2. Small Fix: replace Assert with raising ValueError in deferred.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address comments - fix a typo

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* checkpoint array

* Clean up cunumeric.tile

* Disallow kind=None on cn.ndarray.argsort, to match cn.argsort

* Avoid a cast

* Update a docstring to match the inferred type signature

* Fix handling of out=cn.ndarray in clip

* add new missing types

* Values of type CastingKind should never be None

Value of this type are eventually fed to np.can_cast, which
doesn't accept None.

* Use np.ndarray.tobytes over the deprecated tostring

* Minor fixes

* Don't compare dtypes with `is`, but with ==

Doing the former can result in unexpected behavior, in the common case
where one value is a proper np.dtype object, while the other is something
that is not itself a np.dtype, but something convertible to it:

>>> np.dtype(np.int64) is np.int64
False
>>> np.dtype(np.int64) == np.int64
True

Some of the uses in the modified code were actually safe, because a
pre-existing array's dtype is always wrapped in a np.dtype object,
but I changed them too, for the sake of consistency.

* No need to call dtype.type if using ==

* Fix dtype= handling in _diag_helper

* Copy NumPy's type signature for an unimplemented function

It doesn't matter now, but we might have forgotten to change it to the more general
signature when we got around to implementing this.

Co-authored-by: Manolis Papadakis <manopapad@gmail.com>
* Update test runner for osx

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* lint

* fix up tests, simplify manager creation

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Don't blindly trust user-supplied bincount.minlength

* Change parameter name to match docstring
This is working around some recent changes to OpenBLAS. Previously we were using
the internal names for functions, e.g. "spotrf_". OpenBLAS changed the
definitions of these internal functions, so in a previous PR we switched to
using the public functions, e.g. "LAPACK_spotrf". These used to be function
symbols, but in the latest update OpenBLAS changed these to be macros.
…tion (#467) (#537)

* fix reciprocal tests and add unary test customization

* make tests deterministic and enforce root inputs are non-negative

Co-authored-by: Jeremy <jjwilke@users.noreply.github.com>
* Refactor test runner to support more pinning options

* add --gpu-delay option
* Make the validation condition for random distributions lenient

* Fix typo

* Catch too small standard variations against theoretical values as well

* Replace unnecessary NumPy calls with Python primitives

* Tighten the tolerance
…astype` (#549)

* Fix buggy complex-to-bool conversions and add correctness tests for np.astype

* Typo

* Fix the bug in the eager implementation as well
* src/cunumeric: handle high number of bins in GPU bincount

The existing bincount implementation on GPUs attempts to allocate
a workspace for all bins within the shared memory available on each
SM. This commit updates the implementation to fall back to a slower
kernel that reduces to global memory when there are too many bins to
fit into shared memory.

Fixes #503.

* src/cunumeric/stat: fix launch parameters in bincount kernels

* tests/integration: refactor test_bincount.py for better readability

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Rohan Yadav <rohany@cs.stanford.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
fixing advanced indexing operation for empty arrays
…esults can diverge from numpy (#528)

* Added note to prefix documentation for corner cases where cunumeric results can diverge from numpy. Also other minor fixes to prefix documentation.

* Minor changes to documentation phrasing.
…gion fields (#551)

* Handle inline allocations from 0D stores correctly (spoiler: they are not 0D)

* Add a test case for 0D region-backed stores
jjwilke and others added 24 commits September 21, 2022 09:43
…ckage installation (#514)

* add initial CMake build

* fix compile error

* point to nv-legate repo

* use realm_defines and legion_defines from the build dir if it's defined

* update version

* guard against RealmRuntime and LegionRuntime targets not existing

* fix version number

* fully support building without CUDA and OpenMP, detect support for both from legate_core target

* use compiler cache to speed up tblis builds

* toggle tblis openmp via CUNUMERIC_USE_OPENMP

* print messages for CI

* adjust -isystem flag to support clangd

* Toggle CUDA, OpenMP, and bounds checking based on the found legate.core package's config

* print message when legate_core is found

* fix typo

* use CMAKE_SHARED_LIBRARY_SUFFIX for tblis shared library

* remove dot

* handle case where build_shared_libs is off

* Speed up FetchContent_Populate by downloading a tarball (if possible) instead of cloning

* cleanup

* make required CMake version match conda-forge's CMake

* Use CPM to find or build OpenBLAS

* only create alias targets if OpenBLAS was added

* initial commit of CMake-based install.py

* make install-2.py work with legate build dirs

* place libraries in build/lib

* add target to preprocess cunumeric_c.h for use with Python CFFI

* ignore install dir

* use the preprocessed cunumeric_c.h.i generated by CMake instead of doing it in Python

* remove unused vars

* fix gitlab archive URI for branches with slashes in the name

* update rapids-cmake version

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add scikit-build

* make install.py call pip install .

* fix lint

* remove debugging lines

* clean up

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix default legate branch

* assume legion_core is already installed in build-install.sh example script

* export LIBRARY_PATH if not set

* resolve relative path in build scripts

* formatting

* apply Bryan's fixes for tests

* don't use defaults

* fix lint

* use Readline so tab completion works

* set CMAKE_BUILD_PARALLEL_LEVEL

* build tblis on cmake --build instead of cmake configure

* fix get_libpath

* fix separate tblis configure/build stages to correctly link to libtblis.so

* use add_custom_command so tblis isn't always rebuilt

* use my legate.core branch temporarily in CI

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* default branch and url in install.py temporarily

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* set optimization level -O2

* ensure CUDA architectures are detected correctly

* add searchsorted sources

* fix typos

* install tblis if we built it

* clean out tblis lib and include dirs

* use --upgrade instead of --force-install

* remove todo

* find exact legate_core and cunumeric package versions

* do pip install --upgrade if not editable

* set REQUIRED if legate_core_ROOT is defined

* Update conda recipe to use CMake

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add new source files

* fix bad merge

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* test legate_core_DIR/ROOT for truthy-ness

* fix lint

* add back in optional --legate argument to test.py

* fix legate_path to be str instead of Path

* fix lint

* move make/cmake/ninja to build requirements

* add build and runtime dependencies to dev conda envs

* fix lint

* remove legion_helpers.cmake

* Enable using tblis_ROOT to find external tblis installations

* add build and install export sets to rapids_cpm_find

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add separate build scripts to build with/without prebuilt legate.core

* update cunumeric_cpp.cmake for new files

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix bad merge

* export tblis_BINARY_DIR to PARENT_SCOPE

* do not reference undefined env "_" in tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tblis flag in install.py

* fix flake8 issues on test_patch

* mypy fixes

* update conda-build/build.sh

* add initial build directions

* formatting fixes

* more build information

* add conda directions

* exclude legate files from mypy again

* update default legate core branch and repos

* ensure sccache is used in conda build

* allow sccache envvars from external environment

* ensure SETUPTOOLS_ENABLE_FEATURES is set to "legacy-editable"

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* translate gpu name to cuda architectures

* remove unnecessary cmake define

* link to curand

* fix install.py --with-core arg

* Apply suggestions from code review

* fix gitlab tgz urls

* Apply suggestions from code review

* don't link curand

* use if(POLICY)

* enable cmake policy 0135

* add extra target to update build.ninja mtime so rebuilding doesn't re-run CMake

* remove easy-install.pth

* ensure libcunumeric.so is found if installed into a non-standard install location

* better handle --prefix flag, remove --python-only flag

* infer legate_dir from an existing legate.core python install (including editable installs) when the user omits the --with-core flag

* don't remove easy-install.pth

* mirror flags in legate.core example build scripts

* add argwhere sources

* update mypy paths to ignore new location of install_info

* ensure build dir is cleaned if the value of --build-isolation is different from last time we built

* cmake cleanup

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add --max_dim --max_fields --spy --openmp --llvm --hdf --gasnet --gasnet_dir and --conduit flags in case cunumeric builds legate_core instead of finding it

* define CUDAHOSTCXX envvar

* define flags for debug and minsizerel build types

* update package version

* parse BUILD_MARCH and/or BUILD_MCPU configuration flags

* add openmpi to conda envs

* use correct dynamic library extension for other OS's

* fix typo

* add py.typed for mypy, fix typings

* add wrap to sources list

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused # type: ignore comment

* Update get_legate_core.cmake

* Update install.py

* Update install.py

Co-authored-by: ptaylor <paul.e.taylor@me.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Bryan Van de Ven <bryan@bokeh.org>
* Enhance test_block.py and test_eye.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix case for list in check/compare methods.

* Fix typos.

* Fix another typo.

* Address comments
only use pytest.raises to handle exceptions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address comments

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* add negative test case

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add negative test case

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add negative test case

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* correct bugs by upgrading the code path for the array_split function

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* correct bugs by upgrading the code path for the array_split function

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update the code path for the array_split function

* add negative test case in test_array_split.py

* add negative test case in test_array_split.py

* add testcase for test_array_split.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add test case for test_array_split.py

* add test cases for test_flip and test_indices

* fix Eager execution test error

* add test case for test_flip.py and test_indices.py

* add test cases for test_fill.py and test_ndim.py

* add test cases for test_fill.py and test_ndim.py

* add test cases for test_fill.py and test_ndim.py

* add test cases for test_fill.py and test_ndim.py

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Remove unneeded dependency on curand in conda build.

Co-authored-by: Marcin Zalewski <mzalewski@nvidia.com>
Label checking delay is set to 5 minutes.

Co-authored-by: Marcin Zalewski <mzalewski@nvidia.com>
Co-authored-by: Manolis Papadakis <mpapadakis@nvidia.com>
* Provenance tracking for cuNumeric operators

* Use decorators for provenance tracking

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Filter out legate frames in the logic finding the last user frame

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Invoke eye with read-write privilege, not write-discard

We cannot create tight region requirements, that include just the diagonal we
are writing, so necessarily there will be elements in the regions we pass to the
eye call whose values must remain. Write-discard privilege, then, is not
appropriate for this call, as it essentially tells the runtime that it can throw
away the previous contents of the entire region.

* Add a comment explaining the eye task privilege
* Fix tests utils to make --directory work correctly.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Use relative path to compare against skipped tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Change self.root_dir to Path type.

* remove PurePath

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Enhance test_diag_indices.py and test_flatten.py.

* Address comments.

* Skip msg match.
* Enhance mask_indices and move_axis

* Address comments.
This file was included unnecessarily, and led to build issues on
distributed machines. In particular, including coll.h pulls in mpi.h,
which is an unresolved header to NVCC.

Signed-off-by: Rohan Yadav <rohany@alumni.cmu.edu>

Signed-off-by: Rohan Yadav <rohany@alumni.cmu.edu>
@marcinz marcinz added the category:task PR is a simple task and will not be included in release notes label Oct 11, 2022
@marcinz marcinz merged commit 81ad156 into main Oct 12, 2022
manopapad pushed a commit that referenced this pull request Mar 17, 2025
mag1cp1n pushed a commit to mag1cp1n/cupynumeric that referenced this pull request Apr 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category:task PR is a simple task and will not be included in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.