Commit 67ab975
Add
* initiak commit
* Add test workflow for `xrt` branch (#5241)
* Add test workflow for `xrt` branch
* Only run for PRs targeting XRT branch
* Add function to generate stablehlo based callable from pytorch model (#5216)
* Add function to generate stablehlo based callable from pytorch model
Added function
`torch_xla.experimental.stablehlo_saved_model.export_pytorch_model`.
This function will take a pytorch Module and convert it into stablehlo
bytecode.
* Only run the main CI workflow on PRs targeting master and release branches (#5244)
* Only run main CI for master and release branches.
* Disabling XRT tests on main CI
* AMP for TPUs v3 (#5161)
* remove duplicate autocast_test (#5246)
* Remove `test_experimental_pjrt_tpu.py` from TPU CI (#5247)
* Install `expecttest` in xla_test_job.yaml (#5252)
* Add IAM roles for cloudbuild_editors (#5251)
* [Functionalization] Remove view in view_symint (#5231)
* [Functionalization] Remove view in view_symint
Summary:
This pull request removes views in tensor_method::view_symint.
Test Plan:
XLA_DISABLE_FUNCTIONALIZATION=1 PJRT_DEVICE=TPU python ../test/test_view_ops.py -v -k TestViewOpsXLA.test_view_view
PJRT_DEVICE=TPU python ../test/test_view_ops.py -v -k TestViewOpsXLA.test_view_view
* Fix linters
* fixed the test
* ran the linter
---------
Co-authored-by: Xiongfei Wei <[email protected]>
* Delete XRT from the main branch (#5240)
* Delete XRT from the main branch
* Remove dead import
* formatting
* Remove disable_xrt build option
* Fix runtime init
* Revert "Remove disable_xrt build option"
This reverts commit ba312e7.
* Add disable XRT option back
* formatting
* Prune mesh service
* Remove obsolete test
* Remove other run server script
* Remove XRT config
* Update PJRT default device test
* Add a file I forgot to save
* if using_pjrt -> @requires_pjrt
* Remove irrelevant test case
* Remove XRT env vars
* fix md link
* formatting
* Remove extra `requires_pjrt`
* merge conflicts
* Add other autocast back
* Add nightly build for cuda 12 (#5253)
* Fix the linter command in the CI (#5254)
* fix linter command
* ran linter
* Jack cao g/fix spmd buff is null (#5256)
* Fix that non-tensor scalar can't be handled by virtual device
* add test
* comment
* Skip calling as_strided in empty_strided_symint if the input has dynamic dimensions. (#5239)
* Skip calling as_strided in empty_strided_symint.
* only return empty_symint conditionally.
* add a comment
* Add XRT nightly builds (#5261)
* Add XRT nightly builds
* remove space
* [OpenXLA] Migrate to pull XLA from OpenXLA (#5202)
PyTorch/XLA migrate to pull XLA from OpenXLA by replacing TensorFlow with OpenXLA after deprecating XRT usage, and replace TensorFlow-pin with OpenXLA-pin to May09
* Add ToString method for both PjrtData and PjrtShardedData (#5265)
* Add ToString method for both PjrtData and PjrtShardedData
* on cpu same config will become replicated, dont't check actual op sharding type
* Update Sharded graph HLO dumping (#5266)
* Enable PjRt Client Compilation with StableHLO (#5233)
* Enable xla PjRt client compilation with StableHLO
* add XLA_STABLEHLO_COMPILE to configuration.yaml
* fix merge conflict
* dummy commit to trigger ci
* Revert "dummy commit to trigger ci"
This reverts commit f7aec23.
* Disable Bazel remote cache for forked PR (#5259)
* disable bazel remote cache if gcloud key is empty
* remove remote cache from setup.py
* experiment with debug msg
* fix flag
* add more logs
* skip remote chache if credential file is empty
* add comment
* add logs
* add check in test and coverage script
* fix condition in coverage test
* advance branch pr
* allow remote cache if gloud file isn't specified explicitly
* remove dummy comment
* Suppress debug symbols in OpenXLA code (#5269)
* [SPMD] Sharding n-d tensor on (n+1)-d Mesh (#5268)
* Make TPU detection more robust (#5271)
* Clean bazel stuff on distutils clean. (#5274)
* Clean bazel stuff on distutils clean
* Fix python formatting
* Delete unused .so file, and .lds files (#5275)
* [OpenXLA] Delete unused .so file and .lds files
* Fix the error when export_torch_model is given a non-tensor (#5277)
However the generated StableHLO graph still hardcodes the
non-tensor value. this is not correct, will fix later.
* Dsiable test_simple_model_with_different_input_shape since it is curretnly broken by pytorch (#5282)
* Always do build_ext in python setup.py develop (#5273)
Bazel should figure out that _XLAC.so is current
or not, and trigger rebuild if any cpp files changed.
* Remove or improve several hardcoded TPU test conditions (#5272)
* Remove or improve several hardcoded TPU test conditions
* Fix test condition
* Add `runtime.host_index` (#5283)
* Make it an error if calling sizes() on a dynamic tensor. (#4998)
* Err if calling sizes() on dynamic tensor
* try to set has_symbolic_sizes_strides_
* resolve merge conflict
* enable CONTINUE_ON_ERROR
* fixed the python test test_SizeEq_should_not_compile_for_identical_symints
* fix test_index_types
* set CONTINUE_ON_ERROR to true
* remove some unwanted code.
* add a print
* directly set has_symbolic_sizes_strides_ = true
* make some fixes.
* fix empty_strided_symint
* ran linter
* change error type in the test.
* fix comments
* ran linter
* Fix the error where mark_step does not materalize tensors on SPMD:0 (#5281)
* Fix the error where mark_step does not materalize tensors on SPMD:0
* typo
* fix test_non_tensor_scalar
* Disable torch._dynamo.config.automatic_dynamic_shapes (#5285)
* Set torch._dynamo.config.automatic_dynamic_shapes to False
* Enable DynamoInferenceBasicTest.test_simple_model_with_different_input_shape
* run linter
* wrap only if sharding type is non-replicated
* Handle non-tensors
* run linter
* Call wrap_if_sharded first
* Add exception in test for unsharded tensor
* fix test
* Use torch.Tensor instead of torch.tensor
* use .cpu() only for tensors
---------
Co-authored-by: Will Cromar <[email protected]>
Co-authored-by: qihqi <[email protected]>
Co-authored-by: Meghan Cowan <[email protected]>
Co-authored-by: Mateusz Lewko <[email protected]>
Co-authored-by: Jiewen Tan <[email protected]>
Co-authored-by: Xiongfei Wei <[email protected]>
Co-authored-by: Wonjoo Lee <[email protected]>
Co-authored-by: JackCaoG <[email protected]>
Co-authored-by: Manfei <[email protected]>
Co-authored-by: Siyuan Liu <[email protected]>
Co-authored-by: stgpetrovic <[email protected]>
Co-authored-by: Mohit Khatwani <[email protected]>_sharded_cpu_state_dict for distributed checkpointing (#5288)1 parent 21784ce commit 67ab975
File tree
3 files changed
+66
-18
lines changed- test/spmd
- torch_xla/experimental
3 files changed
+66
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
2 | 3 | | |
3 | 4 | | |
4 | 5 | | |
| |||
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| 18 | + | |
| 19 | + | |
17 | 20 | | |
18 | 21 | | |
19 | 22 | | |
| |||
244 | 247 | | |
245 | 248 | | |
246 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
247 | 268 | | |
248 | 269 | | |
249 | 270 | | |
Lines changed: 42 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
5 | 7 | | |
| 8 | + | |
6 | 9 | | |
7 | 10 | | |
8 | 11 | | |
| 12 | + | |
9 | 13 | | |
10 | 14 | | |
11 | 15 | | |
| |||
14 | 18 | | |
15 | 19 | | |
16 | 20 | | |
17 | | - | |
18 | 21 | | |
19 | 22 | | |
20 | 23 | | |
21 | | - | |
22 | | - | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
23 | 28 | | |
24 | 29 | | |
25 | 30 | | |
| |||
186 | 191 | | |
187 | 192 | | |
188 | 193 | | |
189 | | - | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
37 | | - | |
| 36 | + | |
38 | 37 | | |
39 | 38 | | |
40 | 39 | | |
41 | 40 | | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
| 44 | + | |
44 | 45 | | |
45 | 46 | | |
46 | 47 | | |
| |||
373 | 374 | | |
374 | 375 | | |
375 | 376 | | |
376 | | - | |
377 | | - | |
378 | | - | |
379 | | - | |
380 | | - | |
381 | | - | |
382 | | - | |
383 | | - | |
384 | | - | |
385 | | - | |
386 | | - | |
387 | | - | |
0 commit comments