fixing behaviour for group parameter in open_datatree#9666
fixing behaviour for group parameter in open_datatree#9666TomNicholas merged 26 commits intopydata:mainfrom
open_datatree#9666Conversation
|
@TomNicholas, I think this solution will work, but the |
There was a problem hiding this comment.
I think the reason why you don't get the desired result is that you compute paths relative to the immediate parent of the group, not the global parent. I don't have a deeply nested tree ready for testing (so I can't be sure this actually works), but with the suggestion below I don't get the empty root node anymore.
|
I think we are getting close. However, we are still having some discrepancies when comparing both datatrees using the group parameter and when directly selecting it via path. Example: print(dtree["/group2/subg1"])
Group: /group2/subg1 <----- different root paths here
│ Dimensions: (x: 2, y: 3)
│ Inherited coordinates:
│ * x (x) int64 16B -1 -2
│ * y (y) int64 24B 0 1 2
│ Data variables:
│ blah (x) int64 16B 2 3
├── Group: /group2/subg1/subsub1 <----- different paths here
│ Dimensions: (y: 3)
│ Data variables:
│ var (y) int64 24B 4 5 6
└── Group: /group2/subg1/subsub2 <----- different paths here
dt2 = xr.open_datatree("test.zarr",
group="/group2/subg1"
print(dt2)
Group: / <----- different root paths here
│ Dimensions: (x: 2)
│ Dimensions without coordinates: x
│ Data variables:
│ blah (x) int64 16B ...
├── Group: /subsub1 <----- different paths here
│ Dimensions: (y: 3)
│ Dimensions without coordinates: y
│ Data variables:
│ var (y) int64 24B ...
└── Group: /subsub2 <----- different paths hereAny comments on this @TomNicholas @keewis? |
|
this is by design, I think? I'd interpret (if you know unix commands, this would be similar to |
The reprs are different because the objects are different: |
TomNicholas
left a comment
There was a problem hiding this comment.
We just need a test, to remove the encoding bit, and then this is good to go!
|
The test should look very much like the ones in #9669 - create a tiny nested tree, save it to zarr/netcdf, open it with the |
open_datatree
…into fix-group-param
for more information, see https://pre-commit.ci
…into fix-group-param
for more information, see https://pre-commit.ci
TomNicholas
left a comment
There was a problem hiding this comment.
Thank you so much @aladinor !
keewis
left a comment
There was a problem hiding this comment.
I've got two comments: one on our strategy on DataTree whats-new entries, and one on the way we compare node datasets.
Co-authored-by: Justus Magin <keewis@users.noreply.github.com>
|
Sorry @aladinor - in fact could we just do this? #9666 (comment) |
|
This looks good! @keewis 's comments are addressed so I'm going to merge it. |
|
Otherwise @TomNicholas can do that while preparing for the release. |
|
Amazing thank you! And thanks for pointing that out so everyone involved can get credit for these great contributions! |
|
Thanks @TomNicholas and @keewis for your guidance! |
* main: Add `DataTree.persist` (pydata#9682) Typing annotations for arithmetic overrides (e.g., DataArray + Dataset) (pydata#9688) Raise `ValueError` for unmatching chunks length in `DataArray.chunk()` (pydata#9689) Fix inadvertent deep-copying of child data in DataTree (pydata#9684) new blank whatsnew (pydata#9679) v2024.10.0 release summary (pydata#9678) drop the length from `numpy`'s fixed-width string dtypes (pydata#9586) fixing behaviour for group parameter in `open_datatree` (pydata#9666) Use zarr v3 dimension_names (pydata#9669) fix(zarr): use inplace array.resize for zarr 2 and 3 (pydata#9673) implement `dask` methods on `DataTree` (pydata#9670) support `chunks` in `open_groups` and `open_datatree` (pydata#9660) Compatibility for zarr-python 3.x (pydata#9552) Update to_dataframe doc to match current behavior (pydata#9662) Reduce graph size through writing indexes directly into graph for ``map_blocks`` (pydata#9658)
* main: (85 commits) Refactor out utility functions from to_zarr (pydata#9695) Use the same function to floatize coords in polyfit and polyval (pydata#9691) Add `DataTree.persist` (pydata#9682) Typing annotations for arithmetic overrides (e.g., DataArray + Dataset) (pydata#9688) Raise `ValueError` for unmatching chunks length in `DataArray.chunk()` (pydata#9689) Fix inadvertent deep-copying of child data in DataTree (pydata#9684) new blank whatsnew (pydata#9679) v2024.10.0 release summary (pydata#9678) drop the length from `numpy`'s fixed-width string dtypes (pydata#9586) fixing behaviour for group parameter in `open_datatree` (pydata#9666) Use zarr v3 dimension_names (pydata#9669) fix(zarr): use inplace array.resize for zarr 2 and 3 (pydata#9673) implement `dask` methods on `DataTree` (pydata#9670) support `chunks` in `open_groups` and `open_datatree` (pydata#9660) Compatibility for zarr-python 3.x (pydata#9552) Update to_dataframe doc to match current behavior (pydata#9662) Reduce graph size through writing indexes directly into graph for ``map_blocks`` (pydata#9658) Add close() method to DataTree and use it to clean-up open files in tests (pydata#9651) Change URL for pydap test (pydata#9655) Fix multiple grouping with missing groups (pydata#9650) ...
Hi all.
This might be more complex than pruning the path in the open_group_as_dict function. It is kind of complex because when we use
_iter_zarr_groups,it yields the root group. I am still working o it