refactor: Import fewer private modules#1069
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1069 +/- ##
==========================================
+ Coverage 65.88% 65.90% +0.02%
==========================================
Files 44 45 +1
Lines 6762 6772 +10
Branches 1137 1138 +1
==========================================
+ Hits 4455 4463 +8
- Misses 1874 1875 +1
- Partials 433 434 +1
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This pull request refactors import statements to consolidate version-dependent compatibility code into a new _compat.py module, addressing issue #1061. The changes improve maintainability by encapsulating fragile version-specific imports in a single location.
- Moved version-dependent scanpy and anndata imports to
squidpy/_compat.py - Replaced private pandas imports (
pandas._libs.lib.infer_dtype) with public API (pandas.api.types.infer_dtype) - Replaced deprecated
scanpy._utils.check_presence_downloadwith a newdownload_filefunction usingpooch
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/squidpy/_compat.py |
New compatibility module that handles version-dependent imports from scanpy and anndata with appropriate version checks |
src/squidpy/pl/_var_by_distance.py |
Updated to import scanpy plotting functions from _compat module instead of directly from scanpy private APIs |
src/squidpy/pl/_utils.py |
Replaced private pandas import with public API (pandas.api.types.infer_dtype) |
src/squidpy/pl/_spatial_utils.py |
Updated to import add_categorical_legend from _compat module |
src/squidpy/pl/_interactive/_utils.py |
Replaced private pandas import with public API and imports scanpy function from _compat |
src/squidpy/pl/_color_utils.py |
Updated to import scanpy function from _compat module |
src/squidpy/gr/_utils.py |
Removed local version checking logic and imports anndata views from _compat |
src/squidpy/datasets/_utils.py |
Added new download_file function using pooch as replacement for scanpy's deprecated download utility |
src/squidpy/datasets/_10x_datasets.py |
Updated to use new download_file function instead of scanpy's check_presence_download |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Looks great! Idk if _SET_DEFAULT_COLORS_FOR_CATEGORICAL_OBS_CHANGED has value over just try:/except ImportError since this a private API, otherwise looks great!
Some thoughts about pooch (all optional improvements, but since you’re switching to it anyway):
- If the data you download tends to be big, consider setting
progressbar=...inpooch.download, maybe even usingtqdm.autofor notebook support. - Don’t you want to use caching? The way you use it now re-downloads stuff every time, no?
|
I wouldn't want to use a try block since the ImportError could be from another module theoratically. You are right about your points with pooch. I will have a second look |
|
@flying-sheep do you think its fine to merge this and then create a new issue for adding the hashes? |
7eb2588 to
1dac201
Compare
|
I think that’s fine! |
|
@selmanozleyen Does this touch the zarr.zip download function as well? Or will it automatically also use the new behaviour? |
|
@timtreis until I add the hashes it will redownload everytime. But I think this is better because if the download was interupped in the old implementation you'd have to remove the existing file since it only checked for the existance of the file. But I can add the hashes very quickly |
timtreis
left a comment
There was a problem hiding this comment.
Fine for me, you can add with a subsequent PR
|
merged so we don't have to wait for that zombie action |
fixes #1061 partially with a workaround. I have moved everything we import privately to
_compat.py. If there is a version mismatch it's handled there. This approach is like theunsafeblock of Rust. At least it encapsulates the possible fragile code in one file.Another function that we should reconsider is
_create_sparse_df. It uses lots of protected members of pandas and might cause pandas version mismatch in the future.In fact only
ligrecis using this function which caused reproducibility errors in the past. It wasn't directly caused by it but maybe if there was a safer code written the bug wouldn't surface. Or at least there would be a warning before the pandas behavior changed.