Skip to content

top-level plot collections: fail on empty dict#10233

Merged
skshetry merged 2 commits into
treeverse:mainfrom
skshetry:plot-skip-empty-dict
Jan 12, 2024
Merged

top-level plot collections: fail on empty dict#10233
skshetry merged 2 commits into
treeverse:mainfrom
skshetry:plot-skip-empty-dict

Conversation

@skshetry

@skshetry skshetry commented Jan 12, 2024

Copy link
Copy Markdown
Collaborator

If you have a top-level plots with empty dict as follows:

# dvc.yaml
plots:
- {}

dvc would fail with traceback that looks like:

StopIteration                             Traceback (most recent call last)
Cell In[3], line 1
----> 1 repo.index._plot_sources

File ~/Projects/dvc/.venv/lib/python3.12/site-packages/funcy/objects.py:25, in cached_property.__get__(self, instance, type)
     23 if instance is None:
     24     return self
---> 25 res = instance.__dict__[self.fget.__name__] = self.fget(instance)
     26 return res

File ~/Projects/dvc/dvc/repo/index.py:448, in Index._plot_sources(self)
    445 from dvc.repo.plots import _collect_pipeline_files
    447 sources: List[str] = []
--> 448 for data in _collect_pipeline_files(self.repo, [], {}).values():
    449     for plot_id, props in data.get("data", {}).items():
    450         if isinstance(props.get("y"), dict):

File ~/Projects/dvc/dvc/repo/plots/__init__.py:490, in _collect_pipeline_files(repo, targets, props, onerror)
    488         dvcfile_defs_dict[elem] = None
    489     else:
--> 490         k, v = next(iter(elem.items()))
    491         dvcfile_defs_dict[k] = v
    493 resolved = _resolve_definitions(
    494     repo.dvcfs,
    495     targets,
   (...)
    499     onerror=onerror,
    500 )

StopIteration:

The empty dict is not supported by dvc, and we should fail validation in this case.

@codecov

codecov Bot commented Jan 12, 2024

Copy link
Copy Markdown

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d711ecd) 90.47% compared to head (3a82fba) 90.23%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10233      +/-   ##
==========================================
- Coverage   90.47%   90.23%   -0.25%     
==========================================
  Files         493      493              
  Lines       37595    37596       +1     
  Branches     5455     5455              
==========================================
- Hits        34014    33924      -90     
- Misses       2953     3022      +69     
- Partials      628      650      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@shcheklein shcheklein left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @skshetry for a quick turnaround and for empty YAML catch ...

  • can we make it a validation error though? Otherwise it's easy to miss this kind of conditions
  • I think we need a test for this ... it's easy to add and it's easy to get the same regression later in this (edge) case

@shcheklein

Copy link
Copy Markdown
Contributor

Also, please add some description to PR :)

@shcheklein shcheklein added bug Did we break something? A: plots Related to the plots product: Studio and removed product: Studio labels Jan 12, 2024
@skshetry

Copy link
Copy Markdown
Collaborator Author

There are just too many edge cases with schema and too many negative cases, that I don't find it worth it to cover it with tests. :)

And the schema is very declarative.

@shcheklein

Copy link
Copy Markdown
Contributor

There are just too many edge cases with schema and too many negative cases, that I don't find it worth it to cover it with tests. :)

that's exactly what we need I think, especially as we hit more and more edge cases like this. It's totally fine to have 1000s of tests for this. And even consider generating different combinations, etc if needed.

@skshetry

Copy link
Copy Markdown
Collaborator Author

I'd look for property-based testing than writing 1000s of tests.

@skshetry

skshetry commented Jan 12, 2024

Copy link
Copy Markdown
Collaborator Author

that's exactly what we need I think, especially as we hit more and more edge cases like this. It's totally fine to have 1000s of tests for this. And even consider generating different combinations, etc if needed.

Negative tests will just test voluptuous. Here, we'll be testing if Required works or not.

@shcheklein

Copy link
Copy Markdown
Contributor

Negative tests will just test voluptuous. Here, we'll be testing if Required works or not.

I would argue that you are testing the DVC schema in this case (e.g. that someone would not drop the Required). But I agree if this is becomes part of the validation it's way better and less error prone.

Comment thread dvc/repo/plots/__init__.py Outdated
@skshetry skshetry force-pushed the plot-skip-empty-dict branch from b91ff5d to 3a82fba Compare January 12, 2024 04:00
@skshetry skshetry changed the title top-level plot collections: skip empty dict top-level plot collections: fail on empty dict Jan 12, 2024

@shcheklein shcheklein left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still add a test. Your call on this.

@skshetry

Copy link
Copy Markdown
Collaborator Author

I would still add a test. Your call on this.

The problem here is that there is a mismatch between schema validation and parsing/deserialization.
Even though we might add a test for schema validation, there is no guarantee that the deserialization logic will be correct.

I don't think they are a different thing. They should be just one single thing. And we should just depend on validation to rule those edgecases out.

@skshetry skshetry merged commit b46bd9c into treeverse:main Jan 12, 2024
@skshetry skshetry deleted the plot-skip-empty-dict branch January 12, 2024 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A: plots Related to the plots bug Did we break something?

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants