Skip to content

PCG serialization, rapidcheck, dtgen, and shape inference#1394

Merged
lockshaw merged 46 commits intoflexflow:repo-refactorfrom
lockshaw:pcg-serialization
Jun 4, 2024
Merged

PCG serialization, rapidcheck, dtgen, and shape inference#1394
lockshaw merged 46 commits intoflexflow:repo-refactorfrom
lockshaw:pcg-serialization

Conversation

@lockshaw
Copy link
Collaborator

@lockshaw lockshaw commented May 26, 2024

Description of changes:

  • Move many visitable types over to dtgen
  • Add new-style reduction dims and parallel tensor shape
  • Add parallel shape inference for a subset of operators
    • Add
    • ReLU
    • Linear
    • Conv2D
    • BatchMM
    • Embedding
    • Repartition
    • Combine
    • Replicate
    • Reduction
  • De-template substitutions
  • Refactor PCG, ComputationGraph, etc. do be serializable and based on new DataflowGraph abstraction
  • Add rapidcheck Arbitrary instances for many types, primarily through moving many types over to dtgen

Note: this PR temporarily breaks substitutions, which will be moved to a follow-up PR to prevent this PR from growing too large

Related Issues:

Supersedes #988

Linked Issues:

Issues closed by this PR:


This change is Reviewable

@lockshaw lockshaw added repo-refactor op-attrs Op-attrs library substitutions Substitutions library labels May 26, 2024
@lockshaw lockshaw self-assigned this May 26, 2024
@lockshaw lockshaw requested a review from reyna-abhyankar May 30, 2024 01:11
@lockshaw lockshaw marked this pull request as ready for review June 3, 2024 05:44
Copy link
Collaborator

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 82 files at r4, all commit messages.
Reviewable status: 633 of 713 files reviewed, 1 unresolved discussion (waiting on @lockshaw and @reyna-abhyankar)


lib/op-attrs/include/op-attrs/parallel_tensor_shape.h line 15 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I think "sharding" is actually the right name for the operation, I probably used it incorrectly above (maybe "instances" would be better for sharding+replicas? not sure). CombineParallelDim and PartitionParallelDim both sound more like actions than a type of dim (as Combine and Partition are both verbs, and Combine isn't a great name anyway), and ConcatParallelDim is even more confusing as we have a non-parallel concat operator. I agree there's room for improved naming, but at least until I get around to renaming a bunch of other stuff to be less ambiguous I think "shard" is the best I can think of? (the main contender I can think of is PartitionParallelDim, but I'm not sure that's any better than ShardParallelDim, and I'm not sure we want to explicitly tie the concept of their being multiple disjoint shards exactly to the Partition operator). @reyna-abhyankar do you have any thoughts on this?

I think it is fine to keep it and find a better name later.

Copy link
Collaborator

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 80 of 82 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @reyna-abhyankar)

@lockshaw lockshaw requested a review from wmdi June 3, 2024 22:21
@codecov
Copy link

codecov bot commented Jun 3, 2024

Codecov Report

Attention: Patch coverage is 13.41872% with 4394 lines in your changes missing coverage. Please review.

Project coverage is 18.81%. Comparing base (c97f63f) to head (6230d99).

Additional details and impacted files
@@                Coverage Diff                 @@
##           repo-refactor    #1394       +/-   ##
==================================================
- Coverage          41.32%   18.81%   -22.52%     
==================================================
  Files                181      263       +82     
  Lines               5265    12631     +7366     
  Branches             271        0      -271     
==================================================
+ Hits                2176     2377      +201     
- Misses              3089    10254     +7165     
Flag Coverage Δ
unittests 18.81% <13.41%> (-22.52%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
lib/op-attrs/include/op-attrs/dim_ordered/slice.h 100.00% <100.00%> (ø)
.../op-attrs/include/op-attrs/dim_ordered/transform.h 100.00% <100.00%> (ø)
lib/op-attrs/include/op-attrs/get_output_shapes.h 0.00% <ø> (ø)
...p-attrs/src/op-attrs/computation_graph_op_attrs.cc 100.00% <100.00%> (ø)
...rs/src/op-attrs/ops/conv_2d/conv_2d_input_shape.cc 100.00% <100.00%> (ø)
lib/op-attrs/src/op-attrs/ops/reduction.cc 100.00% <100.00%> (ø)
lib/op-attrs/src/op-attrs/ops/repartition.cc 100.00% <100.00%> (ø)
lib/op-attrs/src/op-attrs/ops/replicate.cc 100.00% <100.00%> (ø)
lib/op-attrs/src/op-attrs/ops/batch_norm.cc 0.00% <0.00%> (ø)
lib/op-attrs/src/op-attrs/ops/dropout.cc 0.00% <0.00%> (ø)
... and 82 more

... and 209 files with indirect coverage changes

Copy link
Collaborator Author

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @wmdi from a discussion.
Reviewable status: 700 of 736 files reviewed, all discussions resolved (waiting on @wmdi)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

op-attrs Op-attrs library substitutions Substitutions library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants