Skip to content

Conversation

@JackCaoG
Copy link
Collaborator

PT_XLA_DEBUG_LEVEL=1 will not output the executation frame analysis. Also update the post-compilation analysis to GB instead of MB. sample output

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   mark_step in parallel loader at step end
Compilation Analysis: Graph Info: 
Compilation Analysis:   Graph Hash: c74c3b91b855b2b123f833b0d5f86943
Compilation Analysis:   Number of Graph Inputs: 35
Compilation Analysis:   Number of Graph Outputs: 107
Compilation Analysis: Python Frame Triggered Execution: 
Compilation Analysis:   mark_step (/workspaces/dk3/pytorch/xla/torch_xla/core/xla_model.py:1055)
Compilation Analysis:   next (/workspaces/dk3/pytorch/xla/torch_xla/distributed/parallel_loader.py:44)
Compilation Analysis:   __next__ (/workspaces/dk3/pytorch/xla/torch_xla/distributed/parallel_loader.py:32)
Compilation Analysis:   train_loop_fn (/workspaces/dk3/pytorch/xla/examples/train_decoder_only_base.py:48)
Compilation Analysis:   start_training (/workspaces/dk3/pytorch/xla/examples/train_decoder_only_base.py:65)
Compilation Analysis:   <module> (/workspaces/dk3/pytorch/xla/examples/train_decoder_only_base.py:73)
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Post Compilation Analysis: ================================================================================
Post Compilation Analysis: Graph input size: 1.548000 GB
Post Compilation Analysis: Graph output size: 7.922460 GB
Post Compilation Analysis: Aliased Input size: 1.547871 GB
Post Compilation Analysis: Intermediate tensor size: 12.124478 GB
Post Compilation Analysis: Compiled program size: 0.028210 GB
Post Compilation Analysis: --------------------------------------------------------------------------------
Post Compilation Analysis: ================================================================================
epoch: 1, step: 0, loss: 7.349868297576904, rate: 2.489864196404525

@JackCaoG JackCaoG marked this pull request as ready for review May 30, 2024 01:49
@JackCaoG
Copy link
Collaborator Author

This should be ready for review.

@JackCaoG JackCaoG merged commit 8c2234e into master May 30, 2024
@will-cromar
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants