Skip to content

Conversation

@jlamypoirier
Copy link
Collaborator

@jlamypoirier jlamypoirier commented Jan 28, 2026

✨ Description

  • Tweak entropy losses to improve readability
  • Extract reusable fused_predicted_logits_from_labels from _fused_cross_entropy_base_from_labels
  • Fix broken entropy loss tests for loss masking.
  • Support arbitrary tensor dimensions in entropy losses.
  • Add TP support for GRPO
  • Add TP support for Z loss
  • Generalize and simplify entropy loss tests with support for z loss and grpo, improve distributed coverage, add fp16 and bf16 tests.

Base automatically changed from jlp_grpo_sample to jlp_pipeline_rl January 28, 2026 23:36
@jlamypoirier jlamypoirier marked this pull request as ready for review January 28, 2026 23:43
@jlamypoirier jlamypoirier changed the title Tensor-parallel GRPO loss [Pipeline RL] Tensor-parallel GRPO loss Jan 29, 2026
@jlamypoirier jlamypoirier merged commit caaa9f8 into jlp_pipeline_rl Jan 29, 2026
@jlamypoirier jlamypoirier deleted the jlp_grpo_tp branch January 29, 2026 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants