Skip to content

Update GBS32 RCPs#449

Open
ShriyaRishab wants to merge 1 commit intomlcommons:masterfrom
ShriyaRishab:shriya/fix-gptoss-gbs32-rcps
Open

Update GBS32 RCPs#449
ShriyaRishab wants to merge 1 commit intomlcommons:masterfrom
ShriyaRishab:shriya/fix-gptoss-gbs32-rcps

Conversation

@ShriyaRishab
Copy link
Contributor

import json, os, glob

target = 3.34

for gbs in ['gbs32']:
    print(f'=== {gbs} ===')
    results = []
    for i in range(20):
        logfile = os.path.join(gbs, f'run_{i}.log')
        samples = None
        with open(logfile) as f:
            for line in f:
                if '\"key\": \"eval_accuracy\"' not in line:
                    continue
                # extract the JSON after :::MLLOG
                json_str = line.split(':::MLLOG ', 1)[-1]
                entry = json.loads(json_str)
                if entry['value'] <= target:
                    samples = entry['metadata']['samples_count']
                    break
        results.append(samples)
        print(f'  run_{i}: {samples}')
    print(f'\nAll values: {results}')
    print()

Running this parsing script from small_llm_moe_pretraining/primus/rcp_logs gives different samples to converge for GBS32 but this looks more realistic.

Given that GBS16 takes about 195k samples to converge
The old RCPs for GBS32 was converging in ~179k samples which doesn't make sense
The new RCPs for GBS32 converge in 235k which is a lot more reasonable and match the logs provided.

@ShriyaRishab ShriyaRishab requested review from a team as code owners March 6, 2026 13:54
@ShriyaRishab
Copy link
Contributor Author

@suachong - can you please review this?

@ShriyaRishab ShriyaRishab marked this pull request as draft March 6, 2026 13:56
@ShriyaRishab ShriyaRishab marked this pull request as ready for review March 6, 2026 13:56
Copy link
Contributor

@suachong suachong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ShriyaRishab
Copy link
Contributor Author

ShriyaRishab commented Mar 6, 2026

@pgmpablo157321 @pbaumstarck can you please approve and merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants