Preserve StreamerInfos when reopening output files in UPDATE mode#567
Open
cmargalejo wants to merge 2 commits into
Open
Preserve StreamerInfos when reopening output files in UPDATE mode#567cmargalejo wants to merge 2 commits into
cmargalejo wants to merge 2 commits into
Conversation
When a ROOT file is closed, TFile::WriteStreamerInfo() replaces the file's StreamerInfo record with only the infos of the classes streamed during that session. TRestRun::MergeToOutputFile merges the threads' output files (which preserves the infos) but then reopens the merged file in UPDATE mode to write the metadata: closing that session wiped the StreamerInfos of all event classes, keeping only the metadata classes. As a consequence no restManager output file contains event-class StreamerInfos, which breaks ROOT schema evolution whenever an event class definition changes. This is what made files written before the TRestDetectorSignal vector<Float_t> -> vector<Double_t> change unreadable (rest-for-physics/detectorlib#125): without the on-disk layout description, ROOT misreads the float payload as doubles. Add TRestTools::PreserveStreamerInfos(TFile*), which re-tags every StreamerInfo already stored in a file opened in UPDATE mode so that TFile::WriteStreamerInfo writes them out again on close (same marking as the deprecated TStreamerInfo::TagFile). Call it at the UPDATE sessions that write into REST data files: - TRestRun::MergeToOutputFile (the main restManager output path) - TRestRun::UpdateOutputFile - TRestProcessRunner split-file metadata update of the main file - TRestDataSet::Export Verified by merging two files containing TRestDetectorSignalEvent trees via TRestRun::MergeToOutputFile: the event-class StreamerInfos now survive the metadata session and the merged file reads back correctly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The validation workflows cache the build with key BRANCH_NAME-sha, so a re-run cannot pick up the restG4 fix (rest-for-physics/restG4#148); only a new commit invalidates the cache.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #568
The problem
When a ROOT file is closed,
TFile::WriteStreamerInfo()replaces the file'sStreamerInfo record, keeping only the infos of the classes streamed during that
session.
TRestRun::MergeToOutputFilemerges the threads' output files (whichpreserves the infos), but then reopens the merged file in UPDATE mode to write
the metadata: closing that session wipes the StreamerInfos of all event classes,
keeping only the metadata classes.
As a consequence, no restManager output file contains event-class
StreamerInfos, which silently breaks ROOT schema evolution whenever an event
class definition changes. This is what made files written before the
TRestDetectorSignalvector<Float_t>→vector<Double_t>change unreadable(rest-for-physics/detectorlib#125): without the on-disk layout description,
ROOT misreads the float payload as doubles and allocates GBs of garbage.
(restG4 files are unaffected: they are written in a single session.)
The fix
New
TRestTools::PreserveStreamerInfos(TFile*): called right after opening afile in UPDATE mode, it re-tags every StreamerInfo already stored in the file
(same marking as the deprecated
TStreamerInfo::TagFile) so thatTFile::WriteStreamerInfowrites them out again on close. Called at the UPDATEsessions that write into REST data files:
TRestRun::MergeToOutputFile(the main restManager output path)TRestRun::UpdateOutputFileTRestProcessRunnersplit-file metadata update of the main fileTRestDataSet::ExportVerification
Merging two files containing
TRestDetectorSignalEventtrees viaTRestRun::MergeToOutputFile: without the fix the merged file keeps onlyTRestRun/TRestMetadatainfos; with it,TRestDetectorSignalEvent,TRestEventandTRestDetectorSignalsurvive the metadata session and thefile reads back correctly.
Note: this protects files written from now on. Existing files have already
lost their StreamerInfos; for the detector signal case the data is recoverable
with the tool in #566.