Skip to content

Native serialization for DynamicFlat index#281

Merged
razdoburdin merged 4 commits into
intel:dev/razdoburdin_streamingfrom
razdoburdin:streaming/dynamicflat
Mar 6, 2026
Merged

Native serialization for DynamicFlat index#281
razdoburdin merged 4 commits into
intel:dev/razdoburdin_streamingfrom
razdoburdin:streaming/dynamicflat

Conversation

@razdoburdin
Copy link
Copy Markdown
Contributor

This PR introduce native serialization for DynamicFlat index.

Main changes are:

  1. auto_dynamic_assemble now accepts lazy loader. That is mandatory for buffer-free deserialization.
  2. new class `Deserializer' is introduced. It is responsible for conditional reading of overhead data (like names of temporary files) in case of legacy models.
  3. IDTranslator is refactored to cover save and load to/from stream.

Copy link
Copy Markdown
Member

@rfsaliev rfsaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGFM

@razdoburdin
Copy link
Copy Markdown
Contributor Author

The failure of Integration tests isn't related to PR content. The same failure presents in other PRs.
image

@razdoburdin razdoburdin merged commit 4c0f19b into intel:dev/razdoburdin_streaming Mar 6, 2026
16 of 19 checks passed
razdoburdin added a commit that referenced this pull request Apr 16, 2026
This PR adds native stream serialization to all SVS index types, as an
alternative to the existing (legacy) directory-based serialization. It
allow to avoid filesystem round-trips of the data. The native
serialization doesn't require from the stream to be seek able, so no
additional restrictions were introduced.

See the following PR for details:
#280,
#281,
#285,
#286,
#289,
#292,
#294,
#296,
#299

Main changes are:
1. A CRTP base `Archiver` extracts binary I/O primitives (`write_size`,
`read_size`, `write_name`, `read_name`, `read_from_istream`) from
`DirectoryArchiver`. `DirectoryArchiver` and new `StreamArchiver` class
inherit from `Archiver`. `StreamArchiver` has its own magic number
("SVS_STRM") to distinguish native streams from directory archives.
2. The monolithic `Writer` is split via CRTP with two derived classes:
`FileWriter` owns an `std::ofstream`, writes a header, flushes on
destructor, `StreamWriter` wraps an external `std::ostream&`, no
header/lifecycle management. This allows `io::save(data, os)` to write
vector data directly to any stream.
3. The `save(stream)` in orchestrator `Impl` classes no longer does
temp-dir->pack. Instead it directly calls `impl().save(stream)`.
4. The dispatching between new (native) and old (legacy) deserialization
is made at the orchestrators. `Deserializer::build(is)` reads the magic
number, exposes `is_native()` to choose path.

---------

Co-authored-by: Dmitry Razdoburdin <drazdobu@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Rafik Saliev <rafik.f.saliev@intel.com>
Co-authored-by: ethanglaser <42726565+ethanglaser@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants