-
Notifications
You must be signed in to change notification settings - Fork 160
Open
Labels
Milestone
Description
Because catalyst is optimized for 0d and 1d tensors, all the tensors should be stored this way. Of course, users can still input some arrays of arrays at the inputs, but the outputs should be optimized for 1d arrays. It should be the recommended output for anything above 3d tensors.
This can be done only with a more flexible interpretation of the metadata.
One concern is that the data storage as seen by sql may be different from the interpretation seen by tensorframes. On the positive side, it will simplify the low-level operations.
Expected modifications:
- default storage layout is row major (but with consideration to a potential option to column major)
- all operations should accept at ingest imbricated arrays or flattened tensors
- all operations should output flattened tensors for tensors >= 2 dimensions -> this is a user-facing change
- analyze will be the conversion point between flattened and nested representations, with an extra option
compact_storage. This option will either accept a single boolean (all numerical types), the letter 'R" (all columns compacted in Row order) or a list of names of columns (only these columns are compacted in Row order). A dictionary could be supported later. - printschema will differentiate between tensors stored in 1d and n
Reactions are currently unavailable