The DictSet is problematic, particularly when it comes to the simdjson objects and arrays. A Relation is a faster construct and it uses less memory. This is still useful with legacy datasets where PyArrow cannot load them.