I recently posted an issue about an Exception unnecessarily raised when chromosomes are not sorted in a specific way (though I admit sorting is not that hard). There, I was advised to use rearrange.py to fix the problem without much recomputation. However, when using this I found that half of our data goes missing. A short investigation revealed that this is due to an implicit float to int casting when creating the cooler. This happens because rearrange.py does not account for input datatype which needs to be passed to cooler.create_cooler explicitly via the dtypes argument. If this is None cooler simply uses the default pixel datatype which is int32. Casting floats < 0 to int results in clipping them to 0 which in turn removes part of our data from the file (part of our data is floats < 0).
I suggest to determine the input pixel datatype and pass it to cooler.create explicitly
I recently posted an issue about an Exception unnecessarily raised when chromosomes are not sorted in a specific way (though I admit sorting is not that hard). There, I was advised to use
rearrange.pyto fix the problem without much recomputation. However, when using this I found that half of our data goes missing. A short investigation revealed that this is due to an implicit float to int casting when creating the cooler. This happens becauserearrange.pydoes not account for input datatype which needs to be passed tocooler.create_coolerexplicitly via thedtypesargument. If this isNonecooler simply uses the default pixel datatype which isint32. Casting floats < 0 to int results in clipping them to 0 which in turn removes part of our data from the file (part of our data is floats < 0).I suggest to determine the input pixel datatype and pass it to
cooler.createexplicitly