Skip to content

Different Byte strings decodes into same UTF string, is that correct? #49

@Sajjon

Description

@Sajjon

Hello! We at Radix are using CBOR. I just found strange behavior in encoding and decoding of this UTF string:
"radix.particles.message"

The Cbor playground cbor.me encodes that into the byte string

77
72616469782e7061727469636c65732e6d657373616765

Which also both SwiftCBOR and node-cbor do.

But Java Jacksson Cbor encodes it to:

7817
72616469782e7061727469636c65732e6d657373616765

Notice the difference, the prefix 77 versus 7817.

The interesting part is that both Cbor.me and SwiftCBOR decodes 781772616469782e7061727469636c65732e6d657373616765 into the same UTF string: "radix.particles.message".

In other words, two different byte strings decode into the same UTF string.

Is that correct?

According to the RFC 7049, the major type is 3, i.e. 0b011 followed by the length of that string, which is 23 is decimal, which results in 0b10111 in binary, and 01110111 in hex is 77.

So where does 7817 come from? And why does it correctly decode into the same UTF string?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions