Skip to content

More expressive string.measure results? #15

@wingo

Description

@wingo

Right now string.measure returns a value in [0,231–1] on success, and -1 otherwise. This conflates failure-to-encode because e.g. a bad USV sequence with failure-to-encode because the result would be longer than 2^31 bytes (admittedly rare and even impossible on some hosts; see #12 (comment)). Should we differentiate the two cases?

We could use in fact use any negative value to indicate which codepoint couldn't be encoded. A return value of -cursor could indicate the cursor after the codepoint at which the error occurred. It fits because a string with 0 codepoints can't fail, so there are only 231 cursor values to encode. It would still be a bit gnarly to distinguish overflow from can't-encode-codepoint, but perhaps that's OK.

But, perhaps it's overkill!

cc @lars-t-hansen

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions