[jsapi] inconsistent utf-8 decoding

Two UTF-8 decoders are specified:
- Wasm's binary format name decoding: https://webassembly.github.io/spec/core/binary/values.html#binary-name.
- [`WebAssembly.Module.customSections`](https://webassembly.github.io/spec/js-api/index.html#dom-module-customsections)'s name decoding (defined in the WHATWG): https://encoding.spec.whatwg.org/#utf-8.

From the WHATWG algorythm, the rules
> 5. Set UTF-8 lower boundary to 0x80 and UTF-8 upper boundary to 0xBF.

> 6. Set UTF-8 code point to (UTF-8 code point << 6) | (byte & 0x3F)

are not used in Wasm.

My understanding is that `U+DC01` and `U+FFFD` should be equal in the JS API, as tested here https://github.com/WebAssembly/spec/blob/5aaea96eceb1e1a3956d7cbb499920e5b8c1109f/test/js-api/module/customSections.any.js#L156-L160 While in Wasm they would be considered as two different sections, which could cause sublte mismatchs.

Note that this is the only occurence of UTF-8 decoding in the JS spec.

	assert_sections(WebAssembly.Module.customSections(module, "na\uFFFDme"), [
	bytes,
	]);
	assert_sections(WebAssembly.Module.customSections(module, "na\uDC01me"), [
	bytes,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[jsapi] inconsistent utf-8 decoding #915

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[jsapi] inconsistent utf-8 decoding #915

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions