Skip to content

perf: avoid cloning record fields during decode#711

Merged
lwshang merged 3 commits into
masterfrom
perf-2-avoid-cloning-record-fields
Mar 15, 2026
Merged

perf: avoid cloning record fields during decode#711
lwshang merged 3 commits into
masterfrom
perf-2-avoid-cloning-record-fields

Conversation

@sasa-tomic

@sasa-tomic sasa-tomic commented Mar 13, 2026

Copy link
Copy Markdown
Contributor

Overview
Reduce per-record allocation churn during Candid record decoding.

Requirements
Preserve existing wire compatibility and keep record decoding behavior unchanged.

Solution
Track expected and wire record fields by type plus index instead of cloning field queues for each decoded record. Add a compatibility test covering backward and forward compatible record vectors.

Considerations
This keeps the wire format unchanged and is intended to remain fully compatible with existing Candid data. Series-level benchmark context is tracked in #710.

Track record decode progress with shared type metadata and indices so struct and tuple decoding stop rebuilding field queues for every value.
@sasa-tomic sasa-tomic requested a review from a team as a code owner March 13, 2026 11:39
@github-actions

github-actions Bot commented Mar 13, 2026

Copy link
Copy Markdown
Name Max Mem (Kb) Encode Decode
blob 4_224 4_207_487 2_122_432
btreemap 73_856 531_975_925 ($\textcolor{green}{-0.00\%}$) 12_984_691_960 ($\textcolor{green}{-0.43\%}$)
nns 192 2_021_253 5_663_909 ($\textcolor{green}{-0.90\%}$)
nns_list_proposal 1_216 7_017_181 ($\textcolor{red}{0.09\%}$) 64_183_554 ($\textcolor{green}{-5.36\%}$)
option_list 64 ($\textcolor{green}{-50.00\%}$) 716_007 ($\textcolor{green}{-0.00\%}$) 21_855_613 ($\textcolor{green}{-7.19\%}$)
text 6_336 4_204_384 7_877_759
variant_list 64 ($\textcolor{green}{-50.00\%}$) 710_989 ($\textcolor{green}{-0.01\%}$) 20_436_541 ($\textcolor{green}{-8.09\%}$)
vec_int16 16_704 123_694_298 998_268_455 ($\textcolor{green}{-1.65\%}$)
  • Parser cost: 17_069_949
  • Extra args: 2_851_812 ($\textcolor{green}{-16.53\%}$)
Click to see raw report
---------------------------------------------------

Benchmark: blob
  total:
    instructions: 6.33 M (no change)
    heap_increase: 66 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 4.21 M (no change)
    heap_increase: 66 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 2.12 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: btreemap
  total:
    instructions: 13.52 B (-0.41%) (change within noise threshold)
    heap_increase: 1154 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 531.98 M (-0.00%) (change within noise threshold)
    heap_increase: 159 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 12.98 B (-0.43%) (change within noise threshold)
    heap_increase: 995 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: extra_args
  total:
    instructions: 2.85 M (improved by 16.53%)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: nns
  total:
    instructions: 25.59 M (-0.20%) (change within noise threshold)
    heap_increase: 3 pages (no change)
    stable_memory_increase: 0 pages (no change)

  0. Parsing (scope):
    calls: 1 (no change)
    instructions: 17.07 M (no change)
    heap_increase: 3 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 2.02 M (no change)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 5.66 M (-0.90%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: nns_list_proposal
  total:
    instructions: 71.20 M (improved by 4.85%)
    heap_increase: 19 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 7.02 M (0.09%) (change within noise threshold)
    heap_increase: 5 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 64.18 M (improved by 5.36%)
    heap_increase: 14 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: option_list
  total:
    instructions: 22.57 M (improved by 6.97%)
    heap_increase: 1 pages (improved by 50.00%)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 716.01 K (-0.00%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 21.86 M (improved by 7.19%)
    heap_increase: 1 pages (improved by 50.00%)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: text
  total:
    instructions: 12.08 M (no change)
    heap_increase: 99 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 4.20 M (no change)
    heap_increase: 66 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 7.88 M (no change)
    heap_increase: 33 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: variant_list
  total:
    instructions: 21.15 M (improved by 7.84%)
    heap_increase: 1 pages (improved by 50.00%)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 710.99 K (-0.01%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 20.44 M (improved by 8.09%)
    heap_increase: 1 pages (improved by 50.00%)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Benchmark: vec_int16
  total:
    instructions: 1.12 B (-1.47%) (change within noise threshold)
    heap_increase: 261 pages (no change)
    stable_memory_increase: 0 pages (no change)

  1. Encoding (scope):
    calls: 1 (no change)
    instructions: 123.69 M (no change)
    heap_increase: 261 pages (no change)
    stable_memory_increase: 0 pages (no change)

  2. Decoding (scope):
    calls: 1 (no change)
    instructions: 998.27 M (-1.65%) (change within noise threshold)
    heap_increase: 0 pages (no change)
    stable_memory_increase: 0 pages (no change)

---------------------------------------------------

Summary:
  instructions:
    status:   Improvements detected 🟢
    counts:   [total 9 | regressed 0 | improved 4 | new 0 | unchanged 5]
    change:   [max 0 | p75 -52.20K | median -1.69M | p25 -3.63M | min -55.57M]
    change %: [max 0.00% | p75 -0.20% | median -1.47% | p25 -6.97% | min -16.53%]

  heap_increase:
    status:   Improvements detected 🟢
    counts:   [total 9 | regressed 0 | improved 2 | new 0 | unchanged 7]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min -1]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min -50.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 9 | regressed 0 | improved 0 | new 0 | unchanged 9]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------

Only significant changes:
| status | name                           | calls |    ins |  ins Δ% | HI |   HI Δ% | SMI |  SMI Δ% |
|--------|--------------------------------|-------|--------|---------|----|---------|-----|---------|
|   -    | nns_list_proposal              |       | 71.20M |  -4.85% | 19 |   0.00% |   0 |   0.00% |
|   -    | nns_list_proposal::2. Decoding |     1 | 64.18M |  -5.36% | 14 |   0.00% |   0 |   0.00% |
|   -    | option_list                    |       | 22.57M |  -6.97% |  1 | -50.00% |   0 |   0.00% |
|   -    | option_list::2. Decoding       |     1 | 21.86M |  -7.19% |  1 | -50.00% |   0 |   0.00% |
|   -    | variant_list                   |       | 21.15M |  -7.84% |  1 | -50.00% |   0 |   0.00% |
|   -    | variant_list::2. Decoding      |     1 | 20.44M |  -8.09% |  1 | -50.00% |   0 |   0.00% |
|   -    | extra_args                     |       |  2.85M | -16.53% |  0 |   0.00% |   0 |   0.00% |

ins = instructions, HI = heap_increase, SMI = stable_memory_increase, Δ% = percent change

---------------------------------------------------
Successfully persisted results to canbench_results.yml

@lwshang lwshang merged commit 3149c13 into master Mar 15, 2026
11 checks passed
@lwshang lwshang deleted the perf-2-avoid-cloning-record-fields branch March 15, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants