Skip to content

ZJIT: lobsters perf burndown #833

@tekknolagi

Description

@tekknolagi

This issue maintains a list of TODOs for speeding up the lobsters benchmark.

In Progress

Fallbacks

Exits

Action items

Things that are already actionable:

Fallbacks

Exits

Backlog

Fallbacks

  • send_without_block_polymorphic: We need to add proper polymorphic call support in HIR

Exits

  • unhandled_hir_insn invokebuiltin: We need to update the backend to support CCall with 6+ args.
  • unhandled_hir_insn throw: Because we use call/ret, we need to implement it differently from YJIT.
  • compile_error exception_handler: Similarly, because of call/ret, this needs to be implemented differently from YJIT.

ZJIT stats

As of 2026-02-25:

ZJIT stats
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (57.7% of total 10,973,558):
                                               Hash#fetch: 2,353,072 (21.4%)
                                                Hash#key?:   363,731 ( 3.3%)
                                            Regexp#match?:   359,073 ( 3.3%)
                                       String#start_with?:   347,091 ( 3.2%)
                                              Hash#delete:   301,592 ( 2.7%)
                                    ERB::Util.html_escape:   273,864 ( 2.5%)
                                              String#sub!:   226,516 ( 2.1%)
                               ObjectSpace::WeakKeyMap#[]:   217,047 ( 2.0%)
                                             Set#include?:   211,700 ( 1.9%)
                                       Kernel#respond_to?:   204,384 ( 1.9%)
                                                String#<<:   190,554 ( 1.7%)
                                               String.new:   167,017 ( 1.5%)
                                              Integer#===:   165,219 ( 1.5%)
                                             Kernel#is_a?:   162,376 ( 1.5%)
                                           Class#allocate:   152,819 ( 1.4%)
                                    Process.clock_gettime:   141,859 ( 1.3%)
                                         Class#superclass:   133,637 ( 1.2%)
                                         Symbol#end_with?:   123,152 ( 1.1%)
                                               Kernel#dup:   119,305 ( 1.1%)
                                                  Hash#[]:   114,801 ( 1.0%)
Top-20 calls to C functions from JIT code (77.3% of total 64,267,029):
                             rb_vm_opt_send_without_block: 11,939,850 (18.6%)
                                             rb_hash_aref:  5,400,094 ( 8.4%)
                                        rb_vm_invokeblock:  4,453,370 ( 6.9%)
                     rb_zjit_writebarrier_check_immediate:  4,279,888 ( 6.7%)
                                rb_vm_getinstancevariable:  3,504,925 ( 5.5%)
                                               rb_vm_send:  3,058,847 ( 4.8%)
                           rb_ivar_get_at_no_ractor_check:  2,864,766 ( 4.5%)
                                        rb_obj_is_kind_of:  2,313,479 ( 3.6%)
                                             rb_hash_aset:  1,903,359 ( 3.0%)
                                               Hash#fetch:  1,639,937 ( 2.6%)
                                rb_vm_setinstancevariable:  1,596,808 ( 2.5%)
                               rb_vm_opt_getconstant_path:  1,328,761 ( 2.1%)
                                          rb_jit_ary_push:    960,570 ( 1.5%)
                                rb_ec_ary_new_from_values:    722,923 ( 1.1%)
                               rb_class_allocate_instance:    721,492 ( 1.1%)
                                                    fetch:    713,135 ( 1.1%)
                                        rb_str_buf_append:    667,547 ( 1.0%)
                                              rb_ivar_get:    585,817 ( 0.9%)
                                    rb_hash_new_with_size:    520,347 ( 0.8%)
                                        rb_vm_sendforward:    479,029 ( 0.7%)
Top-1 not optimized method types for send (100.0% of total 1,340):
  null: 1,340 (100.0%)
Top-3 not optimized method types for send_without_block (100.0% of total 415,535):
        optimized_send: 386,265 (93.0%)
                  null:  27,025 ( 6.5%)
  optimized_block_call:   2,245 ( 0.5%)
Top-1 not optimized method types for super (100.0% of total 1,860):
  attrset: 1,860 (100.0%)
Top-4 instructions with uncategorized fallback reason (100.0% of total 4,999,830):
             invokeblock: 4,453,370 (89.1%)
             sendforward:   479,029 ( 9.6%)
      invokesuperforward:    49,505 ( 1.0%)
  opt_send_without_block:    17,926 ( 0.4%)
Top-20 send fallback reasons (100.0% of total 20,184,609):
                                    singleton_class_seen: 5,718,129 (28.3%)
                                           uncategorized: 4,999,830 (24.8%)
                          send_without_block_no_profiles: 3,034,295 (15.0%)
                          send_without_block_polymorphic: 1,927,420 ( 9.5%)
                                        send_no_profiles: 1,725,095 ( 8.5%)
                            one_or_more_complex_arg_pass:   987,488 ( 4.9%)
                          send_without_block_megamorphic:   391,590 ( 1.9%)
  send_without_block_not_optimized_method_type_optimized:   388,510 ( 1.9%)
                                        send_polymorphic:   274,376 ( 1.4%)
                                   too_many_args_for_lir:   212,820 ( 1.1%)
        send_without_block_not_optimized_need_permission:   202,957 ( 1.0%)
                                       super_polymorphic:   108,032 ( 0.5%)
                                 super_complex_args_pass:    47,297 ( 0.2%)
                                     argc_param_mismatch:    30,214 ( 0.1%)
                                        super_from_block:    27,886 ( 0.1%)
            send_without_block_not_optimized_method_type:    27,025 ( 0.1%)
                                obj_to_string_not_string:    25,666 ( 0.1%)
              send_without_block_direct_keyword_mismatch:    19,358 ( 0.1%)
                          super_target_complex_args_pass:    14,388 ( 0.1%)
                                        send_megamorphic:    13,701 ( 0.1%)
Top-4 setivar fallback reasons (100.0% of total 1,797,706):
            not_monomorphic: 1,690,283 (94.0%)
               not_t_object:    60,891 ( 3.4%)
                too_complex:    46,511 ( 2.6%)
  new_shape_needs_extension:        21 ( 0.0%)
Top-2 getivar fallback reasons (100.0% of total 4,090,742):
  not_monomorphic: 3,976,450 (97.2%)
      too_complex:   114,292 ( 2.8%)
Top-3 definedivar fallback reasons (100.0% of total 260,479):
  not_monomorphic: 256,435 (98.4%)
      too_complex:   2,444 ( 0.9%)
     not_t_object:   1,600 ( 0.6%)
Top-6 invokeblock handler (100.0% of total 4,453,370):
        polymorphic: 2,832,028 (63.6%)
  monomorphic_other:   931,200 (20.9%)
   monomorphic_iseq:   501,716 (11.3%)
        no_profiles:   134,639 ( 3.0%)
  monomorphic_ifunc:    49,772 ( 1.1%)
        megamorphic:     4,015 ( 0.1%)
Top-6 getblockparamproxy handler (100.0% of total 1,804,066):
  polymorphic: 1,268,283 (70.3%)
          nil:   286,526 (15.9%)
         iseq:   139,040 ( 7.7%)
  no_profiles:    87,524 ( 4.9%)
         proc:    19,306 ( 1.1%)
  megamorphic:     3,387 ( 0.2%)
Top-8 popular complex argument-parameter features not optimized (100.0% of total 1,150,590):
        param_block: 393,915 (34.2%)
  param_forwardable: 390,995 (34.0%)
         param_rest: 184,229 (16.0%)
       param_kwrest:  73,021 ( 6.3%)
    caller_kw_splat:  43,000 ( 3.7%)
       caller_splat:  29,522 ( 2.6%)
       caller_kwarg:  18,648 ( 1.6%)
    caller_blockarg:  17,260 ( 1.5%)
Top-1 compile error reasons (100.0% of total 93,876):
  exception_handler: 93,876 (100.0%)
Top-5 unhandled YARV insns (100.0% of total 47,214):
        splatkw: 44,594 (94.5%)
  setblockparam:  1,355 ( 2.9%)
     checkmatch:    929 ( 2.0%)
           once:    171 ( 0.4%)
    expandarray:    165 ( 0.3%)
Top-3 unhandled HIR insns (100.0% of total 132,978):
          throw: 95,915 (72.1%)
  invokebuiltin: 35,772 (26.9%)
      array_max:  1,291 ( 1.0%)
Top-20 side exit reasons (100.0% of total 8,029,259):
                  guard_shape_failure: 3,748,353 (46.7%)
                   guard_type_failure: 3,605,031 (44.9%)
  block_param_proxy_not_iseq_or_ifunc:   279,732 ( 3.5%)
                   unhandled_hir_insn:   132,978 ( 1.7%)
                        compile_error:    93,876 ( 1.2%)
                  unhandled_yarv_insn:    47,214 ( 0.6%)
            block_param_proxy_not_nil:    38,242 ( 0.5%)
                 fixnum_mult_overflow:    24,622 ( 0.3%)
     patchpoint_stable_constant_names:    19,630 ( 0.2%)
           block_param_proxy_modified:    12,337 ( 0.2%)
               fixnum_lshift_overflow:    10,085 ( 0.1%)
         unhandled_newarray_send_pack:     6,954 ( 0.1%)
                  unhandled_block_arg:     6,815 ( 0.1%)
        patchpoint_no_singleton_class:     1,131 ( 0.0%)
          patchpoint_method_redefined:       979 ( 0.0%)
             guard_greater_eq_failure:       570 ( 0.0%)
               obj_to_string_fallback:       274 ( 0.0%)
             guard_super_method_entry:       220 ( 0.0%)
                   guard_less_failure:       163 ( 0.0%)
                            interrupt:        52 ( 0.0%)
                             send_count: 99,620,043
                     dynamic_send_count: 20,184,609 (20.3%)
                   optimized_send_count: 79,435,434 (79.7%)
                  dynamic_setivar_count:  1,797,706 ( 1.8%)
                  dynamic_getivar_count:  4,090,742 ( 4.1%)
              dynamic_definedivar_count:    260,479 ( 0.3%)
              iseq_optimized_send_count: 28,366,772 (28.5%)
      inline_cfunc_optimized_send_count: 37,232,136 (37.4%)
       inline_iseq_optimized_send_count:  2,425,838 ( 2.4%)
non_variadic_cfunc_optimized_send_count:  6,140,799 ( 6.2%)
    variadic_cfunc_optimized_send_count:  5,269,889 ( 5.3%)
compiled_iseq_count:                               5,526
failed_iseq_count:                                     0
compile_time:                                    1,400ms
compile_side_exit_time:                             75ms
compile_side_exit_time_ratio:                       5.4%
compile_hir_time:                                  460ms
compile_hir_build_time:                            189ms
compile_hir_strength_reduce_time:                  170ms
compile_hir_fold_constants_time:                    22ms
compile_hir_clean_cfg_time:                         23ms
compile_hir_eliminate_dead_code_time:               27ms
compile_lir_time:                                  888ms
profile_time:                                       11ms
gc_time:                                           201ms
invalidation_time:                                  14ms
vm_write_pc_count:                            76,958,995
vm_write_sp_count:                            76,958,995
vm_write_locals_count:                        74,773,507
vm_write_stack_count:                         74,773,507
vm_write_to_parent_iseq_local_count:             447,921
vm_read_from_parent_iseq_local_count:                  0
guard_type_count:                             87,956,697
guard_type_exit_ratio:                              4.1%
guard_shape_count:                            36,211,822
guard_shape_exit_ratio:                            10.4%
side_exit_size:                               10,030,896
code_region_bytes:                            30,769,152
side_exit_size_ratio:                              32.6%
zjit_alloc_bytes:                             39,448,675
total_mem_bytes:                              70,217,827
side_exit_count:                               8,029,259
total_insn_count:                            565,072,433
vm_insn_count:                                95,586,394
zjit_insn_count:                             469,486,039
ratio_in_zjit:                                     83.1%

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions