Skip to content

Conversation

@corydolphin
Copy link
Contributor

@corydolphin corydolphin commented Jan 5, 2026

Hello @Cito,

We make use of GraphQL-core (via graphene as our definition layer) at Nextdoor, and a core bottleneck for us is that query parsing is slow, and query serialization/deserializationperformance is even slower than parsing which makes caching remote-caching the queries using pickle provide no advantage (this changes dependent on python version and serialization protocol).

I spent some time over the holidays profiling and working on performance improvements and would like to get your input.

I have a proposal for a series of changes, which I have made as a series of logically independent commits with a goal of providing the ability to serialize a query using Messagepack via the msgspec library, which provides highly optimized serialization as well as struct-like objects. This new serialization is exposed on the DocumentNode as: to_bytes_unstable(self) -> bytes and from_bytes_unstable(cls, data: bytes) -> DocumentNode to indicate the evolving nature, and potential for version incompatibility. I can plan to break this up into a series of PRS so the are easier to review and merge independently, but right now they are structured as commits with the goal that each commit can be landed individually (and should pass all tests etc).

Downsides:
Msgspec does not support PyPy, so we'd need to either drop support for PyPy, or I can work on providing the msgspec variant as an extras. I believe it should be possible to support both simultaneously without duplication.

Alternatives
We can get material gains by converting to Dataclasses (mostly due to slowness in our current setattr). We could then conditionally make use of msgspec structs if the library is available.
Additionally and separately, we can make parsing roughly twice as fast by compiling a mypyc extension due to the overhead of function calls in parsing loops. We could build a wheel such that this is only used for non pypy users. As future work, Using Mypyc is likely a big lever for speeding up the query execution codepaths.

I plan to run this fork in production at Nextdoor to verify these changes have the impact level I expect with our specific workload, but based upon introduced performance tests, the current results are substantial:

General performance: (requires only the classes use msgspec.Structs, no reliance on serialization protocol)

Performance improvement on large query (~117KB):
- Parse:         15.4ms (was 81ms, 5.3x faster)
- Pickle encode:  5.2ms (was 24ms, 4.6x faster)
- Pickle decode:  7.5ms (was 42ms, 5.6x faster)

With the final addition of msgpack serialization protocol.

Serialization Performance on large query (~117KB, no locations):
  Msgpack Size:        124KB vs 383KB pickle (3.1x smaller)
  Msgpack Serialize:   0.4ms vs 5.2ms pickle (13x faster)
  Msgpack Deserialize: 1.6ms vs 7.5ms pickle (4.7x faster)

Performance vs parsing:
  Msgpack Deserialize: 1.6ms vs 15.4ms parse (10x faster)

Introduce benchmarks using a large (~117KB) GraphQL query to measure
parse and pickle serialization performance. These provide a baseline
for comparing serialization approaches in subsequent commits.

Baseline performance (measrued on an m4 max mac):
- Parse:         81ms
- Pickle encode: 24ms
- Pickle decode: 42ms
- Roundtrip:     71ms
Prepares AST for immutability by using tuples instead of lists for
collection fields. This aligns with the JavaScript GraphQL library
which uses readonly arrays, and enables future frozen datastructures.
…andling

Python 3.9 reaches end-of-life October 2025. Python 3.10 adoption is
now mainstream - major frameworks (strawberry, Django 5.0, FastAPI) require it,
and 3.10+ accounts for >70% of PyPI downloads.

This enables modern Python features:
- Union types with `|` syntax (PEP 604)
- isinstance() with union types directly
- match statements for pattern matching

Also prepares for msgspec dependency which requires Python 3.9+ (and
we target 3.10+ for the union type features it enables).

Removes CI jobs for Python 3.7-3.9.
Replaces deep class inheritance with TypeAlias unions, matching the
pattern used in graphql-js. This enables:
- Direct isinstance() checks with union types (Python 3.10+)
- Simpler type definitions without intermediate base classes
- Better alignment with the JavaScript reference implementation

Intermediate classes like ExecutableDefinitionNode are now TypeAliases
rather than base classes, simplifying the inheritance hierarchy while
maintaining type safety.

Test changes: Removed type_system_definition_node and nullability_assertion_node
from predicate tests because these were abstract base classes that no longer
exist. The predicates now check against union types directly.
Modifies the AST visitor to use copy-on-write semantics when applying
edits. Instead of mutating nodes in place, the visitor now creates new
node instances with the edited values. This prepares for frozen AST
nodes while maintaining backwards compatibility.

The visitor accumulates edits and applies them by constructing new
nodes, enabling the transition to immutable data structures.
Tests were using incomplete node construction that relied on
default None values. This change makes tests more representative
in preparation for non-nullable required fields.
…formance improvement

Replace dataclass-based AST nodes with msgspec.Struct for better
performance and memory efficiency.

Performance improvement on large query (~117KB):
- Parse:         15.4ms (was 81ms, 5.3x faster)
- Pickle encode:  5.2ms (was 24ms, 4.6x faster)
- Pickle decode:  7.5ms (was 42ms, 5.6x faster)
…aster than pickle)

Add to_bytes_unstable() and from_bytes_unstable() methods to DocumentNode
for fast binary serialization using msgpack.

Performance on large query (~117KB, no locations):
  Size:        124KB vs 383KB pickle (3.1x smaller)
  Serialize:   0.4ms vs 5.2ms pickle (13x faster)
  Deserialize: 1.6ms vs 7.5ms pickle (4.7x faster)

Performance vs parsing:
  Deserialize: 1.6ms vs 15.4ms parse (10x faster)

Warning: The serialization format is unstable and may change between
versions. Use only for short-lived caches or same-version IPC.
@corydolphin corydolphin requested a review from Cito as a code owner January 5, 2026 00:29
patrick91 added a commit to strawberry-graphql/strawberry that referenced this pull request Jan 5, 2026
Test graphql-core PR #249 which introduces significant performance
improvements using msgspec.Struct for AST nodes. The PR claims:
- 5.3x faster parsing
- 13x faster serialization with msgpack
- 4.7x faster deserialization

See: graphql-python/graphql-core#249
@corydolphin
Copy link
Contributor Author

corydolphin commented Jan 5, 2026

Hey @patrick91 thank you for taking a look. I'll look at your strawberry PR and see if I can figure out why the results don't reproduce. My initial thought is that the queries in the benchmarks are too small for parse time to matter, and the end to end execution time is entirely dominated by execution. I'll look into it to make sure the results I shared aren't purely synthetic.

@patrick91
Copy link
Member

@corydolphin thank you! yes I don't think we have a complex query example, if you have one you can share I'd happy to include in our test suite 😊

@corydolphin
Copy link
Contributor Author

I'm also exploring a slightly different approach which might make it easier to support PyPy concurrently, creating an intermediary class decorator which will allow us to create Msgspec based structs or dataclasses-backed objects. I am a little bit worried about the complexity, but I am exploring to reduce the scope of the change.

Since you are familiar with this ecosystem far better than I am, I'd love your advice for the best way to proceed. If you are comfortable and supportive of the other cleanup changes (excluding what I expect to be the more controversial removal of PyPy and migration to msgspec.structs), I can separate those into a series of stacked PRs.

I also noticed that graphene cannot upgrade to the pre-release version of graphql-core due to a subtle change in subscriptions.
commit 689a70d (“refactor: move subscribe code to execute file”, 2022-11-20) rewrote subscribe as a regular function that can return an AsyncIterator directly, rather than an async def that always returned an awaitable.

@patrick91
Copy link
Member

I'm also exploring a slightly different approach which might make it easier to support PyPy concurrently, creating an intermediary class decorator which will allow us to create Msgspec based structs or dataclasses-backed objects. I am a little bit worried about the complexity, but I am exploring to reduce the scope of the change.

Since you are familiar with this ecosystem far better than I am, I'd love your advice for the best way to proceed. If you are comfortable and supportive of the other cleanup changes (excluding what I expect to be the more controversial removal of PyPy and migration to msgspec.structs), I can separate those into a series of stacked PRs.

For GraphQL-core I'm not fully sure to be honest, I know Cito wants it to follow GraphQL.js closely. One option could be to make a custom executor? not sure if that would be enough

I'd definitely suggest to keep the PR as small as possible, I see changes in linting related stuff (I guess for the removal of python 3.9), that's going to make this PR more difficult to review

Also, did you consider parsing in Rust? Erik did PoC here: https://github.com/erikwrede/rustberry

I also noticed that graphene cannot upgrade to the pre-release version of graphql-core due to a subtle change in subscriptions. commit 689a70d (“refactor: move subscribe code to execute file”, 2022-11-20) rewrote subscribe as a regular function that can return an AsyncIterator directly, rather than an async def that always returned an awaitable.

maybe @erikwrede can check that? 😊

@Cito
Copy link
Member

Cito commented Jan 6, 2026

Hi @corydolphin. Thanks for sharing your ideas and this PR.

As you probably know, my main goal is to be aligned with and stay on track with current development of GraphQL-js, and I have already fallen behind. So I hope you understand that I'm always reluctant to include changes that take me off track regarding that goal - because they take time to review and incorporate and potentially make the implementation deviate more and more from the original, making it harder to keep in sync.

However, your proposal makes sense to me and I see that you tried to stay compatible with GraphQL-js, which I appreciate. So I'm open to discuss it here.

Your first six commits here are pretty uncontroversial. It will be helpful if you create a PR for each one separately. The first commit could be the removal of support for Py 3.7 to 3.9. I wanted to postpone this a bit, but we can do this also now. In this step, we should also modernize all type hints and make use of the newer Py features where obvious. I can help with this. The "flatten AST node hierarchy" commit should be split into two PRs, one for removing CCN - see here - and one for using a type instead of class for ValueNode (good and in line with upstream).

Regarding the last two commits, this is of course where I'm reluctant again, although I understand they are the heart of this PR. But they make us dependent on msgspec which is something I tried to avoid (graphql-core has no dependencies so far) and, as you already pointed out, exclude PyPy (although I don't know how useful/used GraphQL-core with PyPy is). If we could make the use of msgspec optional without increasing the complexity too much that could be a solution.

Regarding Graphene, it seems it is lacking active maintainers. We would need to create a stable Graphene branch that uses Core 3.2 and a dev branch based on Core 3.3 (I think the current version tries to be compatible with both which is not a good idea). Maybe you can talk with Erik - I decided to stay out of Graphene because of lack of time.

@corydolphin
Copy link
Contributor Author

Hey @Cito,

First, thank you for the quick and very detailed response. I totally understand where you are coming from, and after some more exploration yesterday, I lean towards avoiding msgspec as a direct dependency, at least for now, and instead optimizing within pure Python, at least for now. By avoiding msgspec I can also defer the type union changes, since they were required within msgspec to achieve polymorphic deserialization, but are not for Dataclasses.

Plan
I will add a series of PRs to achieve roughly the following:

  1. Increase minimum Python version to 3.10 (this is required for dataclasses kw_only, which is critical). But, I will change this PR to only do the minimum, and defer the structural cleanup later. I am happy to do it, but want to reduce the scope of these changes.
  2. Strictly enforce types of the Node objects: Use tuples for all AST collection fields (instead of lists)
  3. Strictly enforce types of the Node objects: Fix tests to use proper AST node construction
  4. Strictly enforce types of the Node objects: Correct nullability in parser.py construction
  5. Update the Node objects implementation to use frozen dataclasses.

I believe I can defer the type-hierarchy rework, though I am happy to do it, I want to try to reduce the scope of these changes to make it easier to review and merge. Once the type-hierarhcy is reworked, msgspec is actually a fairly small diff, and at worst is something we could maintain as a fork, or try to work as an extension.

Details

At a high level, there are a few core hot spots during parsing

  1. Converting lists to tuples on initialization.
  2. During init, each call to settattr cleaches the cached hash value.
  3. The iteration on kwargs and fields in the init
  4. Pure overhead of Initialization of objects in Python due to the very large number of allocations
  5. Internal optimization opportunities within the parse path, largely due to a very large number of function calls, and Python's comparatively large per function call overhead.

For deserialization, the biggest bottleneck is the override of setattr ressetting the hash, in addition to the cost of construction the Python objects.

These are not actually the same hot spots on Pickle deserialization, since init is not called. I jumped to using msgspec.Struct because it provides a big lever against multiple of the hotspots:

  1. It provides a lower cost for immutable structures than dataclasses, with much faster alloc/init.
  2. It allows for msgspec serde which I've seen be an order of magnitude faster than Pickle in many cases

After spending some more time profiling and simplifying to reduce changes, here are the largest levers:

  1. Making the AST Nodes immutable, such that we can avoid the overhead of setattr in both init and unpickle
  2. Making the AST Nodes require Tuples, rather than transform lists in the init method.
  3. Removing the iteration within the initialization either by hand-unrolling assignment into each class (verbose, not idiomatic), or using Dataclasses. Interestingly, Dataclasses are slightly slower in a few cases, but overall are a good compromise in that they are idiomatic.

I've reworked the change to use Dataclasses locally to achieve the following results

Parse Speed (no_location=True)

  - Original: 42.03 ms
  - Dataclasses: 24.60 ms (41% faster)
  - Msgspec: 18.45 ms (56% faster)

  Deserialize Speed

  - Original + pickle: 22.70 ms
  - Dataclasses + pickle: 9.59 ms (58% faster)
  - Msgspec + pickle: 6.74 ms (70% faster)
  - Msgspec + msgpack: 1.56 ms (93% faster)

Here is the underlying challenge:

  1. With Dataclasses, marking them as frozen makes them slower (~20% slower initialization in my tests)
  2. MyPy currently doesn't correctly handle marking Dataclass fields as Final

Msgspec.struct solves both of these problems for us, and gives us fast msgpack serialization.

@Cito
Copy link
Member

Cito commented Jan 6, 2026

@corydolphin the plan looks good to me. Let's start with the uncontroversial changes.

Thanks for the erformance testing with data classes. The slower initialization of frozen Dataclasses is expected, since the immutability needs to be emulated by creating additional setattr methods. Regarding MyPy and Final - I think we can live with that. We might switch to Pyright or Ty one day anyway.

@corydolphin
Copy link
Contributor Author

corydolphin commented Jan 6, 2026

It looks like managing a PR stack in an OSS repo where you don't have push access is cumbersome, and I couldn't figure out a way to change the merge bases. Here is the stack of uncontroversial changes. I'll finalize the dataclasses PR and add below.

@corydolphin
Copy link
Contributor Author

Ok, I've restacked them -- my tooling is built for different repo structures, so I am not sure the best way to make these easily reviewed -- I defer to you entirely -- if you would prefer to rebase them and restack yourself via e.g. git-spice or jj, please feel free -- I have no attachment to the structure :-)

@Cito
Copy link
Member

Cito commented Jan 7, 2026

Thank you @corydolphin, that's good enough already, will merge them this week. Instead of creating change requests, I will probably just make changes myself in follow-up commits, if needed.

@Cito
Copy link
Member

Cito commented Jan 8, 2026

Thanks again @corydolphin. The PRs are now merged to main. Can you check that everything looks ok?

I have also updated the codebase to make use of Python 3.10 everywhere, so you should pull the current main branch.

My plan is to do some more housekeeping and port upstream changes to get in line with the next js version which is 17.0.0a7, and then cut another alpha release.

Shall we close this PR? You can create a new one for optional msgspec support on top of current main if you think it's useful and doable.

@corydolphin
Copy link
Contributor Author

corydolphin commented Jan 9, 2026

Hey @Cito,

First, thank you for integrating the changes and fixing them up where necessary, I really appreciate it, I can imagine how your time is stretched and I know maintaining an OSS repo is a lot of work.

I'll close this PR for now, and on the backend, I'll work to integrate your changes into our environment. I'd love to figure out the best way to help move the ecosystem forward in a co-ordinated way, e.g. with graphene and the gql client library and I'm happy to play a role in helping where possible.

@corydolphin corydolphin closed this Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants