Faster selector execution by pre-parsing fields#5869
Faster selector execution by pre-parsing fields#5869jcoglan wants to merge 2 commits intoapache:mainfrom
Conversation
janl
left a comment
There was a problem hiding this comment.
This looks good overall, thanks!
We do not want to add erlperf and argparse as dependencies just yet, could you revert those bits? You can keep the benchmark, just disable it for now and leave instructions for how to run it.
6b1d6ef to
095aca7
Compare
7ddba0f to
d66b35c
Compare
nickva
left a comment
There was a problem hiding this comment.
Wondering if this would affect cases when selectors are sent between cluster nodes during a release upgrade. One way to check would be to run a dev cluster node on a version without the change and others with the change and then make mango filtered changes feed and mango index/selector queries.
|
I have tried the things @nickva suggest via the following steps. I believe the results look fine but would appreciate others double checking them.
diff --git a/rel/overlay/etc/vm.args b/rel/overlay/etc/vm.args
index b4167be7b..ca8f562dd 100644
--- a/rel/overlay/etc/vm.args
+++ b/rel/overlay/etc/vm.args
@@ -43,6 +43,8 @@
# Which interfaces should the node listen on?
-kernel inet_dist_use_interface {127,0,0,1}
+-kernel inet_dist_listen_min 9100
+-kernel inet_dist_listen_max 9200
# Tell kernel and SASL not to log anything
-kernel error_logger silent
#!/bin/bash
source ~/.shell/couchdb
username='a'
password='a'
CDB_AUTH="$username:$password"
pwhash='"-pbkdf2:sha256-761a5edaa8635ae1fce1185d477cff666e137e5f2fe4d65037a8e08f380f67ab,b0e91baaa60de5da82be8bcd7acb4ab9,600000"'
uuid1='"019ce2e0305a7a9c9c8b4b50e992c097"'
uuid2='"019ce2e0305a7cd0861784fd28c84d67"'
for n in $(seq 1 3) ; do
CDB_PORT="${n}5984"
cdb '/_node/_local/_config/admins/a' -X PUT -d "$pwhash"
cdb '/_node/_local/_config/chttpd/bind_address' -X PUT -d '"0.0.0.0"'
cdb '/_node/_local/_config/couchdb/uuid' -X PUT -d "$uuid1"
cdb '/_node/_local/_config/chttpd_auth/secret' -X PUT -d "$uuid2"
cdb '/_node/_local/_config/cluster/q' -X PUT -d '"1"'
cdb '/_node/_local/_config/cluster/n' -X PUT -d '"3"'
done
CDB_PORT=15984
ip='127.0.0.1'
cdb '/_cluster_setup' -X POST -d @- <<JSON
{
"action": "enable_cluster",
"bind_address": "0.0.0.0",
"username": "$username",
"password": "$password",
"node_count": "3"
}
JSON
for n in $(seq 2 3) ; do
cdb '/_cluster_setup' -X POST -d @- <<JSON
{
"action": "enable_cluster",
"bind_address": "0.0.0.0",
"username": "$username",
"password": "$password",
"port": $CDB_PORT,
"node_count": "3",
"remote_node": "$ip",
"remote_current_user": "$username",
"remote_current_password": "$password"
}
JSON
cdb '/_cluster_setup' -X POST -d @- <<JSON
{
"action": "add_node",
"host": "$ip",
"port": ${n}5984,
"username": "$username",
"password": "$password"
}
JSON
done
cdb '/_cluster_setup' -X POST -d '{ "action": "finish_cluster" }'
cdb '/_membership' | jq
cdb () { curl -gsu "$CDB_AUTH" -H 'content-type: application/json' "http://127.0.0.1:${CDB_PORT}$1" "${@:2}" ; }
CDB_PORT=15984
cdb '/mydb' -X DELETE
cdb '/mydb' -X PUT
for n in {1..100} ; do cdb "/mydb/doc-$n" -X PUT -d '{ "value": '$n' }' ; done
CDB_PORT=15984 # or 25984, 35984
cdb '/mydb/_find' -X POST -d '{ "selector": { "value": { "$gt": 3, "$lt": 7 } } }'
cdb '/mydb/_explain' -X POST -d '{ "selector": { "value": { "$gt": 3, "$lt": 7 } } }'
CDB_PORT=15984 # or 25984, 35984
cdb '/mydb/_changes?filter=_selector' -X POST -d '{ "selector": { "value": { "$gt": 3, "$lt": 7 } } }' |
d66b35c to
b03f59e
Compare
|
That looks great, James. Thank you. Let me also give it a try locally with similar setup to double-check it. And then it should be good to go |
nickva
left a comment
There was a problem hiding this comment.
Looking at mixed cluster:
(I used a bit of a simpler way to do it)
OLD (node1, node2, ...)
./dev/run --admin=adm:pass -n3 --degrade-cluster=1
NEW (node3)
./dev/run --admin=adm:pass -n1 --no-join --node-number-seed=3
With this script
#!/bin/sh
http delete $DB/db1
http put $DB/db1
http put $DB/db1/a v:='1' type=x
http put $DB/db1/b v:='2' type=x
http put $DB/db1/c v:='3' type=y
http put $DB/db1/d v:='4' type=y
http put $DB/db1/e v:='3' type=z
http put $DB/db1/f v:='2' type=x
http put $DB/db1/g v:='1' type=z
http put $DB/db1/_design/q language=query views:='{"v": {"map": {"type":"x", "v":{"$ge":2}, "fields": {"v":"asc"}}}}'
Then querying from node3 I see this error:
http post $DB3/db1/_find selector:='{"v":{"$gt":2}}' fields:='["v", "_id"]' execution_stats:='true'
HTTP/1.1 500 Internal Server Error
{
"error": "unknown_error",
"reason": "undef",
"ref": 3538619864
}
Querying from DB1 and DB2 does work though:
http post $DB1/db1/_find selector:='{"v":{"$gt":2}}' fields:='["v", "_id"]' execution_stats:='true'
HTTP/1.1 200 OK
{
"bookmark": "g1AAAAAyeJzLYWBgYMpgSmHgKy5JLCrJTq2MT8lPzkzJBYozpoIkOGASEKEsAE9eDYE",
"docs": [
{
"_id": "c",
"v": 3
},
{
"_id": "d",
"v": 4
},
{
"_id": "e",
"v": 3
}
],
"execution_stats": {
"execution_time_ms": 1.316,
"results_returned": 3,
"total_docs_examined": 8,
"total_keys_examined": 8,
"total_quorum_docs_examined": 0
},
"warning": "No matching index found, create an index to optimize query time."
}
The error in the log looks like:
[error] 2026-03-13T04:18:15.961351Z node3@127.0.0.1 <0.33714.0> 17120bdde3 req_err(3538619864) unknown_error : undef
[<<"binary:join/2">>,<<"mango_idx_view:-field_ranges/1-lc$^0/1-0-/1 L327">>,<<"mango_cursor_special:create/4 L35">>,<<"mango_crud:find/5 L49">>,<<"mango_httpd:handle_find_req/2 L213">>
,<<"mango_httpd:handle_req/2 L34">>,<<"chttpd:handle_req_after_auth/2 L432">>,<<"chttpd:process_request/1 L410">>]
[notice] 2026-03-13T04:18:15.961542Z node3@127.0.0.1 <0.33714.0> 17120bdde3 127.0.0.1:35984 127.0.0.1 adm POST /db1/_find 500 ok 1
[error] 2026-03-13T04:18:20.567988Z node3@127.0.0.1 <0.33894.0> -------- Invalid Index Def [{<<"map">>,{[{<<"type">>,<<"x">>},{<<"v">>,{[{<<"$ge">>,2}]}},{<<"fields">>,{[{<<"v">>,<<"asc">>
}]}}]}}]. Error: error, Reason: {badmatch,undefined}
I didn't dive in yet to see why it's happening but figure we'd want to take a better look before merging
b03f59e to
1f6cc39
Compare
Currently, selectors like `{ "a.b": X }` or `{ "a": { "b": X } }` are
normalized into the internal form:
{[{<<"a.b">>, X}]}
When the selector is evaluated, we call `mango_doc:get_field(Doc,
<<"a.b">>)`, which then has to parse `<<"a.b">>` into the list
`[<<"a">>, <<"b">>]` so it can follow the path to the field inside the
doc. This path is re-parsed every time the document is evaluated.
To save time, here we change the normalized form to:
{[{[<<"a">>, <<"b">>], X}]}
i.e. the field path is already parsed before the selector is evaluated.
This has been shown to improve selector evaluation time by 30-40%.
However, it does mean the normalized form is no longer a valid ejson
structure as it contains a key that is not a binary. We therefore need
to turn it back into `<<"a.b">>` in cases where the selector is
serialized back to JSON, for example when it is returned as a part of
`_explain` output. This is done using the `join_field` and `join_keys`
functions in `mango_util`.
We also need to update anywhere that hard-codes a particular document
field so that the field is in the parsed array representation.
In the previous commit we changed lots of code to deal with normalized selector fields now being per-parsed lists, not binaries, i.e. instead of `<<"a.b">>` a field now looks like `[<<"a">>, <<"b">>]`. This commit winds most of that back and instead makes the functions in `mango_selector` hide the fact the field is now pre-parsed from callers. So for example, the code that deals with matching selectors to indexes can continue to pass the unparsed form into `mango_selector` functions, and `mango_selector` will take care of parsing/serializing inputs as necessary.
1f6cc39 to
a91592f
Compare
nickva
left a comment
There was a problem hiding this comment.
Look better, I'll still have to re-run my mixed cluster test a few times though. But for now, I wonder if we've switched to a less optimized join if we'd still see a speedup in indexing or filtering?
| join_field(Field) -> | ||
| Field. | ||
|
|
||
| % binary:join/2 is not available on all Erlang versions we support; it was |
There was a problem hiding this comment.
Since binary:join/2 is probably a faster implementation I wonder if we still get a noticeably indexing or filtering boost from this PR?
Try indexing a few million docs and see if notice a difference
| % binary:join/2 is not available on all Erlang versions we support; it was | ||
| % added in 28.0. For now, we use this function in its place, c.f. | ||
| % https://www.erlang.org/doc/apps/stdlib/binary.html#join/2 | ||
| binary_join(Binaries, Separator) -> |
There was a problem hiding this comment.
Could add a guard that Binaries is a list and Separator is a binary. Sometimes those asserts could also help the compiler to optimize some snippets.
Overview
While working on
mango_selector:match()to implement detailed match failure information, I noticed that every time a non-operator field is evaluated, we callmango_doc:get_field(), which callsmango_util:parse_field()if its input is a binary. We can avoid re-parsing fields so often by changingnorm_fieldsso that it normalizes to the parsed form[<<"a">>, <<"b">>]instead of<<"a.b">>.The changes have made the benchmarks I added about 30-40% faster. (I wasn't sure where benchmarks like this should live, if anywhere... they're mostly here for me to check the effect and I'm happy to move or remove them before merging.)
In the first commit here I make this change and then change all the other logic that interacts with selectors to also use this form. This ended up having a bigger blast radius than I would have liked, so I undid most of it and changed the
mango_selectorfunctions and a couple others so they hide how normalized selector fields actually look from other callers. There's only a couple of places where these normalized selectors need to be turned back into valid JSON, and themango_util:join_keys()function turns the array keys back into binaries as needed.There are a couple of places in the logic for choosing indexes for queries where we now have to parse fields to match the form they appear in in selectors, but I figure doing this once per query during index selection is better than doing it once per selector-field evaluation during index updates or filtering.
I assume the latter commit is the state we'd actually want to keep so I'm happy to squash all the other edits away. I also need to write proper commits messages; I'll tidy the history up once I know what should be kept in the history.
Related Issues or Pull Requests
Checklist
rel/overlay/etc/default.inisrc/docsfolder