-
Notifications
You must be signed in to change notification settings - Fork 71
Migrate to encoding/json/v2 #292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
For the |
a4b6871 to
bdce391
Compare
I fixed the test failure. |
|
/assign @BenTheElder @liggitt |
|
/assign @jpbetz |
I suspect using a json marshal function (like MarshalWrite) that doesn't append a newline would be a more efficient way to accomplish that |
fieldpath/serialize-pe.go
Outdated
| return nil, fmt.Errorf("parsing JSON: %v", err) | ||
| } | ||
|
|
||
| k := rawKey.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is rawKey.String() the same as decoding to a string, in terms of interpreting escape sequences, etc?
| { | ||
| JSON: `1.0`, | ||
| IntoType: reflect.TypeOf(json.Number("")), | ||
| Want: json.Number("1.0"), | ||
| }, | ||
| { | ||
| JSON: `1`, | ||
| IntoType: reflect.TypeOf(json.Number("")), | ||
| Want: json.Number("1"), | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious if it's ok to drop these... were they added to try to catch a specific issue?
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: inteon The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
The The serialize benchmarks still look pretty rough. Need to see what we can improve there. |
|
did you run the full set of benchmarks to see how we looked across all of them? |
|
Thanks for the updates, how are the overall benchmarks looking (not just the subset in the description)? As you were adjusting the implementation, were there any unit tests it would make sense to add to catch edges the previous implementations handled we want to ensure the new one does as well? I'm thinking specifically of things like:
|
|
First off- it's amazing to see this happening and the benchmarks are VERY promising. Thanks @inteon! To get this to the finish line, and merge, what should our criteria be? I chatted offline with @liggitt briefly and some of the criteria we discussed was:
Intuitively, it seems like the deserialization is already sufficiently fast. I suspect we need to optimize serialization a bit further since we serialize managed fields on all updates (not just patches). That said, I'm willing to be data driven here. If we can show downstream scale and performance is acceptable, I'm willing to accept a higher serialization perf regression in order to migrate to json/v2. Thoughts, concerns? |
|
Update check the new numbers in my PR description, upgrading encoding/json/v2 did improve performance! |
|
Did some further tuning and got the # allocations lower than on master. |
even with stable json/v2, we might need to temporarily use a fork, otherwise we kubernetes branches that aren't on that go version yet can't update SMD. but we should encapsulate it and plan to eliminate it when we're ready to require that minimum go version even kubernetes master isn't on 1.25 yet |
Am I reading correctly that B/op and allocs/op are ~equivalent or better than master on pretty much all benchmarks? If so, that's amazing progress! Paired with a close review and functional/correctness test coverage to make sure the new approach behaves identically to the old version (especially in terms of what it accepts/rejects/produces in edge cases like leading/trailing/non-normalized/invalid inputs), this looks really promising. |
|
Amazing work @inteon in reducing the number of allocs per operation to 1. I did a similar analysis over your change, and saw similar performance. The change should have zero to negligible impact on Kube API server performance. We just have to make sure this new library behaves the same as the existing implementation functionally, which I think existing tests should be able to do(?). |
I'm not sure how detailed the existing tests are at all the edge cases of valid and invalid variants on input (handling of escaped values in keys, whitespace before/after/between tokens, valid and invalid syntax, etc), and byte-for-byte assertions about output. Since this needed to effectively rewrite some of the encoding/decoding paths, we need to make sure we have test coverage for those things. |
There are not enough tests for unicode and escape characters . I have added those tests in #300. Including these new tests, We should detect regression in Serialization and Deserialization code in future. |
|
Excellent, #300 looks like a great step forward for test coverage of normalized encoding. We'll probably want similar additions for:
|
|
After upgrading |
|
Huh… did something change on s-m-d master? The latest benchmark update looks like some of the relative improvement came from master getting worse... |
|
oh, maybe the test changes in #300 impacted the master benchmark numbers |
|
k/k master is at golang v1.25.1 fyi |
Signed-off-by: Tim Ramlot <[email protected]>
Signed-off-by: Tim Ramlot <[email protected]>
|
@inteon do we want to pick this up when k8s 1.36 cycle reopens |
Yes. I updated the code, fixed some bugs and updated the benchmark results. |
|
Looks like we can drop the "Performance is not yet great" comment from the description now... this looks ~neutral or an improvement for memory use at this point 🎉 |
We want to pick it up as soon as possible, but we need to wait until json/v2 lands in stdlib in a non-experimental way. I'm not sure if that's happening in Go 1.26 or not (the status of golang/go#76406 and json/v2 working group (view) makes me think not). api-machinery folks have a detailed code review of this PR scheduled for later this week. Once we're happy with the shape of it, and do a final round of benchmarks, we'll submit this as positive community feedback on the current shape of json/v2, and then wait patiently for json/v2 to land in stdlib 🤞 |
| } | ||
|
|
||
| // DeserializePathElement parses a serialized path element | ||
| func DeserializePathElement(s string) (PathElement, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpbetz @lalitc375: let's augment the unit tests that exercise successful DeserializePathElement calls (TestPathElementRoundTrip?) to also assert the in-memory state returned, not just the round-trip
| fields.Sort() | ||
| return PathElement{Key: &fields}, iter.Error | ||
| var fields value.FieldList | ||
| if err := json.UnmarshalRead(strings.NewReader(s[2:]), &fields); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpbetz @lalitc375: can we add test-cases to exercise
- zero-length
- EOF
- leading / trailing whitespace in value
- multi-token should fail identically
| v, err := readJSONIter(iter) | ||
| if err != nil { | ||
| var v any | ||
| if err := json.UnmarshalRead(strings.NewReader(s[2:]), &v); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpbetz @lalitc375: can we add test-cases to ensure behave identically between readJSONIter / json.UnmarshalRead
- zero-length - json-iter masks, json/v2 returns EOF error? https://go.dev/play/p/D80jALZ7ZRq
- non-zero length with EOF (not sure this is possible)
- leading / trailing whitespace in value
- multi-token should fail identically
| return fmt.Errorf("parsing JSON: %v", err) | ||
| } | ||
|
|
||
| fields = append(fields, Field{Name: k, Value: NewValueInterface(v)}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks it would accept duplicate key entries. Did the original ReadObjectCB prevent duplicates or accept them? Is there a test case to make sure these behave identically to json-iterator?
| } else if err != nil { | ||
| return fmt.Errorf("parsing JSON: %v", err) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to encounter a token other than EndObject or String here? will ReadToken() error in that case?
@jpbetz @lalitc375: can we make sure there's a test that some other token type instead of a string key errors correctly here?
| break | ||
| } | ||
|
|
||
| k := rawKey.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpbetz @lalitc375: can we ensure we have test cases that hit this point to compare behavior between json-iterator and json/v2 and we end up with the same in-memory values:
- escaped/encoded keys ("\u... \n ...")
- multi-byte keys ("Iñtërnâtiônàlizætiøn,💝🐹🌇⛔")
- keys containing bytes that are invalid unicode characters
| return children, isMember, nil | ||
| } | ||
|
|
||
| // FromJSON clears s and reads a JSON formatted set structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move this back up to where FromJSON was previously to minimize the diff
| func readIterV1(iter *jsoniter.Iterator) (children *Set, isMember bool) { | ||
| iter.ReadMapCB(func(iter *jsoniter.Iterator, key string) bool { | ||
| if key == "." { | ||
| func (s *setContentsV1) readIterV1(parser *jsontext.Decoder) (children *Set, isMember bool, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the receiver isn't used, change this back to a function:
| func (s *setContentsV1) readIterV1(parser *jsontext.Decoder) (children *Set, isMember bool, err error) { | |
| func readIterV1(parser *jsontext.Decoder) (children *Set, isMember bool, err error) { |
|
|
||
| // Append the member to the members list, we will sort it later | ||
| m := &children.Members.members | ||
| // Since we expect that most of the time these will have been |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's leave the insert-in-sorted-order code as it was here and below if there's not a reason it has to be connected to this PR
@jpbetz, let's separately consider whether to switch this to a post-accumulation sort
| if children == nil { | ||
| children = &Set{} | ||
| } | ||
| // Since we expect that most of the time these will have been |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this bit as well
| }) | ||
| } | ||
|
|
||
| // Sort the members and children |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert the post-accumulation sort
|
@inteon, thanks so much for this PR, we spent several hours today reviewing the details deeply now that the performance has converged to something we could actually merge. There were four main categories of feedback:
Don't feel like you have to be the one to resolve all the comments, @jpbetz and @lalitc375 will be jumping in to help as well. We can gradually work on this and continue getting this ready for when json/v2 lands in stdlib (which looks like it will now be Go 1.27 at the earliest) |
Replaces the
github.com/json-iterator/godependency withencoding/json/v2.closes #202