Conversation
This was referenced Jan 7, 2019
Contributor
Author
|
Here's a large scale adaptation of this API in rust-analyzer: rust-lang/rust-analyzer#449 |
flodiebold
reviewed
Jan 7, 2019
Member
flodiebold
left a comment
There was a problem hiding this comment.
It seems fine to me, but as I said, I don't know much about unsafe :)
Contributor
Author
|
migrated rust-analyzer to this API, everything seems to work fine! |
Contributor
Author
|
@flodiebold thanks for the review! |
Contributor
Author
|
published: https://crates.io/crates/rowan/0.2.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR greatly
breakssimplifies the API by getting rid ofRefRoot / OwnedRootand leveraging native rust references instead.To explain what's going on, it is helpful to start with the history of the current implementation.
Originally, the red-green tree approach was popularized by Roslyn. Red-Green solves allows one to have an immutable tree with cheap updates and with offests and parent pointers on nodes. At the first sight, this seems impossible: an offset of the node depends on the offsets of all of the previous nodes, so if you insert a new node at the beginning of the tree, then all offsets must be recalculated, resulting in O(N) performance. The same reasoning applies to parent links.
The sword to cut this knot is laziness. First, we build a green tree -- a usual immutable tree without parent pointers and with lengths instead of offsets (lengths can be maintained cheaply by summing lengths of children). Then, we build a lazy read tree. Each read node stores a green node, a parent red node, start offset and an array of red children which are calculated lazily. Traversing the tree is O(1): we either use a cached node (constant) or create a one new node (constant as well). When modifying the tree, we keep the green nodes, but tear down all the red nodes. This is still amortizing O(1) though, if we count traversals as well: to tear down a read node, you first need to traverse it.
Roslyn is implemented in C#, so their red-green relies on garbage collection for memory management.
The red-green implementation of red-green in C++ exists in Swift. To solve memory management, they added a third layer of
SyntaxNodes. In Swifts libsyntax, green nodes are a usual immutable trees, red nodes own a green node and children, and store a raw pointer to the parent node.SyntaxNodes bind together a reference-counted pointer to the root of the tree and a raw pointer to the red node. OnlySyntaxNodeis a public API, and it guarantees that the underling red-node is alive. To be more concrete, here are equivalent Rust type definitions:The problem with this approach is that traversing of the syntax tree constantly creates new
SyntaxNodes, bumping theArc. Ideally, reading the tree should just traverse the pointers, without write operations. It would also be cool to be able to work withCopysyntax nodes! And that's where the old implementation ofrowancomes from. The idea of old implementation is to make therootfield of theSynaxNodegeneric:root: R, so that R can be either&'a RedNodeor aArc<RedNode>. That way, you'll get to chose between owned version (arc bumps, .clone required, no lifetimes) and borrowed version (cheap, Copy, requires lifetime annotations). Unfortunately, being generic over ownership is not the most pleasant thing to do in Rust, and the resulting API was quite mind-bending and verbose.This new API started from the desire to leverage usual reference and use
&'a SyntaxNodefor borrowed variant andSyntaxNodefor owned variant. However, this doesn't quite work, due to the requirement of sharing data betweenSyntaxNodes. A similar situation happens if you are writing an application, and you have someCtxstruct which you'd like to store behind theArc. If you do this, some parts of the app will work withArc<Ctx>and some with&'a Ctx: note how owned variant have a smart pointer around. So, in this version a borrowed node looks like&'s SyntaxNodeand an owned one asTreePtr<SyntaxNode>.Specifically, we merge
RedTreeandSyntaxNodeinto one struct, and add aroot: *const TreeRootfield to anode. Physically, thisrootis always stored inside anArc. An ownedTreePtrstores a*const SyntaxNodeinside, but it bumps the root's Arc on creation/clone. That is,TreePtr<T>is basically andArcto the tree root, except that it points not to the root itself, but to some node within a tree. This "manual" management of ref counts forms one bit of unsafely here.The other bit of unsafely comes from the desire to be able to use
TreePtrnot only with rawSyntaxNodes, but with ast wrappers as well. Becauseastand raw syntax use exactly the same binary representation (it's only compile-time types that differ), there's actually no implementation required for this to work: just some casts. So,TransparentNewTypetrait establishes this "it's safe to cast between newtypes" relation.