go: go/ast: provide AST Rewrite functionality (ast.Walk is not good enough)

This has come up again and again. While it’s easy to traverse the AST (ast.Walk, ast.Inspect), it’s not easily possible to use those for general tree rewrites w/o significant amount of work.

For instance, if we wanted to rewrite all expressions (ast.Expr) of a certain kind into something else, ast.Walk and ast.Inspect would need to consider every node that contains ast.Expr fields separately, rather than just look at nodes that satisfy ast.Expr.

AST rewriters exist in other packages; most notably perhaps in gofmt (for gofmt -r); that one is reflection-based. Reflection-based approaches tend to be general, but also hard to understand, and slower than necessary.

API starting point:

// An ApplyFunc is invoked by Apply for each node, before and/or after
// the node's children. See Apply for the interpretation of the return
// value.
//
// The node is given by the node pointer's address addr which makes it
// possible for an ApplyFunc to rewrite the node. The node's parent is
// the node containing the *addr. If the node is part of a list, index
// identifies its position in that list. Index is < 0 otherwise.
type ApplyFunc func(parent Node, index int, addr interface{}) bool

// Apply traverses a syntax tree recursively, starting with the node
// identified by parent, index, and addr. See Apply for the meaning
// of these arguments.
// 
// If pre is not nil, it is called for each node before its children
// are traversed (pre-order). If the result of calling pre is false,
// no children are traversed, and post is not called for the node.
//
// If post is not nil, it is called for each node after its children
// were traversed (post-order). If the result of calling post is false,
// traversal is terminated and Apply returns immediately.
func Apply(parent Node, index int, addr interface{}, pre, post ApplyFunc) {

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 33
  • Comments: 41 (37 by maintainers)

Commits related to this issue

Most upvoted comments

@fatih Thanks for looking into this. A few observations:

  1. When you use Apply, you don’t need to type-assert on the parent - you can simply use the new SetField function (also documented in the API): ast.SetField(parent, name, index, <newnode>). With that, both versions will be approximately the same amount of code to write.
  2. In this specific case where you simply change a string, there’s no need to use a complicated rewrite in the first place: You could just walk the tree and change that string directly.
  3. Your implementation of astrewrite is easier to use in simple cases, but it is almost always more costly: The implementation of astrewrite always needs to write back the result returned by rewriteFunc, and because it’s always an ast.Node there’s always a type assertion needed for the assignment to work (in the astrewrite implementation) - that’s relatively costly. You have to always do this for every single node even though most nodes are not rewritten. On the other hand, with ast.Apply, the ast.SetField operation is relatively expensive but you only pay when you actually rewrite something.
  4. Finally, with ast.Apply you have the choice to do the rewriting in pre- or post-order when traversing the tree, which is often important to be able to control - this is not possibly with astrewrite.

To summarize: If you only want to change the name of things, you may not need a rewrite at all. Secondly, while astrewrite may be fine for your (more general) use case, it’s probably not sufficiently powerful as a general rewrite mechanism, and likely quite a bit slower than the suggested ast.Apply. Hope that helps.

Anyway, thanks again for investigating!

@smasher164 Because ast.Apply may change the type of the node which cannot be done by modifying a node in place. For instance, an ast.Apply working on an constant expression tree may replace that tree with a single node which is the constant result.

I’m struggling with deleting elements of an AST.

Sample:

func f() {
  a1()
  a2()
  b()
}

Task: eliminate all statements in f that are calls to functions starting with a.

I did the obvious thing and wrote a Delete helper, analogous to SetField (not thoroughly tested yet):

// Delete deletes or sets to nil the named field from the parent node.
// Delete performs the following assignment:
//
//   if index < 0 {
//     parent.name = nil
//   } else {
//     copy(parent.name[index:], parent.name[index+1:])
//     parent.name[len(parent.name)-1] = nil
//     parent.name = parent.name[:len(parent.name)-1]
//   }
//
// The parent node may be a pointer to the struct containing the named
// field, or it may be the struct itself.
//
// TODO: do parents of Package type need special handling?
func Delete(parent ast.Node, name string, index int) {
	v := reflect.Indirect(reflect.ValueOf(parent)).FieldByName(name)
	if index < 0 {
		v.Set(reflect.Zero(v.Type()))
		return
	}
	last := v.Len()
	reflect.Copy(v.Slice(index, last), v.Slice(index+1, last))
	v.Index(last - 1).Set(reflect.Zero(v.Type().Elem()))
	v.SetLen(last - 1)
}

But when you delete an element from a slice, it disrupts the walk that Apply does. In the sample above, the obvious implementation of pre deletes a1() but misses a2(), because after the deletion of a1(), Apply’s index into the BlockStmt is off by one.

One option is give ApplyFunc access to Apply’s internal state, perhaps something like this:

type ApplyLocation struct {
  name string // export?
  index int // export?
  parent Node // export?
  a *application
}

type ApplyFunc func(loc ApplyLocation, n Node) bool

func (x ApplyLocation) Set(n ast.Node)

func (x ApplyLocation) Delete()  // adjusts application's internal state as necessary

@josharian Since it’s basically there (but for tests), I made this a 1.9 item. It should just be in the ast package. We can put it in once the tree opens (which is hopefully soon).