Any software for pattern-matching and -rewriting source code?

Posted by Steven A. Lowe on Stack Overflow See other posts from Stack Overflow or by Steven A. Lowe
Published on 2009-01-06T04:27:24Z Indexed on 2010/03/21 21:01 UTC
Read the original article Hit count: 392

I have some old software (in a language that's not dead but is dead to me ;-)) that implements a basic pattern-matching and -rewriting system for source code. I am considering resurrecting this code, translating it into a modern language, and open-sourcing the project as a refactoring power-tool. Before I go much further, I want to know if anything like this exists already (my google-fu is fanning air on this tonight).

Here's how it works:

  • the pattern-matching part matches source-code patterns spanning multiple lines of code using a template with binding variables,
  • the pattern-rewriting part uses a template to rewrite the matched code, inserting the contents of the bound variables from the matching template
  • matching and rewriting templates are associated (1:1) by a simple (unconditional) rewrite rule

the software operates on the abstract syntax tree (AST) of the input application, and outputs a modified AST which can then be regenerated into new source code

for example, suppose we find a bunch of while-loops that really should be for-loops. The following template will match the while-loop pattern:

Template oldLoopPtrn
	int @cnt@ = 0;
	while (@cnt@ < @max@)
	{
		… @body@
		++@cnt@;
	}
End_Template

while the following template will specify the output rewrite pattern:

Template newLoopPtrn
	for(int @cnt@ = 0; @cnt@ < @max@; @cnt@++)
	{
		@body@
	}
End_Template

and a simple rule to associate them

Rule oldLoopPtrn --> newLoopPtrn

so code that looks like this

int i=0;
while(i<arrlen)
{
    printf("element %d: %f\n",i,arr[i]);
    ++i;
}

gets automatically rewritten to look like this

for(int i = 0; i < arrlen; i++)
{
    printf("element %d: %f\n",i,arr[i]);
}

The closest thing I've seen like this is some of the code-refactoring tools, but they seem to be geared towards interactive rewriting of selected snippets, not wholesale automated changes.

I believe that this kind of tool could supercharge refactoring, and would work on multiple languages (even HTML/CSS). I also believe that converting and polishing the code base would be a huge project that I simply cannot do alone in any reasonable amount of time.

So, anything like this out there already? If not, any obvious features (besides rewrite-rule conditions) to consider?

EDIT: The one feature of this system that I like very much is that the template patterns are fairly obvious and easy to read because they're written in the same language as the target source code, not in some esoteric mutated regex/BNF format.

© Stack Overflow or respective owner

Related posts about language-agnostic

Related posts about pattern-matching