How do I strip multiple (optional) parts of a SQL string using .NET Regular Expressions?

Posted by Luc on Stack Overflow See other posts from Stack Overflow or by Luc
Published on 2010-04-22T01:14:22Z Indexed on 2010/04/22 1:23 UTC
Read the original article Hit count: 185

Filed under:
|
|
|

I've been working on this for a few hours now and can't find any help on it. Basically, I'm trying to strip a SQL string into various parts (fields, from, where, having, groupBy, orderBy). I refuse to believe that I'm the first person to ever try to do this, so I'd like to ask for some advise from the StackOverflow community. :)

To understand what I need, assume the following SQL string:

select * from table1 inner join table2 on table1.id = table2.id 
where field1 = 'sam' having table1.field3 > 0 
group by table1.field4 order by table1.field5 

I created a regular expression to group the parts accordingly:

select\s+(?<fields>.+)\s+from\s+(?<from>.+)\s+where\s+(?<where>.+)\s+having\s+(?<having>.+)\s+group\sby\s+(?<groupby>.+)\s+order\sby\s+(?<orderby>.+)

This gives me the following results:

fields => *
from => table1 inner join table2 on table1.id = table2.id
where => field1 = 'sam'
having => table1.field3 > 0
groupby => table1.field4
orderby => table1.field5 

The problem that I'm faced with is that if any part of the SQL string is missing after the 'from' clause, the regular expression doesn't match.

To fix that, I've tried putting each optional part in it's own (...)? group but that doesn't work. It simply put all the optional parts (where, having, groupBy, and orderBy) into the 'from' group.

Any ideas?

© Stack Overflow or respective owner

Related posts about .NET

Related posts about regex