What's wrong with this regex (VBScript/Javascript flavor)

Posted by OtherMichael on Stack Overflow See other posts from Stack Overflow or by OtherMichael
Published on 2010-05-20T16:16:51Z Indexed on 2010/05/20 16:30 UTC
Read the original article Hit count: 452

Filed under:
|
|
|
|

I'm trying to run a regular expression in VBA code that uses Microsoft VBScript Regular Expressions 5.5 (should be the same as JavaScript regex)

regex: ^[0-9A-Z]?[0-9A-Z]{3}[A-Z]?([0-9A-Z]{6})-?([0-9])?$
input: X123A1234567
match: 123456

the six characters I'm interested in give a good match of 123456, ignoring the last (check) digit. Perfect. (The check digit is captured, but it's not a major concern to me).

But when BOTH the optional portions are gone (they are optional) the match grabs the last digit

GOOD input: 123A1234567
match: 123456

Leave in the optional middle alpha, take out the optional leading alpha, and we still get the good match of 123456

GOOD input: X1231234567
match: 123456

Leave in the optional leading alpha, take out the middle optional alpha, and we still get a good match of 123456

BAD input: 1231234567
match: 234567

Take out BOTH optional alphas, and we get a bad match of 234567

Have a looksee @ the regex testers on http://www.regular-expressions.info/javascriptexample.html or http://www.regular-expressions.info/vbscriptexample.html

What am I missing, here? How can I get the regex to ignore the last digit when both optional alphas are missing? The regex is used to feed a lookup system, so that no matter what format the input data, we can match to a complete value.

© Stack Overflow or respective owner

Related posts about regex

Related posts about vbscript