Regular expression for dividing country calling codes

Posted by RickiG on Stack Overflow See other posts from Stack Overflow or by RickiG
Published on 2010-04-07T12:00:41Z Indexed on 2010/04/07 12:03 UTC
Read the original article Hit count: 304

Filed under:

regex

Hi

I have a list of calling codes for all countries(the phone number prefixes), I would like to split them up in the country name and the actual code so I can put then into an xml.

I have tried back and forth but can not get a regexp going that takes all cases into account. I think it is fairly simple for someone with a bit of experience.

The codes have these formats:

Afghanistan 93
Anguilla 1 264
Antarctica 6721
Antigua and Barbuda 1 268
Bosnia and Herzegovina 387
Canada 1
Congo, Republic of the 242
Cote d'Ivoire 225
Ireland (Eire) 353
United States of America 1

There are around 235 of them in total, but these are the regulars and the exceptions.

^[a-zA-Z]\s,'()] for between 1 and X words and then it is [0-9\s]{1,5}$ for the numbers:

X
XX
XXX
XXXX
X XXX

So if I should express it as a sentence it would be: "from beginning of a line, take all characters (1) including space,'() until you encounter digits, then take all of these including space(2) until you encounter a line break."

I am using TextMate, and the docs says:

TextMate uses the Oniguruma regular expression library by K. Kosako.

I would appreciate any help given:) Thank you.

Related posts about regex

Find multiple regex in each line and skip result if one of the regex doesn't match

as seen on Stack Overflow - Search for 'Stack Overflow'
I have a list of variables: variables = ['VariableA', 'VariableB','VariableC'] which I'm going to search for, line by line ifile = open("temp.txt",'r') d = {} match = zeros(len(variables)) for line in ifile: emptyCells=0 for i in range(len(variables)): regex = r'('+variables[i]+r')[:|=|\(](-… >>> More
OWASP Regex Repository: Is this regex correct?

as seen on Stack Overflow - Search for 'Stack Overflow'
I was looking at the regular expression for validating various data types from the (OWASP Regex Repository). One of the regular expressions in there is called safetext and looks like: ^[a-zA-Z0-9\s.\-]+$ My first question is: Is this regular expression correct? complementary question If this… >>> More
Make a Perl-style regex interpreter behave like a basic or extended regex interpreter

as seen on Stack Overflow - Search for 'Stack Overflow'
I am writing a tool to help students learn regular expressions. I will probably be writing it in Java. The idea is this: the student types in a regular expression and the tool shows which parts of a text will get matched by the regex. Simple enough. But I want to support several different regex… >>> More
JS regex isn't matching, even thought it works with a regex tester

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm writing a piece of client-side javascript code that takes a function and finds the derivative of it, however, the regex that's supposed to match with the power rule fails to work in the context of the javascript program, even though it sucessfully matches when it's used with an independent regex… >>> More
c# RegEx with "|"

as seen on Stack Overflow - Search for 'Stack Overflow'
I need to be able to check for a pattern with | in them. For example an expression like d*|*t should return true for a string like "dtest|test". I'm no regex hero so I just tried a couple of things, like: Regex Pattern = new Regex("s*\|*d"); //unable to build because of single backslash Regex Pattern… >>> More

Developer IT