Java regex skipping matches
        Posted  
        
            by 
                Mihail Burduja
            
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by Mihail Burduja
        
        
        
        Published on 2012-11-11T10:42:57Z
        Indexed on 
            2012/11/11
            11:00 UTC
        
        
        Read the original article
        Hit count: 251
        
I have some text; I want to extract pairs of words that are not separated by punctuation. Thi is the code:
//n-grams
Pattern p = Pattern.compile("[a-z]+");
if (n == 2) {
    p = Pattern.compile("[a-z]+ [a-z]+");
}
if (n == 3) {
    p = Pattern.compile("[a-z]+ [a-z]+ [a-z]+");
}
Matcher m = p.matcher(text.toLowerCase());
ArrayList<String> result = new ArrayList<String>();
while (m.find()) {
    String temporary = m.group();
    System.out.println(temporary);
    result.add(temporary);
}
The problem is that it skips some matches. For example "My name is James", for n = 3, must match "my name is" and "name is james", but instead it matches just the first. Is there a way to solve this?
© Stack Overflow or respective owner