Search Results

Search found 5919 results on 237 pages for 'regex matching'.

Page 115/237 | < Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122 | Next Page >

What is the Java 1.4.2 equivalent of Pattern.quote()

- by Luke

What would be a Java 1.4.2 equivalent of Pattern.quote? I was using Pattern.quote() on a URI but now need to make it 1.4.2 compatible.

Read the article
C# regular expression

- by vert

How would I write a regular expression (C#) which will check a given string to see if any of its characters are characters OTHER than the following: a-z A-Z Æ æ Å å Ø ø - ' Thanks!

Read the article
Regular expressions in python unicode

- by Remy

I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

Read the article
Regular expression to match last item without delimiter

- by Steven

I have a string of names like this "J. Smith; B. Jones; O. Henry" I can match all but the last name with \w+.*?; Is there a regular expression that will match all the names, including the last one?

Read the article
Weird error using preg_match and unicode

- by Thorpe Obazee

if (preg_match('(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)', '2010/02/14/this-is-something')) { // do stuff } The above code works. However this one doesn't. if (preg_match('/\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+/u', '2010/02/14/this-is-something')) { // do stuff } Maybe someone could shed some light as to why the one below doesn't work. This is the error that is being produced: A PHP Error was encountered Severity: Warning Message: preg_match() [function.preg-match]: Unknown modifier '\'

Read the article
PHP & Regular expression: keyword just occurs once

- by lauthiamkok

Hi, how can I make sure a certain keyword just occurs once in the input with regular expression? I think there is some mistakes in the expression below as I can repeat the same keywords, if (!preg_match('/\b(.php?){1}\b/', $cfg_path)) { $error = true; echo '<error elementid="cfg_path" message="PATH - make sure you have a \'.php?\' in the path."/>'; } I just want this to be true, form.php?category=something or form.php? but not this, form.php?.php?category=something or form.php?.php? please let me know how to fix it. thanks.

Read the article
regular expression and escaping

- by pstanton

Sorry if this has been asked, my search brought up many off topic posts. I'm trying to convert wildcards from a user defined search string (wildcard is "*") to postgresql like wildcard "%". I'd like to handle escaping so that "%" => "\%" and "\*" => "*" I know i could replace \* with something else prior to replacing * and then swap it back, but i'd prefer not to and instead only convert * using a pattern that selects it when not proceeded by \. String convertWildcard(String like) { like = like.replaceAll("%", "\\%"); like = like.replaceAll("\\*", "%"); return like; } Assert.assertEquals("%", convertWildcard("*")); Assert.assertEquals("\%", convertWildcard("%")); Assert.assertEquals("*", convertWildcard("\*")); // FAIL Assert.assertEquals("a%b", convertWildcard("a*b")); Assert.assertEquals("a\%b", convertWildcard("a%b")); Assert.assertEquals("a*b", convertWildcard("a\*b")); // FAIL ideas welcome.

Read the article
preg_replace only replaces first occurrence then skips to next line

- by Dom

Got a problem where preg_replace only replaces the first match it finds then jumps to the next line and skips the remaining parts on the same line that I also want to be replaced. What I do is that I read a CSS file that sometimes have multiple "url(media/pic.gif)" on a row and replace "media/pic.gif" (the file is then saved as a copy with the replaced parts). The content of the CSS file is put into the variable $resource_content: $resource_content = preg_replace('#(url\((\'|")?)(.*)((\'|")?\))#i', '${1}'.url::base(FALSE).'${3}'.'${4}', $resource_content); Does anyone know a solution for why it only replaces the first match per line?

Read the article
Help with Regular Expression

- by shivesh

Hello I need help with Regular Expression, I want to match each section (number and it's text - 2 groups), the text can be multi line, each section ends when another section starts (another number) or when .END is reached or EOF. Demo Expression: \(\d{1,3}\) ([\s\S]*?)(\.END|\(\d{1,3}\)) Input text: (1) some text some text some text some text some text some text (2) some text some textsome text (3) some textsome text some textsome textsome text (4) some text .END first group should match number (with brackets) and second group should match corresponded text.

Read the article
best REGEXP friendly Text Editors + most powerful REGEXP syntax?

- by John

I am fluent with Microsoft Visual 2005 regular expressions and they are a big time saver. I seem to learn them best by having a vaguely organized cheat sheet thrown at me, at which point I read just a little and play with them until I understand what's going on. That learning approach has worked well for me, for now. I would really like to take this to the next level though. Basically -- What is the REGEXP convention that is generally regarded as the most open-ended and powerful? VS2005 Regexps seem kind of gimped, so maybe I'm a kid playing in a sandbox. Are there text editors out there that can perform a highlight all matches, list lines containing string, or some kind of powerful function like that in conjunction with the very strongest REGEXP language? If not I can just use multiple programs and a weird technique but I'd like to avoid that. I wonder if a stronger REGEXP language or a "stronger" regEXP writer might be able to have his search match all results on all lines even by clicking a "find next" by adding some simple criteria to the search. Anyway, please provide advice!

Read the article
How to process this string via regular expression

- by iiduce

my string style like this: expression1/field1+expression2*expression3+expression4/field2*expression5*expression6/field3 a real style mybe like this: computer/(100)+web*mail+explorer/(200)*bbs*solution/(300) "+" and "*" represent operator "computer","web"...represent expression (100),(200) represent field num . field num may not exist. I want process the string to this: /(100)+web*+explorer/(200)bbs/(300) rules like this: if expression length is more than 3 and its field is not (200), then add brackets to it.

Read the article
Add link around img tags with regexp

- by rdanee

I would like to add links around image tags with preg_replace(). Before: <img href="" src="" alt="" /> After: <a href="" ..><img href="" src="" alt="" /></a> I would greatly appreciate any help. Thank you very much.

Read the article
How to get N random string from a {a1|a2|a3} format string?

- by Pentium10

Take this string as input: string s="planets {Sun|Mercury|Venus|Earth|Mars|Jupiter|Saturn|Uranus|Neptune}" How would I choose randomly N from the set, then join them with comma. The set is defined between {} and options are separated with | pipe. The order is maintained. Some output could be: string output1="planets Sun, Venus"; string output2="planets Neptune"; string output3="planets Earth, Saturn, Uranus, Neptune"; string output4="planets Uranus, Saturn";// bad example, order is not correct Java 1.5

Read the article
Regular Expression to isolate an html tag

- by orit cohen

I'm looking for a regular expression to isolate an html tag. This includes the TAG the ATTRIBUTES and the CONTNET inside. Let's say I have this: <html> <body> aajsdfkjaskd <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> </body> </html> I need a regular expression that would return: <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> Thank, Joe

Read the article
Are .NET's regular expressions Turing complete?

- by Robert

Regular expressions are often pointed to as the classical example of a language that is not Turning complete. For example "regular expressions" is given in as the answer to this SO question looking for languages that are not Turing complete. In my, perhaps somewhat basic, understanding of the notion of Turning completeness, this means that regular expressions cannot be used check for patterns that are "balanced". Balanced meaning have an equal number of opening characters as closing characters. This is because to do this would require you to have some kind of state, to allow you to match the opening and closing characters. However the .NET implementation of regular expressions introduces the notion of a balanced group. This construct is designed to let you backtrack and see if a previous group was matched. This means that a .NET regular expressions: ^(?<p>a)*(?<-p>b)*(?(p)(?!))$ Could match a pattern that: ab aabb aaabbb aaaabbbb ... etc. ... Does this means .NET's regular expressions are Turing complete? Or are there other things that are missing that would be required for the language to be Turing complete?

Read the article
string substitution regular expression not working in tcl

- by Puneet Mittal

i am trying to replace all the special characters including white space, hyphen, etc, to underscore, from a string variable in tcl. I wrote the code below but it doesn't seem to be working. set varname $origVar puts "Variable Name :>> $varname" if {$varname != ""} { regsub -all {[\s-\]\[$^?+*()|\\%&#]} $varname "_" $newVar } puts "New Variable :>> $newVar" one issue is that, instead of replacing the string in $varname, it is replacing the data inside $origVar. No idea why, and also i read the example code (for proper syntax) in my tcl book and according to that it should be something like this regsub -all {[\s-][$^?+*()|\\%&#]} $varname "_" newVar so i used the same syntax but it didn't work and gave the same result as modifying the $origVar instead of required $varname value.

Read the article
How to avoid resetting the java Scanner position

- by Derek

I have some code that looks more or less like this: while(scanner.hasNext()) { if(scanner.findInLine("Test") !=null) { //do some things }else{ scanner.nextLine(); } } I am using this to parse an ~10MB text file. The problem is, if I put a breakpoint on the while() and the scanner.nextLine(), I can see that sometimes the scanners position (in the debug window) goes back to zero. I think this is causing me some kind of loop blow up, because the regext in findInLine() starts at zero, looks through some amount of text, advancing the position, and then it randomly gets set back to zero, so it has to re-parse all that text again. Any ideas what can be causing that? Am I even doing this the right way? Thanks Some additional info: The Scanner is instantiated from an InputStream. After diubg sine debugging, it appears that there is a HearCharBuffer that Scanner uses and it only allows 1024 characters at a time, and then resets. Is there a way to avoid this, or do things differently? That seems like a small amount of characters to be able to scan. Derek

Read the article
Writing a PHP web crawler using cron

- by Horse

Hi all I have written myself a web crawler using simplehtmldom, and have got the crawl process working quite nicely. It crawls the start page, adds all links into a database table, sets a session pointer, and meta refreshes the page to carry onto the next page. That keeps going until it runs out of links That works fine however obviously the crawl time for larger websites is pretty tedious. I wanted to be able to speed things up a bit though, and possibly make it a cron job. Any ideas on making it as quick and efficient as possible other than setting the memory limit / execution time higher?

Read the article
Problem with regular expression for some special parttern.

- by SpawnCxy

Hi all, I got a problem when I tried to find some characters with following code: preg_match_all('/[\w\uFF10-\uFF19\uFF21-\uFF3A\uFF41-\uFF5A]/',$str,$match); //line 5 print_r($match); And I got error as below: Warning: preg_match_all() [function.preg-match-all]: Compilation failed: PCRE does not support \L, \l, \N, \U, or \u at offset 4 in E:\mycake\app\webroot\re.php on line 5 I'm not so familiar with reg expression and have no idea about this error.How can I fix this?Thanks.

Read the article
Markdown implementation in PHP parses text within <a> tags — how does one disable this behavior?

- by Kyle

I'm using the Markdown library for PHP by Michel Fortin. I started noticing that it formats the text in tags with markdown rules, like so: http://foo.com/My_Url_With_Underscores essentially becomes: <a href="...">http://foo.com/My<em>Url</em>With_Underscores</a> How do I disable that behavior or otherwise prevent the library from doing that?

Read the article
RegExp to validate a formula (string/boolean/numeric expression)?

- by JSteve

I have used regExp quit a bit of times but still far from being an expert. This time I want to validate a formula (or math expression) by regExp. The difficult part here is to validate proper starting and ending parentheses with in the formula. I believe, there would be some sample on the web but I could not find it. Can somebody post a link for such example? or help me by some other means?

Read the article
Regular expression for dd/mm

- by swapna

Hi Please help me with a regular expression to validate the following format dd/mm This is for validating a Birthday field and the year is not required. Thanks

Read the article
re.sub emptying list

- by jmau5

def process_dialect_translation_rules(): # Read in lines from the text file specified in sys.argv[1], stripping away # excess whitespace and discarding comments (lines that start with '##'). f_lines = [line.strip() for line in open(sys.argv[1], 'r').readlines()] f_lines = filter(lambda line: not re.match(r'##', line), f_lines) # Remove any occurances of the pattern '\s*<=>\s*'. This leaves us with a # list of lists. Each 2nd level list has two elements: the value to be # translated from and the value to be translated to. Use the sub function # from the re module to get rid of those pesky asterisks. f_lines = [re.split(r'\s*<=>\s*', line) for line in f_lines] f_lines = [re.sub(r'"', '', elem) for elem in line for line in f_lines] This function should take the lines from a file and perform some operations on the lines, such as removing any lines that begin with ##. Another operation that I wish to perform is to remove the quotation marks around the words in the line. However, when the final line of this script runs, f_lines becomes an empty lines. What happened? Requested lines of original file: ## English-Geek Reversible Translation File #1 ## (Moderate Geek) ## Created by Todd WAreham, October 2009 "TV show" <=> "STAR TREK" "food" <=> "pizza" "drink" <=> "Red Bull" "computer" <=> "TRS 80" "girlfriend" <=> "significant other"

Read the article
Regular expression that finds and replaces non-ascii characters with Python

- by prosseek

I need to change some characters that are not ASCII to '_'. For example, Tannh‰user - Tann_huser If I use regular expression with Python, how can I do this? Is there better way to do this not using RE?

Read the article
regexp target last main li in list

- by veilig

I need to target the starting tag of the last top level LI in a list that may or may-not contain sublists in various positions - without using CSS or Javascript. Is there a simple/elegant regexp that can help with this? I'm no guru w/ them, but it appears the need for greedy/non-greedy selectors when I'm selecting all the middle text (.*) / (.+) changes as nested lists are added and moved around in the list - and this is throwing me off. $pattern = '/^(<ul>.*)<li>(.+<\/li><\/ul>)$/'; $replacement = '$1<li id="lastLi">$3'; Perhaps there is an easier approach?? converting to XML to target the LI and then convert back? ie: Single Element <ul> <li>TARGET</li> </ul> Multiple Elements <ul> <li>foo</li> <li>TARGET</li> </ul> Nested Lists before end <ul> <li> foo <ul> <li>bar</li> </ul> <li> <li>TARGET</li> </ul> Nested List at end <ul> <li>foo</li> <li> TARGET <ul> <li>bar</li> </ul> </li> </ul>

Read the article

< Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122 | Next Page >