regex matching - Page 117

Regular Expression to isolate an html tag

- by orit cohen

I'm looking for a regular expression to isolate an html tag. This includes the TAG the ATTRIBUTES and the CONTNET inside. Let's say I have this: <html> <body> aajsdfkjaskd <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> </body> </html> I need a regular expression that would return: <TAGNAME name="bla" context="non">hfdfhdj </TAGNAME> Thank, Joe

Read the article

Markdown implementation in PHP parses text within <a> tags — how does one disable this behavior?

- by Kyle

I'm using the Markdown library for PHP by Michel Fortin. I started noticing that it formats the text in tags with markdown rules, like so: http://foo.com/My_Url_With_Underscores essentially becomes: <a href="...">http://foo.com/My<em>Url</em>With_Underscores</a> How do I disable that behavior or otherwise prevent the library from doing that?

Read the article

String parsing with regular expressions

- by ed1t

I have a following string that I would like to parse into either a List or a String[]. (Test)(Testing (Value)) End result should be Test and Testing (Value)

Read the article

jquery textarea custom tags replacement

- by Tim

Hi all, I'm basically trying to create my own tags - and replace them with the right HTML tags. So {B} {/B} would turn into <b> </b> I have only got so far with this, here: http://www.nacremedia.com/text2.htm Use the [B] button to bold stuff the current selection... it creates two bold tags and one closing for some reason. I'm so close! But I just need a bit of direction to get the final bugs out - can anyone please help?? Also, if there is a better way of doing this altogether then I am more than welcome to new ideas.

Read the article

Regular expression that finds and replaces non-ascii characters with Python

- by prosseek

I need to change some characters that are not ASCII to '_'. For example, Tannh‰user - Tann_huser If I use regular expression with Python, how can I do this? Is there better way to do this not using RE?

Read the article

How Do I grep For non-ASCII Characters in UNIX

- by Peter Conrey

I have several very large XML files and I'm trying to find the lines that contain non-ASCII characters. I've tried the following: grep -e "[\x{00FF}-\x{FFFF}]" file.xml But this returns every line in the file, regardless of whether the line contains a character in the range specified. Do I have the syntax wrong or am I doing something else wrong? I've also tried: egrep "[\x{00FF}-\x{FFFF}]" file.xml (with both single and double quotes surrounding the pattern).

Read the article

Finding the complement of a regular expression

- by Bo Tian

There's a question on my exercise sheet to find the complement of r = (a|b)*ab(a|b)* I've come up with a solution, but I'm not sure if it's correct. Please help me to check, and correct my errors. Thanks in advance.

Read the article

Properly match a Java string literal

- by Fork

Hi, I am looking for a Regular expression to match string literals in Java source code. Is it possible?

Read the article

Regular expressions in python unicode

- by Remy

I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

Read the article

preg_match_all problems

- by NeoNmaN

i use preg_match_all and need to grab all a href="" tags in my code, but i not relly understand how to its work. i have this reg. exp. ( /(<([\w]+)[^])(.?)(<\/\2)/ ) its take all html codes, i need only all a href tags. i hobe i can get help :)

Read the article

Regular expression to match a name

- by zoom_pat277

What will be the regular expression in javascript to match a name field, which allows only letters, apostrophes and hyphons? so that jhon's avat-ar or Josh is valid? Thanks

Read the article

Help with this reg. exp. in PHP

- by Jonathan

Hi, i don't know about regular expressions, I asked here for one that: gets either anything up to the first parenthesis/colon or the first word inside the first parenthesis. This was the answer: preg_match('/(?:^[^(:]+|(?<=^\\()[^\\s)]+)/', $var, $match); I need an improvement, I need to get either anything up to the first parenthesis/colon/quotation marks or the first word inside the first parenthesis. So if I have something like: $var = 'story "The Town in Hell"s Backyard'; // I get this: $match = 'story'; $var = "screenplay (based on)"; // I get this: $match = 'screenplay'; $var = "(play)"; // I get this: $match = 'play'; $var = "original screen"; // I get this: $match = 'original screen'; Thanks!

Read the article

Perl Regular expression remove double tabs, line breaks, white spaces

- by Scoox

Hi guys, I want to write a perl script that removes double tabs, line breaks and white spaces. What I have so far is: $txt=~s/\r//gs; $txt=~s/ +/ /gs; $txt=~s/\t+/\t/gs; $txt=~s/[\t\n]*\n/\n/gs; $txt=~s/\n+/\n/gs; But, 1. It's not beautiful. Should be possible to do that with far less regexps. 2. It just doesn't work and I really do not know why. It leaves some double tabs, white spaces and empty lines (i.e. lines with only a tab or whitespace) I could solve it with a while, but that is very slow and ugly. Any suggestions?

Read the article

Dealing with regular expressions, Python

- by Gusto

I want to remove some symbols from a string using a regular expression, for example: == (that occur both at the beginning and at the end of a line), * (at the beginning of a line ONLY). def some_func(): clean = re.sub(r'= {2,}', '', clean) #Removes 2 or more occurrences of = at the beg and at the end of a line. clean = re.sub(r'^\* {1,}', '', clean) #Removes 1 or more occurrences of * at the beginning of a line. What's wrong with my code? It seems like expressions are wrong. How do I remove a character/symbol if it's at the beginning or at the end of the line (with one or more occurrences)?

Read the article

What is the Regular Expression For "Not Whitespace and Not a hyphen"

- by rudimenter

I tried this but it doesn't work : [^\s-] Any Ideas?

Read the article

Convert a complicated string into an array in php

- by Patrick Beardmore

I have a php variable that comes from a form that needs tidying up. I hope you can help. The variable contains a list of items (possibly two or three word items with a space in between words). I want to convert it to a comma separated list with no superfluous white space. I want the divisions to fall only at commas, semi-colons or new-lines. Blank cannot be an item. Here's a comprehensive example (with a deliberately messy input): Variable In: "dog, cat ,car,tea pot,, ,,, ;;(++NEW LINE++)fly, cake" Variable Out "dog,cat,car,tea pot,fly,cake" Can anyone help?

Read the article

Regular expression: who's greedier?

- by polygenelubricants

My primary concern is with the Java flavor, but I'd also appreciate information regarding others. Let's say you have a subpattern like this: (.*)(.*) Not very useful as is, but let's say these two capture groups (say, \1 and \2) are part of a bigger pattern that matches with backreferences to these groups, etc. So both are greedy, in that they try to capture as much as possible, only taking less when they have to. My question is: who's greedier? Does \1 get first priority, giving \2 its share only if it has to? What about: (.*)(.*)(.*) Let's assume that \1 does get first priority. Let's say it got too greedy, and then spit out a character. Who gets it first? Is it always \2 or can it be \3? Let's assume it's \2 that gets \1's rejection. If this still doesn't work, who spits out now? Does \2 spit to \3, or does \1 spit out another to \2 first?

Read the article

How to validate an IP address with regular expression in objective C

- by Rose

How to validate an IP address in objective C.

Read the article

regexp for detect that the url doesn´t end with an extension

- by devnieL

Hello. I'm using this regular expression for detect if an url ends with a jpg : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*^\.jpg)/ig; it detects the url : e.g. http://www.blabla.com/sdsd.jpg but now i want to detect that the url doesn't ends with an jpg extension, i try with this : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]\b)/ig; but only get http://www.blabla.com/sdsd then i used this : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]$)/ig; it works if the url is alone, but dont work if the text is e.g. : http://www.blabla.com/sdsd.jpg text

Read the article

Regexs in Ruby getting filename

- by user1290757

i am extracting file names of html files using line: filename = File.basename(input_filename, ".*") which currently prints full file name excluding .html extension All files are stored in the form of http^x.x.edu^1^2 all file names begin with http^ and contain edu^ what i want is to extract 2 (which changes) but it is always the second element after .edu I have attempted destructive gsub! but i m weak with regular expressions.

Read the article

Regular Expression Sanitize (PHP)

- by atif089

Hello, I would like to sanitize a string in to a URL so this is what I basically need. Everything must be removed except alphanumeric characters and spaces and dashed. Spaces should be converter into dashes. Eg. This, is the URL! must return this-is-the-url Thanks

Read the article

regular expression: extract last 2 characters

- by dotnet-practitioner

what is the best way to extract last 2 characters of a string using regular expression. For example, I want to extract state code from the following "A_IL" I want to extract IL as string.. please provide me C# code on how to get it.. string fullexpression = "A_IL"; string StateCode = some regular expression code.... thanks

Read the article

allow only [a-z][A-Z][0-9] in string using php

- by zahir hussain

hi i want to get the string only contain the a to z , A to Z, 0 to 9 and some symbols... thanks and advance

Read the article

How to replace only part of the match with python re.sub

- by Arty

I need to match two cases by one reg expression and do replacement 'long.file.name.jpg' - 'long.file.name_suff.jpg' 'long.file.name_a.jpg' - 'long.file.name_suff.jpg' I'm trying to do the following re.sub('(\_a)?\.[^\.]*$' , '_suff.',"long.file.name.jpg") But this is cut the extension '.jpg' and I'm getting long.file.name_suff. instead of long.file.name_suff.jpg I understand that this is because of [^.]*$ part, but I can't exclude it, because I have to find last occurance of '_a' to replace or last '.' Is there a way to replace only part of the match?

Read the article

Writing a PHP web crawler using cron

- by Horse

Hi all I have written myself a web crawler using simplehtmldom, and have got the crawl process working quite nicely. It crawls the start page, adds all links into a database table, sets a session pointer, and meta refreshes the page to carry onto the next page. That keeps going until it runs out of links That works fine however obviously the crawl time for larger websites is pretty tedious. I wanted to be able to speed things up a bit though, and possibly make it a cron job. Any ideas on making it as quick and efficient as possible other than setting the memory limit / execution time higher?

Search Results

Search found 5919 results on 237 pages for 'regex matching'.

Page 117/237 | < Previous Page | 113 114 115 116 117 118 119 120 121 122 123 124 | Next Page >

- by orit cohen

- by Kyle

- by ed1t

- by Tim

- by prosseek

- by Peter Conrey

- by Bo Tian

- by Fork

- by Remy

- by NeoNmaN

- by zoom_pat277

- by Jonathan

- by Scoox

- by Gusto

- by rudimenter

- by Patrick Beardmore

- by polygenelubricants

- by Rose

- by devnieL

- by user1290757

- by atif089

- by dotnet-practitioner

- by zahir hussain

- by Arty

- by Horse

< Previous Page | 113 114 115 116 117 118 119 120 121 122 123 124 | Next Page >