Search Results

Search found 3956 results on 159 pages for 'regex cookbook'.

Page 99/159 | < Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106 | Next Page >

How to capture strings using * or ? with groups in python regular expressions

- by user1334085

When the regular expression has a capturing group followed by "*" or "?", there is no value captured. Instead if you use "+" for the same string, you can see the capture. I need to be able to capture the same value using "?" >>> str1='This string has 29 characters' >>> re.search(r'(\d+)*', str1).group(0) '' >>> re.search(r'(\d+)*', str1).group(1) >>> >>> re.search(r'(\d+)+', str1).group(0) '29' >>> re.search(r'(\d+)+', str1).group(1) '29' More specific question is added below for clarity: I have str1 and str2 below, and I want to use just one regexp which will match both. In case of str1, I also want to be able to capture the number of QSFP ports >>> str1='''4 48 48-port and 6 QSFP 10GigE Linecard 7548S-LC''' >>> str2='''4 48 48-port 10GigE Linecard 7548S-LC''' >>> When I do not use a metacharacter, the capture works: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP).*-LC', str1, re.I|re.M).group(1) '6' >>> It works even when I use the "+" to indicate one occurrence: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)+.*-LC', str1, re.I|re.M).group(1) '6' >>> But when I use "?" to match for 0 or 1 occurrence, the capture fails even for str1: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)?.*-LC', str1, re.I|re.M).group(1) >>>

Read the article
Regular Expression Pattern for C# with matches

- by Sumit Gupta

I am working on project where I need to find Frequency from a given text. I wrote a Regular expression that try to detect frequency, however I am stuck with how C# handle it and how exactly I use it in my software My regular experssion is (\d*)(([,\.]?\s*((k|m)?hz)*)|(\s*((k|m)?hz)*))$ And I am trying to find value from 23,2 Hz 24,4Hz 25,0 Hzsadf 26 Hz 27Khz 28hzzhzhzhdhdwe 29 30.4Hz 31.8 Hz 4343.34.234 Khz 65SD Further Explanation: System needs to work for US and Belgium Culture hence, 23.2 (US) = 23,2 (Be) I try to find a Digit, followed by either khz,mhz,hz or space or , or . If it is , or . then it should have another Digit followed by khz, mhz, hz Any help is appericated.

Read the article
Parsing HTML with XPath and PHP

- by Peter

Is there a way (using XPath and PHP) to do the following (WITHOUT external XSLT files)? Remove all tables and their contents Remove everything after the first h1 tag Keep only paragraphs (INCLUDING their inner HTML (links, lists, etc)) I received an XSLT answer here, but I'm looking for XPATH queries that don't require external files. Currently, I've got the HTML in question loaded into a SimpleXmlElement via: $doc = @DOMDocument::loadHTML($xml); $data = simplexml_import_dom($doc); Now I need help with: $data = $data->xpath('??????'); Been working with this one for several days to no avail. I really appreciate the help. Edit: I don't particularly care what's inside the paragraphs, as I can use strip_tags to eliminate what I don't want. All I need to do is to isolate the paragraphs from the rest of the source. I suppose a more specific, accurate requirement would be this: Return only paragraphs (and their html contents) that aren't contained in tables, and only before the first h1 tag

Read the article
How to replace plain URLs with links?

- by Sergio del Amo

I am using the function below to match URLs inside a given text and replace them for HTML links. The regular expression is working great, but currently I am only replacing the first match. How I can replace all the URL? I guess I should be using the exec command, but I did not really figure how to do it. function replaceURLWithHTMLLinks(text) { var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/i; return text.replace(exp,"<a href='$1'>$1</a>"); }

Read the article
How to find which delimiter was used during string split (VB.NET)

- by typoknig

Hi all, lets say I have a string that I want to split based on several characters, like ".", "!", and "?". How do I figure out which one of those characters split my string so I can add that same character back on to the end of the split segments in question? Dim linePunctuation as Integer = 0 Dim myString As String = "some text. with punctuation! in it?" For i = 1 To Len(myString) If Mid$(entireFile, i, 1) = "." Then linePunctuation += 1 Next For i = 1 To Len(myString) If Mid$(entireFile, i, 1) = "!" Then linePunctuation += 1 Next For i = 1 To Len(myString) If Mid$(entireFile, i, 1) = "?" Then linePunctuation += 1 Next Dim delimiters(3) As Char delimiters(0) = "." delimiters(1) = "!" delimiters(2) = "?" currentLineSplit = myString.Split(delimiters) Dim sentenceArray(linePunctuation) As String Dim count As Integer = 0 While linePunctuation > 0 sentenceArray(count) = currentLineSplit(count)'Here I want to add what ever delimiter was used to make the split back onto the string before it is stored in the array.' count += 1 linePunctuation -= 1 End While

Read the article
Python RegExp exception

- by Jasie

How do I split on all nonalphanumeric characters, EXCEPT the apostrophe? re.split('\W+',text) works, but will also split on apostrophes. How do I add an exception to this rule? Thanks!

Read the article
combining dynamic text with regular expressions in php

- by pfunc

I am experimenting with finding popular keywords using curl, php and regular expressions. I have an array of non-specific nouns that I am matching my keyword search up. So I am looking for words like "the", "and", "that" etc. and taking them out of the keyword search. so I have an array of words like so: $wordArr = [the, and, at,....]; and then running something like: && preg_match('(\bmyword\w*\b)', $key) == false how do I combine these two so it loops through the array finding out if any of the words in the array match the regular expression? I guess I could just do a for loop, but though maybe I could use in_array($wordArr, $key).. or something like that.

Read the article
How Do I Remove The First 4 Characters From A String If It Matches A Pattern In Ruby

- by James

I have the following string: "h3. My Title Goes Here" I basically want to remove the first 4 characters from the string so that I just get back: "My Title Goes Here". The thing is I am iterating over an array of strings and not all have the h3. part in front so I can't just ditch the first 4 characters blindly. I have checked the docs and the closest think I could find was chomp, but that only works for the end of a string. Right now I am doing this: "h3. My Title Goes Here".reverse.chomp(" .3h").reverse This gives me my desired output, but there has to be a better way right? I mean I don't want to reverse a string twice for no reason. I am new to programming so I might have missed something obvious, but I didn't see the opposite of chomp anywhere in the docs. Is there another method that will work? Thanks!

Read the article
Mod rewrite with multiple query strings

- by Boris

Hi, I'm a complete n00b when it comes to regular expressions. I need these redirects: (1) www.mysite.com/products.php?id=001&product=Product-Name&source=Source-Name should become -> www.mysite.com/Source-Name/001-Product-Name (2) www.mysite.com/stores.php?id=002&name=Store-Name should become -> www.mysite.com/002-Store-Name Any help much appreciated :)

Read the article
string abbreviation in matlab

- by Ali

is there an easy way to abbreviate strings in matlab? ex. 'Superior Temporal Gyrus' = 'STG'

Read the article
Classic asp comparison of comma separated lists

- by Reiwoldt

Hello, I have two comma separated lists:- 36,189,47,183,65,50 65,50,189,47 The question is how to compare the two in classic ASP in order to identify and return any values that exist in list 1 but that don't exist in list 2 bearing in mind that associative arrays aren't available. E.g., in the above example I would need the return value to be 36,183 Thanks

Read the article
Find last match with python regular expression

- by SDD

I wanto to match the last occurence of a simple pattern in a string, e.g. list = re.findall(r"\w+ AAAA \w+", "foo bar AAAA foo2 AAAA bar2) print "last match: ", list[len(list)-1] however, if the string is very long, a huge list of matches is generated. Is there a more direct way to match the second occurence of "AAAA" or should I use this workaround?

Read the article
How can I match a match a null byte (0x00) in the Visual Studio binary editor with a find using a re

- by Paul K

Open a file in the Visual Studio binary editor that contains a null byte (0x00), then use the Quick Find feature (Ctrl +F) to find null bytes. I would have thought I could use a regular expression such as \x00 to match null bytes but it doesn't work. Searching for any other hex value using this method works fine. Is this a VS bug, 'feature', or am I just missing something? Is there a work around?

Read the article
Delete all characters in a multline string up to a given pattern

- by biffabacon

Using Python I need to delete all charaters in a multiline string up to the first occurrence of a given pattern. In Perl this can be done using regular expressions with something like: #remove all chars up to first occurrence of cat or dog or rat $pattern = 'cat|dog|rat' $pagetext =~ s/(.*?)($pattern)/$2/xms; What's the best way to do it in Python?

Read the article
What regular expression(s) would I use to remove escaped html from large sets of data.

- by Elizabeth Buckwalter

Our database is filled with articles retrieved from RSS feeds. I was unsure of what data I would be getting, and how much filtering was already setup (WP-O-Matic Wordpress plugin using the SimplePie library). This plugin does some basic encoding before insertion using Wordpress's built in post insert function which also does some filtering. I've figured out most of the filters before insertion, but now I have whacko data that I need to remove. This is an example of whacko data that I have data in one field which the content I want in the front, but this part removed which is at the end: <img src="http://feeds.feedburner.com/~ff/SoundOnTheSound?i=xFxEpT2Add0:xFbIkwGc-fk:V_sGLiPBpWU" border="0"></img> <img src="http://feeds.feedburner.com/~ff/SoundOnTheSound?d=qj6IDK7rITs" border="0"></img> <img src="http://feeds.feedburner.com/~ff/SoundOnTheSound?i=xFxEpT2Add0:xFbIkwGc-fk:D7DqB2pKExk" Notice how some of the images are escape and some aren't. I believe this has to do with the last part being cut off so as to be unrecognizable as an html tag, which then caused it to be html endcoded. Another field has only this which is now filtered before insertion, but I have to get rid of the others: <img src="http://farm3.static.flickr.com/2183/2289902369_1d95bcdb85.jpg" alt="post_img" width="80" (all examples are on one line, but broken up for readability) Question: What is the best way to work with the above escaped html (or portion of an html tag)? I can do it in Perl, PHP, SQL, Ruby, and even Python. I believe Perl to be the best at text parsing, so that's why I used the Perl tag. And PHP times out on large database operations, so that's pretty much out unless I wanted to do batch processing and what not. PS One of the nice things about using Wordpress's insert post function, is that if you use php's strip_tags function to strip out all html, insert post function will insert <p> at the paragraph points. Let me know if there's anything more that I can answer. Some article that didn't quite answer my questions. (http://stackoverflow.com/questions/2016751/remove-text-from-within-a-database-text-field) (http://stackoverflow.com/questions/462831/regular-expression-to-escape-html-ampersands-while-respecting-cdata)

Read the article
Django - urls.py - Filenames with a hash/pound (#) sign?

- by miya

I'm using django and realized that when the filename that the user wants to access (let's say a photo) has the pound sign, the entry in the url.py does not match. Any ideas? url(r'^static/(?P<path>.*)$', 'django.views.static.serve', {'document_root': MEDIA_ROOT}, it just says: "/home/user/project/static/upload/images/hello" does not exist when actually the name of the file is: hello#world.jpg Thanks, Nico

Read the article
Flex 3 Regular Expression Problem

- by Tommy

I've written a url validator for a project I am working on. For my requirements it works great, except when the last part for the url goes longer than 22 characters it breaks. My expression: /((https?):\/\/)([^\s.]+.)+([^\s.]+)(:\d+\/\S+)/i It expects input that looks like "http(s)://hostname:port/location". When I give it the input: https://demo10:443/111112222233333444445 it works, but if I pass the input https://demo10:443/1111122222333334444455 it breaks. You can test it out easily at http://ryanswanson.com/regexp/#start. Oddly, I can't reproduce the problem with just the relevant (I would think) part /(:\d+\/\S+)/i. I can have as many characters after the required / and it works great. Any ideas or known bugs?

Read the article
parse youtube video id using preg_match

- by Webbo

Hi, I am attempting to parse the video ID of a youtube URL using preg_match. I found a regular expression on this site that appears to work; (?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+ As shown in this pic; http://i.imgur.com/SQJW2.jpg My PHP is as follows, but it doesn't work (gives Unknown modifier '[' error)... <? $subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1"; preg_match("(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+", $subject, $matches); print "<pre>"; print_r($matches); print "</pre>"; ?> Cheers

Read the article
string substitution regular expression not working in tcl

- by Puneet Mittal

i am trying to replace all the special characters including white space, hyphen, etc, to underscore, from a string variable in tcl. I wrote the code below but it doesn't seem to be working. set varname $origVar puts "Variable Name :>> $varname" if {$varname != ""} { regsub -all {[\s-\]\[$^?+*()|\\%&#]} $varname "_" $newVar } puts "New Variable :>> $newVar" one issue is that, instead of replacing the string in $varname, it is replacing the data inside $origVar. No idea why, and also i read the example code (for proper syntax) in my tcl book and according to that it should be something like this regsub -all {[\s-][$^?+*()|\\%&#]} $varname "_" newVar so i used the same syntax but it didn't work and gave the same result as modifying the $origVar instead of required $varname value.

Read the article
Nullability (Regular Expressions)

- by danportin

In Brzozowski's "Derivatives of Regular Expressions" and elsewhere, the function d(R) returning ? if a R is nullable, and Ø otherwise, includes clauses such as the following: d(R1 + R2) = d(R1) + d(R2) d(R1 · R2) = d(R1) ? d(R2) Clearly, if both R1 and R2 are nullable then (R1 · R2) is nullable, and if either R1 or R2 is nullable then (R1 + R2) is nullable. It is unclear to me what the above clauses are supposed to mean, however. My first thought, mapping (+), (·), or the Boolean operations to regular sets is nonsensical, since in the base case, d(a) = Ø (for all a ? S) d(?) = ? d(Ø) = Ø and ? is not a set (nor is the return type of d, which is a regular expression). Furthermore, this mapping isn't indicated, and there is a separate notation for it. I understand nullability, but I'm lost on the definition of the sum, product, and Boolean operations in the definition of d: how are ? or Ø returned from d(R1) ? d(R2), for instance, in the definition off d(R1 · R2)?

Read the article
Weird error using preg_match and unicode

- by Thorpe Obazee

if (preg_match('(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)', '2010/02/14/this-is-something')) { // do stuff } The above code works. However this one doesn't. if (preg_match('/\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+/u', '2010/02/14/this-is-something')) { // do stuff } Maybe someone could shed some light as to why the one below doesn't work. This is the error that is being produced: A PHP Error was encountered Severity: Warning Message: preg_match() [function.preg-match]: Unknown modifier '\'

Read the article
Confusion in RegExp Reluctant quantifier? Java

- by Dusk

Hi, Could anyone please tell me the reason of getting an output as: ab for the following RegExp code using Relcutant quantifier? Pattern p = Pattern.compile("abc*?"); Matcher m = p.matcher("abcfoo"); while(m.find()) System.out.println(m.group()); // ab and getting empty indices for the following code? Pattern p = Pattern.compile(".*?"); Matcher m = p.matcher("abcfoo"); while(m.find()) System.out.println(m.group());

Read the article
Switch statement for string matching in JavaScript

- by yaya3

How do I write a swtich for the following conditional? If the url contains "foo", then settings.base_url is "bar". The following is achieving the effect required but I've a feeling this would be more manageable in a switch: var doc_location = document.location.href; var url_strip = new RegExp("http:\/\/.*\/"); var base_url = url_strip.exec(doc_location) var base_url_string = base_url[0]; //BASE URL CASES // LOCAL if (base_url_string.indexOf('xxx.local') > -1) { settings = { "base_url" : "http://xxx.local/" }; } // DEV if (base_url_string.indexOf('xxx.dev.yyy.com') > -1) { settings = { "base_url" : "http://xxx.dev.yyy.com/xxx/" }; } Thanks

Read the article
Regular expression for matching words between <blockquote> & </blockquote>

- by Senthil

Basically I want to strip the document of words between blockquotes. I'm a regular expression newb and even after using rubular, I'm no closer to the answer. Any help is appreciated.

Read the article
Markdown implementation in PHP parses text within <a> tags — how does one disable this behavior?

- by Kyle

I'm using the Markdown library for PHP by Michel Fortin. I started noticing that it formats the text in tags with markdown rules, like so: http://foo.com/My_Url_With_Underscores essentially becomes: <a href="...">http://foo.com/My<em>Url</em>With_Underscores</a> How do I disable that behavior or otherwise prevent the library from doing that?

Read the article

< Previous Page | 95 96 97 98 99 100 101 102 103 104 105 106 | Next Page >