Search Results

Search found 5919 results on 237 pages for 'regex matching'.

Page 114/237 | < Previous Page | 110 111 112 113 114 115 116 117 118 119 120 121  | Next Page >

  • Regular expressions in python unicode

    - by Remy
    I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

    Read the article

  • Confusion in RegExp Reluctant quantifier? Java

    - by Dusk
    Hi, Could anyone please tell me the reason of getting an output as: ab for the following RegExp code using Relcutant quantifier? Pattern p = Pattern.compile("abc*?"); Matcher m = p.matcher("abcfoo"); while(m.find()) System.out.println(m.group()); // ab and getting empty indices for the following code? Pattern p = Pattern.compile(".*?"); Matcher m = p.matcher("abcfoo"); while(m.find()) System.out.println(m.group());

    Read the article

  • Need a regular expression for an Irish phone number

    - by Eoghan O'Brien
    I need to validate an Irish phone number but I don't want to make it too user unfriendly, many people are used to writing there phone number with brackets wrapping their area code followed by 5 to 7 digits for their number, some add spaces between the area code or mobile operator. The format of Irish landline numbers is an area code of between 1 and 4 digits and a number of between 5 to 8 digits. e.g. (021) 9876543 (01)9876543 01 9876543 (0402)39385 I'm looking for a regular expression for Javascript/PHP. Thanks.

    Read the article

  • Regular Expression to find the job id in a string

    - by Jamie
    Hi all, Please could someone help me, i will be forever appreciative. I'm trying to create a regular expression which will extract 797 from "Your job 797 ("job_name") has been submitted" or "Your Job 9212 ("another_job_name") has been submitted" etc. Any ideas? Thanks guys!

    Read the article

  • How to export the matches only in a pattern search in vim?

    - by Mert Nuhoglu
    Is there a way to grab and export the match part only in a pattern search without changing the current file? For example, from a file containing: 57","0","37","","http://www.thisamericanlife.org/Radio_Episode.aspx?episode=175" 58","0","37","","http://www.thisamericanlife.org/Radio_Episode.aspx?episode=170" I want to export a new file containing: http://www.thisamericanlife.org/Radio_Episode.aspx?episode=175 http://www.thisamericanlife.org/Radio_Episode.aspx?episode=170 I can do this by using substitution like this: :s/.\{-}\(http:\/\/.\{-}\)".\{-}/\1/g :%w>>data But the substitution command changes the current file. Is there a way to do this without changing the current file?

    Read the article

  • Regular expression to retrieve everything before first slash

    - by alex
    I need a regular expression to basically get the first part of a string, before the first slash (). For example in the following: C:\MyFolder\MyFile.zip The part I need is "C:" Another example: somebucketname\MyFolder\MyFile.zip I would need "somebucketname" I also need a regular expression to retrieve the "right hand" part of it, so everything after the first slash (excluding the slash.) For example somebucketname\MyFolder\MyFile.zip would return MyFolder\MyFile.zip.

    Read the article

  • How to capture strings using * or ? with groups in python regular expressions

    - by user1334085
    When the regular expression has a capturing group followed by "*" or "?", there is no value captured. Instead if you use "+" for the same string, you can see the capture. I need to be able to capture the same value using "?" >>> str1='This string has 29 characters' >>> re.search(r'(\d+)*', str1).group(0) '' >>> re.search(r'(\d+)*', str1).group(1) >>> >>> re.search(r'(\d+)+', str1).group(0) '29' >>> re.search(r'(\d+)+', str1).group(1) '29' More specific question is added below for clarity: I have str1 and str2 below, and I want to use just one regexp which will match both. In case of str1, I also want to be able to capture the number of QSFP ports >>> str1='''4 48 48-port and 6 QSFP 10GigE Linecard 7548S-LC''' >>> str2='''4 48 48-port 10GigE Linecard 7548S-LC''' >>> When I do not use a metacharacter, the capture works: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP).*-LC', str1, re.I|re.M).group(1) '6' >>> It works even when I use the "+" to indicate one occurrence: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)+.*-LC', str1, re.I|re.M).group(1) '6' >>> But when I use "?" to match for 0 or 1 occurrence, the capture fails even for str1: >>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)?.*-LC', str1, re.I|re.M).group(1) >>>

    Read the article

  • jquery sortable with regexp

    - by Chris Lively
    I am trying to figure out the right regexp to match on list item id's. For example: <ul id="MyList" class="connectedSortable"> <li id="id=1-32">Item 1</li> <li id="id=2_23">Item 2</li> <li id="id=3">Item 3</li> <li id="id=4">Item 4</li> <li id="id=5">Item 5</li> <li id="id=6">Item 6</li> </ul> On the serialize method, I want it to pull everything after the equal sign (=) $(function () { $("#MyList, #OtherList").sortable({ connectWith: '.connectedSortable', update: function () { $("#MyListOrder").val($("#MyList").sortable('serialize', { regexp: '/(.+)[=](.+)/)' })); } }).disableSelection(); }); I tried the above, but that didn't quite work. My regexp expression is wrong and I don't know what it should be. Ideas?

    Read the article

  • List files with two dots in their names using java regular expressions

    - by Nivas
    I was trying to match files in a directory that had two dots in their name, something like theme.default.properties I thought the pattern .\\..\\.. should be the required pattern [. matches any character and \. matches a dot] but it matches both oneTwo.txt and theme.default.properties I tried the following: [resources/themes has two files oneTwo.txt and theme.default.properties] 1. public static void loadThemes() { File themeDirectory = new File("resources/themes"); if(themeDirectory.exists()) { File[] themeFiles = themeDirectory.listFiles(); for(File themeFile : themeFiles) { if(themeFile.getName().matches(".\\..\\..")); { System.out.println(themeFile.getName()); } } } } This prints nothing and the following File[] themeFiles = themeDirectory.listFiles(new FilenameFilter() { public boolean accept(File dir, String name) { return name.matches(".\\..\\.."); } }); for (File file : themeFiles) { System.out.println(file.getName()); } prints both oneTwo.txt theme.default.properties I am unable to find why these two give different results and which pattern I should be using to match two dots... Can someone help?

    Read the article

  • RegularExpressionValidator - Windows ID Validation

    - by Albert
    I'd like to setup a RegularExpressionValidator to ensure users are entering valid windows IDs in a textbox. Specifically, I'd like to ensure it's any three capital letters (for our range of domains), followed by a backslash, followed by any number of letters and numbers. Does anyone know where I can find some examples of this type of validation...or can somebody whip one up for me? :)

    Read the article

  • Weird error using preg_match and unicode

    - by Thorpe Obazee
    if (preg_match('(\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+)', '2010/02/14/this-is-something')) { // do stuff } The above code works. However this one doesn't. if (preg_match('/\p{Nd}{4}/\p{Nd}{2}/\p{Nd}{2}/\p{L}+/u', '2010/02/14/this-is-something')) { // do stuff } Maybe someone could shed some light as to why the one below doesn't work. This is the error that is being produced: A PHP Error was encountered Severity: Warning Message: preg_match() [function.preg-match]: Unknown modifier '\'

    Read the article

  • PHP regular expression for positive number with 0 or 2 decimal places

    - by Peter
    Hi I am trying to use the following regular expression to check whether a string is a positive number with either zero decimal places, or 2: ^\d+(\.(\d{2}))?$ When I try to match this using preg_match, I get the error: Warning: preg_match(): No ending delimiter '^' found in /Library/WebServer/Documents/lib/forms.php on line 862 What am I doing wrong?

    Read the article

  • Java Matcher groups: Understanding The difference between "(?:X|Y)" and "(?:X)|(?:Y)"

    - by user358795
    Can anyone explain: Why the two patterns used below give different results? (answered below) Why the 2nd example gives a group count of 1 but says the start and end of group 1 is -1? public void testGroups() throws Exception { String TEST_STRING = "After Yes is group 1 End"; { Pattern p; Matcher m; String pattern="(?:Yes|No)(.*)End"; p=Pattern.compile(pattern); m=p.matcher(TEST_STRING); boolean f=m.find(); int count=m.groupCount(); int start=m.start(1); int end=m.end(1); System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count + " Start of group 1=" + start + " End of group 1=" + end ); } { Pattern p; Matcher m; String pattern="(?:Yes)|(?:No)(.*)End"; p=Pattern.compile(pattern); m=p.matcher(TEST_STRING); boolean f=m.find(); int count=m.groupCount(); int start=m.start(1); int end=m.end(1); System.out.println("Pattern=" + pattern + "\t Found=" + f + " Group count=" + count + " Start of group 1=" + start + " End of group 1=" + end ); } } Which gives the following output: Pattern=(?:Yes|No)(.*)End Found=true Group count=1 Start of group 1=9 End of group 1=21 Pattern=(?:Yes)|(?:No)(.*)End Found=true Group count=1 Start of group 1=-1 End of group 1=-1

    Read the article

  • How do I write this URL in Django?

    - by alex
    (r'^/(?P<the_param>[a-zA-z0-9_-]+)/$','myproject.myapp.views.myview'), How can I change this so that "the_param" accepts a URL(encoded) as a parameter? So, I want to pass a URL to it. mydomain.com/http%3A//google.com

    Read the article

  • Writing a PHP web crawler using cron

    - by Horse
    Hi all I have written myself a web crawler using simplehtmldom, and have got the crawl process working quite nicely. It crawls the start page, adds all links into a database table, sets a session pointer, and meta refreshes the page to carry onto the next page. That keeps going until it runs out of links That works fine however obviously the crawl time for larger websites is pretty tedious. I wanted to be able to speed things up a bit though, and possibly make it a cron job. Any ideas on making it as quick and efficient as possible other than setting the memory limit / execution time higher?

    Read the article

  • Jakarta Regexp 1.5 Backreferences?

    - by Matt Smith
    Why does this match: String str = "099.9 102.2" + (char) 0x0D; RE re = new RE("^([0-9]{3}.[0-9]) ([0-9]{3}.[0-9])\r$"); System.out.println(re.match(str)); But this does not: String str = "099.9 102.2" + (char) 0x0D; RE re = new RE("^([0-9]{3}.[0-9]) \1\r$"); System.out.println(re.match(str)); The back references don't seem to be working... What am I missing?

    Read the article

  • Dreamweaver regular expression substitution followed by number

    - by mark
    Hi. I'm using Dreamweaver to update copyright dates across my site. I want to preserve the existing spacing (or lack thereof) between years. Examples: © 2002-2008 should update to © 2002-2009 © 2003 - 2008 should update to © 2003 - 2009 This is the regular expression I'm using to accomplish this in Dreamweaver's find & replace function Find: ©\s*(\d{4}\s*-\s*)\d{3}[^9] Replace: © $1 2009 Here's the PROBLEM: This expression works, but has that that extra space between the hyphen and 2009. If I write the replace expression without the space, as © $12009 then dreamweaver looks for the 12,009th substitution in the find expression, and, not finding one, prints $12009. Any ideas?

    Read the article

  • regexp for detect that the url doesn´t end with an extension

    - by devnieL
    Hello. I'm using this regular expression for detect if an url ends with a jpg : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*^\.jpg)/ig; it detects the url : e.g. http://www.blabla.com/sdsd.jpg but now i want to detect that the url doesn't ends with an jpg extension, i try with this : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]\b)/ig; but only get http://www.blabla.com/sdsd then i used this : var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]$)/ig; it works if the url is alone, but dont work if the text is e.g. : http://www.blabla.com/sdsd.jpg text

    Read the article

< Previous Page | 110 111 112 113 114 115 116 117 118 119 120 121  | Next Page >