Search Results

Search found 14260 results on 571 pages for 'regex group'.

Page 149/571 | < Previous Page | 145 146 147 148 149 150 151 152 153 154 155 156 | Next Page >

Extracting a string between specified characters in python

- by Seth

I'm a newbie to regular expressions and I have the following string: sequence = ["{\"First\":\"Belyuen,NT,0801\",\"Second\":\"Belyuen,NT,0801\"}","{\"First\":\"Larrakeyah,NT,0801\",\"Second\":\"Larrakeyah,NT,0801\"}"] I am trying to extract the text Belyuen,NT,0801 and Larrakeyah,NT,0801 in python. I have the following code which is not working: re.search('\:\\"...\\', ''.join(sequence)) I.e. I want to get the string between characters :\ and \.

Read the article
How to replace <span style="font-weight: bold;">foo</span> by <strong>foo</strong> using PHP and reg

- by anon

I have a string like this: <span style="font-weight: bold;">Foo</span> I want to use PHP to make it <strong>Foo</strong> …without affecting other spans. How can I do this?

Read the article
php - get content from second pair of quotes in string

- by Aaron Turecki

I'm trying to get the contents of the second quotes and only the second quotes from a string. Right now I'm able to get the contents of all three quotes. What am I doing wrong? Is it possible to just print the second value in the output array? Text 2014-06-02 11:48:41.519 -0700 Information 94 NICOLE Client "[WebDirect] (207.230.229.204) [207.230.229.204]" opening database "FMServer_Sample" as "Admin". PHP if (preg_match_all('~(["\'])([^"\']+)\1~', $line, $matches)) $database_names = $matches[2]; print_r($database); Output [WebDirect] (207.230.229.204) [207.230.229.204], FMServer_Sample, Admin

Read the article
Format string using RegEx

- by user99322

String to be formatted "new Date(2009,0,1)" String after formatting "'01-Jan-2009'"

Read the article
Regular expression for pipe delimited and double quoted string

- by Hiren Amin

I have a string something like this: "2014-01-23 09:13:45|\"10002112|TR0859657|25-DEC-2013>0000000000000001\"|10002112" I would like to split by pipe apart from anything wrapped in double quotes so I have something like (similar to how csv is done): [0] => 2014-01-23 09:13:45 [1] => 10002112|TR0859657|25-DEC-2013>0000000000000001 [2] => 10002112 I would like to know if there is a regular expression that can do this?

Read the article
Replacing multiple patterns in a block of data

- by VikrantY

Hi All, I need to find the most efficient way of matching multiple regular expressions on a single block of text. To give an example of what I need, consider a block of text: "Hello World what a beautiful day" I want to replace Hello with "Bye" and "World" with Universe. I can always do this in a loop ofcourse, using something like String.replace functions availiable in various languages. However, I could have a huge block of text with multiple string patterns, that I need to match and replace. I was wondering if I can use Regular Expressions to do this efficiently or do I have to use a Parser like LALR. I need to do this in JavaScript, so if anyone knows tools that can get it done, it would be appreciated.

Read the article
Regular Expression relace Myname _Mysurname with "" in PHP

- by streetparade

I need a regular expression which replaces a string like this "Myname _MySurename" with "Myname" that means i just need Myname, so _MySurename should be cut. i tryed something like "/_*/" but that replaces just the _ (underscore) how can i do that in PHP ?

Read the article
perl parentheses in regular expression

- by iamrohitbanga

I have been trying several regular expressions $str =~ s/^0+(.)/$1/; converts 0000 to 0 and 0001 to 1 $str =~ s/^0+./$1/; converts 0000 to empty string, 000100 to 00, 0001100 to 100. what difference is the parentheses making?

Read the article
Is it possible to use re2 from Python?

- by flow

i just discovered http://code.google.com/p/re2, a promising library that uses a long-neglected way (Thompson NFA) to implement a regular expression engine that can be orders of magnitudes faster than the available engines of awk, Perl, or Python. so i downloaded the code and did the usual sudo make install thing. however, that action had seemingly done little more than adding /usr/local/include/re2/re2.h to my system. there seemed to be some `*.a file in addition, but then what is it with this *.a extension? i would like to use re2 from Python (preferrably Python 3.1) and was excited to see files like make_unicode_groups.py in the distro (maybe just used during the build process?). those however were not deployed on my machine. how can i use re2 from Python?

Read the article
javascript: a regular expression for "somestring A.B.C(anything can be here)"

- by Paul

Given a string like, "xyz A.B.C.(anything)" (there's at least one space/tab/newline between z and A.) I'd like to find "A.B.C".

Read the article
get city, state or zip from a string in python

- by Joe

I'd like to be able to parse out the city, state or zip from a string in python. So, if I entered Boulder, Co 80303 Boulder, Colorado Boulder, Co 80303 ... any variation of these it would return the city, state or zip. This is all going to be user inputted data and inputted in one text field.

Read the article
preg_replace function to append a string to all the hyperlinks of a page

- by KoolKabin

hi guys, i want to append my own value to all hyperlinks in a page... e.g if there are links: <a href="abc.htm?val=1">abc 1</a> <br/> <a href="abc.htm?val=2">abc 1</a> <br/> <a href="abc.htm?val=3">abc 1</a> <br/> <a href="abc.htm?val=4">abc 1</a> <br/> I want to add next var like "type=int" to all hyperlinks output should be: <a href="abc.htm?val=1&type=int">abc 1</a> <br/> <a href="abc.htm?val=2&type=int">abc 1</a> <br/> <a href="abc.htm?val=3&type=int">abc 1</a> <br/> <a href="abc.htm?val=4&type=int">abc 1</a> <br/> I hope it can be done quite easily with preg_replace function

Read the article
How to use regular expressions to pull a substring? (screen scraping)

- by Diego

Hey guys, i'm really trying to understand regular expressions while scraping a site, i've been using it in my code enough to pull the following, but am stuck here. I need to quickly grab this: http://www.example.com/online/store/TitleDetail?detail&sku=123456789 from this: ('<a href="javascript:if(handleDoubleClick(this.id)){window.location=\'http://www.example.com/online/store/TitleDetail?detail&sku=123456789\';}" id="getTitleDetails_123456789">\r\n\t\t\t \tcheck store inventory\r\n\t\t\t </a>', 1) This is where I got confused. any ideas?

Read the article
Regular expression for getting any letter, symbol, number from 1 to 100 maxlength.

- by rolando

As you can read in the tittle i need a regular expression for getting any letter, symbol, number from 1 to 100 maxlength (any text posible). Can someone provide that for me and maybe a good link to understand how it works. Thank you.

Read the article
perl ENV value avoid escape

- by Michael

In my makefile I have command in variable like this substitute := perl -p -e 's/@([^@]+)@/"$(update_url)"/ge' > output.txt update_url := em:updateURL=\"http:\/\/bla\/update.rdf\"\n this works fine when I run command in target and I have newline, quotes however I need to replace $(update_url)" with environment variable, using expression like this #substitute := perl -p -e 's/@([^@]+)@/defined $$ENV{$$1} ? $$ENV{$$1} : $$1/ge' I am exporting those variables from makefile. This gives me literally em:updateURL=\"http:\/\/bla\/update.rdf\"\n on output file... so how to make the second version to give output like first version?

Read the article
What is the RFC complicant and working regular expression to check if a string is a valid URL

- by bestis

There is question by the almost the same name already: What is the best regular expression to check if a string is a valid URL I don't understand this stackoverflow. It seems like I need reputation to comment an answer. As I don't have it, I don't know how to tell/ask that the proposed solution doesn't seem to work. So I'm forced to make a new question and ask for the solution this way? But that regexp seems to fail in input which has IPv6 address in it: For example facebook's IPv6 address: http://2620:0:1cfe:face:b00c::3/ Also link to localhost fails: http://::1/ Or is PHP to blame? /** * Validate URL - RFC 3987 (IRI) * * http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url * * @param string $str_url * @return boolean */ function is_url($str_url) { // RFC 3987 For absolute IRIs (internationalized): // @todo FIXME - Has bugs in IPv6 (http://2620:0:1cfe:face:b00c::3/) fails return (bool) preg_match('/^[a-z](?:[-a-z0-9\+\.])*:(?:\/\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:])*@)?(?:\[(?:(?:(?:[0-9a-f]{1,4}:){6}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|::(?:[0-9a-f]{1,4}:){5}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){4}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4}:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){3}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,2}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){2}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,3}[0-9a-f]{1,4})?::[0-9a-f]{1,4}:(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,4}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,5}[0-9a-f]{1,4})?::[0-9a-f]{1,4}|(?:(?:[0-9a-f]{1,4}:){0,6}[0-9a-f]{1,4})?::)|v[0-9a-f]+[-a-z0-9\._~!\$&\'\(\)\*\+,;=:]+)\]|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}|(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=@])*)(?::[0-9]*)?(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*|\/(?:(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*)?|(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@]))*)*|(?!(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])))(?:\?(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])|[\x{E000}-\x{F8FF}\x{F0000}-\x{FFFFD}|\x{100000}-\x{10FFFD}\/\?])*)?(?:\#(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&\'\(\)\*\+,;=:@])|[\/\?])*)?$/iu',$str_url); } Here is the test for it: $urls=array('http://www.example.org/','http://www.example.org:80/','example.org','ftp://user:[email protected]/','http://example.org/?cat=5&test=joo','http://www.fi/?cat=5&test=joo','http://::1/','http://2620:0:1cfe:face:b00c::3/','http://2620:0:1cfe:face:b00c::3:80/'); foreach ($urls as $a) { echo $a."\n"; $a=is_url($a); var_dump($a); } And that outputs: > `http://www.example.org/` bool(true) > `http://www.example.org:80/` bool(true) > example.org bool(false) > `ftp://user:[email protected]/` > bool(true) > `http://example.org/?cat=5&test=joo` > bool(true) > `http://www.fi/?cat=5&test=joo` > bool(true) `http://::1/` bool(false) > `http://2620:0:1cfe:face:b00c::3/` > bool(false) > `http://2620:0:1cfe:face:b00c::3:80/` > bool(false) And it also seems that stackoverflow's code is miss behaving on those :) So what is the RFC compilicant and working regexp? ps. If you close this, please then tell me how this situation should be handled? I don't think that the answer is, just earn your reputation. Who wants to do that if they cannot even tell that some proposed solution isn't working correctly. pps. "we're sorry, but as a spam prevention mechanism, new users can only post a maximum of one hyperlink. Earn more than 10 reputation to post more hyperlinks.". Oh C'mon, I'm fine with plain text :D

Read the article
How do you validate a URL with a regular expression in Python?

- by Zachary Spencer

I'm building a Google App Engine app, and I have a class to represent an RSS Feed. I have a method called setUrl which is part of the feed class. It accepts a url as an input. I'm trying to use the re python module to validate off of the RFC 3986 Reg-ex (http://www.ietf.org/rfc/rfc3986.txt) Below is a snipped which should work, right? I'm incredibly new to Python and have been beating my head against this for the past 3 days. p = re.compile('^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?') m = p.match(url) if m: self.url = url return url

Read the article
questions on nfa and dfa..

- by Loop

Hi Guys... Hope you help me with this one.... I have a main question which is ''how to judge whether a regular expression will be accepted by NFA and/or DFA? For eg. My question says that which of the regular expressions are equivalent? explain... 1.(a+b)*b(a+b)*b(a+b)* 2.a*ba*ba* 3.a*ba*b(a+b)* do we have to draw the NFA and DFA and then find through minimisation algorithm? if we do then how do we come to know that which regular expression is accepted by NFA/DFA so that we can begin with the answer? its so confusing.... Second is a very similar one, the question asks me to show that the language (a^nb^n|n1} is not accepted by DFA...grrrrr...how do i know this? (BTW this is a set of all strings of where a number of a's is followed by the same number of b's).... I hope I explained clearly well....

Read the article
Issue with my regular expression?

- by Rubans

I'm trying to locate the number matches in a relative path for directory up references("..\"). So I have the following pattern : "(..\)" which works as expected for the path "....\a\b" where it will give me 2 successfull groups ("..\") but when I try the path "..\a\b" it will also return 2 when it should be 1. I tried this in a reg ex tool such Expresso and it seems to work as expected in there but not in in .net, any ideas?

Read the article
Regular Expression find a phrase not inside an HTML tag

- by James Buckingham

Hi there, I'm struggling a bit with this regular expression and wondered if anyone was about to help me please? What I need to do is isolate the 1st phrase inside a string which is NOT inside an HTML tag. So the examples I have at the moment are: This is some test text about ITS for the ITS department. Also worth mentioning ABS as well I guess.ITS, ... and ... This is some ITS test text about ITS for the ITS department. Also worth mentioning ABS as well I guess So in the first example I want it to ignore the wrapped ITS and give me the ITS at the end of the 1st sentence. In the second example I want it to return the ITS at the start of the 2nd sentence. The aim is to replace these with my own custom wrapped acronym tags in a ColdFusion application I'm writing. Thanks a lot, James

Read the article
Is it possible to group validation?

- by lambdabutz

I am using a lot of my own validation methods to compare the data from one association to the other. I've noticed that I'm constantly checking that my associations aren't nil before trying to call anything on them, but I am also validating their presence, and so I feel that my nil checks are redundant. Here's an example: class House < ActiveRecord::Base has_one :enterance, :class => Door has_one :exit, :class => Door validates_presence_of :enterance, :exit validate :not_a_fire_hazard def not_a_fire_hazard if enterance && exit && enterance.location != exit.location errors.add_to_base('If there is a fire you will most likely die') return false end end end I feel like I am repeating myself by checking the existence of enterance and exit within my own validation. Is there a more "The Rails Way" to do this?

Read the article
php regular expression to match specific url pattern

- by zen

I'd like to "grab" a few hundred urls from a few hundred html pages. Pattern: <h2><a href="http://www.the.url.might.be.long/urls.asp?urlid=1" target="_blank">The Website</a></h2>

Read the article
what is the return value of BeautifulSoup.find ?

- by prosseek

I run to get some value as score. score = soup.find('div', attrs={'class' : 'summarycount'}) I run 'print score' to get as follows. <div class=\"summarycount\">524</div> I need to extract the number part. I used re module but failed. m = re.search("[^\d]+(\d+)", score) TypeError: expected string or buffer function search in re.py at line 142 return _compile(pattern, flags).search(string) What's the return type of the find function? How to get the number from the score variable? Is there any easy way to let BeautifulSoup to return the value(in this case 524) itself?

Read the article
A "smart" (forgiving) date parser?

- by jdmuys

I have to migrate a very large dataset from one system to another. One of the "source" column contains a date but is really a string with no constraint, while the destination system mandates a date in the format yyyy-mm-dd. Many, but not all, of the source dates are formatted as yyyymmdd. So to coerce them to the expected format, I do (in Perl): return "$1-$2-$3" if ($val =~ /(\d{4})[-\/]*(\d{2})[-\/]*(\d{2})/); The problem arises when the source dates moves away from the "generic" yyyymmdd. The goal is to salvage as many dates as possible, before giving up. Example source strings include: 21/3/1998, March 2004, 2001, 3/4/97 I can try to match as many of the examples I can find with a succession of regular expressions such as the one above. But is there something smarter to do? Am I not reinventing the wheel? Is there a library somewhere doing something similar? I couldn't find anything relevant googling "forgiving date parser". (any language is OK).

Read the article
Repeated regular expression

- by javaguy

How can I parse a strings like : name1="val1" name2="val2" name3="val3" I cannot use split(\s+) as it can be name = "val 1". I am doing java but any laguage is okay.

Read the article

< Previous Page | 145 146 147 148 149 150 151 152 153 154 155 156 | Next Page >