Search Results

Search found 1746 results on 70 pages for 'expressions'.

Page 8/70 | < Previous Page | 4 5 6 7 8 9 10 11 12 13 14 15  | Next Page >

  • generalizing the pumping lemma for UNIX-style regular expressions

    - by Avi
    Most UNIX regular expressions have, besides the usual *,+,? operators a backslash operator where \1,\2,... match whatever's in the last parentheses, so for example L=(a)b\1* matches the (non regular) language a^n b a^n On one hand, this seems to be pretty powerful since you can create (a*)b\1b\1 to match the language a^n b a^n b a^n which can't even be recognized by a stack automaton. On the other hand, I'm pretty sure a^n b^n cannot be expressed this way. Two questions: 1. Is there any literature on this family of languages (UNIX-y regular). In particular, is there a version of the pumping lemma for these? 2. Can someone prove (or perhaps disprove) that a^n b^n cannot be expressed this way? Thanks

    Read the article

  • Using regular expressions to do mass replace in Notepad++

    - by user638820
    I've been trying to replace (and translate) this text, and i don't know what formula I should Use for thousands of places that I need to translate to Spanish. OKay this is what i want to do, i want to use regular expressions on Notepadd++. I give 4 variations, and in bold is what's supposed to go after the name of the place, in lower case and not to be confused with eg. Agency Village because that's its name. Missouri 5,988,927 Adrian City city 1,677 Advance city 1,347 Affton CDP 20,307 Agency Village village 684 Airport Drive village 698 To | [[Adrian City (Misuri)|Adrian City]] || ciudad || 1677 |- | [[Advance (Misuri)|Advance]] || ciudad || 1347 |- | [[Afton (Misuri)|Afton]] || CDP || 20307 |- | [[Agency Village (Misuri)|Agency Village]] || villa || 684 |- | [[Airport Drive (Misuri)|Airport Drive]] || villa || 698

    Read the article

  • Regular Expressions

    - by Rocky
    Hello Everyone, I am new to Stackoverflow and I have a quick question. Let's assume we are given a large number of HTML files (large as in theoretically infinite). How can I use Regular Expressions to extract the list of Phone Numbers from all those files? Explanation/expression will be really appreciated. The Phone numbers can be any of the following formats: (123) 456 7899 (123).456.7899 (123)-456-7899 123-456-7899 123 456 7899 1234567899 Thanks a lot for all your help and have a good one!

    Read the article

  • Python regular expressions assigning to named groups

    - by None
    When you use variables (is that the correct word?) in python regular expressions like this: "blah (?P\w+)" ("value" would be the variable), how could you make the variable's value be the text after "blah " to the end of the line or to a certain character not paying any attention to the actual content of the variable. For example, this is pseudo-code for what I want: >>> import re >>> p = re.compile("say (?P<value>continue_until_text_after_assignment_is_recognized) endsay") >>> m = p.match("say Hello hi yo endsay") >>> m.group('value') 'Hello hi yo' Note: The title is probably not understandable. That is because I didn't know how to say it. Sorry if I caused any confusion.

    Read the article

  • Building 'flat' rather than 'tree' LINQ expressions

    - by Ian Gregory
    I'm using some code (available here on MSDN) to dynamically build LINQ expressions containing multiple OR 'clauses'. The relevant code is var equals = values.Select(value => (Expression)Expression.Equal(valueSelector.Body, Expression.Constant(value, typeof(TValue)))); var body = equals.Aggregate<Expression>((accumulate, equal) => Expression.Or(accumulate, equal)); This generates a LINQ expression that looks something like this: (((((ID = 5) OR (ID = 4)) OR (ID = 3)) OR (ID = 2)) OR (ID = 1)) I'm hitting the recursion limit (100) when using this expression, so I'd like to generate an expression that looks like this: (ID = 5) OR (ID = 4) OR (ID = 3) OR (ID = 2) OR (ID = 1) How would I modify the expression building code to do this?

    Read the article

  • combining dynamic text with regular expressions in php

    - by pfunc
    I am experimenting with finding popular keywords using curl, php and regular expressions. I have an array of non-specific nouns that I am matching my keyword search up. So I am looking for words like "the", "and", "that" etc. and taking them out of the keyword search. so I have an array of words like so: $wordArr = [the, and, at,....]; and then running something like: && preg_match('(\bmyword\w*\b)', $key) == false how do I combine these two so it loops through the array finding out if any of the words in the array match the regular expression? I guess I could just do a for loop, but though maybe I could use in_array($wordArr, $key).. or something like that.

    Read the article

  • Python comparing string against several regular expressions

    - by maerics
    I'm pretty experienced with Perl and Ruby but new to Python so I'm hoping someone can show me the Pythonic way to accomplish the following task. I want to compare several lines against multiple regular expressions and retrieve the matching group. In Ruby it would be something like this: STDIN.each_line do |line| case line when /^A:(.*?)$/ then puts "FOO: #{$1}" when /^B:(.*?)$/ then puts "BAR: #{$1}" # when ... else puts "NO MATCH: #{line}" end end My attempts in Python are turning out pretty ugly because the matching group is returned from a call to match/search on a regular expression and Python has no assignment in conditionals or switch statements. What's the Pythonic way to do (or think!) about this problem?

    Read the article

  • JAVA: Build XML document using XPath expressions

    - by snoe
    I know this isn't really what XPath is for but if I have a HashMap of XPath expressions to values how would I go about building an XML document. I've found dom-4j's DocumentHelper.makeElement(branch, xpath) except it is incapable of creating attributes or indexing. Surely a library exists that can do this? Map xMap = new HashMap(); xMap.put("root/entity/@att", "fooattrib"); xMap.put("root/array[0]/ele/@att", "barattrib"); xMap.put("root/array[0]/ele", "barelement"); xMap.put("root/array[1]/ele", "zoobelement"); would result in: <root> <entity att="fooattrib"/> <array><ele att="barattrib">barelement</ele></array> <array><ele>zoobelement</ele></array> </root>

    Read the article

  • What are block expressions actually good for?

    - by Helper Method
    I just solved the first problem from Project Euler in JavaFX for the fun of it and wondered what block expressions are actually good for? Why are they superior to functions? Is it the because of the narrowed scope? Less to write? Performance? Here's the Euler example. I used a block here but I don't know if it actually makes sense // sums up all number from low to high exclusive which are divisible by a or b function sumDivisibleBy(a: Integer, b: Integer, high: Integer) { def low = if (a <= b) a else b; def sum = { var result = 0; for (i in [low .. <high] where i mod 3 == 0 or i mod 5 == 0) { result += i } result } } Does a block makes sense here?

    Read the article

  • Dealing with regular expressions, Python

    - by Gusto
    I want to remove some symbols from a string using a regular expression, for example: == (that occur both at the beginning and at the end of a line), * (at the beginning of a line ONLY). def some_func(): clean = re.sub(r'= {2,}', '', clean) #Removes 2 or more occurrences of = at the beg and at the end of a line. clean = re.sub(r'^\* {1,}', '', clean) #Removes 1 or more occurrences of * at the beginning of a line. What's wrong with my code? It seems like expressions are wrong. How do I remove a character/symbol if it's at the beginning or at the end of the line (with one or more occurrences)?

    Read the article

  • Regular expressions in python unicode

    - by Remy
    I need to remove all the html tags from a given webpage data. I tried this using regular expressions: import urllib2 import re page = urllib2.urlopen("http://www.frugalrules.com") from bs4 import BeautifulSoup, NavigableString, Comment soup = BeautifulSoup(page) link = soup.find('link', type='application/rss+xml') print link['href'] rss = urllib2.urlopen(link['href']).read() souprss = BeautifulSoup(rss) description_tag = souprss.find_all('description') content_tag = souprss.find_all('content:encoded') print re.sub('<[^>]*>', '', content_tag) But the syntax of the re.sub is: re.sub(pattern, repl, string, count=0) So, I modified the code as (instead of the print statement above): for row in content_tag: print re.sub(ur"<[^>]*>",'',row,re.UNICODE But it gives the following error: Traceback (most recent call last): File "C:\beautifulsoup4-4.3.2\collocation.py", line 20, in <module> print re.sub(ur"<[^>]*>",'',row,re.UNICODE) File "C:\Python27\lib\re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: expected string or buffer What am I doing wrong?

    Read the article

  • Mutually exclusive regular expressions

    - by CaptnCraig
    If I have a list of regular expressions, is there an easy way to determine that no two of them will both return a match for the same string? That is, the list is valid if and only if for all strings a maximum of one item in the list will match the entire string. It seems like this will be very hard (maybe impossible?) to prove definitively, but I can't seem to find any work on the subject. The reason I ask is that I am working on a tokenizer that accepts regexes, and I would like to ensure only one token at a time can match the head of the input.

    Read the article

  • .NET Regular Expressions - Shorter match

    - by Xavier
    Hi Guys, I have a question regarding .NET regular expressions and how it defines matches. I am writing: var regex = new Regex("<tr><td>1</td><td>(.+)</td><td>(.+)</td>"); if (regex.IsMatch(str)) { var groups = regex.Match(str).Groups; var matches = new List<string>(); for (int i = 1; i < groups.Count; i++) matches.Add(groups[i].Value); return matches; } What I want is get the content of the two following tags. Instead it returns: [0]: Cell 1</td><td>Cell 2</td>... [1]: Last row of the table Why is the first match taking </td> and the rest of the string instead of stopping at </td>?

    Read the article

  • Saving substrings using Regular Expressions

    - by user362971
    I'm new to regular expressions in Java (or any language, for that matter) and I'm wanting to do a find using them. The tricky part that I don't understand how to do is replace something inside the string that matches. For example, if the line I'm looking for is Person item6 [can {item thing [wrap]}] I'm able to write a regex that finds that line, but finding what the word "thing" is (as it may differ among different lines) is my problem. I may want to either replace that word with something else or save it in a variable for later. Is there any easy way to do this using Java's regex engine?

    Read the article

  • javascript regular expressions

    - by Zhasulan Berdybekov
    Help me with regular expressions. I need to check the text on the hour and minute. That is the first case, the text can be from 0 to 12. In the second case, the text can be from 1 to 60. this is my code: var hourRegEx = /^([0-9]{2})$/; //You can fix this line of code? $(document).ready( function(){ $('form.form').submit(function(){ if( $('input.hour').val().match(hourRegEx) ){ return true; } return false; }); }); In my case, the code says that, for example 52, too, the correct answer

    Read the article

  • offsetWidth or CSS expression problem for IE6

    - by Bipul
    I need to set the width of textboxes as 80% of it's parent. So first I used td input[type="text"] { width: 80%; } But it was not rendering properly if the input is the child of td. So, I used Css expressions td input[type="text"] { width: expression(this.parentNode.offsetWidth*0.8); } It is working as I wanted in every browser except IE 6. Can anybody help me, where I am going wrong? I know that expressions are allowed in IE 6. So, is it the problem of using css expression or something to do offsetWidth. Thanks in advance.

    Read the article

  • What RegEx should I use to return parameter names wrapped within brackets in an expression?

    - by burak ozdogan
    Hi, I have a set of expressions representing some formula with some parameters inside. Like: "[parameter1] * [parameter2] * [multiplier]" and many others like this. I want to use a regEx so that I can get a list of strings (List<string> ) which will have [paramter1] [paramter2] [multiplier] inside. I am not using regular expressions so often; if you have already used something like this I would appreciate if you can share. Thanks!

    Read the article

  • Mathematica regular expressions on unicode strings.

    - by dreeves
    This was a fascinating debugging experience. Can you spot the difference between the following two lines? StringReplace["–", RegularExpression@"[\\s\\S]" -> "abc"] StringReplace["-", RegularExpression@"[\\s\\S]" -> "abc"] They do very different things when you evaluate them. It turns out it's because the string being replaced in the first line consists of a unicode en dash, as opposed to a plain old ascii dash in the second line. In the case of the unicode string, the regular expression doesn't match. I meant the regex "[\s\S]" to mean "match any character (including newline)" but Mathematica apparently treats it as "match any ascii character". How can I fix the regular expression so the first line above evaluates the same as the second? Alternatively, is there an asciify filter I can apply to the strings first? PS: The Mathematica documentation says that its string pattern matching is built on top of the Perl-Compatible Regular Expressions library (http://pcre.org) so the problem I'm having may not be specific to Mathematica.

    Read the article

  • Parsing Lisp S-Expressions with known schema in C#

    - by Drew Noakes
    I'm working with a service that provides data as a Lisp-like S-Expression string. This data is arriving thick and fast, and I want to churn through it as quickly as possible, ideally directly on the byte stream (it's only single-byte characters) without any backtracking. These strings can be quite lengthy and I don't want the GC churn of allocating a string for the whole message. My current implementation uses CoCo/R with a grammar, but it has a few problems. Due to the backtracking, it assigns the whole stream to a string. It's also a bit fiddly for users of my code to change if they have to. I'd rather have a pure C# solution. CoCo/R also does not allow for the reuse of parser/scanner objects, so I have to recreate them for each message. Conceptually the data stream can be thought of as a sequence of S-Expressions: (item 1 apple)(item 2 banana)(item 3 chainsaw) Parsing this sequence would create three objects. The type of each object can be determined by the first value in the list, in the above case "item". The schema/grammar of the incoming stream is well known. Before I start coding I'd like to know if there are libraries out there that do this already. I'm sure I'm not the first person to have this problem.

    Read the article

  • Combine regular expressions for splitting camelCase string into words

    - by stou
    I managed to implement a function that converts camel case to words, by using the solution suggested by @ridgerunner in this question: Split camelCase word into words with php preg_match (Regular Expression) However, I want to also handle embedded abreviations like this: 'hasABREVIATIONEmbedded' translates to 'Has ABREVIATION Embedded' I came up with this solution: <?php function camelCaseToWords($camelCaseStr) { // Convert: "TestASAPTestMore" to "TestASAP TestMore" $abreviationsPattern = '/' . // Match position between UPPERCASE "words" '(?<=[A-Z])' . // Position is after group of uppercase, '(?=[A-Z][a-z])' . // and before group of lowercase letters, except the last upper case letter in the group. '/x'; $arr = preg_split($abreviationsPattern, $camelCaseStr); $str = implode(' ', $arr); // Convert "TestASAP TestMore" to "Test ASAP Test More" $camelCasePattern = '/' . // Match position between camelCase "words". '(?<=[a-z])' . // Position is after a lowercase, '(?=[A-Z])' . // and before an uppercase letter. '/x'; $arr = preg_split($camelCasePattern, $str); $str = implode(' ', $arr); $str = ucfirst(trim($str)); return $str; } $inputs = array( 'oneTwoThreeFour', 'StartsWithCap', 'hasConsecutiveCAPS', 'ALLCAPS', 'ALL_CAPS_AND_UNDERSCORES', 'hasABREVIATIONEmbedded', ); echo "INPUT"; foreach($inputs as $val) { echo "'" . $val . "' translates to '" . camelCaseToWords($val). "'\n"; } The output is: INPUT'oneTwoThreeFour' translates to 'One Two Three Four' 'StartsWithCap' translates to 'Starts With Cap' 'hasConsecutiveCAPS' translates to 'Has Consecutive CAPS' 'ALLCAPS' translates to 'ALLCAPS' 'ALL_CAPS_AND_UNDERSCORES' translates to 'ALL_CAPS_AND_UNDERSCORES' 'hasABREVIATIONEmbedded' translates to 'Has ABREVIATION Embedded' It works as intended. My question is: Can I combine the 2 regular expressions $abreviationsPattern and camelCasePattern so i can avoid running the preg_split() function twice?

    Read the article

  • SQL with Regular Expressions vs Indexes with Logical Merging Functions

    - by geeko
    Hello Lads, I am trying to develop a complex textual search engine. I have thousands of textual pages from many books. I need to search pages that contain specified complex logical criterias. These criterias can contain virtually any compination of the following: A: Full words. B: Word roots (semilar to stems; i.e. all words with certain key letters). C: Word templates (in some languages are filled in certain templates to form various part of speech such as adjactives, past/present verbs...). D: Logical connectives: AND/OR/XOR/NOT/IF/IFF and parentheses to state priorities. Now, would it be faster to have the pages' full text in database (not indexed) and search though them all using SQL and Regular Expressions ? Or would it be better to construct indexes of word/root/template-page-location tuples. Hence, we can boost searching for individual words/roots/templates. However, it gets tricky as we interdouce logical connectives into our query. I thought of doing the following steps in such cases: 1: Seperately search for each individual words/roots/templates in the specified query. 2: On priority bases, we merge two result lists (from step 1) at a time depedning on the logical connective For example, if we are searching for "he AND (is OR was)": 1: We shall search for "he", "is" and "was" seperately and get result lists for each word. 2: Merge the result lists of "is" and "was" using the merging function OR-MERGE 3: Merge the merged result list from the OR-MERGE function with the one of "he" using the merging function AND-MERGE The result of step 3 is then returned as the result of the specified query. What do you think gurues ? Which is faster ? Any better ideas ? Thank you all in advance.

    Read the article

  • Algorithm(s) for rearranging simple symbolic algebraic expressions

    - by Gabe Johnson
    Hi, I would like to know if there is a straightforward algorithm for rearranging simple symbolic algebraic expressions. Ideally I would like to be able to rewrite any such expression with one variable alone on the left hand side. For example, given the input: m = (x + y) / 2 ... I would like to be able to ask about x in terms of m and y, or y in terms of x and m, and get these: x = 2*m - y y = 2*m - x Of course we've all done this algorithm on paper for years. But I was wondering if there was a name for it. It seems simple enough but if somebody has already cataloged the various "gotchas" it would make life easier. For my purposes I won't need it to handle quadratics. (And yes, CAS systems do this, and yes I know I could just use them as a library. I would like to avoid such a dependency in my application. I really would just like to know if there are named algorithms for approaching this problem.)

    Read the article

  • I need to remove Java Script tags using regular expressions and JRegex

    - by piotr
    I need to remove all the Java Script tags and the content in between and style tags from the HTML code of web pages.So far I've come up with this expression : "(<[ \r\n\t]script([ \r\n\t]|){1,}([ \r\n\t]|.)?)|(<[ \r\n\t]noscript([ \r\n\t]|){1,}([ \r\n\t]|.)?)|(<[ \r\n\t]style([ \r\n\t]|){1,}([ \r\n\t]|.)?)" I use JRegex library to work with regular expressions. When I test it in any regex tester it works just fine, but once I run my program - it all crashes down with this error report: Exception in thread "Thread-0" java.lang.StackOverflowError at java.util.regex.Pattern$BranchConn.match(Unknown Source) at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source) at java.util.regex.Pattern$Branch.match(Unknown Source) at java.util.regex.Pattern$GroupHead.match(Unknown Source) at java.util.regex.Pattern$LazyLoop.match(Unknown Source) at java.util.regex.Pattern$GroupTail.match(Unknown Source) at java.util.regex.Pattern$BranchConn.match(Unknown Source) at java.util.regex.Pattern$CharProperty.match(Unknown Source) at java.util.regex.Pattern$Branch.match(Unknown Source) at java.util.regex.Pattern$GroupHead.match(Unknown Source) at java.util.regex.Pattern$LazyLoop.match(Unknown Source) .................................. And it keeps on going forever. If anyone can give me an advice on this one - I'll be very grateful.

    Read the article

  • Python Regular Expressions: Capture lookahead value (capturing text without consuming it)

    - by Lattyware
    I wish to use regular expressions to split words into groups of (vowels, not_vowels, more_vowels), using a marker to ensure every word begins and ends with a vowel. import re MARKER = "~" VOWELS = {"a", "e", "i", "o", "u", MARKER} word = "dog" if word[0] not in VOWELS: word = MARKER+word if word[-1] not in VOWELS: word += MARKER re.findall("([%]+)([^%]+)([%]+)".replace("%", "".join(VOWELS)), word) In this example we get: [('~', 'd', 'o')] The issue is that I wish the matches to overlap - the last set of vowels should become the first set of the next match. This appears possible with lookaheads, if we replace the regex as follows: re.findall("([%]+)([^%]+)(?=[%]+)".replace("%", "".join(VOWELS)), word) We get: [('~', 'd'), ('o', 'g')] Which means we are matching what I want. However, it now doesn't return the last set of vowels. The output I want is: [('~', 'd', 'o'), ('o', 'g', '~')] I feel this should be possible (if the regex can check for the second set of vowels, I see no reason it can't return them), but I can't find any way of doing it beyond the brute force method, looping through the results after I have them and appending the first character of the next match to the last match, and the last character of the string to the last match. Is there a better way in which I can do this? The two things that would work would be capturing the lookahead value, or not consuming the text on a match, while capturing the value - I can't find any way of doing either.

    Read the article

< Previous Page | 4 5 6 7 8 9 10 11 12 13 14 15  | Next Page >