Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 100/291 | < Previous Page | 96 97 98 99 100 101 102 103 104 105 106 107  | Next Page >

  • Regular expression in C#

    - by user340015
    i have text something like this. @@MMIVLoader@[email protected]@BCM_7400S_LE@Product@Aug 21 2009@ @@MMIVLib@[email protected]@BCM_7400S_LE@Product@Aug 21 2009@ @@HuaweFGDLDrv@[email protected]@7324@PRODUCT@Aug 20 2009@ @@ProtectVer@[email protected] @BCM_SDE5.03@PRODUCT@Aug 4 2009 06:56:19@ @@KernelSw@[email protected]@BCM-7454@PRODUCT@ Dec 19 2007@ @@ReceiverSw@[email protected]@HWBC01ZS@PRODUCT@May 3 2010@ i want the out put in an array like MMIVLoader 4.1.2 MMIVLib 4.1.2 HuaweFGDLDrv 01.00.09 ProtectVer 127.8.1 KernelSw 0.0.1 ReceiverSw E.5.6.001 Can any one suggest me how to do this in c# using regular expression or is there a any sophisticated way to do this thanks in advance

    Read the article

  • "The left hand side of an assignment must be a variable" due to extra parentheses

    - by polygenelubricants
    I know why the following code doesn't compile: public class Main { public static void main(String args[]) { main((null)); // this is fine! (main(null)); // this is NOT! } } What I'm wondering is why my compiler (javac 1.6.0_17, Windows version) is complaining "The left hand side of an assignment must be a variable". I'd expect something like "Don't put parentheses around a method invokation, dummy!", instead. So why is the compiler making a totally unhelpful complaint about something that is blatantly irrelevant? Is this the result of an ambiguity in the grammar? A bug in the compiler? If it's the former, could you design a language such that a compiler would never be so off-base about a syntax error like this?

    Read the article

  • Weird URL parse issue. (Android)

    - by Tarmon
    I am attempting to parse in a URL to a KML file from maps.google.com. When I try and use this link: http://maps.google.com/maps/ms?ie=UTF8&hl=en&msa=0&msid=112748174025829638330.000483ad6315714cc941d&z=13&output=kml` I am unable to overlay this KML file on my MapView. If I were to take the KML file that I get from following this link and upload it to my Dropbox it will work just fine. I think there may be something about the URL from Google that it doesn't like? Link from dropbox: http://dl.dropbox.com/u/1037184/Blue_original.kml Also it would be better if we could just save these KML files locally and pass them in the same way but I can't figure out a way to do this. Here is the code I am using: Intent mapIntent = new Intent(Intent.ACTION_VIEW); Uri uri1 = Uri.parse("geo:0,0?q=http://code.google.com/apis/kml/ documentation/KML_Samples.kml"); mapIntent.setData(uri1); startActivity(Intent.createChooser(mapIntent, "Test")); The URL used in this example also works. So to recap: I am curious as to why some URLs work and others don't. Is there a way to place this KML file locally on the device and pass it to a Uri object? Any other suggestions? Thanks, Rob

    Read the article

  • How to write regex that searches for a dynamic amount of pairs?

    - by citronas
    Lets say a have a string such as this one: string txt = "Lore ipsum {{abc|prop1=\"asd\";prop2=\"bcd\";}} asd lore ipsum"; The information I want to extract "abc" and pairs like ("prop1","asd") , ("prop3", "bcd") where each pair used a ; as delimeter. Edit1: (based on MikeB's) code Ah, getting close. I found out how to parse the following: string txt = "Lore ipsum {{abc|prop1=\"asd\";prop2=\"http:///www.foo.com?foo=asd\";prop3=\"asd\";prop4=\"asd\";prop5=\"asd\";prop6=\"asd\";}} asd"; Regex r = new Regex("{{(?<single>([a-z0-9]*))\\|((?<pair>([a-z0-9]*=\"[a-z0-9.:/?=]*\";))*)}}", RegexOptions.Singleline | RegexOptions.IgnoreCase); Match m = r.Match(txt); if (m.Success) { Console.WriteLine(m.Groups["single"].Value); foreach (Capture cap in m.Groups["pair"].Captures) { Console.WriteLine(cap.Value); } } Question 1: How must I adjust the regex to say 'each value of a pair in delimited by \" only? I added chars like '.',';' etc, but I can't think of any char that I want to permit. The other way around would be much nicer. Question 2: How must I adjust this regex work with this thing here? string txt = "Lore ipsum {{abc|prop1=\"asd\";prop2=\"http:///www.foo.com?foo=asd\";prop3=\"asd\";prop4=\"asd\";prop5=\"asd\";prop6=\"asd\";}} asd lore ipsum {{aabc|prop1=\"asd\";prop2=\"http:///www.foo.com?foo=asd\";prop3=\"asd\";prop4=\"asd\";prop5=\"asd\";prop6=\"asd\";}}"; Therefore I'd probably try to get groups of {{...}} and use the other regex?

    Read the article

  • PHP SAX parser for HTML?

    - by Daniel
    Hi. I need HTML SAX (not DOM!) parser for PHP able to process even invalid HTML code. The reason i need it is to filter user entered HTML (remove all attributes and tags except allowed ones) and truncate HTML content to specified length. Any ideas?

    Read the article

  • How to parse ISO formatted date in python?

    - by Big 40wt Svetlyak
    I need to parse strings like that "2008-09-03T20:56:35.450686Z" into the python's datetime? I have found only strptime in the python 2.5 std lib, but it not so convinient. Which is the best way to do that? Update: It seems, that python-dateutil works very well. I have found that solution: d1 = '2008-09-03T20:56:35.450686Z' d2 = dateutil.parser.parse(d1) d3 = d2.astimezone(dateutil.tx.tzutc())

    Read the article

  • Regex for retrieving the parameter of the css url function...

    - by Kieron
    Hi, I'm trying to get the url portion of the following string: url(images/ui-bg_highlight-soft_75_cccccc_1x100.png) So the required part if images/ui-bg_highlight-soft_75_cccccc_1x100.png. Currently I've got this: url\((?<url>.*)\) But it seems to be choking on the following example: url(images/ui-bg_flat_0_aaaaaa_40x100.png) 50% 50% repeat-x; opacity: .30;filter:Alpha(Opacity=30) Which results in images/ui-bg_flat_0_aaaaaa_40x100.png) 50% 50% repeat-x; opacity: .30;filter:Alpha(Opacity=30... I'd like to make sure that it supports as many variations as possible (additional whitespace etc). Thanks! Kieron

    Read the article

  • making arrays from tab-delimited text file column

    - by absolutenewbie
    I was wondering if anyone could help a desperate newbie with perl with the following question. I've been trying all day but with my perl book at work, I can't seem to anything relevant in google...or maybe am genuinely stupid with this. I have a file that looks something like the following: Bob April Bob April Bob March Mary August Robin December Robin April The output file I'm after is: Bob April April March Mary August Robin December April So that it lists each month in the order that it appears for each person. I tried making it into a hash but of course it wouldn't let me have duplicates so I thought I would like to have arrays for each name (in this example, Bob, Mary and Robin). I'm afraid to upload the code I've been trying to tweak because I know it'll be horribly wrong. I think I need to define(?) the array. Would this be correct? Any help would be greatly appreciated and I promise I will be studying more about perl in the meantime. Thank you for your time, patience and help. #!/usr/bin/perl -w while (<>) { chomp; if (defined $old_name) { $name=$1; $month=$2; if ($name eq $old_name) { $array{$month}++; } else { print "$old_name"; foreach (@array) { push (@array, $month); print "\t@array"; } print "\n"; @array=(); $array{$month}++; } } else { $name=$1; $month=$2; $array{month}++; } $old_name=$name; } print "$old_name"; foreach (@array) { push (@array, $month); print "\t@array"; } print "\n";

    Read the article

  • .NET regex: Match.nextMatch() never returns

    - by Jimmy
    I have a regex that seems to have worked fine for the past year or so, and all of a sudden today with a new slightly different text to match against, Match.nextMatch() never returns. I'm no regex expert and I'm sure the regex can be optimized, but previous data sets weren't much more complex than what I've tried today. Furthermore, the regex works fine against the offending data set in a tool like RegexBuddy; it's only in .net (running in debug in Visual Studio) that it seems to hang. Nevertheless, if anyone can figure out how to tweak the regex to make it work, I'd really appreciate it. This is the regex: <tr>(<td[^>]*><a[^>]*>(?<callOptionTicker>[A-Z]{1,5}\d{6}C\d{8})</a></td>)(<td[^>]*>.*?</td>){6}(<td[^>]*><b><a[^>]*>(?<strikePrice>\d*\.\d*)</a></b></td>)(<td[^>]*><a[^>]*>(?<putOptionTicker>[A-Z]{1,5}\d{6}P\d{8})</a></td>) It's meant to extract put and call option tickers from a Yahoo option chain page (i.e., raw HTML). It works fine for IBM http://finance.yahoo.com/q/os?s=IBM&m=2010-05-21 It doesn't work for SPX options (this is the offending data set) http://finance.yahoo.com/q/os?s=I:SPX.W&m=2010-05

    Read the article

  • How do I process a nested list?

    - by ddbeck
    Suppose I have a bulleted list like this: * list item 1 * list item 2 (a parent) ** list item 3 (a child of list item 2) ** list item 4 (a child of list item 2 as well) *** list item 5 (a child of list item 4 and a grand-child of list item 2) * list item 6 I'd like to parse that into a nested list or some other data structure which makes the parent-child relationship between elements explicit (rather than depending on their contents and relative position). For example, here's a list of tuples containing an item and a list of its children (and so forth): [('list item 1',), ('list item 2', [('list item 3',), [('list item 4', [('list item 5'),]] ('list item 6',)] I've attempted to do this with plain Python and some experimentation with Pyparsing, but I'm not making progress. I'm left with two major questions: What's the strategy I need to employ to make this work? I know recursion is part of the solution, but I'm having a hard time making the connection between this and, say, a Fibonacci sequence. I'm certain I'm not the first person to have done this, but I don't know the terminology of the problem to make fruitful searches for more information on this topic. What problems are related to this so that I can learn more about solving these kinds of problems in general?

    Read the article

  • Read alphanumeric characters from csv file in C#

    - by Prasad
    I am using the following code to read my csv file: public DataTable ParseCSV(string path) { if (!File.Exists(path)) return null; string full = Path.GetFullPath(path); string file = Path.GetFileName(full); string dir = Path.GetDirectoryName(full); //create the "database" connection string string connString = "Provider=Microsoft.ACE.OLEDB.12.0;" + "Data Source=\"" + dir + "\\\";" + "Extended Properties=\"text;HDR=Yes;FMT=Delimited;IMEX=1\""; //create the database query string query = "SELECT * FROM " + file; //create a DataTable to hold the query results DataTable dTable = new DataTable(); //create an OleDbDataAdapter to execute the query OleDbDataAdapter dAdapter = new OleDbDataAdapter(query, connString); //fill the DataTable dAdapter.Fill(dTable); dAdapter.Dispose(); return dTable; } But the above doesn't reads the alphanumeric value from the csv file. it reads only i either numeric or alpha. Whats the fix i need to make to read the alphanumeric values? Please suggest.

    Read the article

  • Any Java library for address extraction from emails?

    - by Hans Klock
    I'm looking for an Java open-source library which is able to extract address information from a (German) email (signature). The library should find name street city, city code/postal code email tel/fax address-parser.com is an commercial product, but a free (albeit simple) library would be great. stackoverflow.com/questions/16413/parse-usable-street-address-city-state-zip-from-a-string is asking for something similar, but my problem is broader because the address information is hidden in a complete email. And there isn't a solution either... Any ideas?

    Read the article

  • How to read XML using XPath in Java

    - by kaibuki
    Hi guys, I want to read XML data using XPath in Java, so for the information I have gathered I am not able to parse XML according to my requirement. here is what I want to do: Get XML file from online via its URL, then use XPath to parse it, I want to create two methods in it. One is in which I enter a specific node attribute id, and I get all the child nodes as result, and second is suppose I just want to get a specific child node value only <?xml version="1.0"?><howto> <topic name="Java"> <url>http://www.rgagnonjavahowto.htm</url> <car>taxi</car> </topic> <topic ame="PowerBuilder"> <url>http://www.rgagnon/pbhowto.htm</url> <url>http://www.rgagnon/pbhowtonew.htm</url> </topic> <topic name="Javascript"> <url>http://www.rgagnon/jshowto.htm</url> </topic> <topic name="VBScript"> <url>http://www.rgagnon/vbshowto.htm</url> </topic></howto> In above example I want to read all the elements if I search via @name and also one function in which I just want the url from @name 'Javascript' only return one node element. I hope I cleared my question :) Thanks. Kai

    Read the article

  • How to get entire input string in Lex and Yacc?

    - by DevDevDev
    OK, so here is the deal. In my language I have some commands, say XYZ 3 5 GGB 8 9 HDH 8783 33 And in my Lex file XYZ { return XYZ; } GGB { return GGB; } HDH { return HDH; } [0-9]+ { yylval.ival = atoi(yytext); return NUMBER; } \n { return EOL; } In my yacc file start : commands ; commands : command | command EOL commands ; command : xyz | ggb | hdh ; xyz : XYZ NUMBER NUMBER { /* Do something with the numbers */ } ; etc. etc. etc. etc. My question is, how can I get the entire text XYZ 3 5 GGB 8 9 HDH 8783 33 Into commands while still returning the NUMBERs? Also when my Lex returns a STRING [0-9a-zA-Z]+, and I want to do verification on it's length, should I do it like rule: STRING STRING { if (strlen($1) < 5 ) /* Do some shit else error */ } or actually have a token in my Lex that returns different tokens depending on length?

    Read the article

  • Xerces SAX parser ignore the xmlxs:xsi attribute as an attribute of an element

    - by user603301
    Hi, Using Xerces SAX parser I try to retrieve all elements and their attributes of this XML file: -------------- Begin XML file to parse ---------------- <?xml version="1.0" encoding="UTF-8"?> <invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="my.xsd"> <parties> (...) -------------- End XML file to parse ---------------- When getting the attributes for the element 'invoice', Xerces++ does not insert the 'xmlns:xsi' attribute in the list of 'Attributes' for the element 'invoice'. However, the attribute 'xsi:noNamespaceSchemaLocation' is inserted in the list. Why? Is there a specific reason from an XML standard point of view ? Is there a way to configure Xerces++ SAX parser so that it inserts this attribute as well? (The documentation on setting the parser properties does not tell how). Thanks for your help.

    Read the article

  • Problem with eastern european characters when scraping data from the European Parliaments Website

    - by Thomas Jensen
    Dear Experts I am trying to scrape a lot of data from the European Parliament website for a research project. Ther first step is the create a list if all parliamentarians, however due to the many Eastern European names and the accents they use i get a lot of missing entries. Here is an example of what is giving me troubles (notice the accents at the end of the family name): ANDRIKIENE, Laima Liucija Group of the European People's Party (Christian Democrats) So far I have been using PyParser and the following code: parser_names name = Word(alphanums + alphas8bit) begin, end = map(Suppress, "<") names = begin + ZeroOrMore(name) + "," + ZeroOrMore(name) + end for name in names.searchString(page): print(name) However this does not catch the name from the html above. Any advice in how to proceed? Best, Thomas

    Read the article

  • pyparsing ambiguity

    - by Claudiu
    I'm trying to parse some text using PyParser. The problem is that I have names that can contain white spaces. So my input might look like this: Joe Bob Jimmy Foo Joe decides to eat. Bob decides to not eat. Jimmy Foo decides to eat. How can I create a parser for the decides to eat line? If I create my name parser naively, meaning with alphabetic characters plus space characters, then it will match the entire line.

    Read the article

  • Extracting email addresses in an html block in ruby/rails

    - by corroded
    I am creating a parser that wards off against spamming and harvesting of emails from a block of text that comes from tinyMCE (so it may or may not have html tags in it) I've tried regexes and so far this has been successful: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i problem is, i need to ignore all email addresses with mailto hrefs. for example: <a href="mailto:[email protected]">[email protected]</a> should only return the second email add. To get a background of what im doing, im reversing the email addresses in a block so the above example would look like this: <a href="mailto:[email protected]">moc.liam@tset</a> problem with my current regex is that it also replaces the one in href. Is there a way for me to do this with a single regex? Or do i have to check for one then the other? Is there a way for me to do this just by using gsub or do I have to use some nokogiri/hpricot magicks and whatnot to parse the mailtos? Thanks in advance! Here were my references btw: so.com/questions/504860/extract-email-addresses-from-a-block-of-text so.com/questions/1376149/regexp-for-extracting-a-mailto-address im also testing using this: http://rubular.com/ edit here's my current helper code: def email_obfuscator(text) text.gsub(/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i) { |m| m = "<span class='anti-spam'>#{m.reverse}</span>" } end which results in this: <a target="_self" href="mailto:<span class='anti-spam'>moc.liamg@tset</span>"><span class="anti-spam">moc.liamg@tset</span></a>

    Read the article

  • Objective-C Implementation Pointers

    - by Dwaine Bailey
    Hi, I am currently writing an XML parser that parses a lot of data, with a lot of different nodes (the XML isn't designed by me, and I have no control over the content...) Anyway, it currently takes an unacceptably long time to download and read in (about 13 seconds) and so I'm looking for ways to increase the efficiency of the read. I've written a function to create hash values, so that the program no longer has to do a lot of string comparison (just NSUInteger comparison), but this still isn't reducing the complexity of the read in... So I thought maybe I could create an array of IMPs so that, I could then go something like: for(int i = 0; i < [hashValues count]; i ++) { if(currHash == [[hashValues objectAtIndex:i] unsignedIntValue]) { [impArray objectAtIndex:i]; } } Or something like that. The only problem is that I don't know how to actually make the call to the IMP function? I've read that I perform the selector that an IMP defines by going IMP tImp = [impArray objectAtIndex:i]; tImp(self, @selector(methodName)); But, if I need to know the name of the selector anyway, what's the point? Can anybody help me out with what I want to do? Or even just some more ways to increase the efficiency of the parser...

    Read the article

< Previous Page | 96 97 98 99 100 101 102 103 104 105 106 107  | Next Page >