Search Results

Search found 19217 results on 769 pages for 'log parser'.

Page 44/769 | < Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51  | Next Page >

  • Text extraction with java html parsers

    - by zenmonkey
    I want to use an html parser that does the following in a nice, elegant way Extract text (this is most important) Extract links, meta keywords Reconstruct original doc (optional but nice feature to have) From my investigation so far jericho seems to fit. Any other open source libraries you guys would recommend?

    Read the article

  • Learning Treetop

    - by cmartin
    I'm trying to teach myself Ruby's Treetop grammar generator. I am finding that not only is the documentation woefully sparse for the "best" one out there, but that it doesn't seem to work as intuitively as I'd hoped. On a high level, I'd really love a better tutorial than the on-site docs or the video, if there is one. On a lower level, here's a grammar I cannot get to work at all: grammar SimpleTest rule num (float / integer) end rule float ( (( '+' / '-')? plain_digits '.' plain_digits) / (( '+' / '-')? plain_digits ('E' / 'e') plain_digits ) / (( '+' / '-')? plain_digits '.') / (( '+' / '-')? '.' plain_digits) ) { def eval text_value.to_f end } end rule integer (( '+' / '-' )? plain_digits) { def eval text_value.to_i end } end rule plain_digits [0-9] [0-9]* end end When I load it and run some assertions in a very simple test object, I find: assert_equal @parser.parse('3.14').eval,3.14 Works fine, while assert_equal @parser.parse('3').eval,3 raises the error: NoMethodError: private method `eval' called for # If I reverse integer and float on the description, both integers and floats give me this error. I think this may be related to limited lookahead, but I cannot find any information in any of the docs to even cover the idea of evaluating in the "or" context A bit more info that may help. Here's pp information for both those parse() blocks. The float: SyntaxNode+Float4+Float0 offset=0, "3.14" (eval,plain_digits): SyntaxNode offset=0, "" SyntaxNode+PlainDigits0 offset=0, "3": SyntaxNode offset=0, "3" SyntaxNode offset=1, "" SyntaxNode offset=1, "." SyntaxNode+PlainDigits0 offset=2, "14": SyntaxNode offset=2, "1" SyntaxNode offset=3, "4": SyntaxNode offset=3, "4" The Integer... note that it seems to have been defined to follow the integer rule, but not caught the eval() method: SyntaxNode+Integer0 offset=0, "3" (plain_digits): SyntaxNode offset=0, "" SyntaxNode+PlainDigits0 offset=0, "3": SyntaxNode offset=0, "3" SyntaxNode offset=1, "" Update: I got my particular problem working, but I have no clue why: rule integer ( '+' / '-' )? plain_digits { def eval text_value.to_i end } end This makes no sense with the docs that are present, but just removing the extra parentheses made the match include the Integer1 class as well as Integer0. Integer1 is apparently the class holding the eval() method. I have no idea why this is the case. I'm still looking for more info about treetop.

    Read the article

  • Performance of tokenizing CSS in PHP

    - by Boldewyn
    This is a noob question from someone who hasn't written a parser/lexer ever before. I'm writing a tokenizer/parser for CSS in PHP (please don't repeat with 'OMG, why in PHP?'). The syntax is written down by the W3C neatly here (CSS2.1) and here (CSS3, draft). It's a list of 21 possible tokens, that all (but two) cannot be represented as static strings. My current approach is to loop through an array containing the 21 patterns over and over again, do an if (preg_match()) and reduce the source string match by match. In principle this works really good. However, for a 1000 lines CSS string this takes something between 2 and 8 seconds, which is too much for my project. Now I'm banging my head how other parsers tokenize and parse CSS in fractions of seconds. OK, C is always faster than PHP, but nonetheless, are there any obvious D'Oh! s that I fell into? I made some optimizations, like checking for '@', '#' or '"' as the first char of the remaining string and applying only the relevant regexp then, but this hadn't brought any great performance boosts. My code (snippet) so far: $TOKENS = array( 'IDENT' => '...regexp...', 'ATKEYWORD' => '@...regexp...', 'String' => '"...regexp..."|\'...regexp...\'', //... ); $string = '...CSS source string...'; $stream = array(); // we reduce $string token by token while ($string != '') { $string = ltrim($string, " \t\r\n\f"); // unconsumed whitespace at the // start is insignificant but doing a trim reduces exec time by 25% $matches = array(); // loop through all possible tokens foreach ($TOKENS as $t => $p) { // The '&' is used as delimiter, because it isn't used anywhere in // the token regexps if (preg_match('&^'.$p.'&Su', $string, $matches)) { $stream[] = array($t, $matches[0]); $string = substr($string, strlen($matches[0])); // Yay! We found one that matches! continue 2; } } // if we come here, we have a syntax error and handle it somehow } // result: an array $stream consisting of arrays with // 0 => type of token // 1 => token content

    Read the article

  • How to switch a view after i parse data?

    - by charles Graffeo
    OK my problem is this, i parse a document and after the document is parsed then i want to load to the next view sounds simple but ive been here for like 4 hours playing with code and id appreciete any help u can give me atm. k heres my parserDidEndDocumentCode - (void)parserDidEndDocument:(NSXMLParser *)parser{ IpadSlideShowViewController temp=(IpadSlideShowViewController)[[IpadSlideShowViewController alloc]init]; [temp performSelectorOnMainThread:@selector(switchViews) withObject:nil waitUntilDone:TRUE]; [temp release]; } now what i think should happen is my code will switch the view after it parses the documentation, n its not doing it idk why

    Read the article

  • lexers vs parsers

    - by Naveen
    Are lexers and parsers really that different in theory ? It seems fashionable to hate regular expressions: coding horror, another blog post. However, popular lexing based tools: pygments, geshi, or prettify, all use regular expressions. They seem to lex anything... When is lexing enough, when do you need EBNF ? Has anyone used the tokens produced by these lexers with bison or antlr parser generators?

    Read the article

  • Splitting a list in python

    - by blob8108
    Hi, I'm writing a parser in Python. I've converted an input string into a list of tokens, such as: ['(', '2', '.', 'x', '.', '(', '3', '-', '1', ')', '+', '4', ')', '/', '3', '.', 'x', '^', '2'] I want to be able to split the list into multiple lists, like the str.split('+') function. But there doesn't seem to be a way to do my_list.split('+'). Any ideas? Thanks!

    Read the article

  • NSConcreteData leaked object in objective c ?

    - by Madan Mohan
    Hi Guys, I am getting the NSConcreteData leaked object while testing the leaks in the instruments.It showing in the parser, - (void)parseXMLFileAtURL:(NSURL *)URL { [urlList release]; urlList = [[NSMutableArray alloc] init]; myParser = [[NSXMLParser alloc] initWithContentsOfURL:URL] ;// it showing this line as leaking [myParser setDelegate:self]; [myParser setShouldProcessNamespaces:NO]; [myParser setShouldReportNamespacePrefixes:NO]; [myParser setShouldResolveExternalEntities:NO]; [myParser parse]; [myParser release]; }

    Read the article

  • why the sexp has array in the end

    - by dorelal
    RubyParser.new.parse "1+1" s(:call, s(:lit, 1), :+, s(:array, s(:lit, 1))) Above code is from this link Why there is array after + in the Sexp. I am just trying to learn ruby parser and the whole AST thing. I have been programming for a while but have no formal education in computer science. So do point to good article which explains AST etc. Please no dragon book. I tried couple of times but couldn't understand much of that book

    Read the article

  • Cannot log to Proxmox GUI

    - by greg
    I'm running proxmox on debian 6.0.8 kernel 2.6.32-18-pve. When I try to log into the GUI with the root password, it lets me in for about 2 seconds then asks for the password again: <hostname>:8006/api2/json/cluster/resources 401 (permission denied - invalid ticket) This is not the same behavior as when I give an incorrect password. I suspect a hostname problem, but I can't sort it out. My /etc/hosts contains the following line: <ip> <shorthostname> <longhostname> pvelocalhost The folder /etc/pve/nodes contains a folder name The https certificates matches the hostname. Any idea? TIA greg

    Read the article

  • gwt+xml- can i read through incomplete XML using the GWT XML Parser

    - by Arvind
    I have a requirement where a user is typing in XML in a text area, and I want to show the various nodes in a tree...But as the user is typing in the xml, it wont be a complete xml (since he is still typing in the XML)... How do I read an incomplete XML and correctly generate the tree? I understand how to read an xml and generate nodes in the tree, what I want to know is, how to read the incomplete XML as it is being typed in

    Read the article

  • Log analyzer that calculates "time on page"?

    - by netvope
    I need to get an idea of the average "time on page" or "page view duration" for each page on my websites without client-side scripting (such as using onunload event handler). Is any of the free log analyzers capable of doing this? I looked at Webalizer, AWStats and Analog, but they don't seem to have such a function. The closest thing is "visits duration" in AWStats, but I'd like to see "page view duration" instead. I know that visitor tracking is inaccurate without client-side scripting, but I can bear with it. Google Analytics seems to provide a "time on page" metric without hooking the onunload event (but correct me if I'm wrong), so I believe this is possible.

    Read the article

  • How to log invalid client SSL certificate in SSL

    - by matra
    I have a IIS web site which requires client certificate. I have turned off CRL checking. The client is unable to access the web site - he gets 403.17 (certificate expired) error. I would like to log the certificate he is using, becaue I think he is using the wrong certificate. Is there a way to do this? I probably can not use WireShark, because client certificatethat is passed from the client is probably already encryped. I am running a WIndows 2003 server. Matra

    Read the article

  • perl xml parser get xml content within xml

    - by user391986
    How can I use XMLParser to get the item-@url, item-@replace and item-"value inside" for the content as a string of the node where item-@cone="one"? <cstep> <item cone="one" url="http://google.com/{ccc}/cthree" replace="{ccc}"> <itemsub conesub="conesub"> <itemsubsub conesubsub="conesubsub" /> </itemsub> </item> <item cone="two" url="http://google.com/{ccc}/cthree" replace="{ccc}"> <itemsub conesub="conesub"> <itemsubsub conesubsub="conesubsub" /> </itemsub> </item> </cstep>

    Read the article

  • Log in/out of Gmail chat programmatically, clicking Gmail's span "links"

    - by endolith
    At work, I use Gmail's chat, since it's encrypted and logs chats without installing or saving anything to the hard drive. At home, I use Pidgin. When I log into GMail at home, I have to log out of chat, or messages will end up in the wrong place. When I log into GMail at work, I have to log back in to chat. In other words, when I start Firefox at home, I want Gmail's chat disabled automatically. When I start Firefox at work, I want Gmail's chat enabled automatically. Is there a way to use a Greasemonkey script or similar to force logging in and logging out on specific machines? It would seem simple enough; just follow a URL or simulate clicking a link. Unfortunately, Gmail doesn't use actual links. While logged out: <span tabindex="0" role="link" action="si" class="az9OKd">Sign into chat</span> While logged in, in drop-down menu: <div tabindex="-1" id=":1mj" role="menuitem" class="oA" value="si"><div class="uQ c6"/>Sign into chat</div> <div tabindex="-1" id=":8f" role="menuitem" class="oA" value="sia"><div class="uQ c5"/>Sign into AIM®</div> <div tabindex="-1" id=":8e" role="menuitem" class="oA" value="so"><div class="uQ df"/>Sign out of chat</div> At bottom of page: <span id=":im" class="l8 ou" tabindex="0" role="link">turn off chat</span> <span id=":im" class="l8 ou" tabindex="0" role="link">turn on chat</span> Anyone know how to "click" these non-links with JavaScript or access their functions? I would imagine that "so" means "sign out", "si" means "sign in", and "sia" means "sign in AIM". Can I somehow call these actions directly? Is there some other alternative for disabling chat?

    Read the article

  • OpenLDAP and Samba, can't log onto Samba share from Windows

    - by Jakobud
    The former jackass IT-guy that I'm taking over for had a Samba share setup on a Fedora server that uses our OpenLDAP server to authenticate users who want to log in from Windows. We recently added a new employee and I jumped through the LDAP hoops to add them to the system. However, I can't seem to use their login to access the Samba share. I'm looking through the LDAP settings and Groups and comparing the new user account to existing ones, and I can't figure out what settings in LDAP are required for this user to be able to access the Samba share. Of course the former idiotic IT-guy didn't document a single thing and has all sorts of weird setups on the network. So I'm at a bit of a loss on knowing what to look for here. Where should I start? On the server that is hosting the Samba share, he has samba running obviously but also has smbldap-tools loaded as well.

    Read the article

  • C# Julian Date Parser

    - by Jake Pearson
    I have a cell in a spreadsheet that is a date object in Excel but becomes a double (something like 39820.0 for 1/7/2009) when it comes out of C1's xls class. I read this is a Julian date format. Can someone tell me how to parse it back into a DateTime in C#? Update: It looks like I might not have a Julian date, but instead the number of days since Dec 30, 1899.

    Read the article

  • Since updating to today's version of Skype on Win8, Skype won't even log in

    - by Warren P
    I am using the desktop (win32 not appstore) Skype on Windows 8. I accepted an auto-update today which now will not let me log into skype at all. I wiped out the AppData\Roaming\Skype folder because that's how you get Skype back in the past when it has gone stupid, but that didn't help either. I do not get any error message, just the circular progress indicator that lasts forever. (Okay, so far it's been 30 minutes. It may eventually time out yet.) Other ideas please? I tried the Windows 8 App Store Skype app (WinRT), which I hate, I guess I'll go back and install that for now. Update Skype 6.0.126 seems to be completely broken on my Win8 machine.

    Read the article

  • Can Windows log CryptoAPI CRL timouts?

    - by makerofthings7
    We have several .NET applications that occasionally "act slow" with no CPU or disk access. I suspect that they are hung up on authentication when trying to validate the certificate, since the timeout is almost 20 seconds. As per this MSFT article Most applications do not specify to CryptoAPI to use a cumulative time-out. If the cumulative time-out option is not enabled, CryptoAPI uses the CryptoAPI default setting which is a time-out of 15 seconds per URL. If the cumulative time-out option specified by the application, then CryptoAPI will use a default setting of 20 seconds as the cumulative timeout. The first URL receives a maximum timeout of 10 seconds. Each subsequent URL timeout is half of the remaining balance in the cumulative timeout value. Since this is a service, how can I detect and log CryptoAPI hangs for applications I have sourcecode to, and also 3rd party

    Read the article

  • Retrieving well formed HTML using Jericho HTML parser in Java

    - by Raj
    Hello, I've looked at jTidy for converting a snipped of malformed/real-world HTML into well-formed HTML/XHTML. However, there's a bug in the latest version due to which I'm not able to use it. I'm looking at Jericho since it has a lot of positive reviews around the net. However, its not immediately obvious to me how one would go about implementing a method like: public String getValidHTML(String messedUpHTML) For instance, if it was passed <div>bar, it would return <div>bar</div> Any pointers would be helpful. Thanks in advance!

    Read the article

  • .NET Excel File Parser

    - by Russak
    So the company I'm working for is looking for a means to verify that a given .xls/.xlsx file is valid. Which means checking columns and rows and other data. He's having me evaluate GrapeCity Spread and SpreadsheetGear, but I'm wondering if anyone else has any other suggestions of external tools to check out. We don't need a means to export .xls files or anything like that, just the ability to import them and verify they are valid based on a set of criteria I create. Thanks.

    Read the article

< Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51  | Next Page >