Search Results

Search found 3176 results on 128 pages for 'parsing'.

Page 46/128 | < Previous Page | 42 43 44 45 46 47 48 49 50 51 52 53 | Next Page >

Python: Is there a way to get HTML that was dynamically created by Javascript?

- by Joschua

As far as I can tell, this is the case for LyricWikia. The lyrics (example) can be accessed from the browser, but can't be found in the source code (can be opened with CTRL + U in most browsers) or reading the contents of the site with Python: from urllib.request import urlopen URL = 'http://lyrics.wikia.com/Billy_Joel:Piano_Man' r = urlopen(URL).read().decode('utf-8') And the test: >>> 'Now John at the bar is a friend of mine' in r False >>> 'John' in r False But when you select and look at the source code of the box in which the lyrics are displayed, you can see that there is: <div class="lyricbox">[...]</div> Is there a way to get the contents of that div-element with Python?

Read the article
MalformedURLException with file URI

- by Paul Reiners

While executing the following code: doc = builder.parse(file); where doc is an instance of org.w3c.dom.Document and builder is an instance of javax.xml.parsers.DocumentBuilder, I'm getting the following exception: Exception in thread "main" java.net.MalformedURLException: unknown protocol: c at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at com.acme.ItemToThetaValues.createFiles(ItemToThetaValues.java:47) It's choking on this line of the file: <!DOCTYPE questestinterop SYSTEM "C:\Program Files\Acme\parsers\acme_full.dtd"> I am not getting this error on my machine, while a user is getting it on his machine. We are both using version 6 of the Sun JRE. This error also occurs when he's uses double backslashes in the path instead of single backslashes and when he uses forward slashes instead of backslashes. First of all, is the XML correct? Is the path expressed correctly? Second of all, why is this error occurring on one computer but not on another?

Read the article
Why the double.Parse throw error in live server and how to track?

- by Kovu

Hi, I build a website, that: reads data from a website by HttpWebRequest Sort all Data Parse values of the data and give out newly On local server it works perfect, but when I push it to my live server, the double.Parse fails with an error. So: - how to track what the double.parse is trying to parse? - how to debug live server? Lang is ASP.Net / C#.net 2.0

Read the article
how to dispaly image in grid view reading imageUrl from xml using sax parser in android

- by Pramod kuamr

thanks for answer but i am able to read xml file from url but i need if in xml imageUrl is there so show in grid view ..this is my xml file and read URL <?xml version="1.0" encoding="UTF-8"?> <channels> <channel> <name>ndtv</name> <logo>http://a3.twimg.com/profile_images/670625317/aam-logo--twitter.png</logo> <description>this is a news Channel</description> <rssfeed>ndtv.com</rssfeed> </channel> <channel> <name>star news</name> <logo>http://a3.twimg.com/profile_images/740897825/AndroidCast-350_normal.png</logo> <description>this is a newsChannel</description> <rssfeed>starnews.com</rssfeed> </channel> </channels>

Read the article
Lexing partial SQL in C#

- by Chris T

I'd need to parse partial SQL queries (it's for a SQL injection auditing tool). For example '1' AND 1=1-- Should break down into tokens like [0] => [SQL_STRING, '1'] [1] => [SQL_AND] [2] => [SQL_INT, 1] [3] => [SQL_AND] [4] => [SQL_INT, 1] [5] => [SQL_COMMENT] [6] => [SQL_QUERY_END] Are their any at least lexers for SQL that I base mine off of or any good tools like bison for C# (though I'd rather not write my own grammar as I need to support most if not all the grammar of MySQL 5)

Read the article
how to detect an escape sequence in a string

- by mix

Given a string named line whose raw version has this value: \rRAWSTRING how can I detect if it has the escape character \r? What I've tried is: if repr(line).startswith('\r'): blah... but it doesn't catch it. I also tried find, such as: if repr(line).find('\r') != -1: blah doesn't work either. What am I missing? thx! EDIT: thanks for all the replies and the corrections re terminolgy and sorry for the confusion. OK, if i do this print repr(line) then what it prints is: '\rSET ENABLE ACK\n' (including the single quotes). i have tried all the suggestions, including: line.startswith(r'\r') line.startswith('\\r') each of which returns False. also tried: line.find(r'\r') line.find('\\r') each of which returns -1

Read the article
SQL error - Cannot convert nvarchar to decimal

- by jakesankey

I have a C# application that simply parses all of the txt documents within a given network directory and imports the data to a SQL server db. Everything was cruising along just fine until about the 1800th file when it happend to have a few blanks in columns that are called out as DBType.Decimal (and the value is usually zero in the files, not blank). So I got this error, "cannot convert nvarchar to decimal". I am wondering how I could tell the app to simply skip the lines that have this issue?? Perhaps I could even just change the column type to varchar even tho values are numbers (what problems could this create?) Thanks for any help! using System; using System.Data; using System.Data.SQLite; using System.IO; using System.Text.RegularExpressions; using System.Threading; using System.Collections.Generic; using System.Linq; using System.Data.SqlClient; namespace JohnDeereCMMDataParser { internal class Program { public static List<string> GetImportedFileList() { List<string> ImportedFiles = new List<string>(); using (SqlConnection connect = new SqlConnection(@"Server=FRXSQLDEV;Database=RX_CMMData;Integrated Security=YES")) { connect.Open(); using (SqlCommand fmd = connect.CreateCommand()) { fmd.CommandText = @"SELECT FileName FROM CMMData;"; fmd.CommandType = CommandType.Text; SqlDataReader r = fmd.ExecuteReader(); while (r.Read()) { ImportedFiles.Add(Convert.ToString(r["FileName"])); } } } return ImportedFiles; } private static void Main(string[] args) { Console.Title = "John Deere CMM Data Parser"; Console.WriteLine("Preparing CMM Data Parser... done"); Console.WriteLine("Scanning for new CMM data..."); Console.ForegroundColor = ConsoleColor.Gray; using (SqlConnection con = new SqlConnection(@"Server=FRXSQLDEV;Database=RX_CMMData;Integrated Security=YES")) { con.Open(); using (SqlCommand insertCommand = con.CreateCommand()) { Console.WriteLine("Connecting to SQL server..."); SqlCommand cmdd = con.CreateCommand(); string[] files = Directory.GetFiles(@"C:\Documents and Settings\js91162\Desktop\CMM WENZEL\", "*_*_*.txt", SearchOption.AllDirectories); List<string> ImportedFiles = GetImportedFileList(); insertCommand.Parameters.Add(new SqlParameter("@FeatType", DbType.String)); insertCommand.Parameters.Add(new SqlParameter("@FeatName", DbType.String)); insertCommand.Parameters.Add(new SqlParameter("@Axis", DbType.String)); insertCommand.Parameters.Add(new SqlParameter("@Actual", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@Nominal", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@Dev", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@TolMin", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@TolPlus", DbType.Decimal)); insertCommand.Parameters.Add(new SqlParameter("@OutOfTol", DbType.Decimal)); foreach (string file in files.Except(ImportedFiles)) { var FileNameExt1 = Path.GetFileName(file); cmdd.Parameters.Clear(); cmdd.Parameters.Add(new SqlParameter("@FileExt", FileNameExt1)); cmdd.CommandText = @" IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'RX_CMMData' AND TABLE_NAME = 'CMMData')) BEGIN SELECT COUNT(*) FROM CMMData WHERE FileName = @FileExt; END"; int count = Convert.ToInt32(cmdd.ExecuteScalar()); con.Close(); con.Open(); if (count == 0) { Console.WriteLine("Preparing to parse CMM data for SQL import..."); if (file.Count(c => c == '_') > 5) continue; insertCommand.CommandText = @" INSERT INTO CMMData (FeatType, FeatName, Axis, Actual, Nominal, Dev, TolMin, TolPlus, OutOfTol, PartNumber, CMMNumber, Date, FileName) VALUES (@FeatType, @FeatName, @Axis, @Actual, @Nominal, @Dev, @TolMin, @TolPlus, @OutOfTol, @PartNumber, @CMMNumber, @Date, @FileName);"; string FileNameExt = Path.GetFullPath(file); string RNumber = Path.GetFileNameWithoutExtension(file); int index2 = RNumber.IndexOf("~"); Match RNumberE = Regex.Match(RNumber, @"^(R|L)\d{6}(COMP|CRIT|TEST|SU[1-9])(?=_)", RegexOptions.IgnoreCase); Match RNumberD = Regex.Match(RNumber, @"(?<=_)\d{3}[A-Z]\d{4}|\d{3}[A-Z]\d\w\w\d(?=_)", RegexOptions.IgnoreCase); Match RNumberDate = Regex.Match(RNumber, @"(?<=_)\d{8}(?=_)", RegexOptions.IgnoreCase); string RNumE = Convert.ToString(RNumberE); string RNumD = Convert.ToString(RNumberD); if (RNumberD.Value == @"") continue; if (RNumberE.Value == @"") continue; if (RNumberDate.Value == @"") continue; if (index2 != -1) continue; DateTime dateTime = DateTime.ParseExact(RNumberDate.Value, "yyyyMMdd", Thread.CurrentThread.CurrentCulture); string cmmDate = dateTime.ToString("dd-MMM-yyyy"); string[] lines = File.ReadAllLines(file); bool parse = false; foreach (string tmpLine in lines) { string line = tmpLine.Trim(); if (!parse && line.StartsWith("Feat. Type,")) { parse = true; continue; } if (!parse || string.IsNullOrEmpty(line)) { continue; } Console.WriteLine(tmpLine); foreach (SqlParameter parameter in insertCommand.Parameters) { parameter.Value = null; } string[] values = line.Split(new[] { ',' }); for (int i = 0; i < values.Length - 1; i++) { if (i = "" || i = null) continue; SqlParameter param = insertCommand.Parameters[i]; if (param.DbType == DbType.Decimal) { decimal value; param.Value = decimal.TryParse(values[i], out value) ? value : 0; } else { param.Value = values[i]; } } insertCommand.Parameters.Add(new SqlParameter("@PartNumber", RNumE)); insertCommand.Parameters.Add(new SqlParameter("@CMMNumber", RNumD)); insertCommand.Parameters.Add(new SqlParameter("@Date", cmmDate)); insertCommand.Parameters.Add(new SqlParameter("@FileName", FileNameExt)); insertCommand.ExecuteNonQuery(); insertCommand.Parameters.RemoveAt("@PartNumber"); insertCommand.Parameters.RemoveAt("@CMMNumber"); insertCommand.Parameters.RemoveAt("@Date"); insertCommand.Parameters.RemoveAt("@FileName"); } } } Console.WriteLine("CMM data successfully imported to SQL database..."); } con.Close(); } } } }

Read the article
RegEx: h1 followed by h2 without p in between

- by voodoo555

Hey everyone, I need a regular expression to find out whether or not a h1 tag is followed by a h2 tag, without any paragraph elements in between. I tried to use a negative lookahead but it doesn't work: <h1(.+?)</h1>(\s|(?!<p))*<h2(.+?)</h2>

Read the article
ICalendar parser in PHP that supports timezones

- by Vincent Robert

I am looking for a PHP class that can parse an ICalendar (ICS) file and correctly handle timezones. I already created an ICS parser myself but it can only handle timezones known to PHP (like 'Europe/Paris'). Unfortunately, ICS file generated by Evolution (default calendar software of Ubuntu) does not use default timezone IDs. It exports events with its a specific timezone ID exporting also the full definition of the timezone: daylight saving dates, recurrence rule and all the hard stuff to understand about timezones. This is too much for me. Since it was only a small utility for my girlfriend, I won't have time to investigate further the ICalendar specification and create a full blown ICalendar parser myself. So is there any known implementation in PHP of ICalendar file format that can parse timezones definitions?

Read the article
How do I extract a substring from a string until the second space is encountered?

- by gbprithvi

i have a string like this: "o1 1232.5467 1232.5467 1232.5467 1232.5467 1232.5467 1232.5467" How do I extract only "o1 1232.5467"? The number of characters to be extracted are not the same always.. hence I want to extract until the second space is encountered.

Read the article
How do I get Bison/YACC to not recognize a command until it parses the whole string?

- by chucknelson

I have some bison grammar: input: /* empty */ | input command ; command: builtin | external ; builtin: CD { printf("Changing to home directory...\n"); } | CD WORD printf("Changing to directroy %s\n", $2); } ; I'm wondering how I get Bison to not accept (YYACCEPT?) something as a command until it reads ALL of the input. So I can have all these rules below that use recursion or whatever to build things up, which either results in a valid command or something that's not going to work. One simple test I'm doing with the code above is just entering "cd mydir mydir". Bison parses CD and WORD and goes "hey! this is a command, put it to the top!". Then the next token it finds is just WORD, which has no rule, and then it reports an error. I want it to read the whole line and realize CD WORD WORD is not a rule, and then report an error. I think I'm missing something obvious and would greatly appreciate any help - thanks! Also - I've tried using input command NEWLINE or something similar, but it still pushes CD WORD to the top as a command and then parses the extra WORD separately.

Read the article
How to get Nokogiri to ignore HTML elements that doesn't exist

- by user296507

any idea how i can get the code below to produce this output? 1 - 2 - B i'm getting this error "undefined method `text' for nil:NilClass (NoMethodError)", because i think table 1 does not have the element 'td class=r2' in it. require 'rubygems' require 'nokogiri' require 'open-uri' doc = Nokogiri::HTML.parse(<<-eohtml) <table class="t1"> <tbody> <tr> <td class="r1">1</td> </tr> </tbody> </table> <table class="t2"> <tbody> <tr> <td class="r1">2</td> <td class="r2">B</td> </tr> </tbody> </table> eohtml doc.css('tbody > tr').each do |n| r1 = n.at_css(".r1").text r2 = n.at_css(".r2").text puts "#{r1} - #{r2}" end

Read the article
Counting total sum of each value in one column w.r.t another in Perl

- by sfactor

I have tab delimited data with multiple columns. I have OS names in column 31 and data bytes in columns 6 and 7. What I want to do is count the total volume of each unique OS. So, I did something in Perl like this: #!/usr/bin/perl use warnings; my @hhfilelist = glob "*.txt"; my %count = (); for my $f (@hhfilelist) { open F, $f || die "Cannot open $f: $!"; while (<F>) { chomp; my @line = split /\t/; # counting volumes in col 6 and 7 for 31 $count{$line[30]} = $line[5] + $line[6]; } close (F); } my $w = 0; foreach $w (sort keys %count) { print "$w\t$count{$w}\n"; } So, the result would be something like Windows 100000 Linux 5000 Mac OSX 15000 Android 2000 But there seems to be some error in this code because the resulting values I get aren't as expected. What am I doing wrong?

Read the article
getElementsByClassName not working on parsed html data in greasemonkey

- by Sid

Hi my code is as such var xhReq = new XMLHttpRequest(); xhReq.open("GET", linksRaw, false); xhReq.send(null); var serverResponse = xhReq.responseText; var tempDiv = document.createElement('div'); tempDiv.innerHTML = serverResponse.replace(/<script(.|\s)*?\/script>/g, ''); var plzWork = tempDiv.getElementsByClassName('organizationID').innerHTML; console.log(plzWork); The value of 'plzWork' :-) which is logged to the firebug console is always 'undefined' while the link code is <a class="organisationID" href="orglists.htm">Partner Organisations</a> I'm writing this script in the latest versions of Greasemonkey and FF 3.6 Thanks

Read the article
PHP Regex to match lines with all-caps with occaisional hyphens.

- by Yaaqov

I'm trying to to convert an existing PHP Regular Expression match case to apply to a slightly different style of document. Here's the original style of the document: **FOODS - TYPE A** ___________________________________ **PRODUCT** 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese **CODE** Sell by date going back to February 1, 2009 And the successfully-running PHP Regex match code that only returns "true" if the line is surrounded by asterisks, and stores each side of the "-" as $m[1] and $m[2], respectively. if ( preg_match('#^\*\*([^-]+)(?:-(.*))?\*\*$#', $line, $m) ) { // only for **header - subheader** $m[2] is set. if ( isset($m[2]) ) { return array(TYPE_HEADER, array(trim($m[1]), trim($m[2]))); } else { return array(TYPE_KEY, array($m[1])); } } So, for line 1: $m[1] = "FOODS" AND $m[2] = "TYPE A"; Line 2 would be skipped; Line 3: $m[1] = "PRODUCT", etc. The question: How would I re-write the above regex match if the headers did not have the asterisks, but still was all-caps, and was at least 4 characters long? For example: FOODS - TYPE A ___________________________________ PRODUCT 1) Mi Pueblito Queso Fresco Authentic Mexican Style Fresh Cheese; 2) La Fe String Cheese CODE Sell by date going back to February 1, 2009 Thank you.

Read the article
Best way to get back to using the power of lxml after having to use a regex to find something in an

- by PyNEwbie

I am trying to rip some text out of a large number of html documents (numbers in the hundreds of thousands). The documents are really forms but they are prepared by a very large group of different organizations so there is significant variation in how they create the document. For example, the documents are divided into chapters. I might want to extract the contents of Chapter 5 from every document so I can analyze the content of the chapter. Initially I thought this would be easy but it turns out that the authors might use a set of non-nested tables throughout the document to hold the content so that Chapter n could be displayed using td tags inside a table. Or they might use other elements such as p tags H tags, div tags or any other block level element. After trying repeatedly to use lxml to help me identify the beginning and end of each chapter I have determined that it is a lot cleaner to use a regular expression because in every case, no matter what the enclosing html element is the chapter label is always in the form of >Chapter # It is a little more complicated in that there might be some white space or non-breaking space represented in different ways ( or or just spaces). Nonetheless it was trivial to write a regular expression to identify the beginning of each section. (The beginning of one section is the end of the previous section.) But now I want to use lxml to get the text out. My thought is that I have really no choice but to walk along my string to find the close tag for the element that encloses the text I am using to find the relevant section. That is here is one example where the element holding the Chapter name is a div <div style="DISPLAY: block; MARGIN-LEFT: 0pt; TEXT-INDENT: 0pt; MARGIN-RIGHT: 0pt" align="left"><font style="DISPLAY: inline; FONT-WEIGHT: bold; FONT-SIZE: 10pt; FONT-FAMILY: Times New Roman">Chapter 1.   Our Beginnings.</font></div> So I am imagining that I would begin at the location where I found the match for chapter 1 and set up a regular expressions to find the next </div|</td|</p|</h1 . . . So at this point I have identified the type of element holding my chapter heading I can use the same logic to find all of the text that is within that element that is set up a regular expression to help me mark from >Chapter 1.   Our Beginnings.< So I have identified where my Chapter 1 begins I can do the same for chapter 2 (which is where Chapter 1 ends) Now I am imagining that I am going to snip the document beginning at the opening of the element that I identified as the element the indicates where chapter 1 begins and ending just before the opening of the element that I identified as the element that indicates where Chapter 2 begins. The string that I have identified will then be fed to lxml to use its power to get the content. I am going to all of this trouble because I have read over and over - never use a regular expression to extract content from html documents and I have not hit on a way to be as accurate with lxml to identify the starting and ending locations for the text I want to extract. For example, I can never be certain that the subtitle of Chapter 1 is Our Beginnings it could be Our Red Canary. Let me say that I spent two solid days trying with lxml to be confident that I had the beginning and ending elements and I could only be accurate <60% of the time but a very short regular expression has given me better than 95% success. I have a tendency to make things more complicated than necessary so I am wondering if anyone has seen or solved a similar problems and if they had an approach (not the details mind you) that they would like to offer.

Read the article
vb.net - how do I parse a percentage value from a grid cell?

- by Bob Palin

I'm trying to parse a formatted percentage value back from a datagridviewcell that has been set with the "P" formatter: double percent = 0.96 cell.value = percent.tostring("p") gives me a displayed value of 96 % which is what I want. Now what I'm looking for is something like what is provided for the other formatting strings - NumberStyles.HexNumber, Currency etc so that I can do this double percent= double.parse( cell.value, NumberStyles.Percent ) which would give me a percent value of .96 I have scoured the .net documentation but can't find any sort of AllowPercent style like the others - is there one? Bob Palin p.s. I see there is another question here like this and tried to expand on it in that thread, but was deleted by a moderator and told to post a new question.

Read the article
Linux: shell builtin string matching

- by gmatt

I am trying to become more familiar with using the builtin string matching stuff available in shells in linux. I came across this guys posting, and he showed an example a="abc|def" echo ${a#*|} # will yield "def" echo ${a%|*} # will yield "abc" I tried it out and it does what its advertised to do, but I don't understand what the $,{},#,*,| are doing, I tried looking for some reference online or in the manuals but I couldn't find anything. Can anyone explain to me what's going on here?

Read the article
Parse HTML with CSS or XPath selectors?

- by ovolko

My goal is to parse HTML with lxml, which supports both XPath and CSS selectors. I can tie my model properties either to CSS or XPath, but I'm not sure which one would be the best, e.g. less fuss when HTML layout is changed, simpler expressions, greater extraction speed. What would you choose in such a situation?

Read the article
Jquery to find a name on html page and add hyperlink

- by mikejones12

Here is my example: I have a a website that contains the following: <body> Jim Nebraska zipcode 65437 Tony lives in California his zipcode is 98708 </body> I would like to be able to search for zip codes on the page and wrap them with hyperlinks like: <body> Jim Nebraska zipcode <a href="/65437.htm">65437</a> Tony lives in California his zipcode is <a href="/65437.htm">98708</a> </body> Could I use a regex selector to find the string and then wrap the string, or replace it with the new hyperlink? I am new to Jquery and looking for someone to point me in the right direction. Thank you, Mike

Read the article
What's an easy and fast way to put returned XML data into a dict?

- by ensnare

I'm trying to take the data returned from: http://ipinfodb.com/ip_query.php?ip=74.125.45.100&timezone=true Into a dict in a fast and easy way. What's the best way to do this? Thanks.

Read the article
Parse usable Street Address, City, State, Zip from a string

- by Rob Allen

Problem: I have an address field from an Access database which has been converted to Sql Server 2005. This field has everything all in one field. I need to parse out the individual sections of the address into their appropriate fields in a normalized table. I need to do this for approximately 4,000 records and it needs to be repeatable. Here are the rules for this exercise: 1 - no whining about how this should have been separate fields in the first place, we are often confronted with less than ideal situations and have to make the best of them 2- for this post, use any language you want 3- feel free to play code golf 4 - Assume an address in the US (for now) 5 - assume that the input string will sometimes contain an addressee (the person being addressed) and/or a second street address (i.e. Suite B) 6 - states may be abbreviated 7 - zip code could be standard 5 digit or zip+4 8 - there are typos in some instances UPDATE: In response to the questions posed, standards were not universally followed, I need need to store the individual values, not just geocode and errors means typo (corrected above) Sample Data: A. P. Croll & Son 2299 Lewes-Georgetown Hwy, Georgetown, DE 19947 11522 Shawnee Road, Greenwood DE 19950 144 Kings Highway, S.W. Dover, DE 19901 Intergrated Const. Services 2 Penns Way Suite 405 New Castle, DE 19720 Humes Realty 33 Bridle Ridge Court, Lewes, DE 19958 Nichols Excavation 2742 Pulaski Hwy Newark, DE 19711 2284 Bryn Zion Road, Smyrna, DE 19904 VEI Dover Crossroads, LLC 1500 Serpentine Road, Suite 100 Baltimore MD 21 580 North Dupont Highway Dover, DE 19901 P.O. Box 778 Dover, DE 19903

Read the article
How to write a Compiler in C for C

- by Kerb_z

I want to write a Compiler for C. This is a Project for my College i am doing as per my University. I am an intermediate programmer in C, with understanding of Data Structures. Now i know a Compiler has the following parts: 1. Lexer 2. Parser 3. Intermediate Code Generator 4. Optimizer 5. Code Generator I want to begin with the Lexer part and move on to Parser. I am consulting the following book: Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman. The thing is that this book is highly theoretical and perplexing to me. I really appreciate the authors. But the point is i am not able to begin my project, as if i am blinded where to go. Need guidance please help.

Read the article
How to write a bison grammer for WDI?

- by Rizo

I need some help in bison grammar construction. From my another question: I'm trying to make a meta-language for writing markup code (such as xml and html) wich can be directly embedded into C/C++ code. Here is a simple sample written in this language, I call it WDI (Web Development Interface): /* * Simple wdi/html sample source code */ #include <mySite> string name = "myName"; string toCapital(string str); html { head { title { mySiteTitle; } link(rel="stylesheet", href="style.css"); } body(id="default") { // Page content wrapper div(id="wrapper", class="some_class") { h1 { "Hello, " + toCapital(name) + "!"; } // Lists post ul(id="post_list") { for(post in posts) { li { a(href=post.getID()) { post.tilte; } } } } } } } Basically it is a C source with a user-friendly interface for html. As you can see the traditional tag-based style is substituted by C-like, with blocks delimited by curly braces. I need to build an interpreter to translate this code to html and posteriorly insert it into C, so that it can be compiled. The C part stays intact. Inside the wdi source it is not necessary to use prints, every return statement will be used for output (in printf function). The program's output will be clean html code. So, for example a heading 1 tag would be transformed like this: h1 { "Hello, " + toCapital(name) + "!"; } // would become: printf("<h1>Hello, %s!</h1>", toCapital(name)); My main goal is to create an interpreter to translate wdi source to html like this: tag(attributes) {content} = <tag attributes>content</tag> Secondly, html code returned by the interpreter has to be inserted into C code with printfs. Variables and functions that occur inside wdi should also be sorted in order to use them as printf parameters (the case of toCapital(name) in sample source). Here are my flex/bison files: id [a-zA-Z_]([a-zA-Z0-9_])* number [0-9]+ string \".*\" %% {id} { yylval.string = strdup(yytext); return(ID); } {number} { yylval.number = atoi(yytext); return(NUMBER); } {string} { yylval.string = strdup(yytext); return(STRING); } "(" { return(LPAREN); } ")" { return(RPAREN); } "{" { return(LBRACE); } "}" { return(RBRACE); } "=" { return(ASSIGN); } "," { return(COMMA); } ";" { return(SEMICOLON); } \n|\r|\f { /* ignore EOL */ } [ \t]+ { /* ignore whitespace */ } . { /* return(CCODE); Find C source */ } %% %start wdi %token LPAREN RPAREN LBRACE RBRACE ASSIGN COMMA SEMICOLON CCODE QUOTE %union { int number; char *string; } %token <string> ID STRING %token <number> NUMBER %% wdi : /* empty */ | blocks ; blocks : block | blocks block ; block : head SEMICOLON | head body ; head : ID | ID attributes ; attributes : LPAREN RPAREN | LPAREN attribute_list RPAREN ; attribute_list : attribute | attribute COMMA attribute_list ; attribute : key ASSIGN value ; key : ID {$$=$1} ; value : STRING {$$=$1} /*| NUMBER*/ /*| CCODE*/ ; body : LBRACE content RBRACE ; content : /* */ | blocks | STRING SEMICOLON | NUMBER SEMICOLON | CCODE ; %% I am having difficulties on defining a proper grammar for the language, specially in splitting WDI and C code . I just started learning language processing techniques so I need some orientation. Could someone correct my code or give some examples of what is the right way to solve this problem?

Read the article
Evaluating mathematical expressions in Python

- by vander

Hi, I want to tokenize a given mathematical expression into a binary tree like this: ((3 + 4 - 1) * 5 + 6 * -7) / 2 '/' / \ + 2 / \ * * / \ / \ - 5 6 -7 / \ + 1 / \ 3 4 Is there any pure Python way to do this? Like passing as a string to Python and then get back as a tree like mentioned above. Thanks.

Read the article

< Previous Page | 42 43 44 45 46 47 48 49 50 51 52 53 | Next Page >