pdf parsing - Page 30 - Developer IT

Parsing data without HTML tags

- by user296507

Hi, I need to extract the actual phone number form the html listed below, but I'm not really sure how to do it using Nokogiri CSS since there are no html tags around it. When an at_css(.phonetitle) it only parse Phone and not the number. <div class="detail"> Corner of Toorak Road and Chapel Street, South Yarra Phone 95435 34341 </div>

Read the article

Parsing a String into date with pattern:"dd/MM/yyyy"

- by kawtousse

Hi, I want to insert a date having this format MM/dd/YYYY for example:04/29/2010 to 29/04/2010 to be inserted into mysql database in a field typed Date. So i have this code: String dateimput=request.getParameter("datepicker"); DateFormat df = new SimpleDateFormat("dd/MM/yyyy"); Date dt = null; try { dt = df.parse(dateimput); System.out.println("date imput is:" +dt); } catch (ParseException e) { e.printStackTrace(); } but it gives me those error: 1-date imput is:Fri May 04 00:00:00 CEST 2012 (it is not the correct value that have been entered). 2-dismatching with mysql date type. I can not detect the error exactly. Please help.

Read the article

Parsing XML data with Namespaces in PHP

- by osbmedia

I'm trying to work with this XML feed that uses namespaces and i'm not able to get past the colon in the tags. Here's how the XML feed looks like: <r25:events pubdate="2010-05-19T13:58:08-04:00"> <r25:event xl:href="event.xml?event_id=328" id="BRJDMzI4" crc="00000022" status="est"> <r25:event_id>328</r25:event_id> <r25:event_name>Testing 09/2005-08/2006</r25:event_name> <r25:alien_uid/> <r25:event_priority>0</r25:event_priority> <r25:event_type_id xl:href="evtype.xml?type_id=105">105</r25:event_type_id> <r25:event_type_name>CABINET</r25:event_type_name> <r25:node_type>C</r25:node_type> <r25:node_type_name>cabinet</r25:node_type_name> <r25:state>1</r25:state> <r25:state_name>Tentative</r25:state_name> <r25:event_locator>2005-AAAAMQ</r25:event_locator> <r25:event_title/> <r25:favorite>F</r25:favorite> <r25:organization_id/> <r25:organization_name/> <r25:parent_id/> <r25:cabinet_id xl:href="event.xml?event_id=328">328</r25:cabinet_id> <r25:cabinet_name>cabinet 09/2005-08/2006</r25:cabinet_name> <r25:start_date>2005-09-01</r25:start_date> <r25:end_date>2006-08-31</r25:end_date> <r25:registration_url/> <r25:last_mod_dt>2008-02-27T14:22:43-05:00</r25:last_mod_dt> <r25:last_mod_user>abc00296004</r25:last_mod_user> </r25:event> </r25:events> And here is what I'm using for code - I'll trying to throw these into a bunch of arrays where I can format the output however I want: <?php $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, "http://somedomain.com/blah.xml"); curl_setopt ($ch, CURLOPT_HTTPHEADER, Array("Content-Type: text/xml")); curl_setopt($ch, CURLOPT_USERPWD, "username:password"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $output = curl_exec($ch); curl_close($ch); $xml = new SimpleXmlElement($output); foreach ($xml->events->event as $entry){ $dc = $entry->children('http://www.collegenet.com/r25'); echo $entry->event_name . " "; echo $entry->event_id . " "; }

Read the article

IE 8 html parsing error message.

- by user48408

I'm experiencing the problem outlined in this kb article. http://support.microsoft.com/kb/927917 . Sorry I can't hyperlink cos i don't have enough points! "This problem occurs because a child container HTML element contains script that tries to modify the parent container element of the child container. The script tries to modify the parent container element by using either the innerHTML method or the appendChild method." The problem I'm having diagnosing the source of my problem is 2 fold: 1) This is only happening on some client machines (All are running IE8) and not others. How/Why only some? 2) I don't have any scripts which modify the innerHTML or call appendChild on any dom elements. I do have server side code which modify properties on asp .net server controls. (Essentially all thats happening is a panel control with some more controls is being made visbile or invisible on a button click), would these in turn then set the innerHTML property of the client rendered control(?)

Read the article

Parsing a website

- by Phenom

I want to make a program that takes as user input a website address. The program then goes to that website, downloads it, and then parses the information inside. It outputs a new html file using the information from the website. Specifically, what this program will do is take certain links from the website, and put the links in the output html file, and it will discard everything else. Right now I just want to make it for websites that don't require a login, but later on I want to make it work for sites where you have to login, so it will have to be able to deal with cookies. I'll also want to later on have the program be able to explore certain links and download information from those other sites. What are the best programming languages or tools to do this?

Read the article

Haskell - Parsec Parsing element

- by Martin

I'm using Text.ParserCombinators.Parsec and Text.XHtml to parse an input like this: This is the first paragraph example\n with two lines\n \n And this is the second paragraph\n And my output should be: This is the first paragraph example\n with two lines\n And this is the second paragraph\n I defined: line= do{ ;t<-manyTill (anyChar) newline ;return t } paragraph = do{ t<-many1 (line) ;return ( p << t ) } But it returns: This is the first paragraph example\n with two lines\n\n And this is the second paragraph\n What is wrong? Any ideas? Thanks!

Read the article

How to generate a PDF from a view using media=print for styles

- by Riderman de Sousa Barbosa

Most of the questions in stackoverflow or in other forums, show how to generate views and sends them by email. But my goal is to generate a PDF from a view with the media=print format and sends it in attachment by email. I have a view that displays a report. I use CSS Print to display this report in a print format. (Basically I display some elements and hide others). How can I generate a PDF from this view (with format media=print) and send it by e-mail in attachment. I am using ActionMailer to send emails and iTextSharp to generate PDFs

Read the article

Parsing Chunk of Data into Hash of Array With Perl

- by neversaint

I have data that looks like this: #info #info2 1:SRX004541 Submitter: UT-MGS, UT-MGS Study: Glossina morsitans transcript sequencing project(SRP000741) Sample: Glossina morsitans(SRS002835) Instrument: Illumina Genome Analyzer Total: 1 run, 8.3M spots, 299.9M bases Run #1: SRR016086, 8330172 spots, 299886192 bases 2:SRX004540 Submitter: UT-MGS Study: Anopheles stephensi transcript sequencing project(SRP000747) Sample: Anopheles stephensi(SRS002864) Instrument: Solexa 1G Genome Analyzer Total: 1 run, 8.4M spots, 401M bases Run #1: SRR017875, 8354743 spots, 401027664 bases 3:SRX002521 Submitter: UT-MGS Study: Massive transcriptional start site mapping of human cells under hypoxic conditions.(SRP000403) Sample: Human DLD-1 tissue culture cell line(SRS001843) Instrument: Solexa 1G Genome Analyzer Total: 6 runs, 27.1M spots, 977M bases Run #1: SRR013356, 4801519 spots, 172854684 bases Run #2: SRR013357, 3603355 spots, 129720780 bases Run #3: SRR013358, 3459692 spots, 124548912 bases Run #4: SRR013360, 5219342 spots, 187896312 bases Run #5: SRR013361, 5140152 spots, 185045472 bases Run #6: SRR013370, 4916054 spots, 176977944 bases What I want to do is to create a hash of array with first line of each chunk as keys and SR## part of lines with "^Run" as its array member: $VAR = { 'SRX004541' => ['SRR016086'], # etc } But why my construct doesn't work. And it must be a better way to do it. use Data::Dumper; my %bighash; my $head = ""; my @temp = (); while ( <> ) { chomp; next if (/^\#/); if ( /^\d{1,2}:(\w+)/ ) { print "$1\n"; $head = $1; } elsif (/^Run \#\d+: (\w+),.*/){ print "\t$1\n"; push @temp, $1; } elsif (/^$/) { push @{$bighash{$head}}, [@temp]; @temp =(); } } print Dumper \%bighash ;

Read the article

Creating a gradient fill in a PDF file using reportlab

- by Adam Tegen

Is it possible to create a gradient fill in a PDF using ReportLab (python)?

Read the article

protocol parsing in c

- by nomad.alien

I have been playing around with trying to implement some protocol decoders, but each time I run into a "simple" problem and I feel the way I am solving the problem is not optimal and there must be a better way to do things. I'm using C. Currently I'm using some canned data and reading it in as a file, but later on it would be via TCP or UDP. Here's the problem. I'm currently playing with a binary protocol at work. All fields are 8 bits long. The first field(8bits) is the packet type. So I read in the first 8 bits and using a switch/case I call a function to read in the rest of the packet as I then know the size/structure of it. BUT...some of these packets have nested packets inside them, so when I encounter that specific packet I then have to read another 8-16 bytes have another switch/case to see what the next packet type is and on and on. (Luckily the packets are only nested 2 or 3 deep). Only once I have the whole packet decoded can I handle it over to my state machine for processing. I guess this can be a more general question as well. How much data do you have to read at a time from the socket? As much as possible? As much as what is "similar" in the protocol headers? So even though this protocol is fairly basic, my code is a whole bunch of switch/case statements and I do a lot of reading from the file/socket which I feel is not optimal. My main aim is to make this decoder as fast as possible. To the more experienced people out there, is this the way to go or is there a better way which I just haven't figured out yet? Any elegant solution to this problem?

Read the article

string parsing and substring in c

- by Josh

I'm trying to parse the string below in a good way so I can get the sub-string stringI-wantToGet: const char *str = "Hello \"FOO stringI-wantToGet BAR some other extra text"; str will vary in length but always same pattern - FOO and BAR What I had in mind was something like: const char *str = "Hello \"FOO stringI-wantToGet BAR some other extra text"; char *probe, *pointer; probe = str; while(probe != '\n'){ if(probe = strstr("\"FOO")!=NULL) probe++ else probe = ""; // Nulterm part if(pointer = strchr(probe, ' ')!=NULL) pointer = '\0'; // not sure here, I was planning to separate it with \0's } Any help will be appreciate it.

Read the article

Transform PDF to HTML, keep layout

- by Tgr

What methods are there to transform a PDF to HTML? It could be anything - online service, software, library. (Opensource preferred. In the last case, php or python would be preferred.) It has to keep the original layout (including page numbers, footnotes and such), keep the images (combining them to one single background image per page is acceptable) and keep the links. It should preferably output valid XHTML and clean up PDF features such as ligatures, but if there is some post-processing required, I can live with that. Something with a clean, relatively semantic HTML output would be great. The closest one I found was zamzar.org, but it choked on links. (Also, the HTML output is an ugly heap of absolutely positioned divs and needs post-processing because of encoding problems.)

Read the article

Parsing the Youtube API with DOM

- by Kirk

I'm using the Youtube API and I can retrieve the date information without a problem, but don't know how to retrieve the description information. My Code: <?php $v = "dQw4w9WgXcQ"; $url = "http://gdata.youtube.com/feeds/api/videos/". $v; $doc = new DOMDocument; $doc->load($url); $pub = $doc->getElementsByTagName("published")->item(0)->nodeValue; $desc = $doc->getElementsByTagName("media:description")->item(0)->nodeValue; echo "Video Uploaded: "; echo date( "F jS, Y", strtotime( $pub ) ); echo ' '; if (isset ($desc)) { echo "Description: "; echo $desc; echo ' '; } ?> Here's a link to the feed: http://gdata.youtube.com/feeds/api/videos/dQw4w9WgXcQ?prettyprint=true And the excerpt of code I don't know how to retrieve data from: <media:group> <media:description type='plain'>Music video by Rick Astley performing Never Gonna Give You Up. (C) 1987 PWL</media:description> </media:group> Thanks in advance.

Read the article

string parsing help

- by sprugman

I've got a string like this: #################### Section One #################### Data A Data B #################### Section Two #################### Data C Data D etc. I want to parse it into something like: $arr( 'Section One' => array('Data A', 'Data B'), 'Section Two' => array('Data C', 'Data D') ) At first I tried this: $sections = preg_split("/(\r?\n)(\r?\n)#/", $file_content); The problem is, the file isn't perfectly clean: sometimes there are different numbers of blank lines between the sections, or blank spaces between data rows. The section head pattern itself seems to be relatively consistent: #################### Section Title #################### The number of #'s is probably consistent, but I don't want to count on it. The white space on the title line is pretty random. Once I have it split into sections, I think it'll be pretty straightforward, but any help writing a killer reg ex to get it there would be appreciated. (Or if there's a better approach than reg ex...)

Read the article

XML Parsing in Groovy strips attribute new lines

- by Bill James

I'm writing code where I retrieve XML from a web api, then parse that XML using Groovy. Unfortunately, it seems that both XmlParser and XmlSlurper for Groovy strip newline characters from the attributes of nodes when .text() is called. How can I get at the text of the attribute including the newlines? Sample code: def xmltest = ''' <snippet> <preSnippet att1="testatt1" code="This is line 1 This is line 2 This is line 3" > <lines count="10" /> </preSnippet> </snippet>''' def parsed = new XmlParser().parseText( xmltest ) println "Parsed" parsed.preSnippet.each { pre -> println pre.attribute('code'); } def slurped = new XmlSlurper().parseText( xmltest ) println "Slurped" slurped.children().each { preSnip -> println [email protected]() } the output of which is: Parsed This is line 1 This is line 2 This is line 3 Slurped This is line 1 This is line 2 This is line 3

Read the article

String Parsing in C#

- by Betamoo

What is the most efficient way to parse a C# string in the form of "(params (abc 1.3)(sdc 2.0)....)" into a struct in the form struct Params { double abc,sdc....; } Thanks EDIT The structure always have the same parameters (number and names).. but the order is not granted..

Read the article

Parsing multiple files at a time in Perl

- by sfactor

I have a large data set (around 90GB) to work with. There are data files (tab delimited) for each hour of each day and I need to perform operations in the entire data set. For example, get the share of OSes which are given in one of the columns. I tried merging all the files into one huge file and performing the simple count operation but it was simply too huge for the server memory. So, I guess I need to perform the operation each file at a time and then add up in the end. I am new to perl and am especially naive about the performance issues. How do I do such operations in a case like this. As an example two columns of the file are. ID OS 1 Windows 2 Linux 3 Windows 4 Windows Lets do something simple, counting the share of the OSes in the data set. So, each .txt file has millions of these lines and there are many such files. What would be the most efficient way to operate on the entire files.

Read the article

Parsing XML wont display all items.

- by Nauman A

I have this code but the toast wont display any message what is wrong with my code.. I can get the value from link, linknext but title wont bring out any value. ( I am not very bright with writing code so please suggest anything you may feel like. final Button button = (Button) findViewById(R.id.Button01); button.setOnClickListener(new View.OnClickListener() { public void onClick(View v) { // Perform action on click try { URL url = new URL( "http://somelink.com=" + Link.setFirst_link); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new InputSource(url.openStream())); doc.getDocumentElement().normalize(); NodeList nodeList = doc.getElementsByTagName("item"); /** Assign textview array lenght by arraylist size */ for (int i = 0; i < nodeList.getLength(); i++) { Node node = nodeList.item(i); Element fstElmnt = (Element) node; NodeList nameList = fstElmnt.getElementsByTagName("link"); Element nameElement = (Element) nameList.item(0); nameList = nameElement.getChildNodes(); String img = (((Node) nameList.item(0)).getNodeValue()); NodeList websiteList = fstElmnt.getElementsByTagName("linknext"); Element websiteElement = (Element) websiteList.item(0); websiteList = websiteElement.getChildNodes(); String nextlink = (((Node) websiteList.item(0)).getNodeValue()); Link.setFirst_link = nextlink; Drawable drawable = LoadImageFromWebOperations(img); imgView.setImageDrawable(drawable); NodeList titleList = fstElmnt.getElementsByTagName("title"); Element titleElement = (Element) titleList.item(0); websiteList = titleElement.getChildNodes(); String title = (((Node) titleList.item(0)).getNodeValue()); Context context = getApplicationContext(); CharSequence text = title; int duration = Toast.LENGTH_SHORT; Toast toast = Toast.makeText(context, text, duration); toast.show(); } } catch (Exception e) { System.out.println("XML Pasing Excpetion = " + e); } } }); /** Set the layout view to display */ } Here is the xml file <?xml version="1.0"?> <maintag> <item> <link>http://image.com/357769.jpg?40</link> <linknext>http://www.image.com</linknext> <title>imagename</title> </item> </maintag>

Read the article

Parsing a simple file

- by Mike Graham

I have a file consisting of lines of the form Foo="Some information" Bar="More" Starting with such a string, what is the best way to extract "Some information" and "More" as strings? Foo and Bar are always exactly those names.

Read the article

Parsing a text file with a fixed format in Java

- by EugeneP

Suppose I know a text file format, say, each line contains 4 fields like this: firstword secondword thirdword fourthword firstword2 secondword2 thirdword2 fourthword2 ... and I need to read it fully into memory I can use this approach: open a text file while not EOF read line by line split each line by a space create a new object with four fields extracted from each line add this object to a Set Ok, but is there anything better, a special 3-rd party Java library? So that we could define the structure of each text line beforehand and parse the file with some function thirdpartylib.setInputTextFileFormat("format.xml"); thirdpartylib.parse(Set, "pathToFile") ?

Read the article

Parsing HTML with XPath and PHP

- by Peter

Is there a way (using XPath and PHP) to do the following (WITHOUT external XSLT files)? Remove all tables and their contents Remove everything after the first h1 tag Keep only paragraphs (INCLUDING their inner HTML (links, lists, etc)) I received an XSLT answer here, but I'm looking for XPATH queries that don't require external files. Currently, I've got the HTML in question loaded into a SimpleXmlElement via: $doc = @DOMDocument::loadHTML($xml); $data = simplexml_import_dom($doc); Now I need help with: $data = $data->xpath('??????'); Been working with this one for several days to no avail. I really appreciate the help. Edit: I don't particularly care what's inside the paragraphs, as I can use strip_tags to eliminate what I don't want. All I need to do is to isolate the paragraphs from the rest of the source. I suppose a more specific, accurate requirement would be this: Return only paragraphs (and their html contents) that aren't contained in tables, and only before the first h1 tag

Read the article

PDF Viewer Showing Last Page...

- by steve

I have a asp.net app that writes a pdf to file. Then, later that file is opened into a window (standard acrobat reader) for viewing. No problems there. The weird part... The entire document loads as it should, but the Reader initially shows the last page in the document on the screen. The user must then scroll up to the first page. It doesn't happen all the time (about 50%) and occurs across several test computers. Is there a switch in the code I'm suppose to use in creating the file or displaying the file to tell the reader to "start displaying the document on the first page?" Environment particulars: asp.net 3.5 vb, websupergoo's abcpdf.net pro 7 (assembly that creates the pdf file), Windows 2008 Server, IIS7 Thanks

Read the article

String parsing with regular expressions

- by ed1t

I have a following string that I would like to parse into either a List or a String[]. (Test)(Testing (Value)) End result should be Test and Testing (Value)

Read the article

Parsing a file with column data in Python

- by rejinacm

I have a file that contains the symbol table details.Its in the form of rows and columns. I need to extract first and last column. How can I do that?

Read the article

Android - Parsing XML with XPath

- by Ruben Deig Ramos

First of all, thanks to all the people who's going to spend a little time on this question. Second, sorry for my english (not my first language! :D). Well, here is my problem. I'm learning Android and I'm making an app which uses a XML file to store some info. I have no problem creating the file, but trying to read de XML tags with XPath (DOM, XMLPullParser, etc. only gave me problems) I've been able to read, at least, the first one. Let's see the code. Here is the XML file the app generates: <dispositivo> <id>111</id> <nombre>Name</nombre> <intervalo>300</intervalo> </dispositivo> And here is the function which reads the XML file: private void leerXML() { try { XPathFactory factory=XPathFactory.newInstance(); XPath xPath=factory.newXPath(); // Introducimos XML en memoria File xmlDocument = new File("/data/data/com.example.gps/files/devloc_cfg.xml"); InputSource inputSource = new InputSource(new FileInputStream(xmlDocument)); // Definimos expresiones para encontrar valor. XPathExpression tag_id = xPath.compile("/dispositivo/id"); String valor_id = tag_id.evaluate(inputSource); id=valor_id; XPathExpression tag_nombre = xPath.compile("/dispositivo/nombre"); String valor_nombre = tag_nombre.evaluate(inputSource); nombre=valor_nombre; } catch (Exception e) { e.printStackTrace(); } } The app gets correctly the id value and shows it on the screen ("id" and "nombre" variables are assigned to a TextView each one), but the "nombre" is not working. What should I change? :) Thanks for all your time and help. This site is quite helpful! PD: I've been searching for a response on the whole site but didn't found any.

Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 30/291 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37 | Next Page >

- by user296507

- by kawtousse

- by osbmedia

- by user48408

- by Phenom

- by Martin

- by Riderman de Sousa Barbosa

- by neversaint

- by Adam Tegen

- by nomad.alien

- by Josh

- by Tgr

- by Kirk

- by sprugman

- by Bill James

- by Betamoo

- by sfactor

- by Nauman A

- by Mike Graham

- by EugeneP

- by Peter

- by steve

- by ed1t

- by rejinacm

- by Ruben Deig Ramos

< Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37 | Next Page >