Search Results

Search found 13 results on 1 pages for 'xmlslurper'.

Page 1/1 | 1 

  • replace XmlSlurper tag with arbitrary XML

    - by Misha Koshelev
    Dear All: I am trying to replace specific XmlSlurper tags with arbitrary XML strings. The best way I have managed to come up with to do this is: #!/usr/bin/env groovy import groovy.xml.StreamingMarkupBuilder def page=new XmlSlurper(new org.cyberneko.html.parsers.SAXParser()).parseText(""" <html> <head></head> <body> <one attr1='val1'>asdf</one> <two /> <replacemewithxml /> </body> </html> """.trim()) import groovy.xml.XmlUtil def closure closure={ bind,node-> if (node.name()=="REPLACEMEWITHXML") { bind.mkp.yieldUnescaped "<replacementxml>sometext</replacementxml>" } else { bind."${node.name()}"(node.attributes()) { mkp.yield node.text() node.children().each { child-> closure(bind,child) } } } } println XmlUtil.serialize( new StreamingMarkupBuilder().bind { bind-> closure(bind,page) } ) However, the only problem is the text() element seems to capture all child text nodes, and thus I get: <?xml version="1.0" encoding="UTF-8"?> <HTML>asdf<HEAD/> <BODY>asdf<ONE attr1="val1">asdf</ONE> <TWO/> <replacementxml>sometext</replacementxml> </BODY> </HTML> Any ideas/help much appreciated. Thank you! Misha p.s. Also, out of curiosity, if I change the above to the "Groovier" notation as follows, the groovy compiler thinks I am trying to access the ${node.name()} member of my test class. Is there a way to specify this is not the case while still not passing the actual builder object? Thank you! :) def closure closure={ node-> if (node.name()=="REPLACEMEWITHXML") { mkp.yieldUnescaped "<replacementxml>sometext</replacementxml>" } else { "${node.name()}"(node.attributes()) { mkp.yield node.text() node.children().each { child-> closure(child) } } } } println XmlUtil.serialize( new StreamingMarkupBuilder().bind { closure(page) } )

    Read the article

  • Groovy pretty print XmlSlurper output from HTML?

    - by Misha Koshelev
    Dear All: I am using several different versions to do this but all seem to result in this error: [Fatal Error] :1:171: The prefix "xmlns" cannot be bound to any namespace explicitly; neither can the namespace for "xmlns" be bound to any prefix explicitly. I load html as: // Load html file def fis=new FileInputStream("2.html") def html=new XmlSlurper(new org.cyberneko.html.parsers.SAXParser()).parseText(fis.text) Versions I've tried: http://johnrellis.blogspot.com/2009/08/hmmm_04.html import groovy.xml.StreamingMarkupBuilder import groovy.xml.XmlUtil def streamingMarkupBuilder=new StreamingMarkupBuilder() println XmlUtil.serialize(streamingMarkupBuilder.bind{mkp.yield html}) http://old.nabble.com/How-to-print-XmlSlurper%27s-NodeChild-with-indentation--td16857110.html // Output import groovy.xml.MarkupBuilder import groovy.xml.StreamingMarkupBuilder import groovy.util.XmlNodePrinter import groovy.util.slurpersupport.NodeChild def printNode(NodeChild node) { def writer = new StringWriter() writer << new StreamingMarkupBuilder().bind { mkp.declareNamespace('':node[0].namespaceURI()) mkp.yield node } new XmlNodePrinter().print(new XmlParser().parseText(writer.toString())) } Any advice? Thank you! Misha

    Read the article

  • XmlSlurper/NekoHTML document fragment parsing - No HTML or BODY tags wanted

    - by Misha Koshelev
    Dear All, I am trying to parse the following HTML fragment, and I would like to get the same fragment as output (without HTML and BODY tags). Is this possible? If so, how? Thank you Misha p.s. I am reading here: http://nekohtml.sourceforge.net/faq.html#fragments and I believe I have added the correct options below. However, the output is still incorrect :( Thank you Misha import groovy.xml.MarkupBuilder import groovy.xml.StreamingMarkupBuilder import groovy.util.XmlNodePrinter import groovy.util.slurpersupport.NodeChild def text=""" <div><h2>Test</h2> <div>Hi</div> </div> """ // Parse def config=new org.cyberneko.html.HTMLConfiguration() config.setFeature("http://cyberneko.org/html/features/balance-tags/document-fragment",true) def html=new XmlSlurper(new org.cyberneko.html.parsers.SAXParser()).parseText(text) // Output def printNode(NodeChild node) { def writer = new StringWriter() writer << new StreamingMarkupBuilder().bind { mkp.declareNamespace('':node[0].namespaceURI()) mkp.yield node } new XmlNodePrinter().print(new XmlParser().parseText(writer.toString())) } printNode(html) Output: <HTML> <tag0:HEAD xmlns:tag0="http://www.w3.org/1999/xhtml"/> <BODY> <DIV> <H2> Test </H2> <DIV> Hi </DIV> </DIV> </BODY> </HTML>

    Read the article

  • Groovy XmlSlurper

    - by Langali
    I am trying to parse a html file using Groovy XmlSlurper. <div id="users"> <h1>Name: Joe Doe</h1> <div id="user"> <div id="user_summary">Game: 1</div> <object width="640" height="385"><param name="movie" value="http://www.youtube.com/v/DApLO_HDhD0&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/DApLO_HDhD0&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"></embed></object> </div> <div id="user"> <div id="user_summary">Game: 2</div> ... </div> <div id="user"> .... </div> </div> <div id="featured_users"> <div id="user"> ... </div> <div id="user"> .... </div> </div> I need to grab each user (and not featured user) with his name, summary and object tag (which the video embed code). Anybody wanna give it a shot? Here's a start: def parser =new XmlSlurper(new org.ccil.cowan.tagsoup.Parser()) def response = parser.parseText(htmlString) def users = response.depthFirst().collect { it }.findAll { it.@id == "users" } users.each { ...... } I cant seem to be able to get much further:

    Read the article

  • Groovy XmlSlurper

    - by Langali
    <div id="videos"> <div id="video"> <embedcode>....</embedcode> </div> </div> I need to grab the video embed code, and not just the text inside a XMl tag. Any idea how can I grab the snippet of XML using XmlSlurper? I need the whole line: <embedcode>....</embedcode>

    Read the article

  • HTTP Builder/Groovy - get source text _and_ XmlSlurper output?

    - by Misha Koshelev
    Dear All: I am reading here: http://groovy.codehaus.org/modules/http-builder/doc/get.html I seem to be able to get i) XMLSlurper output as parsed by NekoHTML using: def http = new HTTPBuilder('http://www.google.com') def html = http.get( path : '/search', query : [q:'Groovy'] ) ii) Raw text using: http.get( path : '/search', contentType : TEXT, query : [q:'Groovy'] ) { resp, reader -> println "response status: ${resp.statusLine}" println 'Headers: -----------' resp.headers.each { h -> println " ${h.name} : ${h.value}" } println 'Response data: -----' System.out << reader println '\n--------------------' } I am having some trouble and would like to get BOTH (i) and (ii) to debug my XmlSlurper code on the actual html I am getting. Any suggestions how I might go about doing this? I can easily instantiate an XmlSlurper object with the relevant string using the parseString(string) method or the parse(reader) method, but I cannot seem to get the Neko processing step correct. Any hints? Thank you! Misha

    Read the article

  • XML Parsing in Groovy strips attribute new lines

    - by Bill James
    I'm writing code where I retrieve XML from a web api, then parse that XML using Groovy. Unfortunately, it seems that both XmlParser and XmlSlurper for Groovy strip newline characters from the attributes of nodes when .text() is called. How can I get at the text of the attribute including the newlines? Sample code: def xmltest = ''' <snippet> <preSnippet att1="testatt1" code="This is line 1 This is line 2 This is line 3" > <lines count="10" /> </preSnippet> </snippet>''' def parsed = new XmlParser().parseText( xmltest ) println "Parsed" parsed.preSnippet.each { pre -> println pre.attribute('code'); } def slurped = new XmlSlurper().parseText( xmltest ) println "Slurped" slurped.children().each { preSnip -> println [email protected]() } the output of which is: Parsed This is line 1 This is line 2 This is line 3 Slurped This is line 1 This is line 2 This is line 3

    Read the article

  • Pros and cons of using Grails compared to pure Groovy

    - by shabunc
    Say, you (by you I mean an abstract guy, any guy in your team) have experience of writing and building java web apps, know about filters, servlet mappings and so on, and so on. Also, let us assume you know pretty well any sql db, no matter which one exactly, whether it mysql, oracle or psql. At last, let pretend we know Groovy and its standard libraries, for example all that JsonBuilder and XmlSlurper stuff, so we don't need grails converters. The question is - what are benefits of using grails in this case. I'm not trying to start flame war, I'm just asking to compare - what are ups and downs of grails development compared to pure groovy one. For instance, off the top of my head I can name two pluses - automatic DB mapping and custom gsp tags. But when I want to write a modest app which provides small API for handling some well defined set of data, I'm totally OK with groovy's awesome SQL support. As for gsp, we does not use it at all, so we are not interested in custom tags as well.

    Read the article

  • Interfacing HTTPBuilder and HTMLUnit... some code

    - by Misha Koshelev
    Ok, this isn't even a question: import com.gargoylesoftware.htmlunit.HttpMethod import com.gargoylesoftware.htmlunit.WebClient import com.gargoylesoftware.htmlunit.WebResponseData import com.gargoylesoftware.htmlunit.WebResponseImpl import com.gargoylesoftware.htmlunit.util.Cookie import com.gargoylesoftware.htmlunit.util.NameValuePair import static groovyx.net.http.ContentType.TEXT import java.io.File import java.util.logging.Logger import org.apache.http.impl.cookie.BasicClientCookie /** * HTTPBuilder class * * Allows Javascript processing using HTMLUnit * * @author Misha Koshelev */ class HTTPBuilder { /** * HTTP Builder - implement this way to avoid underlying logging output */ def httpBuilder /** * Logger */ def logger /** * Directory for storing HTML files, if any */ def saveDirectory=null /** * Index of current HTML file in directory */ def saveIdx=1 /** * Current page text */ def text=null /** * Response for processJavascript (Complex Version) */ def resp=null /** * URI for processJavascript (Complex Version) */ def uri=null /** * HttpMethod for processJavascript (Complex Version) */ def method=null /** * Default constructor */ public HTTPBuilder() { // New HTTPBuilder httpBuilder=new groovyx.net.http.HTTPBuilder() // Logging logger=Logger.getLogger(this.class.name) } /** * Constructor that allows saving output files for testing */ public HTTPBuilder(saveDirectory,saveIdx) { this() this.saveDirectory=saveDirectory this.saveIdx=saveIdx } /** * Save text and return corresponding XmlSlurper object */ public saveText() { if (saveDirectory) { def file=new File(saveDirectory.toString()+File.separator+saveIdx+".html") logger.finest "HTTPBuilder.saveText: file=\""+file.toString()+"\"" file<<text saveIdx++ } new XmlSlurper(new org.cyberneko.html.parsers.SAXParser()).parseText(text) } /** * Wrapper around supertype get method */ public Object get(Map<String,?> args) { logger.finer "HTTPBuilder.get: args=\""+args+"\"" args.contentType=TEXT httpBuilder.get(args) { resp,reader-> text=reader.text this.resp=resp this.uri=args.uri this.method=HttpMethod.GET saveText() } } /** * Wrapper around supertype post method */ public Object post(Map<String,?> args) { logger.finer "HTTPBuilder.post: args=\""+args+"\"" args.contentType=TEXT httpBuilder.post(args) { resp,reader-> text=reader.text this.resp=resp this.uri=args.uri this.method=HttpMethod.POST saveText() } } /** * Load cookies from specified file */ def loadCookies(file) { logger.finer "HTTPBuilder.loadCookies: file=\""+file.toString()+"\"" file.withObjectInputStream { ois-> ois.readObject().each { cookieMap-> def cookie=new BasicClientCookie(cookieMap.name,cookieMap.value) cookieMap.remove("name") cookieMap.remove("value") cookieMap.entrySet().each { entry-> cookie."${entry.key}"=entry.value } httpBuilder.client.cookieStore.addCookie(cookie) } } } /** * Save cookies to specified file */ def saveCookies(file) { logger.finer "HTTPBuilder.saveCookies: file=\""+file.toString()+"\"" def cookieMaps=new ArrayList(new LinkedHashMap()) httpBuilder.client.cookieStore.getCookies().each { cookie-> def cookieMap=[:] cookieMap.version=cookie.version cookieMap.name=cookie.name cookieMap.value=cookie.value cookieMap.domain=cookie.domain cookieMap.path=cookie.path cookieMap.expiryDate=cookie.expiryDate cookieMaps.add(cookieMap) } file.withObjectOutputStream { oos-> oos.writeObject(cookieMaps) } } /** * Process Javascript using HTMLUnit (Simple Version) */ def processJavascript() { logger.finer "HTTPBuilder.processJavascript (Simple)" def webClient=new WebClient() def tempFile=File.createTempFile("HTMLUnit","") tempFile<<text def page=webClient.getPage("file://"+tempFile.toString()) webClient.waitForBackgroundJavaScript(10000) text=page.asXml() webClient.closeAllWindows() tempFile.delete() saveText() } /** * Process Javascript using HTMLUnit (Complex Version) * Closure, if specified, used to determine presence of necessary elements */ def processJavascript(closure) { logger.finer "HTTPBuilder.processJavascript (Complex)" // Convert response headers def headers=new ArrayList() resp.allHeaders.each() { header-> headers.add(new NameValuePair(header.name,header.value)) } def responseData=new WebResponseData(text.bytes,resp.statusLine.statusCode,resp.statusLine.toString(),headers) def response=new WebResponseImpl(responseData,uri.toURL(),method,0) // Transfer cookies def webClient=new WebClient() httpBuilder.client.cookieStore.getCookies().each { cookie-> webClient.cookieManager.addCookie(new Cookie(cookie.domain,cookie.name,cookie.value,cookie.path,cookie.expiryDate,cookie.isSecure())) } def page=webClient.loadWebResponseInto(response,webClient.getCurrentWindow()) // Wait for condition if (closure) { for (i in 1..20) { if (closure(page)) { break; } synchronized(page) { page.wait(500); } } } // Return text text=page.asXml() webClient.closeAllWindows() saveText() } } Allows one to interface HTTPBuilder with HTMLUnit! Enjoy Misha

    Read the article

  • Show progress during a long Ajax call

    - by kousen
    I have a simple web site (http://www.kousenit.com/twitterfollowervalue) that computes a quantity based on a person's Twitter followers. Since the Twitter API only returns followers 100 at a time, a complete process may involve lots of calls. At the moment, I have an Ajax call to a method that runs the Twitter loop. The method looks like (Groovy): def updateFollowers() { def slurper = new XmlSlurper() followers = [] def next = -1 while (next) { def url = "http://api.twitter.com/1/statuses/followers.xml?id=$id&cursor=$next" def response = slurper.parse(url) response.users.user.each { u -> followers << new TwitterUser(... process data ...) } next = response.next_cursor.toBigInteger() } return followers } This is invoked from a controller called renderTTFV.groovy. I call the controller via an Ajax call using the prototype library: On my web page, in the header section (JavaScript): function displayTTFV() { new Ajax.Updater('ttfv','renderTTFV.groovy', {}); } and there's a div in the body of the page that updates when the call is complete. Everything is working, but the updateFollowers method can take a fair amount of time. Is there some way I could return a progress value? For example, I'd like to update the web page on every iteration. I know ahead of time how many iterations there will be. I just can't figure out a way to update the page in the middle of that loop. Any suggestions would be appreciated.

    Read the article

  • HTTP Builder/Groovy - lost 302 (redirect) handling?

    - by Misha Koshelev
    Dear All: I am reading here http://groovy.codehaus.org/modules/http-builder/doc/handlers.html "In cases where a response sends a redirect status code, this is handled internally by Apache HttpClient, which by default will simply follow the redirect by re-sending the request to the new URL. You do not need to do anything special in order to follow 302 responses." This seems to work fine when I simply use the get() or post() methods without a closure. However, when I use a closure, I seem to lose 302 handling. Is there some way I can handle this myself? Thank you p.s. Here is my log output showing it is a 302 response [java] FINER: resp.statusLine: "HTTP/1.1 302 Found" Here is the relevant code: // Copyright (C) 2010 Misha Koshelev. All Rights Reserved. package com.mksoft.fbbday.main import groovyx.net.http.ContentType import java.util.logging.Level import java.util.logging.Logger class HTTPBuilder { def dataDirectory HTTPBuilder(dataDirectory) { this.dataDirectory=dataDirectory } // Main logic def logger=Logger.getLogger(this.class.name) def closure={resp,reader-> logger.finer("resp.statusLine: \"${resp.statusLine}\"") if (logger.isLoggable(Level.FINEST)) { def respHeadersString='Headers:'; resp.headers.each() { header->respHeadersString+="\n\t${header.name}=\"${header.value}\"" } logger.finest(respHeadersString) } def text=reader.text def lastHtml=new File("${dataDirectory}${File.separator}last.html") if (lastHtml.exists()) { lastHtml.delete() } lastHtml<<text new XmlSlurper(new org.cyberneko.html.parsers.SAXParser()).parseText(text) } def processArgs(args) { if (logger.isLoggable(Level.FINER)) { def argsString='Args:'; args.each() { arg->argsString+="\n\t${arg.key}=\"${arg.value}\"" } logger.finer(argsString) } args.contentType=groovyx.net.http.ContentType.TEXT args } // HTTPBuilder methods def httpBuilder=new groovyx.net.http.HTTPBuilder () def get(args) { httpBuilder.get(processArgs(args),closure) } def post(args) { args.contentType=groovyx.net.http.ContentType.TEXT httpBuilder.post(processArgs(args),closure) } } Here is a specific tester: #!/usr/bin/env groovy import groovyx.net.http.HTTPBuilder import groovyx.net.http.Method import static groovyx.net.http.ContentType.URLENC import java.util.logging.ConsoleHandler import java.util.logging.Level import java.util.logging.Logger // MUST ENTER VALID FACEBOOK EMAIL AND PASSWORD BELOW !!! def email='' def pass='' // Remove default loggers def logger=Logger.getLogger('') def handlers=logger.handlers handlers.each() { handler->logger.removeHandler(handler) } // Log ALL to Console logger.setLevel Level.ALL def consoleHandler=new ConsoleHandler() consoleHandler.setLevel Level.ALL logger.addHandler(consoleHandler) // Facebook - need to get main page to capture cookies def http = new HTTPBuilder() http.get(uri:'http://www.facebook.com') // Login def html=http.post(uri:'https://login.facebook.com/login.php?login_attempt=1',body:[email:email,pass:pass]) assert html==null // Why null? html=http.post(uri:'https://login.facebook.com/login.php?login_attempt=1',body:[email:email,pass:pass]) { resp,reader-> assert resp.statusLine.statusCode==302 // Shouldn't we be redirected??? // http://groovy.codehaus.org/modules/http-builder/doc/handlers.html // "In cases where a response sends a redirect status code, this is handled internally by Apache HttpClient, which by default will simply follow the redirect by re-sending the request to the new URL. You do not need to do anything special in order to follow 302 responses. " } Here are relevant logs: FINE: Receiving response: HTTP/1.1 302 Found Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << HTTP/1.1 302 Found Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Cache-Control: private, no-store, no-cache, must-revalidate, post-check=0, pre-check=0 Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Expires: Sat, 01 Jan 2000 00:00:00 GMT Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Location: http://www.facebook.com/home.php? Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << P3P: CP="DSP LAW" Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Pragma: no-cache Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Set-Cookie: datr=1275687438-9ff6ae60a89d444d0fd9917abf56e085d370277a6e9ed50c1ba79; expires=Sun, 03-Jun-2012 21:37:24 GMT; path=/; domain=.facebook.com Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Set-Cookie: lxe=koshelev%40post.harvard.edu; expires=Tue, 28-Sep-2010 15:24:04 GMT; path=/; domain=.facebook.com; httponly Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Set-Cookie: lxr=deleted; expires=Thu, 04-Jun-2009 21:37:23 GMT; path=/; domain=.facebook.com; httponly Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Set-Cookie: pk=183883c0a9afab1608e95d59164cc7dd; path=/; domain=.facebook.com; httponly Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Content-Type: text/html; charset=utf-8 Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << X-Cnection: close Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Date: Fri, 04 Jun 2010 21:37:24 GMT Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.DefaultClientConnection receiveResponseHeader FINE: << Content-Length: 0 Jun 4, 2010 4:37:22 PM org.apache.http.client.protocol.ResponseProcessCookies processCookies FINE: Cookie accepted: "[version: 0][name: datr][value: 1275687438-9ff6ae60a89d444d0fd9917abf56e085d370277a6e9ed50c1ba79][domain: .facebook.com][path: /][expiry: Sun Jun 03 16:37:24 CDT 2012]". Jun 4, 2010 4:37:22 PM org.apache.http.client.protocol.ResponseProcessCookies processCookies FINE: Cookie accepted: "[version: 0][name: lxe][value: koshelev%40post.harvard.edu][domain: .facebook.com][path: /][expiry: Tue Sep 28 10:24:04 CDT 2010]". Jun 4, 2010 4:37:22 PM org.apache.http.client.protocol.ResponseProcessCookies processCookies FINE: Cookie accepted: "[version: 0][name: lxr][value: deleted][domain: .facebook.com][path: /][expiry: Thu Jun 04 16:37:23 CDT 2009]". Jun 4, 2010 4:37:22 PM org.apache.http.client.protocol.ResponseProcessCookies processCookies FINE: Cookie accepted: "[version: 0][name: pk][value: 183883c0a9afab1608e95d59164cc7dd][domain: .facebook.com][path: /][expiry: null]". Jun 4, 2010 4:37:22 PM org.apache.http.impl.client.DefaultRequestDirector execute FINE: Connection can be kept alive indefinitely Jun 4, 2010 4:37:22 PM groovyx.net.http.HTTPBuilder doRequest FINE: Response code: 302; found handler: post302$_run_closure2@7023d08b Jun 4, 2010 4:37:22 PM groovyx.net.http.HTTPBuilder doRequest FINEST: response handler result: null Jun 4, 2010 4:37:22 PM org.apache.http.impl.conn.SingleClientConnManager releaseConnection FINE: Releasing connection org.apache.http.impl.conn.SingleClientConnManager$ConnAdapter@605b28c9 You can see there is clearly a location argument. Thank you Misha

    Read the article

1