Is there a way to get YQL to return HTML?

Posted by Joe Shaw on Stack Overflow See other posts from Stack Overflow or by Joe Shaw
Published on 2010-04-02T16:35:38Z Indexed on 2010/04/04 1:53 UTC
Read the original article Hit count: 413

Filed under:
|
|
|

I am trying to use YQL to extract a portion of HTML from a series of web pages. The pages themselves have slightly different structure (so a Yahoo Pipes "Fetch Page" with its "Cut content" feature does not work well) but the fragment I am interested in always has the same class attribute.

If I have an HTML page like this:

<html>
  <body>
    <div class="foo">
      <p>Wolf</p>
      <ul>
        <li>Dog</li>
        <li>Cat</li>
      </ul>
    </div>
  </body>
</html>

and use a YQL expression like this:

SELECT * FROM html 
WHERE url="http://example.com/containing-the-fragment-above" 
AND xpath="//div[@class='foo']"

what I get back are the (apparently unordered?) DOM elements, where what I want is the HTML content itself. I've tried SELECT content as well, but that only selects textual content. I want HTML. Is this possible?

© Stack Overflow or respective owner

Related posts about yql

Related posts about yahoo-pipes