Search Results

Search found 1 results on 1 pages for 'phpwns'.

Page 1/1 | 1 

  • Anyone have a good solution for scraping the HTML source of a page with content (in this case, HTML

    - by phpwns
    Anyone have a good solution for scraping the HTML source of a page with content (in this case, HTML tables) generated with Javascript? An embarrassingly simple, though workable solution using Crowbar: <?php function get_html($url) // $url must be urlencode(d) { $context = stream_context_create(array( 'http' => array('timeout' => 120) // HTTP timeout in seconds )); $html = substr(file_get_contents('http://127.0.0.1:10000/?url=' . $url . '&delay=3000&view=browser', 0, $context), 730, -32); // substr removes HTML from the Crowbar web service, returning only the $url HTML return $html; } ?> The advantage to using Crowbar is that the tables will be rendered (and accessible) thanks to the headless mozilla-based browser. The problem, of course, is being dependent on on an external web service, especially given that SIMILE seems to undergo regular server maintenance. :( A pure php solution would be nice, but any functional (and reliable) alternatives would be great.

    Read the article

1