Nokogiri find only inbound links

Posted by astropanic on Stack Overflow See other posts from Stack Overflow or by astropanic
Published on 2010-05-26T14:21:17Z Indexed on 2010/05/27 1:51 UTC
Read the original article Hit count: 239

Filed under:
|

I have an html document located on http://somedomain.com/somedir/example.html

The document contains of four links:

http://otherdomain.com/other.html

http://somedomain.com/other.html

/only.html

test.html

How I can get the full urls for the links in the current domain ?

I mean I should get:

http://somedomain.com/other.html

http://somedomain.com/only.html

http://somedomain.com/somedir/test.html

The first link should be ignored because it does'nt match my domain

© Stack Overflow or respective owner

Related posts about screen-scraping

Related posts about nokogiri