The proper way to script periodically pulling a page from an https site

Posted by DarthShader on Stack Overflow See other posts from Stack Overflow or by DarthShader
Published on 2010-05-20T15:39:46Z Indexed on 2010/05/20 15:40 UTC
Read the original article Hit count: 173

Filed under:
|
|

I want to create a command-line script for Cygwin/Bash that logs into a site, navigates to a specific page and compares it with the results of the last run. So far, I have it working with Lynx like so:

----snpipped, just setting variables----
echo "# Command logfile created by Lynx 2.8.5rel.5 (29 Oct 2005)
----snipped the recorded keystrokes-------
key Right Arrow
key p
key Right Arrow
key ^U" >> $tmp1 #p, right arrow initiate the page saving

#"type" the filename inside the "where to save" dialog
for i in $(seq 0 $((${#tmp2} - 1)))
do
    echo "key ${tmp2:$i:1}" >> $tmp1
done

#hit enter and quit
echo "key ^J
key y
key q
key y
" >> $tmp1

lynx -accept_all_cookies -cmd_script=$tmp1 https://thewebpage.com/login

diff $tmp2 $oldComp
mv $tmp2 $oldComp

It definitely does not feel "right": the cmd_script consists of relative user actions instead of specifying the exact link names and actions. So, if anything on the site ever changes, switches places, or a new link is added - I will have to re-create the actions.

Also, I can't check for any errors so I can't abort the script if something goes wrong (login failed, etc)

Another alternative I have been looking at is Mechanize with Ruby (as a note - I have 0 experience with Ruby).

What would be the best way to improve or rewrite this?

© Stack Overflow or respective owner

Related posts about shell-scripting

Related posts about scripting