Search Results

Search found 101 results on 5 pages for 'mechanize'.

Page 1/5 | 1 2 3 4 5  | Next Page >

  • form submitting with mechanize and Python

    - by MATELIN Alexis
    I'm trying to scrap a website that requires to submit two forms : a first one to loggin and a second one to specify my research. I'm using Python and the mechanize package. No problem with the first one, but i just can't figure out how to pass through the second one. Here is the part of my code related to the firm above-mentionned agemin=18 agemax=25 by='region' country='France' region=2 newcustomers=1 browser.select_form(nr=0) browser['age[min]']=agemin browser['age[max]']=agemax browser['country']=country browser['region']=region browser['by']=by browser['new-customers']=newcustomers response=browser.submit() content=response.read() but when I submit the variable 'age[min]' by example, I get the following error message : TypeError: object of type 'int' has no len() to give you some more informations, here is what I get with 'print br.form' <POST http://www.adopteunmec.com/qsearch/ajax_quick application/x-www-form-urlencoded <SelectControl(age[min]=[, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, *30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])> <SelectControl(age[max]=[, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, *45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])> <SelectControl(by=[*region, distance])> <SelectControl(country=[*fr, be, ch, ca])> <SelectControl(region=[*1, 2, 3, 4, 5, 6, 7, 8, 22, 23, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 11])> <SelectControl(distance[min]=[*, 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000])> <SelectControl(distance[max]=[, 0, 10, 20, 30, 40, 50, 60, 70, *80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000])> <CheckboxControl(new=[*1])>> My guess is that the form needs an object (like a list) containing all the variables to accept it ; that's why it refuses the variables submited one by one. Thank you in advance for any help ! Alexis

    Read the article

  • python mechanize.browser submit() related problem

    - by paul
    Hello All im making some script with mechanize.browser module. one of problem is all other thing is ok, but when submit() form,it not working, so i was found some suspicion source part. in the html source i was found such like following. <form method="post" onsubmit="return loginCheck(this)" name="FRMLOGIN"/> im thinking, loginCheck(this) making problem when submit form. but how to handle this kind of javascript function with mechanize module ,so i can successfully submit form and can receive result? folloing is my current script source. if anyone can help me ..much appreciate!! # -*- coding: cp949-*- import sys,os import mechanize, urllib import cookielib from BeautifulSoup import BeautifulSoup,BeautifulStoneSoup,Tag import datetime, time, socket import re,sys,os,mechanize,urllib,time br = mechanize.Browser() cj = cookielib.LWPCookieJar() br.set_cookiejar(cj) # Browser options br.set_handle_equiv(True) br.set_handle_gzip(True) br.set_handle_redirect(True) br.set_handle_referer(True) br.set_handle_robots(False) # Follows refresh 0 but not hangs on refresh > 0 br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1) # Want debugging messages? br.set_debug_http(True) br.set_debug_redirects(True) br.set_debug_responses(True) # User-Agent (this is cheating, ok?) br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.6')] br.open('http://user.buddybuddy.co.kr/Login/LoginForm.asp?URL=') html = br.response().read() print html br.select_form(name='FRMLOGIN') print br.viewing_html() br.form['ID']='zero1zero2' br.form['PWD']='012045' br.submit() print br.response().read()

    Read the article

  • Python and mechanize login script

    - by Perun
    Hi fellow programmers! I am trying to write a script to login into my universities "food balance" page using python and the mechanize module... This is the page I am trying to log into: http://www.wcu.edu/11407.asp The website has the following form to login: <FORM method=post action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx><INPUT value=https://cf.wcu.edu/busafrs/catcard/idsearch.cfm type=hidden name=wcuirs_uri> <P><B>WCU ID Number<BR></B><INPUT maxLength=12 size=12 type=password name=id> </P> <P><B>PIN<BR></B><INPUT maxLength=20 type=password name=PIN> </P> <P></P> <P><INPUT value="Request Access" type=submit name=submit> </P></FORM> From this we know that I need to fill in the following fields: 1. name=id 2. name=PIN With the action: action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx This is the script I have written thus far: #!/usr/bin/python2 -W ignore import mechanize, cookielib from time import sleep url = 'http://www.wcu.edu/11407.asp' myId = '11111111111' myPin = '22222222222' # Browser #br = mechanize.Browser() #br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)) br = mechanize.Browser(factory=mechanize.RobustFactory()) # Use this because of bad html tags in the html... # Cookie Jar cj = cookielib.LWPCookieJar() br.set_cookiejar(cj) # Browser options br.set_handle_equiv(True) br.set_handle_gzip(True) br.set_handle_redirect(True) br.set_handle_referer(True) br.set_handle_robots(False) # Follows refresh 0 but not hangs on refresh > 0 br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1) # User-Agent (fake agent to google-chrome linux x86_64) br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'), ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'), ('Accept-Encoding', 'gzip,deflate,sdch'), ('Accept-Language', 'en-US,en;q=0.8'), ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')] # The site we will navigate into br.open(url) # Go though all the forms (for debugging only) for f in br.forms(): print f # Select the first (index two) form br.select_form(nr=2) # User credentials br.form['id'] = myId br.form['PIN'] = myPin br.form.action = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx' # Login br.submit() # Wait 10 seconds sleep(10) # Save to a file f = file('mycatpage.html', 'w') f.write(br.response().read()) f.close() Now the problem... For some odd reason the page I get back (in mycatpage.html) is the login page and not the expected page that displays my "cat cash balance" and "number of block meals" left... Does anyone have any idea why? Keep in mind that everything is correct with the header files and while the id and pass are not really 111111111 and 222222222, the correct values do work with the website (using a browser...) Thanks in advance EDIT Another script I tried: from urllib import urlopen, urlencode import urllib2 import httplib url = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx' myId = 'xxxxxxxx' myPin = 'xxxxxxxx' data = { 'id':myId, 'PIN':myPin, 'submit':'Request Access', 'wcuirs_uri':'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm' } opener = urllib2.build_opener() opener.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'), ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'), ('Accept-Encoding', 'gzip,deflate,sdch'), ('Accept-Language', 'en-US,en;q=0.8'), ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')] request = urllib2.Request(url, urlencode(data)) open("mycatpage.html", 'w').write(opener.open(request)) This has the same behavior...

    Read the article

  • Can python mechanize handle HTTP auth?

    - by Shekhar
    Mechanize (Python) is failing with 401 for me to open http digest URLs. I googled and tried debugging but no success. My code looks like this. import mechanize project = "test" baseurl = "http://trac.somewhere.net" loginurl = "%s/%s/login" % (baseurl, project) b = mechanize.Browser() b.add_password(baseurl, "user", "secret", "some Realm") b.open(loginurl)

    Read the article

  • WWW::Mechanize trouble with meta refresh from bank login

    - by J Miller
    I am trying to use perl's WWW::Mechanize to login to my bank and pull transaction information. After logging in through a browser to my bank (Wells Fargo), it briefly displays a temporary web page saying something along the lines of "please wait while we verify your identity". After a few seconds it proceeds to the bank's webpage where I can get my bank data. The only difference is that the URL contains several more "GET" parameters appended to the URL of the temporary page, which only had a sessionID parameter. I was able to successfully get WWW::Mechanize to login from the login page, but it gets stuck on the temporary page. There is a <meta http-equiv="Refresh"... tag in the header, so I tried $mech->follow_meta_redirect but it didn't get me past that temporary page either. Any help to get past this would be appreciated. Thanks in advance. Here is the barebones code that gets me stuck at the temporary page: #!/usr/bin/perl -w use strict; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->agent_alias( 'Linux Mozilla' ); $mech->get( "https://www.wellsfargo.com" ); $mech->submit_form( form_number => 2, fields => { userid => "$userid", password => "$password" }, button => "btnSignon" );

    Read the article

  • How to set the mechanize page encoding?

    - by Juan Medín
    Hi, I'm trying to get a page with an ISO-8859-1 encoding clicking on a link, so the code is similar to this: page_result = page.link_with( :text => 'link_text' ).click So far I get the result with a wrong encoding, so I see characters like: 'T?tulo:' instead of 'Título:' I've tried several approaches, including: Stating the encoding in the first request using the agent like: @page_search = @agent.get( :url => 'http://www.server.com', :headers => { 'Accept-Charset' => 'ISO-8859-1' } ) Stating the encoding for the page itself page_result.encoding = 'ISO-8859-1' But I must be doing something wrong: a simple puts always show the wrong characters. Do you know how to state the encoding? Thanks in advance, Added: Executable example: require 'rubygems' require 'mechanize' WWW::Mechanize::Util::CODE_DIC[:SJIS] = "ISO-8859-1" @agent = WWW::Mechanize.new @page = @agent.get( :url => 'http://www.mcu.es/webISBN/tituloSimpleFilter.do?cache=init&layout=busquedaisbn&language=es', :headers => { 'Accept-Charset' => 'utf-8' } ) puts @page.body

    Read the article

  • Mechanize Submit Form Error: Insufficient items with name '10427'

    - by maneh
    I'm trying to submit a form with Mechanize, I have tried different ways, but the problem persists. Can anyone help me on this. Thank you in advance! This is the form I want to submit: http://www.stpairways.st/ This is the code that I'm using: def stp_airways(url): import re import mechanize br = mechanize.Browser() br.open(url) print br.title() br.select_form(name = "frmbook") br.form['TypeTrajet'] = ["1"] br.form['id_depart'] = ["11967"] br.form['id_arrivee'] = ["10427"] br.form['txtDateAller'] = "5/7/2014" br.form['txtDateRetour'] = "12/7/2014" br.form['TypePassager1u1000r0b1'] = ["1"] br.form['TypePassager2u1000r0b1'] = ["0"] br.form['TypePassager3u1000r0b1'] = ["0"] br.form['CodeIsoDeviseClient'] = ["17,20,23,24,25,26,27,28,29,30,31,33,34,36,37,64,65,67,68,70,73,80,81,95,96,103,147,151,152,159,160,162,169,170TP1TPF"] br.form['CodeIsoDeviseClient'] = ["EUR"] # submit response1 = br.submit() print response1.read()

    Read the article

  • WWW::Mechanize for Objective C / iPhone?

    - by dan
    Hi, I want to port a python app that uses mechanize for the iPhone. This app needs to login to a webpage and using the site cookie to go to other pages on that site to get some data. With my python app I was using mechanize for automatic cookie management. Is there something similar for Objective C that is portable to the iPhone? Thanks for any help.

    Read the article

  • Mechanize on HTTPS site.

    - by Grzegorz Kazulak
    Have any of you guys/girls have used ruby's Mechanize library on a site that required SSL? The problem I'm experiencing at the minute is that when I try to access such a website the mechanize tries to use standard http protocol which results in endless redirections between http// and https://

    Read the article

  • Mechanize complex form input name

    - by ADAM
    Hi there i am trying to access a form in mechanize with ugly characters in the object name similar to this agent = Mechanize.new page = agent.get('http://domain.com) form = page.forms[0] form.ct600$Main$LastNameTextBox = "whatever" page = agent.submit(form) The problem is the $ in the html name is messing with ruby because Is there another method i could use ie: form.element_by_name("ct600$Main$LastNameTextBox") = "whatever" Unfortunately i cant change the html

    Read the article

  • Ruby Mechanize - Basic Get Failing

    - by hutch
    a = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Mac Safari' agent.history.max_size=0 } page = a.get('http://livingsocial.com/deals?preferred_city=18') Trying a very basic GET request using mechanize but get a 500, yet when I CURL I have no problems. Is there a problem with including parameters in a get() call? I know I am missing something simple

    Read the article

  • WWW::Mechanize-Question

    - by sid_com
    Hello! Are both of these versions OK or is one of them to prefer? #!/usr/bin/env perl use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); my $content; # 1 $mech->get( 'http://www.kernel.org' ); $content = $mech->content; say $content; # 2 my $res = $mech->get( 'http://www.kernel.org' ); $content = $res->content; say $content;

    Read the article

  • Using Python and Mechanize with ASP Forms

    - by tchaymore
    I'm trying to submit a form on an .asp page but Mechanize does not recognize the name of the control. The form code is: <form id="form1" name="frmSearchQuick" method="post"> .... <input type="button" name="btSearchTop" value="SEARCH" class="buttonctl" onClick="uf_Browse('dledir_search_quick.asp');" > My code is as follows: br = mechanize.Browser() br.open(BASE_URL) br.select_form(name='frmSearchQuick') resp = br.click(name='btSearchTop') I've also tried the last line as: resp = br.submit(name='btSearchTop') The error I get is: raise ControlNotFoundError("no control matching "+description) ControlNotFoundError: no control matching name 'btSearchTop', kind 'clickable' If I print br I get this: IgnoreControl(btSearchTop=) But I don't see that anywhere in the HTML. Any advice on how to submit this form?

    Read the article

  • How to set timeout with python-mechanize?

    - by Michal Cihar
    I'm using python-mechanize to scrape some web sites, which sometime simply don't respond to requests and these requests stay open too long, so I need to limit timeout for these requests. While using urlopen method, the timeout can be set using timeout parameter, but I have not found easy way for doing it with high level API such as submit or click methods. Ideally the timeout would be set just once for whole browser class and all calls would honor that. It would be probably possible to customize this by passing custom request_class to every click and submit call, but this would just pollute the code, so I'm looking for nicer solution for setting timeout for mechanize's browser class (and no, I don't want to change default socket timeout using socket.setdefaulttimeout).

    Read the article

  • Twill/Mechanize access to html content...

    - by Shaheeb Roshan
    Hello! Couple of questions regarding Twill and Mechanize: 1) Is Twill still relevant as a web-automation tool? If yes, then why is not currently maintained? If no, has Mechanize matured further to support Twill-style simple scripting? Or is there another package that has stepped up to fill the gap? 2) I was able to very quickly setup a couple of test suites in python using Twill, but I'm a little confused on how to access the information that Twill spits out in my python program. That is, I can do showforms() and see the form values neatly listed and I can use fv to update the form values and submit. But how do I access one of those form values as a python var? How can I say something like: someField1Value = fv("1","someField1") Thanks! Shaheeb R.

    Read the article

  • process all links but external ones (ruby + mechanize)

    - by Radek
    I want to process all links but external ones from the whole web site. Is there any easy way how to identify that the link is external and skip it? My code looks so far like (the site url is passed through command line argument require 'mechanize' def process_page(page) puts puts page.title STDIN.gets page.links.each do |link| process_page($agent.get(link.href)) end end $agent = WWW::Mechanize.new $agent.user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.4) Gecko/20091016 Firefox/3.5.4' process_page($agent.get(ARGV[0]))

    Read the article

  • Python Mechanize unable to avoid redirect when Post

    - by Enric Geijo
    I am trying to crawl a site using mechanize. The site provides search results in different pages. When posting to get the next set of results, something is wrong and the server redirects me to the first page, asking mechanize to update the SearchSession Cookie. I have been debugging the requests using Firefox and they look quite the same, and I am unable to find the problem. Any suggestion? Below the requests: ----------- FIRST THE RIGHT SEQUENCE, USING TAMPER IN FIREFOX ------------------------- POST XXX/JobSearch/Results.aspx?Keywords=Python&LTxt=London%2c+South+East&Radius=0&LIds2=ZV&clid=1621&cltypeid=2&clName=London Load Flags[LOAD_DOCUMENT_URI LOAD_INITIAL_DOCUMENT_URI ] Content Size[-1] Mime Type[text/html] Request Headers: Host[www.cwjobs.co.uk] User-Agent[Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100401 Ubuntu/9.10 (karmic) Firefox/3.5.9] Accept[text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8] Accept-Language[en-us,en;q=0.5] Accept-Encoding[gzip,deflate] Accept-Charset[ISO-8859-1,utf-8;q=0.7,*;q=0.7] Keep-Alive[300] Connection[keep-alive] Referer[XXX/JobSearch/Results.aspx?Keywords=Python&LTxt=London%2c+South+East&Radius=0&LIds2=ZV&clid=1621&cltypeid=2&clName=London] Cookie[ecos=774803468-0; AnonymousUser=MemberId=acc079dd-66b6-4081-9b07-60d6955ee8bf&IsAnonymous=True; PJBIPPOPUP=; WT_FPC=id=86.181.183.106-2262469600.30073025:lv=1272812851736:ss=1272812789362; SearchSession=SessionGuid=71de63de-3bd0-4787-895d-b6b9e7c93801&LogSource=NAT] Post Data: __EVENTTARGET[srpPager%24btnForward] __EVENTARGUMENT[] hdnSearchResults[BV%2CA%2CC0P5x%2COou-%2CB4S-%2CBuC-%2CDzx-%2CHwn-%2CKPP-%2CIVA-%2CC9D-%2CH6X-%2CH7x-%2CJ0x-%2CCvX-%2CCra-%2COHa-%2CHhP-%2CCoj-%2CBlM-%2CE9W-%2CIm8-%2CBqG-%2CPFy-%2CN%2Fm-%2Ceaa%2CCvj-%2CCtJ-%2CCr7-%2CBpu-%2Cmh%2CMb6-%2CJ%2Fk-%2CHY8-%2COJ7-%2CNtF-%2CEya-%2CErT-%2CEo4-%2CEKU-%2CDnL-%2CC5M-%2CCyB-%2CBsD-%2CBrc-%2CBpU-%2Col%2C30%2CC1%2Cd4N%2COo8-%2COi0-%2CLz%2F-%2CLxP-%2CFyp-%2CFVR-%2CEHL-%2CPrP-%2CLmE-%2CK3H-%2CKXJ-%2CFyn%2CIcq-%2CIco-%2CIK4-%2CIIg-%2CH2k-%2CH0N-%2CHwp-%2CHvF-%2CFij-%2CFhl-%2CCwj-%2CCb5-%2CCQj-%2CCQh-%2CB%2B2-%2CBc6-%2ChFo%2CNLq-%2CNI%2F-%2CFzM-%2Cdu-%2CHg2-%2CBug-%2CBse-%2CB9Q-] __VIEWSTATE[%2FwEPDwUKLTkyMzI2ODA4Ng9kFgYCBA8WBB4EaHJlZgWJAWh0dHA6Ly93d3cuY3dqb2JzLmNvLnVrL0pvYlNlYXJjaC9SU1MuYXNweD9LZXl3b3Jkcz1QeXRob24mTFR4dD1Mb25kb24lMmMrU291dGgrRWFzdCZSYWRpdXM9MCZMSWRzMj1aViZjbGlkPTE2MjEmY2x0eXBlaWQ9MiZjbE5hbWU9TG9uZG9uHgV0aXRsZQUkTGF0ZXN0IFB5dGhvbiBqb2JzIGZyb20gQ1dKb2JzLmNvLnVrZAIGDxYCHgRUZXh0BV48bGluayByZWw9ImNhbm9uaWNhbCIgaHJlZj0iaHR0cDovL3d3dy5jd2pvYnMuY28udWsvSm9iU2Vla2luZy9QeXRob25fTG9uZG9uX2wxNjIxX3QyLmh0bWwiIC8%2BZAIIEGRkFg4CBw8WAh8CBV9Zb3VyIHNlYXJjaCBvbiA8Yj5LZXl3b3JkczogUHl0aG9uOyBMb2NhdGlvbjogTG9uZG9uLCBTb3V0aCBFYXN0OyA8L2I%2BIHJldHVybmVkIDxiPjg1PC9iPiBqb2JzLmQCCQ8WAh4HVmlzaWJsZWhkAgsPFgIfAgUoVGhlIG1vc3QgcmVsZXZhbnQgam9icyBhcmUgbGlzdGVkIGZpcnN0LmQCEw8PFgIeC05hdmlnYXRlVXJsBQF%2BZGQCFQ9kFgYCBQ8PFgYfAgUGUHl0aG9uHgtEZWZhdWx0VGV4dAUMZS5nLiBhbmFseXN0HhNEZWZhdWx0VGV4dENzc0NsYXNzZWRkAgsPDxYGHwIFEkxvbmRvbiwgU291dGggRWFzdB8FBQllLmcuIEJhdGgfBmVkZAIRDxAPFgYeDURhdGFUZXh0RmllbGQFClJhZGl1c05hbWUeDkRhdGFWYWx1ZUZpZWxkBQZSYWRpdXMeC18hRGF0YUJvdW5kZ2QQFREHMCBtaWxlcwcyIG1pbGVzBzUgbWlsZXMIMTAgbWlsZXMIMTUgbWlsZXMIMjAgbWlsZXMIMjUgbWlsZXMIMzAgbWlsZXMIMzUgbWlsZXMINDAgbWlsZXMINDUgbWlsZXMINTAgbWlsZXMINjAgbWlsZXMINzAgbWlsZXMIODAgbWlsZXMIOTAgbWlsZXMJMTAwIG1pbGVzFREBMAEyATUCMTACMTUCMjACMjUCMzACMzUCNDACNDUCNTACNjACNzACODACOTADMTAwFCsDEWdnZ2dnZ2dnZ2dnZ2dnZ2dnZGQCFw9kFgQCAQ9kFgQCBA8QZA8WA2YCAQICFgMQBQhBbGwgam9icwUBMGcQBRlEaXJlY3QgZW1wbG95ZXIgam9icyBvbmx5BQEyZxAFEEFnZW5jeSBqb2JzIG9ubHkFATFnZGQCBg8QZA8WA2YCAQICFgMQBQlSZWxldmFuY2UFATFnEAUERGF0ZQUBMmcQBQZTYWxhcnkFATNnZGQCBQ8PFgYeClBhZ2VOdW1iZXICAh4PTnVtYmVyT2ZSZXN1bHRzAlUeDlJlc3VsdHNQZXJQYWdlAhRkZAIZDxYCHwNoZGQ%3D] Refinesearch%24txtKeywords[Python] Refinesearch%24txtLocation[London%2C+South+East] Refinesearch%24ddlRadius[0] ddlCompanyType[0] ddlSort[1] Response Headers: Cache-Control[private] Date[Sun, 02 May 2010 16:09:27 GMT] Content-Type[text/html; charset=utf-8] Expires[Sat, 02 May 2009 16:09:27 GMT] Server[Microsoft-IIS/6.0] X-SiteConHost[P310] X-Powered-By[ASP.NET] X-AspNet-Version[2.0.50727] Set-Cookie[SearchSession=SessionGuid=71de63de-3bd0-4787-895d-b6b9e7c93801&LogSource=NAT; path=/] Content-Encoding[gzip] Vary[Accept-Encoding] Transfer-Encoding[chunked] -------- NOW WHAT I'AM SENDING USING MECHANIZE, SOME HEADERS ADDED, ETC ----------- POST /JobSearch/Results.aspx?Keywords=Python&LTxt=London%2c+South+East&Radius=0&LIds2=ZV&clid=1621&cltypeid=2&clName=London HTTP/1.1\r\nContent-Length: 2424\r\n Accept-Language: en-us,en;q=0.5\r\n Accept-Encoding: gzip\r\n Host: www.cwjobs.co.uk\r\n Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8\r\n Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n Connection: keep-alive\r\n Cookie: AnonymousUser=MemberId=8fa5ddd7-17ed-425e-b189-82693bfbaa0c&IsAnonymous=True; SearchSession=SessionGuid=33e4e439-c2d6-423f-900f-574099310d5a&LogSource=NAT\r\n Referer: XXX/JobSearch/Results.aspx?Keywords=Python&LTxt=London%2c+South+East&Radius=0&LIds2=ZV&clid=1621&cltypeid=2&clName=London\r\n Content-Type: application/x-www-form-urlencoded\r\n\r\n' '__EVENTTARGET=srpPager%24btnForward& __EVENTARGUMENT=& hdnSearchResults=BV%2CA%2CC0eif%2CMwc%2CM6s%2COou%2CK09%2CG4H%2CEZf%2CGTu%2CLrr%2CGuX%2CGs9%2CEz9%2CL5X%2CL9U%2ChU%2CHHf%2CMAL%2CNDi%2CJrY%2CGBy%2CM%2Bo%2CdE-%2CpI%2CtDI%2CL5L%2CL7l%2CL8z%2CM%2FA%2CPPP%2CCM0%2CEpK%2CHPy%2Cez%2C7p%2CJ2U%2CJ9b%2CJ%2F2%2CKea%2CLBj%2CLvi%2CL2t%2CM8r%2CM9S%2CM%2Fa%2CPRT%2CPgi%2Csg7%2CF6%2CI2F%2CJTd%2CO-%2CC0v%2CC3f%2CDCq%2CDxn%2CERl%2CUbV%2CGME%2CGMG%2CGd2%2CGgO%2CGyK%2CG0h%2CG4F%2CG5p%2CJGL%2CJHJ%2CKhj%2CL4L%2CMM1%2CMYL%2CMYN%2CMp4%2CNL0%2COrj%2CvuW%2CBdE%2CBfv%2CI1i%2CBCh-%2COLA%2CHH4%2CM6O%2CM8Q%2CMre& __VIEWSTATE=%2FwEPDwUKLTkyMzI2ODA4Ng9kFgYCBA8WBB4EaHJlZgWJAWh0dHA6Ly93d3cuY3dqb2JzLmNvLnVrL0pvYlNlYXJjaC9SU1MuYXNweD9LZXl3b3Jkcz1QeXRob24mTFR4dD1Mb25kb24lMmMrU291dGgrRWFzdCZSYWRpdXM9MCZMSWRzMj1aViZjbGlkPTE2MjEmY2x0eXBlaWQ9MiZjbE5hbWU9TG9uZG9uHgV0aXRsZQUkTGF0ZXN0IFB5dGhvbiBqb2JzIGZyb20gQ1dKb2JzLmNvLnVrZAIGDxYCHgRUZXh0BV48bGluayByZWw9ImNhbm9uaWNhbCIgaHJlZj0iaHR0cDovL3d3dy5jd2pvYnMuY28udWsvSm9iU2Vla2luZy9QeXRob25fTG9uZG9uX2wxNjIxX3QyLmh0bWwiIC8%2BZAIIEGRkFg4CBw8WAh8CBV9Zb3VyIHNlYXJjaCBvbiA8Yj5LZXl3b3JkczogUHl0aG9uOyBMb2NhdGlvbjogTG9uZG9uLCBTb3V0aCBFYXN0OyA8L2I%2BIHJldHVybmVkIDxiPjg1PC9iPiBqb2JzLmQCCQ8WAh4HVmlzaWJsZWhkAgsPFgIfAgUoVGhlIG1vc3QgcmVsZXZhbnQgam9icyBhcmUgbGlzdGVkIGZpcnN0LmQCEw8PFgIeC05hdmlnYXRlVXJsBQF%2BZGQCFQ9kFgYCBQ8PFgYfAgUGUHl0aG9uHgtEZWZhdWx0VGV4dAUMZS5nLiBhbmFseXN0HhNEZWZhdWx0VGV4dENzc0NsYXNzZWRkAgsPDxYGHwIFEkxvbmRvbiwgU291dGggRWFzdB8FBQllLmcuIEJhdGgfBmVkZAIRDxAPFgYeDURhdGFUZXh0RmllbGQFClJhZGl1c05hbWUeDkRhdGFWYWx1ZUZpZWxkBQZSYWRpdXMeC18hRGF0YUJvdW5kZ2QQFREHMCBtaWxlcwcyIG1pbGVzBzUgbWlsZXMIMTAgbWlsZXMIMTUgbWlsZXMIMjAgbWlsZXMIMjUgbWlsZXMIMzAgbWlsZXMIMzUgbWlsZXMINDAgbWlsZXMINDUgbWlsZXMINTAgbWlsZXMINjAgbWlsZXMINzAgbWlsZXMIODAgbWlsZXMIOTAgbWlsZXMJMTAwIG1pbGVzFREBMAEyATUCMTACMTUCMjACMjUCMzACMzUCNDACNDUCNTACNjACNzACODACOTADMTAwFCsDEWdnZ2dnZ2dnZ2dnZ2dnZ2dnZGQCFw9kFgQCAQ9kFgQCBA8QZA8WA2YCAQICFgMQBQhBbGwgam9icwUBMGcQBRlEaXJlY3QgZW1wbG95ZXIgam9icyBvbmx5BQEyZxAFEEFnZW5jeSBqb2JzIG9ubHkFATFnZGQCBg8QZA8WA2YCAQICFgMQBQlSZWxldmFuY2UFATFnEAUERGF0ZQUBMmcQBQZTYWxhcnkFATNnZGQCBQ8PFgYeClBhZ2VOdW1iZXICAR4PTnVtYmVyT2ZSZXN1bHRzAlUeDlJlc3VsdHNQZXJQYWdlAhRkZAIZDxYCHwNoZGQ%3D& Refinesearch%24txtKeywords=Python& Refinesearch%24txtLocation=London%2CSouth+East& Refinesearch%24ddlRadius=0& Refinesearch%24btnSearch=Search& ddlCompanyType=0& ddlSort=1'

    Read the article

  • WWW::Mechanize Perl login only works after relaunch

    - by Klaus
    Hello, I'm trying to login automatically in a website using Perl WWW::Mechanize. What I do is: $bot = WWW::Mechanize->new(); $bot->cookie_jar( HTTP::Cookies->new( file => "cookies.txt", autosave => 1, ignore_discard => 1, ) ); $response = $bot->get( 'http://blah.foo/login' ); $bot->form_number(1); $bot->field( usern => 'user' ); $bot->field( pass => 'pass' ); $response =$bot->click(); print $response->content(); $response = $bot->get( 'http://blah.foo' ); print $response->content(); The login works, but when I load the page it tells me that I am not connected. You see that I store cookies in a file. Now if I relaunch the script without the login part, it says that I am connected... Does anyone understand this strange behaviour ?

    Read the article

  • Get Mechanize to handle cookies from an arbitrary POST (to log into a website programmatically)

    - by Horace Loeb
    I want to log into https://www.t-mobile.com/ programmatically. My first idea was to use Mechanize to submit the login form: However, it turns out that this isn't even a real form. Instead, when you click "Log in" some javascript grabs the values of the fields, creates a new form dynamically, and submits it. "Log in" button HTML: <button onclick="handleLogin(); return false;" class="btnBlue" id="myTMobile-login"><span>Log in</span></button> The handleLogin() function: function handleLogin() { if (ValidateMsisdnPassword()) { // client-side form validation logic var a = document.createElement("FORM"); a.name = "form1"; a.method = "POST"; a.action = mytmoUrl; // defined elsewhere as https://my.t-mobile.com/Login/LoginController.aspx var c = document.createElement("INPUT"); c.type = "HIDDEN"; c.value = document.getElementById("myTMobile-phone").value; // the value of the phone number input field c.name = "txtMSISDN"; a.appendChild(c); var b = document.createElement("INPUT"); b.type = "HIDDEN"; b.value = document.getElementById("myTMobile-password").value; // the value of the password input field b.name = "txtPassword"; a.appendChild(b); document.body.appendChild(a); a.submit(); return true } else { return false } } I could simulate this form submission by POSTing the form data to https://my.t-mobile.com/Login/LoginController.aspx with Net::HTTP#post_form, but I don't know how to get the resultant cookie into Mechanize so I can continue to scrape the UI available when I'm logged in. Any ideas?

    Read the article

  • beautifulsoup and mechanize to get ajax call result

    - by nabizan
    hi im building a scraper using python 2.5 and beautifulsoup but im stuble upon a problem ... part of the web page is generating after user click on some button, whitch start an ajax request by calling specific javacsript function using proper parameters is there a way to simulate user interaction and get this result? i come across a mechanize module but it seems to me that this is mostly used to work with forms ... i would appreciate any links or some code samples thanks

    Read the article

  • Python mechanize to follow image links?

    - by Shark
    mechanize's Browser class is great and it's follow_link() function is great too. But what to do with this kind of links: <a href="http://example.com"><img src="…"></a> Is there any way to follow such links? The text attribute of this type of links is simply '[IMG]', so AFAIK, there is no way to differentiate such links. Any help would be appreciated.

    Read the article

  • Mechanize Javascript ...

    - by Horace Ho
    I try to submit a form by Mechanize, however, I am not sure how to add necessary form valuables which are done by some Javascript. Since Mechanize does not support Javascript yet, and so I try to add the variables manually. The form source: <form name="aspnetForm" method="post" action="list.aspx" language="javascript" onkeypress="javascript:return WebForm_FireDefaultButton(event, '_ctl0_ContentPlaceHolder1_cmdSearch')" id="aspnetForm"> <input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" /> <input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" /> <input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" /> <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/..." /> <script type="text/javascript"> <!-- var theForm = document.forms['aspnetForm']; if (!theForm) { theForm = document.aspnetForm; } function __doPostBack(eventTarget, eventArgument) { if (!theForm.onsubmit || (theForm.onsubmit() != false)) { theForm.__EVENTTARGET.value = eventTarget; theForm.__EVENTARGUMENT.value = eventArgument; theForm.submit(); } } // --> </script> <script language="javascript"> <!-- var _linkpostbackhit = 0; function _linkedClicked(id, key, str, a, b) { if (!b || !_linkpostbackhit) { if (!a) { __doPostBack(key, id); _linkpostbackhit = 1; } else { if (window.confirm(str)) { __doPostBack(key, id); _linkpostbackhit = 1; } } } return void(0); } // --> </script> ... <a href="JavaScript:_linkedClicked('123456','_ctl0:ContentPlaceHolder1:Link', '',0,1);">123456</a> ... </form> I tried to add the 2 variables: page.forms.first['__EVENTTARGET'] = '_ctl0:ContentPlaceHolder1:Link' page.forms.first['__EVENTARUGMENT'] = '123456' and submit the form: page.forms.first.click_button(page.forms.first.buttons.first) The result returned only (re)show the current list of links as if I have not clicked on any of the links. Any help will be appreciated. Thanks!

    Read the article

  • Python Mechanize select a form with no name

    - by mvid
    I am attempting to have mechanize select a form from a page, but the form in question has no "name" attribute in the html. What should I do? when I try to use br.select_form(name = "") i get errors that no form is declared with that name, and the function requires a name input. There is only one form on the page, is there some other way I can select that form?

    Read the article

1 2 3 4 5  | Next Page >