Paginating requests to an API

Posted by user332912 on Stack Overflow See other posts from Stack Overflow or by user332912
Published on 2010-05-05T00:34:01Z Indexed on 2010/05/05 0:38 UTC
Read the original article Hit count: 236

Filed under:

I'm consuming (via urllib/urllib2) an API that returns XML results. The API always returns the total_hit_count for my query, but only allows me to retrieve results in batches of, say, 100 or 1000. The API stipulates I need to specify a start_pos and end_pos for offsetting this, in order to walk through the results.

Say the urllib request looks like "http://someservice?query='test'&start_pos=X&end_pos=Y".

If I send an initial 'taster' query with lowest data transfer such as http://someservice?query='test'&start_pos=1&end_pos=1 in order to get back a result of, for conjecture, total_hits = 1234, I'd like to work out an approach to most cleanly request those 1234 results in batches of, again say, 100 or 1000 or...

This is what I came up with so far, and it seems to work, but I'd like to know if you would have done things differently or if I could improve upon this:

hits_per_page=1000 # or 1000 or 200 or whatever, adjustable
total_hits = 1234 # retreived with BSoup from 'taster query'
base_url = "http://someservice?query='test'"
startdoc_positions = [n for n in range(1, total_hits, hits_per_page)]
enddoc_positions = [startdoc_position + hits_per_page - 1 for startdoc_position in startdoc_positions]
for start, end in zip(startdoc_positions, enddoc_positions):
if end > total_hits: end = total_hits print "url to request is:\n ", print "%s&start_pos=%s&end_pos=%s" % (base_url, start, end)

p.s. I'm a long time consumer of StackOverflow, especially the Python questions, but this is my first question posted. You guys are just brilliant.

Developer IT

Paginating requests to an API - Developer IT

Paginating requests to an API

python

api

list-comprehension

improvement

Related posts about python

unmet dependencies in Ubuntu 12.04

How can I get sikuli-ide to work?

Getting PATH right for python after MacPorts install

call python with system() in R to run a python script emulating the python console

Python - Calling a non python program from python?

Related posts about api

Where does ASP.NET Web API Fit?

An Introduction to ASP.NET Web API

Introduction to the ASP.NET Web API

Introduction to the ASP.NET Web API

How can I setup dependencies for Axis2 / Axiom on Maven2

Categories cloud