0 comments
general
04 Mar 2010

Python Parameter Parsing to Array

As I’ve been working on a project on Google App Engine, using Python, I’ve needed to grab data from the post or get request data.

I’m coming from a mostly PHP scripting background, so I might be missing something, but I will go over what I think is happening…

Raw Tools in App Engine

So you can get hold of simple data easily enough, you can query the request object thats attached to self, like this:

class Home(webapp.RequestHandler):
  def get(self):
    self.response.get('x')

For most cases, that works. However, I have a cases that don’t… The main one being multi-dimensional arrays as get data. My query string looks a bit like this:

?names[0][firstname]=foo&names[0][surname]=bar&names[1][firstname]=first&names[1][surname]=last

With PHP you would simply do $names = $_REQUEST['names'] and PHP works out the array for you. Not so in Python. Calling self.request.get('names') returns an error, saying that index doesn’t exist.

Python does have other helpers which are more useful, like self.request.GET.items(), that lets you loop over the entire GET data set, and if you enumerate it you get key value pairs:

for key,value in enumerate(self.request.GET.items()):
  print key
  print value
  

Success, or so you’d think… However, the key is not very useful, it comes out as the full thing. So, the first key is [0][firstname] which is about as useful as a chocolate tea pot when I’m trying to make a multi-dimensional array.

In the end, I couldn’t find a way to handle incoming data structured like that. Instead I had to revert to using single names like this:

?firstname=foo&surname=bar&firstname=first&surname=last

This is not something I like. What do you do if some of the fields are blank? Or if the order is jumbled, with the second surname belonging to the first firstname.. It doesn’t handle it..

The best I could do to take care of this and get a nicely formatted array of data was to parse it based on the keys I want and use self.request.get_all in a nested loop.

The Python Class

#class to handle parsing of post/get data in to multi arrays
class ParamSort():
  def __init__(self, structure, request_obj):
    self.structure = structure
    self.request = request_obj
    self.results = []

  def sort(self):
    grouped = {}
    lengths = []
    results=[]
    #convert the post data to key based array
    for i, v in enumerate(self.structure):
      grouped[v]=[]
      lengths.insert(i,0)
      for f in self.request.get_all(v):
        grouped[v].append(f)
        lengths[i] += 1
    #convert the grouped to a multi dimensional
    for x in range(max(lengths)):
      tmp = {}
      for f in self.structure:
        if len(grouped[f]) > x :
          tmp[f] = grouped[f][x]
      #if everything mapped properly, save this sucker
      if len(tmp) == len(self.structure):
        results.append(tmp)
    return results


Example use:

dbkeys = ['firstname', 'lastname']
sorter = ParamSort(dbkeys, request)
res = sorter.sort()

As I’m fairly new to Python, I might be totally wrong and missing something that everyone else knows… Please tell me if I am!

The Files

All the files that are used…

EDIT (08/06/2010): It seems I was indeed wrong, there is a utility for this built in to the cgi module (and in 2.6 the urlparse module) called parse_qs that does this for query strings (GET data). I’ll presume there is something similar that does the same for post data.

blog comments powered by Disqus