Trouble with copying dictionaries and using deepcopy on an SQLAlchemy ORM object

Posted by Az on Stack Overflow See other posts from Stack Overflow or by Az
Published on 2010-06-03T23:13:48Z Indexed on 2010/06/06 7:32 UTC
Read the original article Hit count: 291

Hi there,

I'm doing a Simulated Annealing algorithm to optimise a given allocation of students and projects.

This is language-agnostic pseudocode from Wikipedia:

s ? s0; e ? E(s)                                // Initial state, energy.
sbest ? s; ebest ? e                            // Initial "best" solution
k ? 0                                           // Energy evaluation count.
while k < kmax and e > emax                     // While time left & not good enough:
  snew ? neighbour(s)                           // Pick some neighbour.
  enew ? E(snew)                                // Compute its energy.
  if enew < ebest then                          // Is this a new best?
    sbest ? snew; ebest ? enew                  // Save 'new neighbour' to 'best found'.
  if P(e, enew, temp(k/kmax)) > random() then   // Should we move to it?
    s ? snew; e ? enew                          // Yes, change state.
  k ? k + 1                                     // One more evaluation done
return sbest                                    // Return the best solution found.

The following is an adaptation of the technique. My supervisor said the idea is fine in theory.

First I pick up some allocation (i.e. an entire dictionary of students and their allocated projects, including the ranks for the projects) from entire set of randomised allocations, copy it and pass it to my function. Let's call this allocation aOld (it is a dictionary). aOld has a weight related to it called wOld. The weighting is described below.

The function does the following:

  • Let this allocation, aOld be the best_node
  • From all the students, pick a random number of students and stick in a list
  • Strip (DEALLOCATE) them of their projects ++ reflect the changes for projects (allocated parameter is now False) and lecturers (free up slots if one or more of their projects are no longer allocated)
  • Randomise that list
  • Try assigning (REALLOCATE) everyone in that list projects again
  • Calculate the weight (add up ranks, rank 1 = 1, rank 2 = 2... and no project rank = 101)
  • For this new allocation aNew, if the weight wNew is smaller than the allocation weight wOld I picked up at the beginning, then this is the best_node (as defined by the Simulated Annealing algorithm above). Apply the algorithm to aNew and continue.
  • If wOld < wNew, then apply the algorithm to aOld again and continue.

The allocations/data-points are expressed as "nodes" such that a node = (weight, allocation_dict, projects_dict, lecturers_dict)

Right now, I can only perform this algorithm once, but I'll need to try for a number N (denoted by kmax in the Wikipedia snippet) and make sure I always have with me, the previous node and the best_node.

So that I don't modify my original dictionaries (which I might want to reset to), I've done a shallow copy of the dictionaries. From what I've read in the docs, it seems that it only copies the references and since my dictionaries contain objects, changing the copied dictionary ends up changing the objects anyway. So I tried to use copy.deepcopy().These dictionaries refer to objects that have been mapped with SQLA.


Questions:

I've been given some solutions to the problems faced but due to my über green-ness with using Python, they all sound rather cryptic to me.

  1. Deepcopy isn't playing nicely with SQLA. I've been told thatdeepcopy on ORM objects probably has issues that prevent it from working as you'd expect. Apparently I'd be better off "building copy constructors, i.e. def copy(self): return FooBar(....)." Can someone please explain what that means?

  2. I checked and found out that deepcopy has issues because SQLAlchemy places extra information on your objects, i.e. an _sa_instance_state attribute, that I wouldn't want in the copy but is necessary for the object to have. I've been told: "There are ways to manually blow away the old _sa_instance_state and put a new one on the object, but the most straightforward is to make a new object with __init__() and set up the attributes that are significant, instead of doing a full deep copy." What exactly does that mean? Do I create a new, unmapped class similar to the old, mapped one?

  3. An alternate solution is that I'd have to "implement __deepcopy__() on your objects and ensure that a new _sa_instance_state is set up, there are functions in sqlalchemy.orm.attributes which can help with that." Once again this is beyond me so could someone kindly explain what it means?

  4. A more general question: given the above information are there any suggestions on how I can maintain the information/state for the best_node (which must always persist through my while loop) and the previous_node, if my actual objects (referenced by the dictionaries, therefore the nodes) are changing due to the deallocation/reallocation taking place? That is, without using copy?

© Stack Overflow or respective owner

Related posts about python

Related posts about beginner