Python hashable dicts

Posted by TokenMacGuy on Stack Overflow See other posts from Stack Overflow or by TokenMacGuy
Published on 2009-07-20T04:04:30Z Indexed on 2010/04/24 18:33 UTC
Read the original article Hit count: 424

Filed under:

As an exercise, and mostly for my own amusement, I'm implementing a backtracking packrat parser. The inspiration for this is i'd like to have a better idea about how hygenic macros would work in an algol-like language (as apposed to the syntax free lisp dialects you normally find them in). Because of this, different passes through the input might see different grammars, so cached parse results are invalid, unless I also store the current version of the grammar along with the cached parse results. (EDIT: a consequence of this use of key-value collections is that they should be immutable, but I don't intend to expose the interface to allow them to be changed, so either mutable or immutable collections are fine)

The problem is that python dicts cannot appear as keys to other dicts. Even using a tuple (as I'd be doing anyways) doesn't help.

>>> cache = {}
>>> rule = {"foo":"bar"}
>>> cache[(rule, "baz")] = "quux"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>>

I guess it has to be tuples all the way down. Now the python standard library provides approximately what i'd need, collections.namedtuple has a very different syntax, but can be used as a key. continuing from above session:

>>> from collections import namedtuple
>>> Rule = namedtuple("Rule",rule.keys())
>>> cache[(Rule(**rule), "baz")] = "quux"
>>> cache
{(Rule(foo='bar'), 'baz'): 'quux'}

Ok. But I have to make a class for each possible combination of keys in the rule I would want to use, which isn't so bad, because each parse rule knows exactly what parameters it uses, so that class can be defined at the same time as the function that parses the rule. But combining the rules together is much more dynamic. In particular, I'd like a simple way to have rules override other rules, but collections.namedtuple has no analogue to dict.update().

Edit: An additional problem with namedtuples is that they are strictly positional. Two tuples that look like they should be different can in fact be the same:

>>> you = namedtuple("foo",["bar","baz"])
>>> me = namedtuple("foo",["bar","quux"])
>>> you(bar=1,baz=2) == me(bar=1,quux=2)
True
>>> bob = namedtuple("foo",["baz","bar"])
>>> you(bar=1,baz=2) == bob(bar=1,baz=2)
False

tl'dr: How do I get dicts that can be used as keys to other dicts?

Having hacked a bit on the answers, here's the more complete solution I'm using. Note that this does a bit extra work to make the resulting dicts vaguely immutable for practical purposes. Of course it's still quite easy to hack around it by calling dict.__setitem__(instance, key, value) but we're all adults here.

class hashdict(dict):
    """
    hashable dict implementation, suitable for use as a key into
    other dicts.

        >>> h1 = hashdict({"apples": 1, "bananas":2})
        >>> h2 = hashdict({"bananas": 3, "mangoes": 5})
        >>> h1+h2
        hashdict(apples=1, bananas=3, mangoes=5)
        >>> d1 = {}
        >>> d1[h1] = "salad"
        >>> d1[h1]
        'salad'
        >>> d1[h2]
        Traceback (most recent call last):
        ...
        KeyError: hashdict(bananas=3, mangoes=5)

    based on answers from
       http://stackoverflow.com/questions/1151658/python-hashable-dicts

    """
    def __key(self):
        return tuple(sorted(self.items()))
    def __repr__(self):
        return "{0}({1})".format(self.__class__.__name__,
            ", ".join("{0}={1}".format(
                    str(i[0]),repr(i[1])) for i in self.__key()))

    def __hash__(self):
        return hash(self.__key())
    def __setitem__(self, key, value):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def __delitem__(self, key):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def clear(self):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def pop(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def popitem(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def setdefault(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def update(self, *args, **kwargs):
        raise TypeError("{0} does not support item assignment"
                         .format(self.__class__.__name__))
    def __add__(self, right):
        result = hashdict(self)
        dict.update(result, right)
        return result

if __name__ == "__main__":
    import doctest
    doctest.testmod()

© Stack Overflow or respective owner

Related posts about python