Imagine you have a class like Coordinate that implements a two-dimensional coordinate pair. You might want to use instances of that class in a dictionary, but the problem is that instance keys compare by identity, not equality:
>>> D = {Coordinate(2, 3): "something"} # Coordinate is a custom class.
>>> D.has_key(Coordinate(2, 3)
False
This is not what you expect: even though the two instances of Coordinate(2, 3) have the same value, they don't have the same ID and therefore Python won't treat them as the same dictionary key.
The answer to this problem is to give the class __hash__ and __eq__ methods:
class Coordinate(object):
def __init__(self, x, y):
self.x = x
self.y = y
def __hash__(self):
return hash(self.x) ^ hash(self.y)
def __eq__(self, other):
try:
return self.x == other.x and self.y == other.y
except AttributeError:
return False
Now two instances that compare equal will also have the same hash, and Python will recognise them as the same dictionary key.
But there's a gotcha: unlike built-in types like int and tuple, classes in Python are mutable. That's generally what you want, but in this case it can bite you. If the instance which is the key is changed, the hash will also change and your code will probably experience difficult to track down bugs.
The solution is to make Coordinate immutable, or at least as immutable as any Python class can be. To make a class immutable, have the __setattr__ and __delattr__ methods raise exceptions. (But watch out -- that means that you can no longer write something like self.x = x, you have to delegate that to the superclass.)
class Coordinate(object):
def __setattr__(*args):
raise TypeError("can't change immutable class")
__delattr__ = __setattr__
def __init__(self, x, y):
super(Coordinate, self).__setattr__('x', x)
super(Coordinate, self).__setattr__('y', y)
def __hash__(self):
return hash(self.x) ^ hash(self.y)
def __eq__(self, other):
try:
return self.x == other.x and self.y == other.y
except AttributeError:
return False
There are a few other things you can do as well: as a memory optimization, you can use __slots__ = ('x', 'y') to allocate memory for the two attributes you do use and avoid giving each instance an attribute dictionary it can't use. If the superclass defines in-place operators like __iadd__ etc. you should over-ride them to raise exceptions. If your class is a container, you must also make sure that __setitem__ etc. either don't exist at all or raise exceptions.
(I am indebted to Python guru Alex Martelli's explanation about immutable instances.)
[Update, 2007-04-02: fixed a stupid typo where I called super(Immutable, ...) instead of super(Coordinate, ...).]