You could write:
files = [open(f) for f in sys.argv[1:]]
instead of:
files = map(lambda f: open(f), sys.argv[1:])
(ie. use a "list comprehension" - I think it reads a bit easier than using the map/lambda).
Ivan
Closures in Python (part 2)
Closures in Python (part 2)
This assumes that you've read Martin Fowler's article on closures. Part 1 shows a translation of Martin Fowler's Ruby code into Python, both a direct translation and a more idomatic translation using Python's "list comprehensions" (which are arguably neater for doing lots of the sorts of things that you use closures for in Smalltalk or Ruby). From this, you might think that Python can handle closures like Ruby or Smalltalk can, but this isn't quite the case.
Limitation of lambda
In the non-list comprehension examples, the "lambda
" keyword for creating a closure in Python can only be used with an expression, not any arbitrary code. This happens to be OK for the examples in Martin's article, but consider something just a tiny bit more complicated:
Let's say you wanted to do:
map(lambda each: if each.isManager: each.salary = 2000, employees)
You can't. "if each.isManager: each.salary = 2000
" isn't an expression.
Instead, you'd have to define a function (which doesn't take much syntax):
def anonymousFunction(employee):
if employee.isManager: employee.salary = 2000
then you can do:
map(anonymousFunction, employees)
(As an aside, "map
" returns the collection of the result of executing the function on every element of a collection.
We just want to execute the function and don't care about the return result; there's no equivalent to
that in Python other than just ignoring the return result which is what we'll do. It's not a problem.)
Assignment considered awkward
There are other problems too. Consider the following code:
def totalCostOfManagers(emps):
total = 0
def anonymousFunction(employee):
if employee.isManager: total = total + employee.salary
map(anonymousFunction, emps)
return total
This looks like it should give you the total of the manager's salaries. (Ignore the fact that there are other ways to do this,
it's just an example). Try to execute this and you get:
UnboundLocalError: local variable 'total' referenced before assignment
This is because the "total
" inside "anonymousFunction
" is different to the "total
" inside "totalCostOfManagers
".
When you do an assignment to a variable, it is created if it didn't already exist (in that scope).
(If I find a suitable reference I'll edit this and put it here).
Work-around for assignment
One way around this would be not to try to assign to "total
" itself, but rather have "total
" refer to a list and assign to an element in that:
def totalCostOfManagers(emps):
total = [0]
def anonymousFunction(employee):
if employee.isManager: total[0] = total[0] + employee.salary
map(anonymousFunction, emps)
return total[0]
This is the sort of thing you might also do with an anonymous inner class in Java,
where you also can't do assignments to variables in an outer scope.
Making functions look like objects
A slightly subtle thing that I haven't mentioned at all so far is the difference between the "closures" you've see in Python and those in Smalltalk or Ruby. In Smalltalk a closure is an object defining a "value" method. That is, to execute the code of a Smalltalk closure, you'd send it the message "value
", with parameters as appropriate. The equivalent in Python would be something like: (given that "emps
" has a method "do
" that accepts an object with a "value
" method)
def totalCostOfManagers(emps):
total = [0]
class AnonymousClass:
def value(self, employee):
if employee.isManager: total[0] = total[0] + employee.salary
emps.do(AnonymousClass())
return total[0]
You can do the equivalent in Java using an anonymous inner class. (Note in Python that "self
" (or "this
") is explicit,
and also has to be included as the first parameter in method definitions.)
(If you want to try this out, you could use the following:
class Employee:
pass
ivan = Employee()
ivan.name="ivan"
ivan.isManager = False
ivan.salary = 2
tim = Employee()
tim.name="tim"
tim.isManager = True
tim.salary = 5
class Employees:
def __init__(self):
self.emps = [tim, ivan]
def do(self, block):
for e in self.emps:
block.value(e)
and execute:
print totalCostOfManagers(Employees())
to see it work.)
You might also consider:
def totalCostOfManagers(emps):
class AnonymousClass:
def __init__(self):
self.total = 0
def value(self, employee):
if employee.isManager: self.total = self.total + employee.salary
block = AnonymousClass()
emps.do(block)
return block.total
Note that "__init__
" defines the constructor for AnonymousClass, which is called by doing "AnonymousClass()
"
(there's no "new" keyword needed).
Making objects look like functions
In Smalltalk, closures look like objects with a "value
" method. In Python it is more idomatic to use a function instead, as you've seen earlier. To invoke a Python function, you put "()
" after it. So, back to basics; if you have a function "foo
":
def foo():
return "hi mum"
then "foo
" is a reference to the function, and "foo()
" executes the function, i.e. evaluating: "print foo
"
results in something like "<function foo at 0x008F7970>
" and evaluating "print foo()
" results in "hi mum
".
So, rather than defining "do
" to accept an object with a "value
" method, more idomatic would be to use the built in function "map
" and pass it a function (as shown earlier). In Python, you can make any object look like a function by defining a "__call__
" method. So back to the example, another way to implement it would be:
def totalCostOfManagers(emps):
class AnonymousClass:
def __init__(self):
self.total = 0
def __call__(self, employee):
if employee.isManager: self.total = self.total + employee.salary
block = AnonymousClass()
map(block, emps)
return block.total
(execute "print totalCostOfManagers([tim, ivan])
", with "tim
" and "ivan
" defined as before, to see it work).
Note the regularity in Python of having functions/methods callable (e.g. "foo()
"), classes callable (e.g. the constructor "AnonymousClass()
" and instances callable (e.g. the instance of "AnonymousClass
").
To clarify, try:
class Foo:
def __call__(self):
return "hi mum"
def bar(self):
return "yo dude"
someFoo = Foo()
print someFoo.bar()
print someFoo()
Admitting defeat for now, and moving onto something more difficult
OK - let's admit the truth - it's looking like a closure style isn't working so great here. Let's just revert to a simple loop:
def totalCostOfManagers(emps):
total = 0
for employee in emps:
if employee.isManager: total = total + employee.salary
return total
So - what was all that effort for? Why are closures such a Good Thing? Where a closure becomes really neat is
if you have to do something more complicated. For example, consider if you have to do something like this:
def totalCostOfManagers(emps):
total = 0
try:
emps.startSomething()
employee = emps.next()
while(employee != None):
if employee.isManager: total = total + employee.salary
employee = emps.next()
finally:
emps.endSomething()
return total
where "emps
" is some object defining "startSomething
", "endSomething
" and "next
" methods that have to be called like
in this method, and the "endSomething
" has to be called whether you finish looping through all the employees or not.
(You can use:
class Employees:
def __init__(self):
self.emps = [tim, ivan]
def startSomething(self):
pass
def endSomething(self):
pass
def next(self):
if len(self.emps) == 0:
return None
return self.emps.pop()
print totalCostOfManagers(Employees())
to try this out. Of course, this is not how a real implementation would look, it's just to illustrate the calling method
"totalCostOfManagers
")
Now for some duplication, and how to remove it.
This pattern is typical of some types of code, e.g. database related code. If you had to do lots of things similar to this, but slightly different, you can easily end up with lots of duplicated code.
For example:
def totalCostOfNonManagers(emps):
total = 0
try:
emps.startSomething()
employee = emps.next()
while(employee != None):
if not employee.isManager: total = total + employee.salary
employee = emps.next()
finally:
emps.endSomething()
return total
In Java, I've seen lots of code like this that is mostly duplicated with just a bit different. At first sight,
to many Java developers it might look like too much effort to remove the duplication, but actually it's not too hard. This is a case where closures really help (or anonymous inner classes in Java). You can put the bulk of this code where it belongs, in the Employees class:
class Employees:
#whatever ...
def do(self, fun):
try:
self.startSomething()
employee = self.next()
while(employee != None):
fun(employee)
employee = self.next()
finally:
self.endSomething()
Then the definition of "totalCostOfManagers
" doesn't need to worry about most of that stuff and looks just like
the code from earlier (but "emps
" is an instance of "Employees
" rather than a simple list):
def totalCostOfManagers(emps):
total = [0]
def anonymous(employee):
if employee.isManager: total[0] = total[0] + employee.salary
emps.do(anonymous)
return total[0]
However, as we've seen earlier, this would be neater if we could write it as just a simple loop.
Generators
Python has a trick up it's sleeve, called a "generator". If we define "do
" as:
class Employees:
#whatever ...
def do(self):
self.startSomething()
employee = self.next()
while(employee != None):
yield employee
employee = self.next()
self.endSomething()
The "yield
" keyword creates a "generator", which, as far as a "for
" loop is concerned, is a method that looks like
it returns a list, and as far as the "generator" method is concerned, looks like a method that "returns" one element at a time. Unfortunately, "'yield' not allowed in a 'try' block with a 'finally' clause
", so I've deleted that until someone can tell me a way around that limitation (arghhh!). Anyway, back to the "generator" version of "do
"; the calling code now becomes:
def totalCostOfManagers(emps):
total = 0
for employee in emps.do():
if employee.isManager: total = total + employee.salary
return total
which is nice and simple again.
Generators as Iterators
Another tweak is also available. If we change the name of "do
" to "__iter__
":
class Employees:
#whatever ...
def __iter__(self):
self.startSomething()
employee = self.next()
while(employee != None):
yield employee
employee = self.next()
self.endSomething()
Then, our calling code becomes:
def totalCostOfManagers(emps):
total = 0
for employee in emps:
if employee.isManager: total = total + employee.salary
return total
and we've gone back to something that looks really simple!
Our other method now becomes:
def totalCostOfNonManagers(emps):
total = 0
for employee in emps:
if not employee.isManager: total = total + employee.salary
return total
which removes much of the duplication. Removing the last of the duplicaion is left as an exercise for the reader.
Conclusion
I miss the really neat syntax of closures in Smalltalk, and the fact that everything just works like it should in Smalltalk with syntax that is truth and beauty. However, despite thinking that Smalltalk is a better language, I use Python in my work for various reasons.
I have never been able to choose the main implementation language for a commercial project. (Yes, now I think of it, really, never.) It's always been dictated either by what's already been done, or for projects that I've joined from the start, it's been chosen by someone else (someone else less technical than me, that is!). These days, that means Java or C#. However, even on these projects, there's always the need for little scripts for automating parts of the development process or doing other useful things. Python is better than Smalltalk for this sort of thing, as it's easy to share Python programs with other people, and to move them from machine to machine. You just copy the relevant text file, edit it in your favourite text editor if required, and "Python whatever.py" and it works. Smalltalk doesn't have that convenience.
Some people use shell scripts for that sort of thing, but Python is a very much better language, and also works cross platform. Compared to Perl or Ruby, Python has very little syntax, which makes it easy to read even if you haven't been doing much Python recently. I'm a "bear of very little brain" (or something like that), so I want to be able to read code without having to remember what the syntax means. With Python, that's very easy. I can also read other people's Python code, which helps!
Python now has a very large following, and it's easy to create Python wrappers for "C" programs, so the quality and quantity of libraries is fantastic. The main Python implementation itself is also stable and reasonably performant.
From a language point of view, Python is less scary to the majority of Java developers than Smalltalk (although that's a deficiency on their behalf rather than Smalltalk's, sometimes you have to make compromises to get something accepted). Also, people have heard of Python and aren't frightened by it. The common reaction is "oh, that's like Perl but more readable, isn't it?", whereas the reaction to Smalltalk tends to be fear. Python also has some nice language features, such as list comprehensions and generators, which make it easy to write code that's easy to read.
If you want to learn a language with good support for closures, and one of the best languages ever invented, try Smalltalk. If you want to use a language that's easy to learn and useful in your day-to-day work, try Python. If you want to improve as a developer, learn both.
Closures in Python (part 3)
Just a short addition to part 2 - I was writing some code the other day and came across another closure related problem with Python. I wanted to create some buttons, with event callbacks, dynamically; something like this:
class Foo:
def __init__(self):
self.callbacks = [lambda event: self.callback(name) for name in ['bob','bill','jim']]
def callback(self, name):
print name
def run(self):
for callback in self.callbacks:
callback("some event")
Now, what do you suppose happens when you run "Foo().run()
"?
You get:
jim
jim
jim
This is because there is only one "name
" variable (so Nat Pryce tells me). A simple solution is to do:
class Foo:
def __init__(self):
self.callbacks = [self.makeCallback(name) for name in ['bob','bill','jim']]
def makeCallback(self, name):
return lambda event: self.callback(name)
def callback(self, name):
print name
def run(self):
for callback in self.callbacks:
callback("some event")
I'm sure there are many other solutions too. For the curious, in Ruby the equivalent code would be:
class Foo
def initialize()
@callbacks = ['bob','bill','jim'].map {|name| Proc.new {|event| callback(name)}}
end
def callback(name)
print name
end
def run()
@callbacks.each { |callback| callback.call("some event") }
end
end
Foo.new.run()
which works just fine. However, I still chose Python over Ruby for pragmatic reasons (a bit ironic I know) - the libraries, tools and support are all superior.