Consider this simple function:
def foo(): print(foo)
It may be tempting to assume that
def foo makes available the name
foo to the function block every time it is run, similar to how
for x in ... would make the name
x available inside a
for block. This is not the case, however:
def foo(): print(foo) foo() bar = foo foo = None bar()
<function foo at 0x0000000003437D90> None
This case wanders into a peculiar fact about scope within a function: There is nothing inherently special about a function's name within the function itself. The name
foo the function is not a magical invariant means for the function to refer to itself. It merely results in a lookup to something in the visible scope with the name
In the first example (and prior to reassigning
foo in the second example) the name
foo is visible within the module which contains
foo, so it is also visible inside
foo. However if it is removed as an available name from the module's scope, as in the second example, it is also no longer available inside the function
Perhaps a clearer view of this point is seen using a method rather than a function:
class Foo: def bar(self): print(bar) f = Foo() f.bar()
Output (portion of traceback omitted):
line 3, in bar print(bar) NameError: name 'bar' is not defined
Inside the method
bar, the name
bar by itself is never available.
A Practical Example: Recursion
A function referring to itself is of course not a crazy edge case by any means; every recursive function must do it at some point. Consider this example:
def linear_search(seq, x, i=0): if i == len(seq): return None elif seq[i] == x: return i else: return linear_search(seq, x, i + 1)
This function's purpose is to be simple and recursive, so ignore its crumminess.
A function that refers to itself by name, like
linear_search above, is relatively easy to break by doing something that shouldn't be inherently dangerous:
# Function not changed from previous example def linear_search(seq, x, i=0): if i == len(seq): return None elif seq[i] == x: return i else: return linear_search(seq, x, i + 1) linsearch = linear_search linear_search = None linsearch(('0', '1', '2'), '2')
Output (portion of traceback omitted):
line 7, in linear_search return linear_search(seq, x, i + 1) TypeError: 'NoneType' object is not callable
The line in which
linear_search recursively calls itself feels surprisingly fragile for being so simple and common a task. Do note, however, the specificity of the circumstances necessary for this problem to arise: First, the function must not only be reassigned to a different name, the original name must be
del'd or overwritten, and second this must occur within the same scope as the function, i.e. if
linear_search was imported from a different module there would be no problem, because its name would remain available in the original module.
It could go without saying that renaming a function and reusing its former name for another purpose is confusing and odd (if not flat-out bad) practice, but it could happen, so how can the function-referring-to-itself problem be prevented?
The issue being discussed is of course the reason that instance methods have an implicit argument containing the instance, by convention
self, and class methods have
cls. An instance, or the class itself, can be renamed at will without ill effect. Functions however, have no such builtin means of referencing themselves. The simplest technique to allow a function to reference itself is to pass it itself as an argument, as in the following:
def foo(self): print(self.__name__) bar = foo del foo bar(bar)
In the previous example, note that the
__name__ attribute of
foo does not change even though the name
foo is no longer within the module's scope. I find this entirely intuitive; to see why, refer back to the
line 7, in linear_search bit of the traceback above.1 To add to the strangeness, however,
__name__ is writable:
def foo(self): print(self.__name__) bar = foo del foo bar(bar) bar.__name__ = 'spam' bar(bar)
One might wonder if
spam would be callable at the end of the previous example. It would not. Attempting e.g.
print(spam) would result in a
NameError. I find that intuitive, but it's worth mentioning.
Automating the Solution
Getting back to recursive functions, passing the function itself fixes the problem:
def linear_search(self, seq, x, i=0): if i == len(seq): return None elif seq[i] == x: return i else: return self(self, seq, x, i + 1) linsearch = linear_search linear_search = None i = linsearch(linsearch, ('0', '1', '2'), '2') print(i) # Output: 2
With an explicit reference to itself, the function's name is no longer an issue. However, forcing the caller to pass a function itself is very tedious and thankfully, unnecessary. A decorator could improve the situation by automating the self argument, like so:
def with_self(func): """ Add a keyword argument 'self' to a decorated function which contains the function itself. """ def wrapper(*args, **kwargs): return func(*args, self=func, **kwargs) return wrapper @with_self def linear_search(seq, x, i=0, **kwargs): self = kwargs['self'] if i == len(seq): return None elif seq[i] == x: return i else: return self(seq, x, i + 1, **kwargs) linsearch = linear_search linear_search = None i = linsearch(('0', '1', '2'), '2') print(i) # Output: 2
The decorator automatically adds a keyword argument named
self to the wrapped function which contains the function itself. The use of
kwargs hides this bit of implementation from the user, as they shouldn't have control of it anyway. It also keeps the function's signature uncluttered, at the expense of an extra line in the function to grab the argument from the kwargs dictionary.
There is something I do not like about this solution: the function is reliant on being decorated to execute properly. If it is not decorated by
with_self (or something else that provides the
self kwarg) then it will crash and burn with a
KeyError. That makes this pattern a little smelly, but with the risk properly documented it would do fairly nicely.
An alternate implementation, which avoids the reliant-on-the-decorator problem, explicitly requires the
def with_self(func): """ Automatically passes the function object `func` as the first argument to `func`, similar to a `self` instance variable. """ def wrapper(*args, **kwargs): return func(func, *args, **kwargs) return wrapper # Note `self` is now required and named, rather than a keyword argument @with_self def linear_search(self, seq, x, i=0, **kwargs): if i == len(seq): return None elif seq[i] == x: return i else: return self(self, seq, x, i + 1, **kwargs) linsearch = linear_search linear_search = None i = linsearch(('0', '1', '2'), '2') print(i) # Output: 2
In this second case, if the decorator is not used the user would have to pass the function as the first argument themself. This solution eliminates the smell of the first solution, but has its own problems. The signature of
linear_search, which now begins with
self, is not the signature a user would actually use, as
self is automatically provided by the decorator. This is, again, confusing at a glance and would need proper documentation. Which of the two solutions is preferred would be a judgement call.
I find the idea that there was a "hole" in every recursive function I've ever written in Python fairly amusing (and slightly frightening). Although, as described earlier, there is little significant danger; it is very much an odd edge case.
I don't particularly recommend using the
with_self solution on every recursive function the reader may write from this point forward. Still, it's an interesting lesson in not taking scope for granted.