What are the implications of registering an instance method with atexit in Python?
Assume I've got some really big Python class that might consume a fair amount of memory. The class has some method that is responsible for cleaning up some things when the interpreter exits, and it gets registered with the atexit module:
import atexit import os class ReallyBigClass(object): def __init__(self, cache_file): self.cache_file = open(cache_file) self.data = <some large chunk of data> atexit.register(self.cleanup) <insert other methods for manipulating self.data> def cleanup(self): os.remove(self.cache_file)
Various instances of this class might come and go throughout the life of the program. My questions are:
Is registering the instance method with atexit safe if I, say, del all my other references to the instance? In other words, does atexit.register() increment the reference counter in the same way as traditional binding would? If so, does the entire class instance now have to hang around in memory and wait until exit because one of its methods has been registered with atexit, or can portions of the instance be garbage collected? What would be the preferred way to structure such a cleanup at exit for transient class instances like this so that garbage collection can happen effectively?
Registering an instance method with atexit makes the whole class instance persist until the interpreter exits. The solution is to decouple any functions that are registered with atexit from the class. Then the instances can be successfully garbage collected. For example,
import atexit import os import gc import random class BigClass1(object): """atexit function tied to instance method""" def __init__(self, cache_filename): self.cache_filename = cache_filename self.cache_file = open(cache_filename, 'wb') self.data = [random.random() for i in range(10000000)] atexit.register(self.cleanup) def cleanup(self): self.cache_file.close() os.remove(self.cache_filename) class BigClass2(object): def __init__(self, cache_filename): """atexit function decoupled from instance""" self.cache_filename = cache_filename cache_file = open(cache_filename, 'wb') self.cache_file = cache_file self.data = [random.random() for i in range(10000000)] def cleanup(): cache_file.close() os.remove(cache_filename) atexit.register(cleanup) if __name__ == "__main__": import pdb; pdb.set_trace() big_data1 = BigClass1('cache_file1') del big_data1 gc.collect() big_data2 = BigClass2('cache_file2') del big_data2 gc.collect()
Stepping through this line by line and monitoring the process memory shows that memory consumed by big_data1 is held until the interpreter exits while big_data2 is successfully garbage collected after del. Running each test case alone (comment out the other test case) provides the same results.