Resource Cleanup Idioms
by John Hurst, Feb 2003
(updated John Hurst, Aug 2005)
Overview
This is a brief article giving a comparison of idioms for resource cleanup in a selection of different programming languages.
Introduction
Object-oriented programming involves the creation of a lot of objects. Every object is stored in memory, and many objects also contain handles to external resources. Examples of external resources are:
- file handles
- network handles
- database connections
- synchronization objects, such as semaphores
It is important during a long-running program, e.g. a server, for resources to be appropriately freed when they are no longer used.
There are a variety of techniques available for this. Almost all object-oriented languages include the concept of a constructor, which is typically used to create and allocate resources for an object. Many also include a destructor of one form or another, which can be used to clean up the object and free resources to the system. At a high level, there are many differences in approaches to constructing objects (1). But at the basic level of language features, while there are some differences in object creation, it is fair to say that there are more differences, and the differences are more crucial, when it comes to destroying them.
Explicit destruction requires the programmer to code explicitly for an object to be destroyed—nothing is assumed or automatic. A completely explicit language is C.
OO languages generally provide some degree of implicit resource cleanup, where some aspects are handled automatically by the system.
Deterministic approaches are completely predictable as to when cleanup occurs. Fully explicit approaches are, by definition, deterministic. Implicit approaches can vary.
The standard technique for deterministic cleanup of objects is reference counting. A count of live references to each object is maintained by the system, and the object is destroyed when this count falls to zero.
In some systems, for example COM, reference counting can be done explicitly via function calls. Alternatively, it can be done via classes or built-in language support.
The strengths of reference counting are:
- Simple to understand
- Relatively simple to implement
- Completely deterministic
- Relatively good performance
The main weakness of reference counting is that it cannot easily deal with circular references, which arise frequently in OO programming.
A nondeterministic alternative to reference counting is a mark-and-sweep algorithm, which usually runs periodically in a background thread and can clean up objects when it finds no live references to them.
In practice, nondeterministic approaches are only taken with memory management. Other resources, such as file handles, network handles and database connections are generally freed (at some level) via explicit API calls, and thus are completely deterministic.
Another common feature of OO languages with bearing on resource cleanup is exceptions. Exceptions allow for flexible handling of runtime errors at a level appropriate to an application. However, they can tend to propagate error handling to a point somewhat apart from where an error arises, and so it is important to have some mechanism to ensure that resources allocated in intermediate layers are appropriately freed.
The rest of this article discusses approaches to resource cleanup in these commonly used OO programming languages:
C++
In C++ virtually everything, including object destruction, is completely deterministic. (An example of something that is nondeterministic in C++ is the creation of static, or global, objects in different translation units.)
In C++, all objects are created with a constructor, and all objects are freed via a destructor.
C++ provides three kinds of storage for objects:
- Static objects are defined at file scope, outside of any class or
function. These objects are created at program startup, and destroyed
when the program shuts down. They are stored in a global, static area
of memory. The constructors and destructors are called automatically
by the system. The programmer can name a constructor explicitly.
Example:Logger gLogger(std::cerr); // global logger, stderr is a parameter.
- Automatic objects are created within a scope. They are stored on the
stack. The scope may be a class, function or block, for example. The
scope has a well-defined beginning and end. The object is created at
the point it is defined, and automatically destroyed at the end of its
scope.
Example:class X { Logger logger_; // this object is created when an X is created, // and destroyed automatically when X is destroyed. // other members ... };
Example:void f() { Logger logger(hSyslog); // destroyed automatically at end of f(). // do work ... }
Example:// somewhere... { // starts a block... Logger quickLog(std::cerr); // destroyed automatically at end of block. // work with quickLog ... }
- Dynamic objects are allocated using new, and freed with delete. They are stored on the heap, and must be both created and destroyed explicitly.
C++ thus supports different approaches to memory management for different tasks. For objects with a well-defined global, class, function or block scope, automatic memory management is used. For objects requiring more flexibility, dynamic storage is used, and the programmer must design the code so that objects are guaranteed to be freed appropriately.
The key to managing dynamic objects, as well as other explicitly-allocated objects, in C++, is to use classes to manage resources.
For example, multithreaded programs often need to obtain locks to ensure serialized access to some resource. A platform API may provide lock() and unlock() functions to obtain and free locks.
A C program using these APIs to serialize work in function sync_routine() might look like this:
sync_routine() { lock(handle); // do work ... unlock(handle); }
In contrast, a C++ program would encapsulate the resource, i.e. the object the lock is obtained on, in a class, and use the class to manage the resource:
class Lock { public: Lock(HANDLE handle): handle_(handle) { lock(handle_); } ~Lock() { unlock(handle_); } private: HANDLE handle_; };
In C++, sync_routine() looks like this:
void sync_routine() { Lock lock(handle); // do work ... }
There is no need to release the lock explicitly. It is released automatically after normal completion of the function. More importantly, it is also released automatically if the function returns or exits prematurely, whether by the return statement or an exception.
C++ provides no finally keyword, in contrast to similar languages Java, C# and Delphi. That is because the proper use of resource classes and destructors in C++ makes finally blocks unnecessary.
Code such as the following:
{ r = AllocateResource(); try { // do something ... } catch (...) { Deallocate(r); } Deallocate(r); }
is unwieldy and unnecessary, and a sign of bad design. Resources which are required for a well-defined scope should use automatic storage, with resource-wrapping classes as appropriate.
It is common to use special resource management classes such as smart pointers in C++ to provide the semblance of automatic memory management for dynamic objects. Programs using such techniques are less prone to memory leaks.
Java
Java provides implicit, nondeterministic memory management. Other resources, however, are handled explicitly, and care must be taken with them.
The nearest thing Java provides to a C++ destructor is the Object.finalize() method. However, it is important to note that this method is not deterministic, in that there is no way to predict when it will be called by the system.
Therefore, for cases requiring deterministic resource cleanup, it is typical to use try {} finally {} constructs to simulate C++ destructor behaviour.
For example:
public void load() throws Exception { Statement query = null; try { query = getConnection().createStatement(); // work with query ... } finally { if (query!=null) { query.close(); } } }
This approach can get very awkward quickly when multiple resources are involved, as they typically are in JDBC use:
public Customer load(String id) throws SQLException { Connection connection = null; PreparedStatement query = null; ResultSet rs = null; try { connection = getDataSource().getConnection(); query = getConnection().prepareStatement( "SELECT name FROM customer WHERE id = ? " ); query.setString(1, id); // ... rs = query.executeQuery(); Customer result = new Customer(id); result.setName(rs.getString(1)); // ... return result; } finally { SqlUtils.close(rs); SqlUtils.close(query); SqlUtils.close(connection); } } // ... public class SqlUtils { // ... public static void close(Connection connection) { if (connection != null) { try { connection.close(); } catch (Exception ex) { // log it ... } } } public static void close(Statement statement) { if (statement != null) { try { statement.close(); } catch (Exception ex) { // log it ... } } } public static void close(ResultSet resultSet) { if (resultSet != null) { try { resultSet.close(); } catch (Exception ex) { // log it ... } } } }
An approach demonstrated by the excellent Spring Framework is to encapsulate the resource handling in a so-called Template class, and use callbacks to do the work.
public Customer load(final String id) { JdbcTemplate template = new JdbcTemplate(getDataSource()); return (Customer) template.queryForObject( "SELECT name FROM customer WHERE id = ? ", new Object[] {id}, new RowMapper() { public Object mapRow(ResultSet rs, int rowNum) throws SQLException { Customer result = new Customer(id); result.setName(rs.getString(1)); return result; } } ); }
Note that when relying on automatic memory management in Java, it can be important to explicitly set object references to null, to ensure the garbage collecter can free the objects.
C#
We use C# as the main language for this section, but what we are really discussing is the .NET framework. C# is promoted as the premier .NET language, and it is similar in look and feel to Java.
C# uses essentially the same resource management methods as Java, though the syntax is a little different.
As in Java, it is possible to define a method which will be called automatically by the system at some point after the object is no longer visible to the program.
VB.Net very sensibly follows the Java syntax, and calls this method Finalize(). Very confusingly, C# (and .Net C++) use the C++ destructor syntax for this method, even though it does not behave like a C++ destructor.
C# also offers a standard interface, IDisposable, and a control structure, using(), which can make deterministic resource cleanup more straightforward.
For resources where deterministic cleanup is desirable, implement the IDisposable interface and provide a Dispose() method.
Users of your class can follow the Java pattern like so:
public void Load() { Statement query = null; try { query = getConnection().createStatement(); // work with query ... } finally { query.Dispose(); }
Alternatively, and really rather better, the using() construct:
public void Load() { Statement query = getConnection().createStatement(); using (query) { // work with query ... } // query.Dispose() automatically called here. }
Many of the .Net framework classes implement IDisposable; where classes do, this construct should be preferred.
It is important to note that Dispose is optional, and does not guarantee deterministic resource clean-up if not properly used by the client. For example:
public void Load() { Statement query = getConnection().createStatement(); // work with query ... }
After this function is called, the query is not freed immediately, but sometime later by the garbage collector. Therefore, class designers should provide fallback cleanup in Finalize() (or ~Statement() in this case) so that resources do get cleaned up at some point.
Delphi
Delphi has some similarities to C++, and to Java, and also some differences.
For example, Delphi provides destructors. However, they are typically not called automatically.
Delphi somewhat confusingly provides class objects and interface objects, with different resource cleanup semantics. Partly this is to accomodate seamless use of interface objects with COM.
Objects which are declared of class type are managed explicitly. They are created explicitly, and their destructors must be called explicitly.
On the other hand, objects which are declared of interface type are destroyed automatically, through a destructor call, when their reference count drops to zero. (It could be seen as another point in favour of interface-based programming!)
One pleasant difference between Delphi and Java, when calling destructors explicitly, is that it is OK to call Free() on a nil object.
Thus, we typically get code which looks like this:
try CreateSQL := TQuery.Create(nil); // work with query finally CreateSQL.Free; end;
Delphi's try block syntax is weirdly restrictive, compared to that of Java/C#.
It is not valid to combine finally and except clauses in the same try block. Instead, if you want to handle an exception as well as clean up a resource, you must have something like this:
try try QueueTable := TTable.Create(nil); // work with table ... except // handle exception end; finally QueueTable.Free; // clean up resource end;
Since Delphi provides implicit cleanup of interface objects, it is possible to wrap resources that are acquired for a particular scope in an interface object and have it handled automatically.
For example, a critical section lock object could be defined like this:
type ILock = interface end; TLock = class(TInterfacedObject, ILock) constructor Create(CS: TCriticalSection); destructor Destroy; override; private fCS: TCriticalSection; end; constructor TLock.Create(CS: TCriticalSection); begin inherited Create; fCS := CS; fCS.Acquire; end; destructor TLock.Destroy; begin fCS.Release; inherited; end;
A routine needing protection against more than one thread running at once could acquire a lock on the critical section as follows:
procedure SyncRoutine; var lock: ILock; // ... begin lock := TLock.Create(aCS); // aCS is a TCriticalSection. // ... rest of routine is run in only one thread at a time. end;
It is not necessary to explicitly destroy or release the lock variable, since it is declared as an interface type.
Unfortunately, Delphi's ugly separation between a variable's declaration and its first use makes this approach unnecessarily verbose compared to C++-styled languages.
Note also that with Delphi, unlike most OO languages, it's necessary to be explicit about constructor and destructor chaining.
Perl
Perl was not originally an object-oriented language; it was a powerful UNIX scripting language, and then object-oriented features were added with version 5. As a result, some of them work in weird and wonderful ways compared to more typical object-oriented languages.
Perl features automatic memory management and deterministic resource clean-up. Perl objects are automatically reference-counted.
Memory is reclaimed automatically when the last reference to an object goes out of scope. To clean up other resources, define a destructor for your class. In Perl, the destructor is the class's DESTROY subroutine.
Perl does not automatically chain destructor calls (or constructor calls for that matter), so if your class derives from other classes, it is a good idea to invoke their destructors explicitly in your destructor.
Perl memory management relies on reference counting during program execution, and then uses a mark-and-sweep algorithm when the interpreter exits, to release remaining resources to the system.
The upshot is that the programmer usually does not need to worry about memory management, unless he has objects with circular references.
For example:
package SearchPage; sub new { my $class = shift; my $self = bless {}, $class; $self->{'_WorkingPanel'} = $self->GetDefaultWorkingPanel($self); return $self; } # other methods ... package WorkingPanel; sub new { my $class = shift; my ($parent) = @_; my $self = bless {}, $class; $self->{'_Parent'} = $parent; return $self; } # other methods ...
In this example, the Page class includes a reference to a WorkingPanel object, which in turn refers back to the containing Page.
Because of the circular reference, Perl will never clean up such a pair of objects automatically.
Instead, the programmer must explicitly break the circularity when the object/data are no longer required.
This can be done by placing the self-referential data structures inside a container class.
Ruby
Ruby is the newest OO scripting language reaching widespread use.
Ruby features automatic garbage collection, using a mark-and-sweep algorithm to reclaim unused objects.
Objects are garbage-collected nondeterministically. It is possible to specify a block of code to be run when an object is garbage-collected, using ObjectSpace.define_finalizer. This finalizer acts like a destructor, except that it is not called until some indetermine time after the object is no longer referenced.
Ruby provides similar exception trapping control structures to Java, using begin/ensure/end:
begin f = File.open(filename) # process file ... ensure f.close end
However, the standard idiom in Ruby for managing resources is much cooler: use blocks of code as transactions.
File.open(filename) do |f| # process file ... end
When File.open is passed a block of code, as it is here, it opens the file, runs the block, and guarantees the file closed itself, upon return.
Again, this means the file will be closed after the block, regardless of how it ends: normal completion or exception.
This magic is achieved by File.open knowing what to do when it is passed a block of code. It might be implemented like this:
def File.open(name, mode="r") f = os_file_open(name, mode) if block_given? begin yield f ensure f.close end return nil else return f end end
Here's another example, showing the elegance of Ruby's resource management with DBI, the standard database interface module:
DBI.connect(data_source, username, password) do |dbh| dbh.select_all(sql_customers) do |customer_id,| puts "Analysing customer [#{customer_id}]" dbh.select_all(sql_customer_jobs(customer_id)) do |job_id,| print "#{job_id}," end end end
There's no need to close the database connection explicitly, or to release the SQL statement handles, as has to be done in the equivalent Perl code using Perl's DBI.
All the standard classes in Ruby that manage external resources provide constructors that work this way with blocks.
Conclusion
This article has examined some common idioms for resource cleanup in a selection of programming languages. It has provided some comparison of the different approaches commonly in use today.
The most important thing is to know the mechanisms your language and environment provides for resource cleanup. Know the scopes of your objects, and know how they are cleaned up after they are no longer used.
Notes
References
Bjarne Stroustrup "The C++ Programming Language" 3rd Ed Addison-Wesley 1997
James Gosling, Bill Joy, Guy Steele "The Java Language Specification", Addison-Wesley 1996
Microsoft Corp ".NET Framework SDK Documentation" http://msdn.microsoft.com/library (In particular, see the section Reference Design Guidelines for Class Library Developers Common Design Patterns Implementing Finalize and Dispose to Clean Up Unmanaged Resources)
Borland Corp "Delphi Language Guide" 2002
Larry Wall, Tom Christiansen, Jon Orwant "Programming Perl" 3rd Ed O'Reilly Associates 2000
Tom Christiansen, Nathan Torkington "Perl Cookbook" O'Reilly Associates 1998
David Thomas, Andrew Hunt "Programming Ruby" Addison-Wesley 2001