This is a translation of the first part of the article. The article was written in 2008. After 10 years, almost did not lose relevance.
Deterministic release of resources - the need
For more than 20 years of coding experience, I sometimes developed my own languages for solving problems. They ranged from simple imperative languages to specialized regular expressions for trees. When creating languages there are many recommendations and some simple rules should not be violated. One of them:
Never create a language with exceptions in which there is no deterministic release of resources.
Guess what recommendations should not .NET runtime, and as a result all languages based on it?
The reason for which there is this rule - deterministic release of resources is necessary to create supported programs . Deterministic release of resources provides a certain point at which the programmer is confident that the resource is released. There are two ways to write reliable programs: the traditional approach is to release resources as early as possible and the modern approach, with the release of resources for an indefinite time. The advantage of the modern approach is that the programmer does not need to explicitly free resources. The disadvantage is that it is much more difficult to write a reliable application, there are a lot of subtle errors. Unfortunately .NET runtime was created using a modern approach.
.NET supports non-deterministic release of resources using the Finalize
method, which has a special meaning. For the deterministic release of resources, Microsoft also added the IDisposable
interface (and other classes that we will look at later). However, for runtime, IDisposable
is a common interface, like everyone else. Such a status of "second-rate" creates some difficulties.
In C #, "deterministic release for the poor" can be implemented using the try
and finally
or using
operators (which is almost the same). Microsoft has long discussed whether to do reference counters or not, and it seems to me that the wrong decision was made. As a result, for deterministic release of resources, you need to use clumsy finally
\ using
constructions or a direct call to IDisposable.Dispose
, which is fraught with errors. For a C ++ programmer who is used to using shared_ptr<T>
both options are not attractive. (the last sentence makes it clear where the author’s attitude is . )
IDisposable
IDisposable
is a solution for deterministic release of resources offered by Microsoft. One is intended for the following cases:
- Any type owning managed (
IDisposable
) resources. The type must possess , that is, control the lifetime, resources, and not just refer to them. - Any type owning unmanaged resources.
- Any type that owns both managed and unmanaged resources.
- Any type inherited from the class that implements
IDisposable
. I do not recommend inheriting from classes that own unmanaged resources. Better to use the attachment.
IDisposable
helps to free up deterministic resources, but it has its own problems.
IDisposable Difficulties - Usability
IDisposable
objects to use IDisposable
rather cumbersome. Using an object needs to be wrapped in a using
construct. The bad thing is that C # does not allow using using
with a type that does not implement IDisposable
. Therefore, the programmer must refer to the documentation every time to understand whether it is necessary to write using
, or simply to write using
everywhere, and then to erase where the compiler swears.
Managed C ++ is much better in this respect. It supports stack semantics for reference types , which works as using
only for those types where it is needed. C # could benefit from the ability to write using
with any type.
This problem can be solved with. code analysis tools. Worsening the situation is that if you forget using
, the program can pass tests, but fall while working "in the fields".
IDisposable
instead of counting links carries another problem - the definition of the owner. When in C ++ the last copy of shared_ptr<T>
goes out of scope, resources are released immediately, no need to think who should release. IDisposable
on the contrary, forces the programmer to determine who "owns" the object and is responsible for its release. Sometimes ownership is obvious: when one object encapsulates another and implements IDisposable
, it is therefore responsible for releasing the child objects. Sometimes the lifetime of an object is determined by a block of code, and the programmer simply uses using
around this block. Nevertheless, there are many cases where an object can be used in several places and its lifetime is difficult to determine (although in this case reference counting would have done fine).
IDisposable Difficulties - Backward Compatibility
Adding IDisposable
to a class and removing IDisposable
from the list of implemented interfaces is a breaking change. Client code that does not expect IDisposable
will not release resources if you add IDisposable
to one of your classes, which are passed by reference to the interface or base class.
Microsoft itself has run into this problem. IEnumerator
not inherited from IDisposable
, and IEnumerator<T>
inherited. If the code receiving IEnumerator
passed to IEnumerator<T>
, then Dispose
will not be called.
This is not the end of the world, but it does produce some kind of IDisposable
entity.
Difficulties IDisposable - designing class hierarchy
The biggest drawback caused by IDisposable
in the design of the hierarchy is that each class and interface must predict whether its descendants will need IDisposable
.
If the interface does not inherit IDisposable
, but the classes that implement the interface also implement IDisposable
, then the final code will either ignore the deterministic release, or should check whether the object implements the IDisposable
interface. But for this purpose it will not be possible to use the using construction using
and you will have to write ugly try
and finally
.
In short, IDisposable
complicates the development of reusable software. The key prchin is a violation of one of the principles of object-oriented design - separation of the interface and implementation. The release of resources must be an implementation detail. Microsoft decided to make deterministic release of resources a second-class interface.
One of the not very nice decisions is to make all classes implement IDisposable
, but in the vast majority of classes IDisposable.Dispose
will not do anything. But it is not too beautiful.
Another IDisposable
complexity is collections. Some of the collections "own" the objects in them, and some do not. However, the collections themselves do not implement IDisposable
. The programmer must remember to call IDisposable.Dispose
on objects in the collection or create his own heirs of collection classes that implement IDisposable
to denote "ownership".
Difficulties IDisposable - additional "erroneous" state
IDisposable
can be called explicitly at any time, regardless of the lifetime of the object. That is, a “released” state is added to each object, in which it is recommended to throw an ObjectDisposedException
exception. Checking the status and throwing exceptions - an additional cost.
Instead of checking for every sneeze, it is better to consider the reference to the object in the “released” state as “undefined behavior”, as an appeal to the freed memory.
Difficulties IDisposable - no guarantees
IDisposable
is just an interface. The class that implements IDisposable
supports deterministic release, but does not guarantee it. For client code, it’s quite normal not to call Dispose
. Therefore, a class that implements IDisposable
must support both deterministic and non-deterministic release.
Difficulties IDisposable - difficult implementation
Microsoft offers a pattern for implementing IDisposable
. (Previously there was a horrible pattern in general, but relatively recently, after the appearance of .NET 4, the documentation was corrected, including under the influence of this article. In the old editions of books on .NET you can find the old version. - approx. Lane )
IDisposable.Dispose
may not be called at all, so the class must include a finalizer to free up resources.IDisposable.Dispose
can be called several times and should work without visible side effects. Therefore it is necessary to add a check whether the method has already been called or not.- Finalizers are called in a separate thread and can be called before
IDisposable.Dispose
completes. You must use GC.SuppressFinalize
to avoid such "races".
Besides:
- Finalizers are also called for objects that have thrown an exception in the constructor. Therefore, the release code must work with partially initialized objects.
- Implementing
IDisposable
in a class inherited from CriticalFinalizerObject
requires non-trivial constructs. void Dispose(bool disposing)
is a viral method and must be used in the Constrained Execution Region , which requires calling RuntimeHelpers.PrepareMethod
.
Difficulties IDisposable - not suitable for Logic Completion
Object termination - often occurs in programs with parallel or asynchronous streams. For example, a class uses a separate thread and wants to terminate it with a ManualResetEvent
. This can be done in IDisposable.Dispose
, but can lead to an error if the code is called in the finalizer.
To understand the limitations in the finalizer, you need to understand how the garbage collector works. Below is a simplified diagram, in which many of the details associated with generations of weak references, revival of objects, background garbage collection, etc. are omitted.
The .NET garbage collector uses the mark-and-sweep algorithm. In general, the logic is as follows:
- Suspend all threads.
- Take all the objects "roots": variables in the stack, static fields,
GCHandle
objects, the finalization queue. In the case of unloading the application domain (program completion), it is considered that variables in the stack and static fields are not roots. - Recursively go through all the links from the objects and mark them as "achievable".
- Go through all other objects that have destructors (finalizers), declare them reachable, and put them in the finalization queue (
GC.SuppressFinalize
tells GC not to do this). Objects are in the queue in an unpredictable order.
The background (or several) finalization works in the background:
- Takes an object from the queue and starts its finalizer. It is possible to run several finalizers of different objects at the same time.
- The object is removed from the queue, and if nobody else refers to it, it will be cleared at the next garbage collection.
Now it should be clear why it is impossible to access managed resources from the finalizer - you do not know in what order the finalizers are called. Even a call to IDisposable.Dispose
another object from the finalizer can lead to an error, since the resource release code can work in a different thread.
There are several exceptions when you can access managed resources from the finalizer:
- Objects inherited from
CriticalFinalizerObject
are finalized after objects that are not inherited from this class are finalized. This means that you can call ManualResetEvent
from the finalizer until the class is inherited from CriticalFinalizerObject
- Some objects and methods are special, such as the Console and some Thread methods. They can be called from finalizers, even if the program ends.
In general, it is better not to use managed resources from finalizers. However, the completion logic is necessary for non-trivial software. In Windows.Forms
contains completion logic in the Application.Exit
method. When you develop your component library, the best logic to complete is to tie in with IDisposable
. Normal termination in case of calling IDisposable.Dispose
and emergency otherwise.
Microsoft also encountered this problem. The StreamWriter
class owns a Stream
object (depending on the parameters of the constructor in the latest version - approx. Lane ). StreamWriter.Close
flushes the buffer and calls Stream.Close
(also proceeds if you wrap it in using
- approx. Lane ). If StreamWriter
not closed, the buffer is not cleared and data is lost. Microsoft simply did not override the finalizer, thus "solving" the problem of completion. A great example of the necessity of completion logic.
I recommend reading
A lot of information about the internal device. NET in this article is gleaned from the book "CLR via C #" by Geoffrey Richter. If you do not have it yet, then buy it . Seriously. This is essential knowledge for any C # programmer.
Conclusion from the translator
Most .NET programmers will never encounter the problems described in this article. .NET will develop in the direction of increasing the level of abstraction and reducing the need for "juggling" uncontrolled resusrami. Nevertheless, this article is useful in that it describes the deep details of simple things and their influence on the design of the code.
The next part will be a detailed analysis of how to work with managed and unmanaged resources in .NET with a bunch of examples.