IDisposable - that your mother did not talk about the release of resources. Part 1

This is a translation of the first part of the article. The article was written in 2008. After 10 years, almost did not lose relevance.


Deterministic release of resources - the need


For more than 20 years of coding experience, I sometimes developed my own languages ​​for solving problems. They ranged from simple imperative languages ​​to specialized regular expressions for trees. When creating languages ​​there are many recommendations and some simple rules should not be violated. One of them:


Never create a language with exceptions in which there is no deterministic release of resources.

Guess what recommendations should not .NET runtime, and as a result all languages ​​based on it?


The reason for which there is this rule - deterministic release of resources is necessary to create supported programs . Deterministic release of resources provides a certain point at which the programmer is confident that the resource is released. There are two ways to write reliable programs: the traditional approach is to release resources as early as possible and the modern approach, with the release of resources for an indefinite time. The advantage of the modern approach is that the programmer does not need to explicitly free resources. The disadvantage is that it is much more difficult to write a reliable application, there are a lot of subtle errors. Unfortunately .NET runtime was created using a modern approach.


.NET supports non-deterministic release of resources using the Finalize method, which has a special meaning. For the deterministic release of resources, Microsoft also added the IDisposable interface (and other classes that we will look at later). However, for runtime, IDisposable is a common interface, like everyone else. Such a status of "second-rate" creates some difficulties.


In C #, "deterministic release for the poor" can be implemented using the try and finally or using operators (which is almost the same). Microsoft has long discussed whether to do reference counters or not, and it seems to me that the wrong decision was made. As a result, for deterministic release of resources, you need to use clumsy finally \ using constructions or a direct call to IDisposable.Dispose , which is fraught with errors. For a C ++ programmer who is used to using shared_ptr<T> both options are not attractive. (the last sentence makes it clear where the author’s attitude is . )


IDisposable


IDisposable is a solution for deterministic release of resources offered by Microsoft. One is intended for the following cases:



IDisposable helps to free up deterministic resources, but it has its own problems.


IDisposable Difficulties - Usability


IDisposable objects to use IDisposable rather cumbersome. Using an object needs to be wrapped in a using construct. The bad thing is that C # does not allow using using with a type that does not implement IDisposable . Therefore, the programmer must refer to the documentation every time to understand whether it is necessary to write using , or simply to write using everywhere, and then to erase where the compiler swears.


Managed C ++ is much better in this respect. It supports stack semantics for reference types , which works as using only for those types where it is needed. C # could benefit from the ability to write using with any type.


This problem can be solved with. code analysis tools. Worsening the situation is that if you forget using , the program can pass tests, but fall while working "in the fields".


IDisposable instead of counting links carries another problem - the definition of the owner. When in C ++ the last copy of shared_ptr<T> goes out of scope, resources are released immediately, no need to think who should release. IDisposable on the contrary, forces the programmer to determine who "owns" the object and is responsible for its release. Sometimes ownership is obvious: when one object encapsulates another and implements IDisposable , it is therefore responsible for releasing the child objects. Sometimes the lifetime of an object is determined by a block of code, and the programmer simply uses using around this block. Nevertheless, there are many cases where an object can be used in several places and its lifetime is difficult to determine (although in this case reference counting would have done fine).


IDisposable Difficulties - Backward Compatibility


Adding IDisposable to a class and removing IDisposable from the list of implemented interfaces is a breaking change. Client code that does not expect IDisposable will not release resources if you add IDisposable to one of your classes, which are passed by reference to the interface or base class.


Microsoft itself has run into this problem. IEnumerator not inherited from IDisposable , and IEnumerator<T> inherited. If the code receiving IEnumerator passed to IEnumerator<T> , then Dispose will not be called.


This is not the end of the world, but it does produce some kind of IDisposable entity.


Difficulties IDisposable - designing class hierarchy


The biggest drawback caused by IDisposable in the design of the hierarchy is that each class and interface must predict whether its descendants will need IDisposable .


If the interface does not inherit IDisposable , but the classes that implement the interface also implement IDisposable , then the final code will either ignore the deterministic release, or should check whether the object implements the IDisposable interface. But for this purpose it will not be possible to use the using construction using and you will have to write ugly try and finally .


In short, IDisposable complicates the development of reusable software. The key prchin is a violation of one of the principles of object-oriented design - separation of the interface and implementation. The release of resources must be an implementation detail. Microsoft decided to make deterministic release of resources a second-class interface.


One of the not very nice decisions is to make all classes implement IDisposable , but in the vast majority of classes IDisposable.Dispose will not do anything. But it is not too beautiful.


Another IDisposable complexity is collections. Some of the collections "own" the objects in them, and some do not. However, the collections themselves do not implement IDisposable . The programmer must remember to call IDisposable.Dispose on objects in the collection or create his own heirs of collection classes that implement IDisposable to denote "ownership".


Difficulties IDisposable - additional "erroneous" state


IDisposable can be called explicitly at any time, regardless of the lifetime of the object. That is, a “released” state is added to each object, in which it is recommended to throw an ObjectDisposedException exception. Checking the status and throwing exceptions - an additional cost.


Instead of checking for every sneeze, it is better to consider the reference to the object in the “released” state as “undefined behavior”, as an appeal to the freed memory.


Difficulties IDisposable - no guarantees


IDisposable is just an interface. The class that implements IDisposable supports deterministic release, but does not guarantee it. For client code, it’s quite normal not to call Dispose . Therefore, a class that implements IDisposable must support both deterministic and non-deterministic release.


Difficulties IDisposable - difficult implementation


Microsoft offers a pattern for implementing IDisposable . (Previously there was a horrible pattern in general, but relatively recently, after the appearance of .NET 4, the documentation was corrected, including under the influence of this article. In the old editions of books on .NET you can find the old version. - approx. Lane )



Besides:



Difficulties IDisposable - not suitable for Logic Completion


Object termination - often occurs in programs with parallel or asynchronous streams. For example, a class uses a separate thread and wants to terminate it with a ManualResetEvent . This can be done in IDisposable.Dispose , but can lead to an error if the code is called in the finalizer.


To understand the limitations in the finalizer, you need to understand how the garbage collector works. Below is a simplified diagram, in which many of the details associated with generations of weak references, revival of objects, background garbage collection, etc. are omitted.


The .NET garbage collector uses the mark-and-sweep algorithm. In general, the logic is as follows:


  1. Suspend all threads.
  2. Take all the objects "roots": variables in the stack, static fields, GCHandle objects, the finalization queue. In the case of unloading the application domain (program completion), it is considered that variables in the stack and static fields are not roots.
  3. Recursively go through all the links from the objects and mark them as "achievable".
  4. Go through all other objects that have destructors (finalizers), declare them reachable, and put them in the finalization queue ( GC.SuppressFinalize tells GC not to do this). Objects are in the queue in an unpredictable order.

The background (or several) finalization works in the background:


  1. Takes an object from the queue and starts its finalizer. It is possible to run several finalizers of different objects at the same time.
  2. The object is removed from the queue, and if nobody else refers to it, it will be cleared at the next garbage collection.

Now it should be clear why it is impossible to access managed resources from the finalizer - you do not know in what order the finalizers are called. Even a call to IDisposable.Dispose another object from the finalizer can lead to an error, since the resource release code can work in a different thread.


There are several exceptions when you can access managed resources from the finalizer:


  1. Objects inherited from CriticalFinalizerObject are finalized after objects that are not inherited from this class are finalized. This means that you can call ManualResetEvent from the finalizer until the class is inherited from CriticalFinalizerObject
  2. Some objects and methods are special, such as the Console and some Thread methods. They can be called from finalizers, even if the program ends.

In general, it is better not to use managed resources from finalizers. However, the completion logic is necessary for non-trivial software. In Windows.Forms contains completion logic in the Application.Exit method. When you develop your component library, the best logic to complete is to tie in with IDisposable . Normal termination in case of calling IDisposable.Dispose and emergency otherwise.


Microsoft also encountered this problem. The StreamWriter class owns a Stream object (depending on the parameters of the constructor in the latest version - approx. Lane ). StreamWriter.Close flushes the buffer and calls Stream.Close (also proceeds if you wrap it in using - approx. Lane ). If StreamWriter not closed, the buffer is not cleared and data is lost. Microsoft simply did not override the finalizer, thus "solving" the problem of completion. A great example of the necessity of completion logic.


I recommend reading


A lot of information about the internal device. NET in this article is gleaned from the book "CLR via C #" by Geoffrey Richter. If you do not have it yet, then buy it . Seriously. This is essential knowledge for any C # programmer.


Conclusion from the translator


Most .NET programmers will never encounter the problems described in this article. .NET will develop in the direction of increasing the level of abstraction and reducing the need for "juggling" uncontrolled resusrami. Nevertheless, this article is useful in that it describes the deep details of simple things and their influence on the design of the code.


The next part will be a detailed analysis of how to work with managed and unmanaged resources in .NET with a bunch of examples.

Source: https://habr.com/ru/post/414873/


All Articles