C #: backward compatibility and overload

Hello colleagues!

We remind everyone that we have a great book by Mark Price " C # 7 and. NET Core. Cross-platform development for professionals ." Please note: this is the third edition in front of you, the first edition was written on version 6.0 and did not appear in Russian, and the 3rd edition was published in the original in November 2017 and covers version 7.1.

After the release of such a compendium, which passed a separate scientific editorial to check backward compatibility and other correctness of the material presented, we decided to translate an interesting article by John Skit about what known and little-known backward compatibility problems may arise in C #. Enjoy reading.

Back in July 2017, I undertook to write an article about versioning. Soon he abandoned her, because the topic was too extensive to cover her in just one post. On such a topic it is more reasonable to select a whole site / wiki / repository. I hope someday to return to this topic, because I consider it extremely important and I think that it receives much less attention than it deserves.

So, in the .NET ecosystem, semantic versioning is usually welcomed - it sounds great, but it requires everyone to understand the same thing, which is considered a "fundamental change." That is what I thought about for a long time. One of the aspects that most struck me recently is how difficult it is to avoid fundamental changes when overloading methods. It is about this (mostly) that will be discussed in the post that you read; after all, this topic is very interesting.
To begin with - a brief definition ...

Sources and binary compatibility

If I can recompile my client code with a new version of the library, and everything is working fine, then this is compatibility at the source code level. If I can redeploy the client binary I have with the new version of the library without recompiling, then it is binary compatible. None of this is a superset of another:

Some changes may be simultaneously incompatible with both the source code and the binary code — for example, you cannot delete a whole public type from which you are completely dependent.
Some changes are compatible with the source code, but incompatible with the binary code — for example, if you convert the public static field to read-only into a property.
Some changes are compatible with binary code, but not compatible with the source code — for example, adding an overload that can cause ambiguity at compile time.
Some changes are compatible with both the source and binary code - for example, a new implementation of the method body.

So what are we talking about?

Suppose we have a public library version 1.0, and we want to add several overloads to it, to finalize it to version 1.1. We adhere to semantic versioning, so we need backward compatibility. What does this mean that we can and can’t do, and can we answer “yes” or “no” to all the questions here?

In various examples, I’ll show code in versions 1.0 and 1.1, and then “client” code (that is, code that uses the library) that may break as a result of the changes. There will be neither the bodies of methods nor class declarations, since they are essentially not important - we focus on signatures. However, if you are interested, all these classes and methods can be easily reproduced. Suppose that all the methods described here are in the Library class.

The simplest change imaginable, adorned with a group conversion of methods to a delegate
The simplest example that comes to my mind is adding a parameterized method where there is already an unparameterized method:

  //   1.0 public void Foo() //   1.1 public void Foo() public void Foo(int x)

Even here, compatibility is incomplete. Consider the following client code:

  //  static void Method() { var library = new Library(); HandleAction(library.Foo); } static void HandleAction(Action action) {} static void HandleAction(Action<int> action) {}

In the first version of the library, everything is fine. The call to the HandleAction method converts a group of methods to the delegate library.Foo , and as a result an Action is created. In version 1.1, the situation becomes ambiguous: a group of methods can be converted to an Action or an Action. That is, strictly speaking, such a change is incompatible with the source code.

At this stage, it is tempting to just give up and promise yourself that you will never again add any overloads. Or it can be said that such a case is rather improbable in order not to be afraid of such a failure. Let's call transformations of a group of methods out of scope for now.

Unlinked reference types

Consider another context where you have to use overloads with the same number of parameters. It can be assumed that such a library change will be non-destructive:

 //  1.0 public void Foo(string x) //  1.1 public void Foo(string x) public void Foo(FileStream x)

At first glance, everything is logical. We retain the original method, so we will not violate binary compatibility. The easiest way to break it is to write a call that works in v1.0, but does not work in v1.1, or works in both versions, but in different ways.
What incompatibility between v1.0 and v1.1 can give such a challenge? We must have an argument that is compatible with both string and FileStream . But these are unrelated reference types ...

The first failure is possible if we make a user-defined implicit conversion to both the string and the FileStream :

 //  class OddlyConvertible { public static implicit operator string(OddlyConvertible c) => null; public static implicit operator FileStream(OddlyConvertible c) => null; } static void Method() { var library = new Library(); var convertible = new OddlyConvertible(); library.Foo(convertible); }

I hope the problem is obvious: code that was previously unambiguous and worked with string is now ambiguous, since the OddlyConvertible type can be implicitly converted to both string and FileStream (both overloads are applicable, neither of them is better than the other).

Maybe in this case it is reasonable to prohibit user-defined conversions ... but this code can be brought down and much easier:

 //  static void Method() { var library = new Library(); library.Foo(null); }

The zero literal is implicitly convertible to any reference type or to any nullable significant type ... therefore, again, the situation in version 1.1 is ambiguous. Let's try again ...

Parameters of reference types and non-nullable significant types

Suppose we do not care about user-defined transformations, but we don’t like problem zero literals. How in this case add overload with non-nullable significant type?

  //  1.0 public void Foo(string x) //  1.1 public void Foo(string x) public void Foo(int x)

At first glance, well - library.Foo(null) will work fine in v1.1. So is he safe? No, not only in C # 7.1 ...

  //  static void Method() { var library = new Library(); library.Foo(default); }

The default literal is zero, but applicable to any type. It is very convenient - and a headache when it comes to overload and compatibility: (

Optional Parameters

The optional parameters are still a problem. Suppose we have one optional parameter, and we want to add a second one. We have three options, indicated below as we 1.1a, 1.1b and 1.1c.

  //  1.0 public void Foo(string x = "") //  1.1a //   ,         public void Foo(string x = "") public void Foo(string x = "", string y = "") //  1.1b //          public void Foo(string x = "", string y = "") //  1.1c //   ,    ,   //  ,     . public void Foo(string x) public void Foo(string x = "", string y = "")

And what if the client makes two calls:

 //  static void Method() { var library = new Library(); library.Foo(); library.Foo("xyz"); }

Library 1.1a maintains compatibility at the binary level, but violates at the source code level: now library.Foo() ambiguous. According to the overload rules in C #, methods are preferred that do not require the compiler to “fill in” all available optional parameters, however, it does not specify in any way how many optional parameters can be filled.

The 1.1b library maintains compatibility at the source level, but breaks binary compatibility. The existing compiled code is designed to call a method with a single parameter — and this method no longer exists.

The 1.1c library maintains binary compatibility, but is fraught with possible surprises at the source code level. Now the call to library.Foo() resolved to a method with two parameters, while library.Foo("xyz") resolved to a method with one parameter (from the point of view of the compiler, it is preferable to the method with two parameters, mainly because no optional parameters fill is not required). This may be perfectly acceptable if a version with one parameter simply delegates versions with two parameters, and in both cases the same default value is used. However, it seems odd that the value of the first call will change if the method in which it was previously allowed still exists.

The situation with the optional parameters becomes even more confusing if you want to add a new parameter not at the end, but in the middle - for example, try to stick to the agreement and keep the optional CancellationToken parameter at the very end. I will not go into this ...

Generalized methods

Type deduction at the best of times was not an easy task. When it comes to overload resolution, this work turns into a shaped nightmare.

Suppose we have only one non-generic method in v1.0, and we add another generalized method in v1.1.

 //  1.0 public void Foo(object x) //  1.1 public void Foo(object x) public void Foo<T>(T x)

At first glance, not so scary ... but let's see what happens in the client code:

 //  static void Method() { var library = new Library(); library.Foo(new object()); library.Foo("xyz"); }

In the v1.0 library, both calls are resolved to Foo(object) - the only method available.

The v1.1 library is backward compatible: if you take an executable client file compiled for v1.1, then both calls will still use Foo(object) . But, in the case of recompilation, the second call (and only the second) will switch to working with the generalized method. Both methods are applicable to both calls.

On the first call, type inference will show that T is an object , so converting an argument to a parameter type in both cases will be reduced to object to object . Fine. The compiler applies the rule that non-generic methods are always preferable to general ones.

In the second call, type inference will show that T will always be string , so when converting an argument to a type parameter, we get string in object for the original method or string in string for the generic method. The second transformation is “better”, so the second method is chosen.

If the two methods work the same way, great. If not, then you break the compatibility in a very non-obvious way.

Inheritance and dynamic typing

Sorry, I'm already exhaled. Both inheritance and dynamic typing can manifest themselves in the most "cool" and mysterious way when resolving overloads.
If we add such a method at one level of the inheritance hierarchy that overloads the base class method, the new method will be processed first and preferred by the base class method, even if the base class method is more accurate when converting an argument to a type parameter. There is enough space to confuse everything.

Similarly comes out with dynamic typing (in client code); to some extent, the situation becomes unpredictable. You’ve already seriously sacrificed security during the compilation ... so don’t be surprised if something breaks.

Total

I tried to make the examples in this article fairly simple. Everything becomes very complicated, and very quickly, when you have a lot of optional parameters. Versioning is a tricky business, my head is puffy.

Source: https://habr.com/ru/post/414223/

All Articles

C #: backward compatibility and overload

More articles: