Implementing Git in a corporate development system

Developers who know and know how to work with git have recently increased by an order of magnitude. You get used to the speed of command execution. You get used to the convenience of branches and easy rollback of changes. Conflict resolution is so commonplace that programmers are used to heroic conflict resolution where they shouldn't be.

Our team at Directum develops a development tool for platform solutions. If you have seen 1C, then approximately you will be able to imagine the working environment of our "clients" - application developers. With this development tool itself, an application developer creates an application solution for customers.

Our team was faced with the task of simplifying the lives of our applied people. We are spoiled by modern chips from Visual Studio, ReSharper and IDEA. Prikladniki demanded from us to introduce into the tool work with git "out of the box".

The difficulty is what it is. In the tool for each type of entity (contract, report, directory, module) there could be a lock. One developer started editing the entity type and blocked it until it completed the changes and commits them to the server. The rest of the developers at this time looking at the same type of entity read-only. The development was somewhat reminiscent of working in SVN or sending a Word document by mail between several users. I want everything at once, and maybe only one.

Each type of entity can have many processors (opening a document, validation before saving, writing to the database), in which you need to write code that works with a specific entity instance. For example, block buttons, display a message to the user, or create a new task for performers. All code is within the API provided by the platform. Handlers are classes in which there are many methods. When two people needed to correct the same code file, it was not possible to do this, because the platform blocked the type of the entity as a whole along with the dependent code.

Our practitioners went to hell. They quietly forged an "illegal" copy of our development environment, commented out a part with locks and merged our commits. The application code was kept under git, commits or through third-party tools (git bash, SourceTree and others). We made our conclusions:

Our team underestimated the willingness of application developers to get into the platform. Huge respect and honor!
The solution proposed by them is not suitable for production. A man has his hands free with git and he is able to create anything. Maintain all the diversity will be stupid, do not hate. In addition, it is necessary to train the platform customers. Documenting all the git commands applied to the platform would drive the documentation team crazy.

What do you want from Git

So giving a git out to production is no good. We decided to somehow encapsulate the logic of the main operations and limit their number. At least for the first release. The list of teams was reduced as they could and remained:

status
commit
pull
push
reset --hard to HEAD
reset to the last "server" commit

For the first release, they decided to refuse working with branches. Not that it is very difficult, just the team did not fit the time resource.

Periodically, our partners send their application development and ask: "Something does not work with us. What are we doing wrong?". In this case, the application handles someone else’s development and looks at the code. Previously, it worked like this:

The developer took away an archive with the development;
Changed the local database in the configs;
Filled someone else's development to his base;
Debugged, found errors;
Gave recommendations;
Returned the development back.

The new methodology did not fit the old approach. I had to break my head. The team proposed two approaches to solve this problem:

Store all developments in one git repository. If you need to work with someone else's decision to create a temporary branch.
Keep the development of different teams in different repositories. Deliver the settings of folders uploaded to the environment to the configuration file.

We decided to go the second way. The first one seemed more difficult to implement and, moreover, it was easier to shoot oneself in the leg with switching branches.

But the second is also not sweet. The commands described above should work not just within the same repository, but with several at once. Are there changes in entity types from different repositories? We show them in one window. It is more convenient and transparent for an application developer. By clicking the commit button, the tool commits the changes to each of the repositories. Accordingly, the command pull / push / reset "under the hood" work with physically different repositories.

Libgit2sharp

To work with git choose from two options:

To work with git installed in the system, pulling it through Process.Start and parsing the output.
Use libgit2sharp, which pulls the libgit2 library through pinvoke.

It seemed to us that using a ready-made library is a reasonable solution. In vain. A little later, tell you why. At first, the library gave us the opportunity to quickly roll out a working prototype.

First iteration of development

It was possible to realize in about a month. Actually, the fastening of the guitars was quick, and most of the time we tried to cure the opened wounds because we sawed out the old source file storage mechanism. The interface just gave everything that returned git status . Clicking on each file displays diff. It looked like a git gui interface.

Second iteration of development

The first option turned out to be overly informative. Many types of files are associated with each type of entity at once. These files created noise, and it became unclear what types of entities changed and what exactly.

Grouped files by entity type. Each file was given a human readable name, the same as in the GUI. The entity type metadata is described in JSON. They also needed to be presented in a human-readable format. Analysis of the changes in the json versions "before" and "after" began using the jsondiffpatch library, and then they wrote their own implementation of the JSON comparison (hereinafter I will call jsondiff). The results of the comparison are run through analyzers that produce human-readable records. Many files were hidden from view, leaving a simple entry in the change tree.

The end result was:

Difficulties with libgit2

Libgit2 gave a lot of unexpected surprises. Dealing with some was not possible at a reasonable time. I'll tell you what I remember.

Unexpected and hardly reproducible falls on some standard operations. "No error provided by native library" tells us the wrapper. Perfectly. You swear, rebuild the native library in debug, repeat the case that fell before, but it does not fall in debug mode. You rebuild in release and fall again.

If a third-party tool is running in parallel with libgit2sharp, say SourceTree, then commit may fail to commit some files. Or freezes when displaying diffs on some files. As soon as you try to debug, it is impossible to reproduce.

For one of our applied developers, the implementation of the git status analog took 40 seconds. Forty, Karl! At the same time, the git launched from the console worked as expected for a second. I spent a couple of days to figure it out. When looking for changes, Libgit2 looks at the file attributes of folders and compares them with the entry in the index. If the modification time is different, then something has changed inside the folder and you need to look inside and / or search in files. And if nothing has changed, then you should not go inside. This optimization seems to be in the console git. I do not know for what reason, but it was for one person that the mtime index was changed in the git index. Because of this, git every time checked for the presence of changes in the contents of ALL files in the repository.

Closer to the release, our team caved in under the wishes of applied fetch + rebase + autostash and replaced git pull with fetch + rebase + autostash . And then a lot of bugs came to us, including with "No error provided by native library".

status, pull and rebase work noticeably longer than invoking console commands.

Automatic Merzh

Files in development are divided into two types:

Files that the application sees in the development tool. For example, code, images, resources. Such files need to be merged in the way git does.
JSON files that are created by the development environment, but the application developer sees them only as a GUI. They are required to automatically resolve conflicts.
Generated files that are automatically recreated when working with the development tool. These files do not fall into the repository, the tool immediately carefully puts .gitignore.

With the new way, two different applicationists were able to change the same type of entity.

For example, Sasha will change the information on how to store an entity type in the database and write a save event handler, and Sergey will stylize the entity representation. From the point of view of git, this will not be a conflict, and both changes will merge without difficulty.

And then Sasha changed the Property1 property and asked him the handler. Sergey created Property2 and set the handler. If you look at the situation from above, their changes do not conflict, although from the point of view of git the same files are affected.
I wanted the tool to be able to independently resolve such a situation.

Approximate algorithm for merging two JSON when a conflict occurs:

Download from JSON base git.
We load ours from JSON.
We load theirs from JSON.
Using jsondiff, we form software patches base-> ours and apply to theirs. The resulting JSON is called P1.
Using jsondiff, we form software patches base-> theirs and apply to ours. The resulting JSON is called P2.
Ideally, after applying the P1 === P2 patches. If so, then write P1 to disk.
In the non-ideal case (when there really was a conflict) we offer the user to choose between P1 and P2 with the ability to finish their hands. Write the choice to disk.

After the merge, we check whether we have reached the state without validation errors. If they didn’t come, then cancel this merge and ask the user to repeat. This is not the best solution, but at least it guarantees that the second or third attempt at a merger will occur without unpleasant consequences.

Results

Prikladniki satisfied that they can legally use.
The introduction of git has accelerated development.
Automatic merges generally look like magic.
Putting the future rejection of libgit2 in favor of calling the git process.

Source: https://habr.com/ru/post/412787/

All Articles