Code review: you do it wrong


Today, many people in development use code review . Practice useful, necessary. Even if you do not make a review, you probably know what it is.

There are a lot of tools on the market for reviewing code with ready-made usage scenarios, recommendations and rules. GitHub, Phabricator, FishEye / Crucible, GitLab, Bitbucket, Upsource - the list goes on and on. We also worked with them at Badoo: in my previous article, I told our story about the review of the code and how we came up with the invention of our own “bicycle” - the Codeisok solution.

There is plenty of information, you can google a bunch of articles about the review of the code with real examples, practices, approaches, telling about how good, how bad, how to do, and how - do not need, what should be taken into account, and what is not, and so on. In general, the theme is "sucked to the bone."

That is why the other part of the iceberg can not be overlooked.

I hope that you will take seriously the things that I will describe in this article, in some cases deliberately exaggerating. A review, like any other process, requires careful and deliberate implementation, whereas the “Let's do it like everyone else’s approach” would not only fail, but even harm.

After reading this introduction, you may have a question: what is difficult in the review of the code? Things are thirsty. Moreover, most of the tools listed above immediately offer a flow review (out of the box: set, commit / push - and go!).

But practice shows that everything is not as obvious as it seems at first glance. Including due to the alleged simplicity of the process. When everything is adjusted and works, there is a risk that the manager will calm down and stop diving into the process, passing it on to the developers. And they will either follow the process, or will do something else at the expense of solving the problems that the review process is intended to solve. The manager may not realize that something is going wrong, because it does not give installations or does not impose rules. While for business it is very important to solve problems as quickly as possible, without wasting time on unplanned activities .

Once upon a time, when I was working at another company, I had a deep conversation with one of the leading developers in memory of the review of the code. It was p1lot . Perhaps some readers are familiar with it (p1lot, hello :)). He is now working with us at Badoo, which is great.

So, PILOT then said: “The most difficult thing for me in the review process is to compromise and skip the finished and correctly working task further, even if I know another way to solve it and even if I like my method more.” In my opinion, this thought is key. And the main message of my entire article somehow revolves around this postulate.

Why do I need a code review process?


Do you have a review in the company? Do you do it right? I have a test that may make you doubt it.

Only three questions need to be answered:

  1. How long does it take to review a code for an average (spherical in vacuum) problem?
  2. How do you minimize review time?
  3. How do you determine that a review of a specific task is done correctly?

If you do not have clear answers to some (or all) questions, if you doubt your answers, I dare to suggest that something is going wrong.

If you do not know how long it will take to review, and do not minimize it all the time, then how do you carry out planning? Is it possible that the performer who made the review for four hours did not drink tea half this time? Is it possible to be sure that the time for tea drinking does not increase from task to task? Or maybe the reviewer does not look at the code at all, but simply clicks "Code is OK"?

And, even if the review is done in good faith, how to ensure that with the growth of the code base and the company the time for the implementation of tasks does not increase? After all, management, as a rule, expects processes to accelerate, not slow down.

If the answer to the third question does not immediately come to mind, then it makes sense to ask one more. Do you know why you really need this process? Because "that's the way all big guys do"? Maybe you do not need it at all?

If you ask your leads and developers about what constitutes the “correct” review of the code, you will be very surprised at how many different things you will hear. Answers may vary depending on personal experience, beliefs, personal confidence in the correctness or incorrectness of things, and even on the technological stack and development language used.

Remember about the ready tools for review of the code, offering their approaches and flow? For example, I wonder how the developers of these tools would answer the question. Will their answers about the “correctness” of the review correlate with the answers of your employees? Not sure. Sometimes I sadly observe how someone implements review tools, not doubting for a minute that they are necessary. Whether people are doing this “for show”, or to report that “they have implemented the review of the code, everything is fine.” And in the end forget about it.


I do not want to be a cap, but nonetheless. Communicating with employees, pay attention to the answers like "Because it is so instituted" or "This is the best practice, everyone is doing it." You yourself know very well that you do not need to thoughtlessly introduce some process, not understanding why it is needed. Any process should be justified and implemented (if necessary), adjusted for the needs of the business and the project, as well as the problems that really exist and that really want to be solved. Cargo-cult in a company with good engineering culture is not the place.

Accordingly, it is important for the manager to convey to the people the “correct” review of the code, which is necessary in his project. And before that, of course, there is a criterion of “correctness” to formulate for oneself, choose suitable tools and establish appropriate rules. And there is already close to control.

Bad Code Review


So what is the right process? Let's figure it out. Below there will be a lot of my personal reasoning, and for those who are impatient to find out what I am writing all this, I propose to go straight to the “Good Review of the Code” part.

I say thank you to those who stayed, and nevertheless, I propose to determine the correctness of the review independently, based on the needs of your project, having thought it over carefully. And I just try to help you with this.

In order to highlight important and necessary things for yourself in any process, it is useful to start cutting off obviously unnecessary ones that harm the engineering culture. So do.

Knowledge Exchange


To the question “Why is the code review process needed?” I heard the answer “ To share knowledge ” many times. Indeed, useful and necessary thing. And many agree that it is important to have a process in the team for sharing knowledge and expertise. But are you sure that revision code is suitable for this?


I have repeatedly asked people what they mean by “knowledge sharing.” And in response, I heard different things.

First, it is a demonstration to newcomers (in a team or an affected component) of accepted rules and practices: this is how we do it, and we don’t do it, which is why.

Secondly, this is a fresh look of a novice (again, in a team or an affected component) on how ordinary things are done. Suddenly, they are done inefficiently, and a new person will help to detect it?

Thirdly, this is just an introduction to some part of the code. After all, if in the future the reviewer faces the need to change this part of the code, it will be much easier for him.

Let's go through all the points and try to assess how relevant they are in the review process.

Code review as a tool for learning newbies


In this case, we are talking mainly about this scenario: the beginner is given a task, he performs it, and then an experienced member of the team (the owner of the component) indicates errors that he made. The process is important and necessary, I do not argue - beginners need to somehow show the rules adopted in the team.

But let's be honest: isn't it too late? Wouldn't it be more correct to tell a newbie about these rules before admitting him to the code? After all, the cost of an error is very high - rarely beginners immediately work the way you need. So, the task will come back and come back for revision. So, the product will be available to the user later than it could.

It turns out that in the process, the duration of which is difficult to estimate, we add another one, the time spent on which is even less predictable. Thus, we increase the time required to deliver the task to the user by an even more unknown amount.

Of course, sometimes learning newbies through workflows has the right to exist. But sometimes we do not think about the shortcomings of the approaches that have become habit. And to make a mistake here is not just easy, but very easy.

For example, if a company does not have a streamlined process of training and coaching , the manager is forced to “throw into the water” a novice. The latter has no choice but to “swim” or leave the company. Someone really learns in such situations, but someone does not stand up. I think many have come across this throughout their professional career. My message is that the most precious time may not be spent optimally due to such a phenomenon. As well as the process of adapting a new employee in a team may not be optimal. Just because no one got around to organize an effective process of introducing new employees into the team, and the manager thought that the newcomer would learn everything at the review code stage. But in reality these are two different processes, very necessary and important.

Of course, even under the conditions that the rules were initially explained to the newcomer and that there are other learning processes in the company, the newcomer's adaptation process must somehow be controlled. A review is one of the processes that can help. But control and training are different things, aren't they? Moreover, all members of the team need control, not just beginners. After all, we all can make mistakes, forget about the agreements and somehow influence the part of the system that the whole team owns (in this case, the code).

What can be concluded? The process described above is aimed at achieving a different goal: not for training, but for control. And this very control is necessary not only for beginners, but for the team as a whole.

All this is easily stated in the following thesis: "A review of the code is necessary in order to control the observance of agreements and general rules by all team members ." And the training of newcomers in this case will be nothing more than an artificial substitution of the goal, which will only confuse people and direct the process in a different direction.

But even with this conclusion, which seemed to be correctly formulated, it is possible to break wood. A code review is a stage that occurs when the code is already written. And it may happen that the code written without observance of the general rules will be very expensive to fit into the general pattern. Our task is to inform the performer that something went wrong as soon as possible. And therefore, I argue that sometimes reviewing a code is not the most appropriate process for monitoring compliance with common rules. In many cases, compliance checks can and should be moved to earlier stages.

For example, you can check the formatting of the code in the hooks of the version control system (as well as the use of predefined classes, agreements on the location of files in the appropriate folders, the use of external dependencies and third-party libraries, etc.). This minimizes the time required for code review.

Continuing the conversation about training beginners in the process of reviewing the code, we draw an analogy. Suppose you produce some product, for example, cakes. Let's say you received an order for a wedding cake for VIPs. Do you entrust the “review” of the finished cake to the novice? Are you ready for the new pastry chef to take responsibility for the shame or success of the entire bakery at this wedding? Or maybe you want him at this stage to get acquainted with the rules adopted in the team (how to bake cakes, make cream and insist on cranberries)? Obviously, the process of acquaintance of a novice with the rules and control from his side are rather strange things at this stage: the cake has already been baked. So why we can do differently with features and code? Or the features that we launch do not bear the shame or success of our team in the eyes of users and customers?

You can argue by saying that we do not prepare tasks “for important people” every day and that it is possible to entrust the newcomer with a not very urgent or not very important task. And you will be right.

But first, what will a beginner learn from trivial tasks in which there are few code changes (and these changes are not in critical places and are certainly not so important for business, since the task is not urgent)? How can he learn to bake cakes, if all the while whipping cream? The transition from simple to complex, of course, necessary. But this must be done by carefully controlling the process. Beginners need to be taught, and not leave to learn on their own.

And secondly, even such a seemingly useful and correct approach can harm companies. To understand this is very simple, using the method of bringing to the point of absurdity , to make the conclusion as simple and clear as possible.

Imagine that we openly declare that the tasks that the product manager gives us are needed for training newcomers. Why? And just like with the review of the code: we are after all entrusting some simple and non-urgent tasks to the beginners, so that they learn how to work as we have done.

What can this result in? A zealous performer who conscientiously does his job and believes that everything he does should be done as well as possible, will be taken as a learning process. He can set the task to do several implementations instead of one, then to show the student the disadvantages and advantages of different approaches. He can give lectures comparing approaches and best practices applied in the company and beyond. He can do a lot more to train a beginner. As a result, the time required for the implementation of the task will increase, because instead of developing, we focus on training .

In the case of code review, such an ardent employee initiates a long discussion in the comments to the review about the benefits and harms of certain implementations, posts pieces of code with Stack Overflow, showing that there are other ideas in other heads, gives links to articles and books that the learner should read to understand that its implementation is imperfect.

This is a normal learning process that may exist, but which should be done correctly. And it's not always worth integrating it into the process of solving a business problem. Because these are two different processes. Let the novice make different components of the cake, let him make several options, let him spoil something and throw it away. Let someone more experienced show him how to act and retell ancient recipes. But to learn "on the combat conveyor", which produces your product for customers - not the best idea. And if you decide on this, then “drive the handle” of a beginner, not allowing him to interfere with the production process during his training.

If your company is not an institute, not a school, not a technical school, or any other educational institution, then training is not your direct responsibility. The business hired you to solve other problems and achieve other results.

By the way, that’s why we don’t add functionality to Codeisok for uploading images, styling text, highlighting code in comments, although there have been many requests for similar features. We promote the idea that code review is not a personal blog, not a messenger or a service for learning. The tool is needed for another. After all, to indicate the shortcomings of the code of a plain text comment is more than enough.

Code review as a fresh look at the code


Go ahead. The following scenario of “exchange of experience” is this: we are doing something in the team, observing internal agreements, and our eyes have become blurred, and then a new person has come (from another company or from another component) - and maybe he will see shortcomings in our organization of work.

The idea is good, it sounds reasonable. Really, what if we do something wrong?

But again we will return to the life of the task and the onset of the review of the code stage. Is it too late? The cake has already been baked, the cakes are smeared with cream, the roses are glued. The cost of the error is very high. And here we learn that in another bakery, soda, slaked with lemon juice, is added to the cakes for pomp. So what? What to do? Redo?

Obviously not, because: 1) we always did without soda, and the result was quite good; 2) it is possible that buyers take cakes from us, and not in another bakery, because they do not like the taste of soda; 3) the cake is ready, the wedding is tonight, and we will not have time to bake a new cake.

Then why all this? Sharing experience is necessary. A fresh look is needed. But, it seems to me, at a different stage. For example, before baking cakes, you can check with a beginner: “How did you do it before? And why is this way better? ”And, probably, it’s worth baking a couple of cakes in the old and new way to give our regular customers a try and get their opinion.

Thoughtless introduction of the best practices seen in articles or reports can harm the company and the product. “Soda is added to the cakes by all the big players:“ Bugl ”,“ Traisbook ”,“ Ryl.ru. ”,“ Yumdeks ”. Everybody does that. ” So what? Because of this, Peter must immediately take up the task of reworking, which is already ready?

I'm sure not. If initially, before solving the problem, you did not agree that “there will be soda in the cake”, then it is very short-sighted to demand its addition during the review of the code. Your business has no goal to have soda in the cake. If your cakes are selling so well, then it doesn’t matter whether they have soda or not.

It is necessary to understand that the exchange of experience is a different process that is not tied to the review of the code. It may occur earlier, later or in parallel with it, but it is different. You can arrange mutual workshops with competitors. Some try to steal a super-secret recipe and a super modern technique of beating cream with a perforator. Someone tries to spy on other people's ideas and improve their processes with their help. But it is strange to do it on a working pipeline, at the stage when your product is almost ready, and the cost of rework is very high.

After all, we are talking about the review stage. Review of the code that is already written and ready for the next stage. This code is waiting for the “Code is OK” button to be clicked by the viewer. And your customers are not at all prepared for the fact that you will prepare a baked cake with a new chef to show him how you bake cakes and listen to his critical remarks.

Code review as familiarity with part of the code


Perhaps it looks like the most reasonable and understandable part, with which many agree. The scenario here is meant as follows: one programmer wrote the task code, the others looked at it, studied it - and now they know how to work with it. Thus, we reduce the risk of bus factor : if a developer leaves, others will be able to successfully work with his code. And also we are ready for the fact that next time this code can be touched by someone else, in which case he will already know how to work with him.

Let's think about whether this really works as planned, and whether or not it really helps people to share competencies and knowledge about specific parts of the code.


Let's look at the situation through the eyes of the person who wrote the code. Does each task he applies the approaches and practices that must be learned by all team members?

Suppose he invented some sort of cool sorting algorithm (maybe, he has invented a long time ago and constantly uses, and maybe just that). Do all team members need to know about this? Of course yes. It is necessary that people find out that cool programmer Vasya has created an unrealistically fast and efficient sorting algorithm that will be useful to everyone.

However, in my opinion, there are not so many cool things that should certainly be integrated into general rules and standards. And they are not used in every task. So is it worth stopping the entire pipeline each time, tearing everyone off from work and showing them the code of all Vasya's tasks?

It is obvious that in many companies the whole team cannot know about Vasya's know-how at the review stage. After all, the review is not made by the whole team, but by a certain number of reviewers. If you do it all, it's too early to rejoice: with the growth of the team, it will be more and more inconvenient, and ultimately impossible. This process does not scale, and it is important to remember that the time resources that can be spent on reviewing the code are limited.

Now let's look at the situation from other developers who could review the Vasya code and evaluate its work. Do they sit idle and wait for Vasya to show him something cool in the code? Surely not - they are working on their tasks. Everyone has their own business, and the developer is not always ready to drop everything and instantly switch to a discussion of the achievements of colleagues, no matter how stunning they are.

And, probably, it would be reasonable, if Vasya had informed everyone about his invention in some other way. Made a presentation, for example. Or just wrote a letter with a piece of code and an explanation of why this is cool, why its algorithm is really cool and why it should now become a standard for everyone. The rest will read it, each in its own time, ask questions, agree or disagree with the decision. And only after that you can accept or not accept the new standard. Perhaps someone will have a better offer: a more optimal, cheaper, more quickly realizable.

Do you feel? What I am describing is not a code review process — it is a process of creating and approving common practices and approaches. Completely different process.

And at this stage, the best practices from colleagues and “In my previous company, they did it differently” or “I read that they do it differently”. So what? Best practices in general, for a spherical company in a vacuum, may be suitable. Does this mean that they will all be useful in your company? At least not always. And it certainly does not mean that someone should immediately rewrite their code, which has already been spent on writing.

Compliance with any abstract or external best practices, guidelines, fashion trends, etc. - to some rules that are divorced from the company / context - for the sake of conformity no one needs. “Static methods are considered bad form”, “Such things should now be put into treits”, “Singleton is a bad practice, it is better to use a factory and DI”. Who needs? What is considered?

Sometimes you can hear sound arguments, but fair for abstract projects and tasks, and not for real ones. Even useful and necessary things to discuss and recommend at a stage when the code has already been written are late, because the implementation will be unnecessarily expensive.

In addition, Vasya’s code may not be a cool new algorithm or know-how, but a “crutch,” a dirty hack, which he had to use for some reason. Do everyone really need to know about this in the team? If a dirty hack is justified and there was no other way to do it, then probably yes. But even in this case, you should not distract the whole team by stopping the conveyor. I doubt that Vasya himself would have liked this idea.

First, few people are ready to “distract” right now and appreciate the hack’s “dirtiness”, because everyone is busy with their tasks. And secondly, no one will remember anything. Yes, and people in the team change, and if someone in the future stumbles upon this hack, the person one way or another, the question arises: why it was done that way? And the best solution would be to add a comment directly to the code. A comment explaining the reason that compelled to resort to such a hack (remember the rules of good form when commenting on code: we describe in the comments not what we do in the code, but why we do it that way).

And if you can avoid using this dirty hack, then it is obvious that the task is to rediscover and force Vasya to redo it. This is a direct purpose of reviewing the code, and it is clear that it has no relation to familiarizing the entire team with the code (why should everyone know about the code that will still be rewritten?).

The third case is when there is nothing supernew in the task and there are no dirty hacks. It is based on all the principles and rules adopted in the team, framed in accordance with general standards, etc. So why should the whole team be distracted by it? After all, there is no bus factor, and if someone else has to work with this piece of code in the future, he will easily figure it out.

And finally, if I still could not convince you, then I have a question: how to understand that the exchange of knowledge in the review process took place?

- Petya, did you read the code Vasya wrote?
- Read.
- Got it?
- Got it.

That's the whole verification process. And where is the confidence that Petya understood everything correctly? Where is the confidence that when Petya will change this section of the code in the future, he will not break the wood? , , , — ?

, , , , , ?

, , , . , , .

, . , , , ( ), , ( , , , — , . .). , .

— , , , . ( , , . .) . — — .

Code review


« ?» . , , .

: - , , - , , . git blame , , , , , .

, , — , , , ( ) — . , , . . It sounds strange, doesn’t it?


. What to do in this case?

, , , . , , , . , - ( , ). , , — , , , .

, , . ? , .

, , , . , . , ? .

, , — , . ( )?

« - - » . , . .

: . .


, , . , . , .

, , — . , .

— . , . , . , .

, , :


? , , ? ? , , ? - , ? ? ? And so on.

, . , . , .

, , . , , , . , , , .

, best practices, . , , , , . . , , . , .

,


. , , . - , , - .

, , . , API- . , bus factor, .

, . .


- ? , . , . null , « ». .

, . , ? ?

-


, .

, code maintainability, , . , maintainability? , , , , , , code maintainability. .

. , , , , . — . ( ). .

Conclusion


That's all. — , . , , , . .

.

, . . : , , (? , -). , , . .

. , . , , , .

, , , , . . , , -.

: , , , . , , . , . , , , — , .

, , .

Source: https://habr.com/ru/post/413965/


All Articles