The attentive reader scrolls through the tape and asks the question: “What, again the text about agile?”. Yeah.
This article is about processes, technical aspects and a little bit about how agile lives and is implemented in Yandex.Money. If you have traveled at least half the way to the present agile, some things may seem obvious to you, and this is normal.
Under the cut about the test stands, locking people in the negotiations and how to manage the department, when all dispersed to the teams and enjoy life.
And the attentive reader will ask: "Why is the" Dark Side "? Is it about Darth Vader?" Alas, no, it will be about the dark side of the Moon, which was unknown to mankind, until the device came to take a picture and show it to everyone. When you introduce agile, you design a project for the exploration of the moon, not knowing what's on the other side
It all starts with an attempt to introduce new development processes. Scrum, kanban, scramban or some other local bike is not so important.
At the head of the classic development departments, the resource manager usually sits. He says to the outside world: "Do not go to the developers, go to me, I will distribute everything here." One day, such a manager for the first time allocates a stream, because a special customer has appeared. Then there are more such customers inside and outside the company, conflicts start, the struggle for resources begins, and the manager has to settle all this. Also a flow allocation. Java - left, JavaScript - right
This game continues to some critical point, after which the company accepts the idea that agile is definitely needed right now. Product teams appear because there is nothing more valuable for a PO than a dedicated resource and its own team. Product producers are satisfied because with a live team it’s easier to take responsibility for the functionality and bear the burden of responsibility for PNL, traffic and other KPIs.
It sounds correct, but in real life, everything is a bit wrong.
In most classical development departments, it is more profitable to work with a monolith. In this case, all the attributes are attached - release cycle in 3-4 weeks, long testing and assembly. Sometimes monolith - norms
But the selected teams do not work that way. In general, the world came to microservices because everyone started to switch to small teams and work in them. Yes, this leads to the fact that the code "crawls" and everything becomes more difficult to control.
On the other hand, we speed up the release of the product, roll out releases more often, but run into testing problems. And they also need to somehow be addressed.
Reforming testing
If you have one team and one test stand - everything is in order, you can not worry (or worry, but on another occasion). Often, in such cases, it is not even considered something critical - for example, an additional tool such as mail or corporate chat. Everyone is closely watching production, and they are fine too.
If you have already flown to the Adjal Moon, then the test bench is the thing that will slow down the whole process, and here's why.
Life story : in one company, the entropy around agile began to grow too fast. At that moment, testers started the schedule of a single test bench in the calendar - they broke time into half-hour slots and tried to somehow control the chaos.
The stand, in fact, should be used by 20 teams, but they cannot, because one of them broke everything
About test benches
Once upon a time we had several monoliths, for each - on a test bench, and everyone was happy. Once we made a difficult project for the separation of stands, selected teams, and then there were 20 stands.
Now there are 70 of them, but we are aiming at 200 - so that everything, as a gift, and no one left offended.
From the dialogue with the admin: - Tell me, where is the deployment automation? - The bi-weekly display takes an hour, what should I automate here?
In fact, as follows: (200 stands + production) * (50+ components) = A lot of effort on the layer. Now we have more than 50 components that the robot rolls out. If not for him, then everything would be bad.
At this stage, in a company that goes towards agile, automated systems for assembling and delivering to production will appear, work in teams will gradually improve and the figures will also grow - up to 60-80 releases per week, and each component will be released 2-3 times a day . At this point, everyone understands that the system has become too big for one person to embrace it.
In any team that supports the monolith, there will definitely be a couple of old-timers. They were here from the very beginning and remember all the bugs of all time, remember the strange logical solutions that the business has been pushing through.
Life story: “In general, it’s ok to try to knock on a client 3 times, but this client is special, and we will knock 100 times, there’s a correction factor, and you don’t need to touch it, it’s not just there.”
It takes so much resources to maintain the work of the stands that the operation becomes “golden” - multiply the whole farm by the number of test benches, add production, get confused and call admins at last.
Other monitoring
Admins will come and say: "We have everything covered by monitoring." This is fine, but with one clarification - monitoring it would be nice to be custom.
“Iron” metrics, the amount of memory consumed by Java, the temperature of all the cores of all processors — all this is useful, but does not always help in case of incidents with clients. Business metrics will also be stupid, if you have not made them custom. The world is complicated - it rarely happens that your ideal API uses all ideal customers perfectly. Everything is done by people, and everyone has to adapt - sometimes to clients for you, sometimes for you for clients. Like a nuclear power plant
Life story : we have been looking for and fixing a bug in our prod for a long time. After that, one of the clients broke several processes in which this bug was taken into account.
At such moments, you have to add custom monitoring, because without isolating exceptions and aggregation, this simply will not work. Therefore, by the way, they so often speak and write about machine learning and complex systems that define problems instead of a person.
Every six months, we have to conduct monitoring reviews, because business expectations increase over time. It happens like this - in the company everything is built and controlled, and the business brings a new client who needs SLA 10 times higher. And the whole story again.
If this is overcome, the system will work quite well in all cases, except for “umbrella” projects.
Umbrella projects
This, for example, is the introduction of 54-, when the state says: “But restructure all cash processes in the country”. Or when marketing paid for the project, the product still has to work and work, and the deadline is real, and then it will be shot for it. Or when someone from a top management just comes, it doesn't matter for what reason, and he also has a project with a deadline.
Spoiler - few in the market understand how to do them. You can buy different add-ons over scrum and kanban, you can read success stories, but practice shows that it is more expensive to do such projects according to theory. In addition, all these SAFE and LEAN are expensive administratively and resourcefully, and still require expensive and complex competencies that are not available on the market.
Life Story: Spotify is one of the exemplary agile companies. At some point they came up with a family subscription, but could not realize it because of the difficulty in synchronization and planning between teams that have their own roadmap and plans. A year later, Google and Apple rolled out the family subscription.
Synchronization and scheduling conflicts
The main problem with umbrella projects is the synchronization of all participating teams. It is connected with the fact that people do not like to negotiate.
This is manifested in many things, starting with the Scrum, when people can not agree within the same resource department. In agile, you have to synchronize and coordinate everything that happens with several teams. And if at some point you stop demanding collaboration from everyone, then each team returns to its favorite dark corner and works independently. This leads to failure.
Life hacking If two months are left before the next law, or before the advertising campaign, or the boss demands - take people from 4 teams, lock them in one room, give food and water and control. This is rude, but it works. Because if you try to synchronize in a limited time, you will fail the project.
In general, synchronization is necessary and without it you can not move on. It complicates life in projects with clear deadlines and high criticality - the terms are floating from 10% to 50%, and this is often unacceptable.
How to manage this?
The classic leader in the distributed department does not understand what his role is, because he was taught the paradigm “I distributed tasks to everyone”, and I have to work with “I don’t have people at all, why am I in the company?”.
Worst of all, there are control freaks who do not miss a single task solved by the department, arrange double public code reviews and control literally everything. When people are handed out to the teams, they ask the question: “Why am I here?”. The answer is this - then, so that developers in all teams change information, grow synchronously in one direction and the system does not crawl away.
In general, when such a question sounds, the manager must either be changed or taught.
To teach because many managers (including us) grow out of engineers, and no one has ever taught them soft skills. We believe that this is important, and once came to HR and asked for a large two-year course for managers - from fundamentals to performance and non-financial motivation.
Culture in IT
In agile there is another subtle point that concerns the organization of teams. When developers agree on something inside the team, they can begin to defend the interests of the team, forgetting about the interests of the company.
Ideally, when people realize that there is someone else around their Ejail Moon, a security service with their own requirements; architecture, which is not just invented; other teams whose interests need to be considered. We try to identify, cultivate and encourage such behavior.
Agile - tip of the iceberg
This path has important characteristics. Long. For example, DevOps appeared on the market five years ago, and its introduction will now cost 1-2 years, depending on the size of the company. If you start doing it when you have queues at test stands, then you are guaranteed six months of hell, because admins will be torn between everything.
Expensive. Implementing agile and moving along this path is only possible when there is a strong business and a strong company, and you understand that in the future you will still have to grow.
No people. For agile, new competencies are needed, which people do not have so much. It turns out a vicious circle - no people -> everything is done not very well -> no money -> there is no place to take people.
Three conclusions
No need to touch the "classic" development departments unnecessarily. Yandex.Money has a hybrid system - there is a product team, but there are departments that effectively cope with work without agile.
If you do not have the task to rebuild the entire company, but there is a desire or need to make a new product for new approaches faster, then it is sometimes easier to hire outsourcers who work on agile and give the product an external resource.
If IT-transformation is inevitable, then everything is better to negotiate "ashore". It is necessary to conclude a kind of “gentlemen's agreement” with the management - what will be the budget for hardware, people (for new positions of system administrators, testers and developers). In which case, it is possible to periodically return to this agreement and discuss what was done and how.
All of the above has one problem. Walk this path entirely! = Come to success. Do not pass it = guaranteed to come to failure.
But if you're already on the road - good luck to you!
For those who remember ears
This text is a retelling of the Dmitry Kruglov Yandex.Denge tehdir report on Agile Days. If you had better listen, here is the video.