Tales clouds



Immediately I warn you: these are not the stories where the rat was torn up in the trash . Just different small stories "how people do" that may be useful to administrators. And may not be. We have a lot of fairly large customers, and they, respectively, have competent admins. By experience I will say: often more experienced where there are strict budget constraints.

We have a customer who resells our cloud (as it turned out), there is a management of baking bread from the cloud, there is even a deployed CTF for training hackers. But let's start with the crossings, including those that have arisen due to the roads of Moscow excavated by excavators.

Highest data transfer rate


Moving from a data center in the region to our cloud . Since there is a lot of data, the customer decided to simply load the wheels into a regular truck. Since not everything had come together with the truck as it should, as a result, he took a compartment on the train. I directly imagine: bought all the coupe, made boxes with iron, locked himself and paranoil. As a result, set our record for data transfer speed, as in the old joke about a truck with blanks.

Mass travel


When Moscow began to dig, we received a whole wave of requests for travel. Someone cut the power cable, and while it was repaired, people drove into the cloud (since we migrate quickly under warranty, before the contract). We looked closely, stayed, slowly transferring applications from our server to us as we remove iron from the warranty.

The most interesting story was with one small server in Khimki. There, the financial director did not give money for modernization, in principle, they exploited iron in cycles of 5-6 years.

The logical consequence of this approach is that at the end of the cycle everything breathed its last, and there were problems even with the disks for the combat database. Every week something took off from them, they looked at the least critical applications and cut them. As a result, excavations by Khimki’s excavators gave the admins a rare chance to coordinate the move to the cloud.

A similar story was on the reconstruction of the Kaluga highway, but there the company immediately moved the office under this business and decided not to make the server node in a new place, but go directly to our data center.

Urgent Relocation


If you urgently need to move, you can simply bring a physical storage system. There are such stories, and not without oddities. The customer's admin gave us the iron, says: "Reload." “What is it?” We ask. “Any important documents,” he replies. Began to drag files - the admin simply reads their names and slides down the wall laughing. On the fileshare with important documents, which, in fact, almost 9 TB gathered, is the series "House MD". Naturally, they later cleaned up that in a hurry the customer did not delete.

Forgotten flash drive


Launched a new environment after moving the plan. Raise the application with the admins of the customer, and then one slams his forehead. Says: "Forgot the flash drive." Well, forgot and forgotten, then bring. Delov something. It turns out that this is not just a flash drive, but an authorization key for a business critical application. At night, I had to rush to the old data center, find a USB flash drive in one of the servers, wrest it and come to us. They inserted it into a standard USB hub for the cloud, presented it to the virtual machine as stuck in a physical port, everything went up.

Monitor in the cloud


Call: “Do you have a monitor in the cloud?” We: “WHAT?” Customer: “Well, I'm carrying a server here, remote access is not configured on it. I need a monitor in the cloud. ” As it turned out, they bought a new desktop computer (approximately the gaming level) and deployed services on it. The configuration is really almost like a server, they have filled it with memory almost to the limit. So they brought it to the cloud, stuck it in the Vilan with their virtual machines and left it. When it is depreciated, they will get rid of it, but for now it is just awkwardly standing upright and competing with servers in speed. We had a monitor, so we set it up simply.

Autospeiling


One of the large retail customers has auto-scaling, that is, resource consumption as they are needed. They have a monitoring system Zabbix, it is configured triggers to load each service. Let's say a web node. When load average reaches 0.8, Zabbix pulls the external Terraform script and creates a new virtual machine through the API. She is provilted with the help of Ansible, packages are being put down, a release is being put. The balancer receives it and updates the config. Deployment takes 5–10 minutes. When the total load falls to a certain level, it is this node that is removed.

Their database is configured as a master-master, so it also scales easily. By the way, disk performance in our country is scaled by a single request to the API, they and a number of other clients are actively using.

In the end, like a crutch, but beautiful. Savings of approximately 25–30% with full readiness for peaks.

Autoscale for Legacy


The state company made another scaling. They have architecture with legacy (read: something works on Fortran, something on the half-line, here and there for compatibility it is necessary to run the old hypervisor in the new hypervisor). They cannot scale horizontally over the cars. But they turn off their VMs at night and restart them with a light type of resources. During the daytime, the most powerful cloud machines, at 12 at night, they partially stop and instead start up the same, but much cheaper ones, with slow access to the disks and with other core quotas. This is sheduled on the basis of Exapark - our system that hangs from the outside. At 6-7 in the morning everything is repeated in the reverse order. This functionality is available to all customers of the cloud, but it is here that the admins knew exactly what they want and how. The result is a good savings, as the payment for cloud resources is hourly.

Unusual access type


We have AWS-compatible object storage. As a rule, it is usually used as S3. But one of the customers applies directly through the mobile application without intermediate transfers. The application is on iOS and Android applications, thousands of merchandisers work through it. They upload all photos and reports there. Directly from the mobile to the object storage. The applications, by the way, are written using the AWS SDK, only endpoints are different.

We pull the breaker in a month


There is a company that buys businesses ready to die and prolongs their life. There was one French company that, due to the sanctions, was about to leave Russia. Our customers have bought the business. The entire infrastructure of the French was in Moscow in a regular office, tightly packed server rack. For a month it was necessary to transfer everything to the cloud at once. If you don’t have time, they just chop the light and that's it. And there are shipments from warehouses, cars are waiting. And some things could not be verified until the old infrastructure was turned off completely. Naturally, we didn’t want to remain without shipments on the first date, so we agreed with the admins that they quenched the office server hub Sunday before the end of the month, we watch how everything new is unfolding. Has risen. Another difficulty was that the parent company did not give access to the native means of transferring the database, they also suffered.

When leaving, extinguish VM


One of our customers - the aggregator - consists almost entirely of developers. They are very fast and very obscured. We drove to the cloud for test environments in the main. Previously, they had problems with the fact that many different development teams came to the admin and pulled: "Give resources." It was unclear how much it costs: there was no budget division, it was considered manually. We moved to our cloud, covered everything with automation scripts and cleared up. After the first day, they didn’t have any calls to the GUI and only a couple of calls to the console - they do everything in a very automated infrastructure. Their automation tools can deploy whatever they want at any time. Now the bunt from financiers is asking, “leaving, to put out the cars”. So that after the tests, each of them cleaned. Taking into account that they write their infrastructure completely as a code (IaaC), I think they also automate this. If not yet done.

A similar picture was with another developer - they used our standard functional design uchetok. These are separate roles for each group of admins: it’s clear how much money is spent on what type of activity. Project accounting is a role through which you can create private clouds in essence, and deploy resources in them. Rights and access are chopped as you please, there are upper limits, there is a separate billing.

Fun use


We usually do not know how customers use our cloud. But sometimes admins themselves tell and show. So, we sawed a blockchain node, there is a CTF from one security company - just a simulator of a corporate network of a company, you need to connect there and break everything. Used for training employees and end customers. Another customer coordinates cleaning through the cloud, there is bread management baking (ACS TP), there is a medical service with video consultations with patients (they have a very complicated history with personal data, there is a dedicated, specially protected segment). There are still a couple of customers - some sell the service, and the second write the first software. From different countries. And both are standing nearby in the cloud. Another retail company is with us only in order to fight the raiders - they once a month cut off the light on the server node. There were requests for services like spammers: “We want to be located in Russia. Send several million emails per hour. It should be in the Russian Federation. And Russian software. The last refused. The light controller of one of the offices (dynamic lighting from the weather and the presence of people in the offices) is thrown directly into the cloud so that the contractor can enter it for maintenance. This is our first IoT in the cloud.

Source: https://habr.com/ru/post/414039/


All Articles