
What prevents you from successfully combining math and business?
This text is the first of a series of articles on how to properly embed big data tools for business.
Little spoiler: everything will turn out, if you remember the business itself.
5 years ago, large companies wanted to introduce a new-fashioned bigdatu. But the real experimenters were few. The exceptions were those who accurately possessed a wealth of data: telecom, banking sector, Internet companies. And in 2018, for expertise in big data, businesses come themselves, and from the most unexpected sectors: metallurgy, insurance, aviation industry.
How does the model start?
Big data has ceased to be a magic mantra (now the blockchain is wearing this crown). But until she got rid of the main myth:
“A more or less adequate mathematician can jot down a model on a piece of paper, quickly implement it, and after that you can sip a cocktail and watch the sales grow.”
I exaggerate, of course, but not very much. I will give an example from our practice.
There is a manufacturer of building bricks. Nemelka, with experience and well-established sales. At such times, companies often ask themselves: how can we still reduce costs and increase profits?
The candidate for improvement was logistics. There was a lot of chaos in the delivery of bricks, it was difficult to estimate customer demand in advance, so the cost of fuel and vehicle depreciation was unnerving. Having learned about the big data, the company decided: we will predict when the brick will end on the client’s construction sites in order to send it there promptly. Analyzed the previous data, made a model that promised interesting optimization percentages.
All the joy broke the usual order. First of all, it was necessary to find machines for prompt delivery and think over routes. Secondly, these cars could go to the warehouse for loading only in strictly defined time slots, because the schedule for the arrival of client machines was made several weeks in advance. It was impossible to move customers. Therefore, the efficiency was ashes.
It turned out that we started with the usual “let's predict” and ended up with the transformation of the business process.
The big data task has two statements: business and math. And their order is exactly like that. Before you put the analyst for building a model, you need to go through three stages.
1. Define the task from a business perspective.
Suppose we want to deal with the outflow of customers. And they decided to predict that a certain group of buyers is close to go to a competitor. For them, we will form all sorts of buns to keep.
The task at first glance is trivial. The analyst builds a model on historical data - gone and regular customers - in order to display the signs of those and others. For example, in a real case of a cellular operator, an anonymous subscriber’s outflow = the subscriber stopped using the connection. But how much time - a week, a month, a year - should it not be included in order for it to be recorded as “diverted”?
This task can be defined in different ways. It is possible on a ready-made business template. Or according to historical data - how often do subscribers who did not use the connection return a month? And if such - as much as 10%? For example, the subscriber was on a long-term business trip, or fell for a limited share of another operator.
It is important here: who should be considered “scavengers” is a completely business decision.
The required minimum of any unit of big data is 2 roles. The first is a data scientist, in which mathematics and model building. The second one from team to team is named differently - product owner, product manager, business analyst. On the conscience of this person is the correct formulation of the problem. His mission is to delve into the subtleties of the customer's business and select the tools he needs. And delve into active communication with all parties.
2. Check the business case.
Okay, we will determine the model. But how much will optimization cost us?
Take the same outflow. To keep potentially leaving customers, you can call or beacon the right message. Or, if there is a resource, offer a bonus. You can give the client a more economically interesting fare by analyzing his expenses.
But since we are thinking about bonuses, then this is our spending on such customers. And it would be okay if we knew for sure that this client would leave if nothing was done. But the models are not perfect in their predictions. We will hold someone correctly. And, for example, 20% of potential “sharpeners” will not actually be such. With this we will offer bonuses and them. How much money will be spent on this, is it permissible in our case - you need to look at the volume of the client base, the extent of the outflow and count absolute numbers.
This is called the errors of the first and second kind. We must understand that the results of the introduction of the model will give more than take away. And this should be an acceptable difference for us. Requirements for the model are formed before its construction. Maybe they will come out such that there will be no need to waste time Scientist.
3. Plan how the results will be used.
“The economy has come together,” the business case tells us. “Can we finally build a model?”
Early. You need to think about what will happen to the results.
That will give us a model of 200,000 people who can turn into “scavengers” every month. And we will decide to ring them. And if we have time to go through all? The contact center is not a rubber one.
Another point - you need to understand what time lag we will have between the prediction of care and the actual departure of the client. Why do we need a prediction if the client “denies” in the very near future? After all, then we may not have time to contact them. But the farther from the moment of departure we give the answer, the lower the prediction accuracy. Here again we have to calculate the optimum between the pros and the risks.
And the third point - how quickly can we implement innovations in our business processes? That did not work, as with the example of the manufacturer of bricks.
Finally
The path to a clear task for a data scientist is a task in itself.
If we checked all three points, everything worked out and a model appeared, we are waiting for the next fun stage - integration. Model building and related mathematics usually takes about 20% of the time. The remaining 80% (and sometimes much more, depending on the flexibility of the company) - implementation in a productive way. Up to several months.
The model is only MVP. Everyone loves to build them, because everyone likes hypothetical results. And then their introduction into real business processes is stalled in most companies. After all, the most difficult thing is to change the streamlined order.
Therefore, in any big data project there must be a data scientist, in which there is a math, a product manager in charge of the business, and a project manager with a project team. The last to be engaged in the implementation and shake up the business process. Sometimes painful and hard. But only in this configuration, working with big data can bring benefits.
This and other features of data analytics in business we teach in our School of Data in courses for
analysts and for
managers .
The post was prepared
by the School of Data on the basis of the publication of the founder of the School in the
Business HUB of PJSC "Kyivstar"