Database in a commercial project: what to do?

All with the end of the holidays! ... Some will say that this greeting - so-so. But, for sure, many soon leave, so press some more. Well, we do not give in momentum and on this warm day we share the experience of our partners. It will be about optimizing work with the database. More under the cut!



I give the word to the author.

Greetings to you, readers of Habr! We are the WaveAccess team, in this article we will share with you the experience of using the Azure Cosmos DB database service (DB) in a commercial project. We will tell you what the database is for, and the nuances that we had to deal with during development.

What is Azure Cosmos DB


Azure Cosmos DB is a commercial globally distributed service database with a multi-model paradigm, provided as a PaaS solution. It is the next generation of Azure DocumentDB.

The database was developed in 2017 at Microsoft with the participation of Dr. Computer Leslie Lamport (winner of the Turing Award 2013 for a fundamental contribution to the theory of distributed systems, the developer of LaTex, the creator of the TLA + specification).

The main characteristics of Azure Cosmos DB are:


On the graph you can see the dependence of different levels of consistency on the availability, performance and consistency of the data.




The task that we solved


Thousands of sensors located throughout the world transmit information (hereinafter, notifications) every few N-seconds. These notifications should be saved in the database, and then they will be searched for and displayed in the system operator’s UI.

Customer requirements:


Based on the requirements of the customer, a non-relational, globally distributed, reliable commercial database ideally suited us.



If we look at Cosmos DB-like databases, then we can recall Amazon DynamoDB, Google Cloud Spanner. But Amazon DynamoDB is not globally distributed, and Google Cloud Spanner has less levels of consistency and types of data models (just a table view, relational).

For these reasons, we stopped at the Azure Cosmos DB. Azure Cosmos DB SDK for .NET was used to interact with the database, since the backend was written on .NET.

The nuances we are facing


1. Database management

In order to start using the database, first of all you need to choose a tool to manage it. We used Azure Cosmos DB Data Explorer in the Azure portal and DocumentDbExplorer . There is also an Azure Storage Explorer utility.



2. Configure DB Collections

In Cosmos DB, each database consists of collections and documents.

Customizable collection features that you should pay attention to:


Typical index example

{ "id": "datas", "indexingPolicy": { "indexingMode": "consistent", "automatic": true, "includedPaths": [ { "path": "/*", "indexes": [ { "kind": "Range", "dataType": "Number", "precision": -1 }, { "kind": "Hash", "dataType": "String" }, { "kind": "Spatial", "dataType": "Point" } ] } ], "excludedPaths": [] } } 

For the search by substring to work, for string fields you need to use a Hash index ("kind": "Hash").

3. Database transactions

Transactions are implemented in the database at the stored procedure level (the execution of the stored procedure is an atomic operation). Stored procedures are written in JavaScript

 var helloWorldStoredProc = { id: "helloWorld", body: function () { var context = getContext(); var response = context.getResponse(); response.setBody("Hello, World"); } } 

4. DB change channel

Change Feed listens to changes in the collection. When changes are made to the collection documents, the database “throws out” the change event to all subscribers of this channel.

We used Change Feed to track collection changes. When creating a channel, you must first create an auxiliary AUX collection that coordinates the processing of a change channel for several working roles.

5. Database limitations:


6. Processing 429th database error

When the collection bandwidth reaches its maximum, the database starts to generate the error “429 Too Many Request”. To process it, you can use the RetryOptions setting in the SDK, where MaxRetryAttemptsOnThrottledRequests is the number of request attempts and MaxRetryWaitTimeInSeconds is the total time for connection attempts.

7. Forecast of the cost of using the database

To predict the cost of using the database, we used the online calculator RU / s. In the basic plan, one request unit for an item of 1 KB in size corresponds to a simple GET command referring to itself or to the identifier of this element.

findings


Azure Cosmos DB is easy to use, easily and flexibly configured through the Azure portal. Many APIs for accessing data allow you to quickly make the transition to Cosmos DB. No need to involve a database administrator to maintain the database. Financial guarantees SLA, global horizontal scaling make this database very attractive in the market. It is suitable for use in enterprise and global applications that place high demands on resiliency and throughput. We at WaveAccess continue to use Cosmos DB in our projects.

about the author


The WaveAccess team creates technically sophisticated high-load fault-tolerant software for companies all over the world. Alexander Azarov , senior vice president of software development at WaveAccess, comments on:

Complex at first glance, the problem can be solved relatively simple methods. It is important not only to learn new tools, but also to perfect the knowledge of familiar technologies.

Company blog

Source: https://habr.com/ru/post/413757/


All Articles