Adaptive Waveform for your audio service



When it took me for the site of one radio program to adjust the layout of the audio archive, besides the admin, I also needed an audio player. The broadcast went 40 minutes plus two musical pauses. Using Waveform in such long formats is especially convenient, so like many music services, I decided to use this solution in the design of the player.

At the planned future redesign of the site and, possibly, future mobile applications, the raster waveform here was simply wedged. It is not adaptive, it is extremely resource intensive to redesign, if it is in raster.

Well-known SOUNDCLOUD solved this issue on small screens by moving the entire waveform relative to the static center. But I do not want that.

The radio programs were uploaded through the admin area, and I immediately made more compressed copies of audio files through ffmpeg. It would be foolish to abandon its capabilities and to generate a waveform.

Algorithm of actions:


1. Generate a waveform in the minimum storage size
2. Translation into vector (JSON)
3. Player drawing over this array
4. Implementation of adaptability: uniform reduction of the array and return to p.3

Waveform generation



At the time of implementation of this approach, comrades from the BBC have not yet released the output in JSON in their utility , as far as I remember. And at the moment, I would recommend that you rebuild their utility to remove the useless output of negative numbers and add. info about bit channels and other nonsense.
For now, let's continue:

If we take my player design (it is reduced in width here), we will see that there are 2 pixels per strip (plus 1 pixel separator). This means 600px will give us 1200px in width.



I suppose that in the future it will be extremely unlikely that there will be a need for a larger representation of the audio file. Well, if you do not pull on the design of the entire width of the 4K monitor, you should think about it, but I stop at the size of 600x60px.

And now closer to the code:

shell_exec("ffmpeg -y -i '$name.mp3' -filter_complex 'aformat=channel_layouts=mono,compand,showwavespic=s=600x120,crop=in_w:in_h/2:0:0' -c:v png -pix_fmt monob -frames:v 1 '$png_path.png' > /dev/null 2>/dev/null &"); 

-filter_complex - connect filters

aformat - working with sound

channel_layouts

-mono - mono mode

-compand is a compressor and expander. In this mode, both quiet and loud sounds will be equalized by volume, which allows you to get a waveform without peaks and overloads on both quiet and loud recordings. The wave form is always stretched to the maximum.

-showwavespic = s = 600x120 - s takes the size of the image.

-crop = in_w: in_h / 2: 0: 0 - trimming the resulting image. Typically, the output frequency response is mirrored around the x axis. Therefore, we sprinkle, leaving only the tip of the iceberg.

-c: v png -pix_fmt monob -frames: v 1 is the output image format, the color palette is chb and only the first frame (we do not need animation). png8 is excellent in quality (lossless in our case) / place.

> / dev / null 2> / dev / null & send output and working data to the abyss. A '&' allows php to not wait for the console to finish working, but continue on.

At the output we get the following image:


The size of the final file 2.4kb

The funny thing is that a couple of years ago instead of white was red. The developers apparently changed the default values.

Waveform to vector translation


The resulting image is the amplitude in Y and the time in X. It is easy to translate it into a one-dimensional JSON array. Where values ​​will act as amplitude values, and time will simply be their ordinal index.

I decided to do the translation on the fly, without caching the result, it is done very quickly.
Measure the number of pixels in Y from the top to the first one , and go to the next pixel in X.

 $a = imagecreatefrompng("test.png"); $i = 0; $h = '60'; // horizontal movener while ( $i < 600 ) { // vertical movener $y = $h-1; $c = 0; while ( $c < $h ) { //echo imagecolorat($aa, $i, $c ); // test color if(imagecolorat($a, $i, $c ) == "255") { $arr[$i] = $c; break; } else { $arr[$i] = $y; } $c++; } $i++; }; echo json_encode($arr); 

The resulting array consists of 600 values.

[46,28,34,35,34,35,26,33,39,29,29,30,30,30,33,33,28...]

Player drawing by JSON


For a comfortable work progress bar, I took the word progressor.js from Elliot Bentley. He made it for the service of audio transcriptions.

github.com/ejb/progressor.js 2.76 KB

Take another look at our player.



The progress bar consists of two layers: a background with gray bars and green.

Below the images are drawn by the getGraph function.

Its meaning is to draw columns of the desired thickness and color with columns separators.

 var c = document.createElement("canvas"); c.width = width; c.height = height; var ctx = c.getContext("2d"); function getGraph(fillStyle1,fillStyle2,fillStyle3) { if (fillStyle3) { //console.log(fillStyle1); var grd = ctx.createLinearGradient(0,120,0,0); grd.addColorStop(0.5,fillStyle1); grd.addColorStop(1,fillStyle2); fillStyle1 = grd; fillStyle2 = fillStyle3; } json.forEach(function(item, i, arr) { ctx.fillStyle = fillStyle1; ctx.fillRect(i * 3, height, 2, item - height); ctx.fillStyle = fillStyle2; var next = json[i + 1]; if( item <= next ) { h2 = next; } else { h2 = item; } ctx.fillRect(i * 3 + 2, height, 1, h2 - height); }); return c.toDataURL(); } 

Here is a working example without adaptability.

4. Implementation of adaptability


Now we need to reduce the array of JSON on the client to the desired size and here's your adaptability.

Plan A


The very first method that comes to mind is to remove every second, third, fourth in the loop ... wait, so the array cannot be reduced by less than twice, and pixel precision cannot be achieved here.

Modifying a waveform by deleting array values ​​is a dead-end path. When you do this, you will see how much the shape of the wave becomes impersonal, because you are throwing out the extremes and not averaging your neighbors by height.

We need resampling algorithms. There is an implementation of the algorithm on js:

largestTriangleThreeBuckets

It works well, it only asks for the input of such an array, according to the indices of which it will receive the XY coordinates. We have a one-dimensional array, so we had to think a bit and redo the function. It works like this:



And here you can touch with adaptive as KDPV.

Translate the view mode, where the frame with html will be on the right. Then you can change the width of this window.

Plan B - puff


However, I still would not want to load the client part. For example, I want 1000 points-5000, but the entire width of the screen. If I have more points, how will this business behave on the mobile? On the one hand, there are absolutely no problems with this, it would not seem so expensive, judging by the demos of the algorithm, it cheats 5000 points easily. But on the other hand - it is necessary to give as much as they ask. Question design.

Elementary, if you have Node.Js, you can transfer this code to the server. And if you have php, you can find the implementation of this algorithm in php but ... why, I thought.

Where are the resampling algorithms? In the same native lib GD, which we used to generate JSON. We simply transfer the parameter from the client in pixels of the required width and resize our waveform before transferring it to JSON.

Therefore I will expand the code written in the beginning.

 $h = 60; $width_new = 600; $a = imagecreatefrompng("$id.png"); $width_old = imagesx($a); $aa = imagecreatetruecolor($width_new, $h); imagecopyresized($aa, $a, 0, 0, 0, 0, $width_new, $h, $width_old, $h); imagetruecolortopalette($aa, false, 2); $i = 0; // horizontal movener while ( $i < $width_new ) { // vertical movener $y = $h-1; $c = 0; while ( $c < $h ){ //echo imagecolorat($aa, $i, $c ); // search what color is needed if(imagecolorat($aa, $i, $c ) == "1"){ $arr[$i] = $c; break; } else { $arr[$i] = $y; } $c++; } $i++; }; echo json_encode($arr); 

After that, you can not worry if you need to change the design, the width of the player, to expand into a mobile application. Everything looks quite flexible and very nimble.

The code is here

.
Easter eggs.

It must have been a sunny day. The window of our room looked out onto two old brick 9 floors, which I remember as a teenager, I know that a tram ring opens behind them, a little further away - an old hospital, it is right behind the school, and the current building with an office where I try to be digging in memoirs, this is a former unfinished hospital, now a purely office space. I remember how the special forces trained here in their childhood, they were shown on TV, cheerfully storming the concrete structure, covered with everything they could. And now, it turns out, I cheerfully beat the current on the shiny railings, going down the stairs, and admiring the form of distortion of this building in the reflection of the nearest residential complex. (The wall of the old large cemetery opens right along the tram line. And there is an inscription in green paint “While Boris is in power” and “Labor Russia.” The devil knows who and when made them, but after a couple of decades they are still being read. , but remain completely invisible. I have not seen more of the heritage of the 90s of the more ancient monument in the city.)

On our top floor, it is empty, as it is empty in the package with buckwheat that was started: there is a lot of everything at the bottom and it is tight: some special-purpose crutches, an office 2gis, then regular SEOs, and from above - almost no grains. You think, here something has to sprout through the floors here, but for these 5 years only the window washer has looked out of the transcendent, and the immanent accountant has insane eyes that knock on all the doors on the floor in search of someone. who will explain how to sign them through the insane plug of the Internet bank due to the next browser update.

Source: https://habr.com/ru/post/412629/


All Articles