Adblock for radio

The author of the article, Polish programmer Tomek Rekavek, is developing the Jackrabbit Oak project as part of the Apache Software Foundation for Adobe. The article was published in the author’s personal blog on February 24, 2016.

The Polish Radio-3 (the so-called Troika) is famous for its good music and intelligent presenters. On the other hand, it suffers from the presence of loud and annoying ad units in the broadcast, where usually some kind of electronics or medicine is advertised. I listen to Troika almost constantly at work and at home, so I wondered: how to remove ads? It seems I managed to find a solution.

Digital signal processing

My goal is to create an application that mutes ads. The commercial unit starts and ends with jingles, so the program must recognize these particular sounds and turn off the sound between them.

I know that this area of mathematics / computer science is called digital signal processing , but DSP always seemed like magic to me. Well, a great opportunity to learn something new. I spent a day or two trying to figure out which mechanism to use for analyzing the audio stream. And in the end I found what I need: it is a cross-correlation or cross-correlation.

Octave

Usually all refer to the MATLAB implementation. But MATLAB is an expensive application that simplifies complex mathematical operations, including DSP. Fortunately, there is a free alternative called Octave . It seems that in Octave, it is not difficult to start mutual correlation on two audio files. You just need to run the following commands:

pkg load signal jingle = wavread('jingle.wav')(:,1); audio = wavread ('audio.wav')(:,1); [R, lag] = xcorr(jingle, audio); plot(R);

Get this schedule:

A peak is clearly visible, describing the position of jingle.wav in audio.wav . What surprised me was the simplicity of the method: xcorr() does all the work, the rest of the code is only for reading files and displaying the result.

I wanted to implement the same algorithm in Java, and then I will have a tool that:

reads an audio stream from a standard input (for example, from ffmpeg),
analyzes it in search of jingles,
prints the same stream to stdout and / or disables it.

Using stdin and stdout will allow you to connect a new analyzer to other applications responsible for audio broadcasting and playback of the result.

Reading sound files

First of all, a Java program must read the jingle (saved as a .wav file) into an array. There is some additional information in the file, such as headers, metadata and other things, but we need only sound. A suitable format is called PCM, it’s just a list of numbers representing sounds. Convert WAV to PCM can ffmpeg:

 ffmpeg -i input.wav -f s16le -acodec pcm_s16le output.raw

Here each sample is saved as a 16-bit number with inverse byte order (little endian). In Java, this number is called short , and to automatically convert the input stream to a list of short values, you can use the class ByteBuffer :

 ByteBuffer buf = ByteBuffer.allocate(4); buf.order(ByteOrder.LITTLE_ENDIAN); buf.put(bytes); short leftChannel = buf.readShort(); // stereo stream short rightChannel = buf.readShort();

Reverse Engineering xcorr

To implement the xcorr() function in Java, I studied the Octave source code . Without changing the final result, I was able to replace the xcorr () call with the following lines - they need to be rewritten in Java:

 N = length(audio); M = 2 ^ nextpow2(2 * N - 1); pre = fft(postpad(prepad(jingle(:), length(jingle) + N - 1), M)); post = fft(postpad(audio(:), M)); cor = ifft(pre .* conj(post)); R = real(cor(1:2 * N));

It looks scary, but most of the functions are trivial array operations. The cross-correlation is based on the application of the fast Fourier transform on a sound sample.

Fast Fourier Transform

As a person who had no experience with DSP, I simply consider FFT as a function that takes an array with a sound sample description — and returns an array with complex numbers representing frequencies. This minimalist approach worked well: I launched the FFT implementation from the JTransforms package and got the same results as in Octave. I think this is partly a cargo cult , but damn, it works!

Run xcorr on stream

The algorithm above assumes that audio is an array in which we are looking for a jingle . This is not quite suitable for radio broadcasting, where we have a continuous stream of sound. To run the analysis, I created a circular buffer slightly longer than the duration of the jingle, which needs to be recognized. The incoming stream fills the buffer, and as soon as it is filled, the cross-correlation test is run. If nothing is found, then the oldest part of the buffer is discarded - and again we expect it to be filled.

I experimented a bit with the length of the buffer and got the best results with the buffer size 1.5 times the size of the jingle.

Putting it all together

Getting a stream in PCM format is easy. This can be done using the aforementioned ffmpeg . The command below redirects the stream to the standard java input, and then outputs Got jingle 0 or Got jingle 1 when the corresponding pattern is found in the stream.

 ffmpeg -loglevel -8 \ -i http://stream3.polskieradio.pl:8904/\;stream \ -f s16le -acodec pcm_s16le - \ | java -jar target/analyzer-1.0.0-SNAPSHOT-jar-with-dependencies.jar \ 2 \ src/test/resources/commercial-start-44.1k.raw 500 \ src/test/resources/commercial-end-44.1k.raw 700

Standalone version

I also prepared a simple offline version of the analyzer, which itself connects to the “Three” stream (without an external ffmpeg ) and reproduces the result using javax.sound . Everything fits into one JAR file and contains a basic user interface with the Star and Stop buttons. It can be downloaded here . If you don’t like to run other people's JARs on your machine (which is absolutely correct), then all the sources are on GitHub .

It seems that everything works as it should :)

Further work

The ultimate goal is to disable advertising at the level of a hardware amplifier, receiving a “real” FM signal, rather than some kind of Internet stream. This is covered in the next article .

Update (June 2018)

Hacker News Talk
Wykop Talk
Reddit Talk

Source: https://habr.com/ru/post/415469/

All Articles