The third version of reCAPTCHA works in the background imperceptible to users.

In May of this year, the third version of the reCAPTCHA technology, reCAPTCHA v3 (beta), was presented at the Google I / O 2018 conference. As you know, this is the most popular system like CAPTCHA, which is designed to block bots, that is, automated actions on different services.

The system was criticized for exploiting free human labor (in the case of the first version, which Google used to digitize books), for making life difficult for people with visual impairments and other diseases like dyslexia. Another reCAPTCHA is criticized for being too complex : it is difficult or impossible for people to correctly answer the question: the test becomes just absurd. The illustration on the left shows some examples from the first version of reCAPTCHA. The situation did not improve much with the release of the second version (where you need to select the pictures containing the specified object).

But the third version is a completely different matter. It definitely does not hurt anyone, because it works seamlessly for users, using behavioral analysis methods.

First, a little history. The first version of reCAPTCHA appeared back in 2007 and served a good purpose: at the same time blocking spam and bots, it also helped in digitizing books. By 2011, with its help, the OCR results in digitizing the archives of The New York Times newspaper (more than 13 million articles since 1851) and Google Books books were refined.

Since 2012, fragments of photos of houses from the Google Street View service have been added to the system. Around 2013, Google began using behavioral analysis of user actions in the browser ( advanced risk analysis ), and in 2014, the second version of the system was implemented, where you had to select several “correct” pictures from a set of nine images, but at the same time you could take the test in one click. If the actions were similar to a person, then the user passed the test without solving any problems at all: just press the "I am not a robot" button (the so-called NoCAPTCHA). If the actions are similar to a bot, he was given a complicated test with recognition of objects in images.


NoCAPTCHA

The problem here was that besides the behavioral analysis, the cookies on the computer were also checked - and NoCAPTCHA was practically inaccessible to browsers in anonymous mode or those who clean the cookies after the session.

Third version



Presentation of the third version of the system at the Google I / O 2018 conference

In the third version of reCAPTCHA, behavioral analysis has been improved (or Google’s surveillance of users, if someone presents it in this light), that is, the aforementioned advanced risk analysis system, an advanced risk analysis.

Now the system works "in the background" and imperceptibly to users. It is enough to load the reCAPTCHA library together with the page and run grecaptcha.execute at a certain moment or immediately at the moment of loading the page. And it's all. The user does not notice anything - and you, through the JavaScript API, receive from the reCAPTCHA server an assessment of this user based on his interaction with the site and other parameters.

  <script src="https://www.google.com/recaptcha/api.js?render=reCAPTCHA_site_key"></script> <script> grecaptcha.ready(function() { grecaptcha.execute('reCAPTCHA_site_key', {action: 'homepage'}).then(function(token) { ... }); }); </script> 

There are suggestions that besides the movement of the mouse cursor, the system began to track other parameters, such as mouse clicks. This can only be guessed at. No information about the internal workings of the Google system is provided, so as not to help spammers and bot owners.

From the webmaster's point of view, perhaps the main difference of the third version is that upon request via the API, the reCAPTCHA server does not produce a binary value, but an estimate in the range from 0.0 (probable bot) to 1.0 (probable person) for this particular request. The answer is sent in JSON format:

 { "success": true|false, // whether this request was a valid reCAPTCHA token for your site "score": number // the score for this request (0.0 - 1.0) "action": string // the action name for this request (important to verify) "challenge_ts": timestamp, // timestamp of the challenge load (ISO format yyyy-MM-dd'T'HH:mm:ssZZ) "hostname": string, // the hostname of the site where the reCAPTCHA was solved "error-codes": [...] // optional } 

As can be seen from the server's response, reCAPTCHA v3 introduces a new concept of “actions”. If you define different names of actions in different parts of the site, then the system will begin to “adapt” to different needs: it will become adaptive (adaptive risk analysis).

In other words, the site owner chooses a “cut-off level” and what actions to take for users above or below this level on different pages. The default level is set to 0.5. For example, on the main page it is recommended to block only explicit scrapers (say, only 0.0). On the authorization form, you can filter everyone below 0.5 by offering them two-factor authentication or verification of the mailing address in order to protect against brute-force attacks. Now the scan can be run several times on the same page, at the right time and imperceptibly for the user.

To take part in beta testing of the third version of the system, you must register on this page .



Source: https://habr.com/ru/post/415075/


All Articles