MIT has developed a new type of interface that allows you to turn "thoughts" into a voice.


Arnav Kapoor, one of the developers of the new type of interface, demonstrates the operation of the device

MIT engineers have created a system that transcribes silently pronounced words and sentences into text. For the successful operation of the system, its carrier needs to clearly pronounce words and phrases to itself. In this case, the muscles of the face, throat and tongue, which are responsible for speech, begin to work. They do not work in full force, but are only activated, which is quite enough for the new system to "read."

From the outside, it looks like this - the person simply remains silent, and the system “speaks”, or rather, prints. The development consists of two parts: a gadget that must be worn on the face and a specially “trained” neural network that analyzes the information received and associates it with letters and words. In addition, the interface allows you to manage gadgets - switch channels to TV, keep track of costs and keep quite normal activity.

The gadget to be worn on the ear includes a bone earpiece, that is, an earpiece that conducts sound along the bone to the inner ear. The external channel remains open and the person hears everything that happens around.

Such a system is very portable and suitable for carrying both outdoors and at home. Some of its uses are unusual. For example, you can play chess, pronouncing the opponent's moves to yourself and get help from the computer.

You can use the development not only for people with physical problems, but also for ordinary users in a variety of situations. The task of the developers was to create a system that allows one to improve a person’s abilities, complement his intellect and in some way his sense organs.

“We are no longer able to live without smartphones and other digital devices,” said Petty Maes, one of the project participants. “But the use of these gadgets prevents us, you need to be interrupted in order to work with them. For example, there is a conversation, and suddenly there is a need to use the phone. You need to find it, take it in hand, enter the password and open the application. Therefore, my students and I have been experimenting for a long time with new types of systems and their form factors that allow people to use the advantages of modern technologies and services without being distracted by gadgets. ”

The results of the work were reported at the ACM Intelligent User Interface Association for Computing Machinery's conference.

In principle, the idea proposed by scientists is not new. It appeared somewhere in the 19th century, and with the advent of new technologies, it began to work seriously on its implementation. In the 60s, the pronunciation of phrases and words during the reading began to be considered as a third-party factor that impedes speed reading (actually, it is). But the pronunciation has its advantages, it can be used in the development of computer interfaces. One of the examples was given above.


During the creation of the system, scientists first needed to understand which muscles on a person’s face would be used most actively during speaking. After that, the development of a prototype device for converting “thoughts into text” started. The main signal reading element of the system were electrodes in an amount of 16 pieces.

They took testimony from them and checked against what the man was saying to himself. Then, based on the resulting data array, the developers began to train the neural network. By the way, initially the device covered both sides of the face. But then it turned out that the neural network transforms signals into text without problems even if the electrodes are only on one side of the face. Therefore, to reduce the size of the system, it was halved.

Neural networks began learning small - just 20 words. Over time, the dictionary was increased, and the neural network became more "smart". According to scientists, it can be personalized for any person, increasing the accuracy of recognition of "thoughts". The more training there is, the better the system can work.

The developers had no plans to bring it to perfection, it is only a proof-of-concept. The technology can be used in many areas, including production. You can imagine an industrial enterprise, the noise level at which interferes with the normal communication of employees on work issues. In this case, you can use such a system. The situation is similar with firefighters or divers. They will not need to talk, the system will voice “thoughts”.

So far, we are not talking about the commercialization of technology, but this development is also not excluded.


Source: https://habr.com/ru/post/411651/


All Articles