toString: Great and Awful

The toString function in JavaScript is probably the most "implicitly" discussed among both the js developers themselves and among external observers. She is the cause of numerous jokes and memes about many suspicious arithmetic operations, transformations that introduce the object object 's stupor. Inferior, perhaps only to surprise when working with float64.

Interesting cases that I had to observe, use or overcome, motivated me to write a real debriefing. We will gallop on the specification of the language and use the examples to analyze the non-obvious features of toString .

If you expect useful and sufficient guidance, then this , this and that materials will suit you more. If your curiosity still prevails over pragmatism, then please under the cat.

All you need to know

The toString function is a property of the Object prototype object , in simple words is its method. Used when converting an object string and in an amicable way should return a primitive value. The prototype objects also have their implementations: Function, Array, String, Boolean, Number, Symbol, Date, RegExp, Error . If you implement your prototype object (class), then toString will be good form for it.

JavaScript is a language with a weak type system: which means it allows us to mix different types, performs many operations implicitly. In transformations, toString works in tandem with valueOf to reduce the object to the necessary primitive for the operation. For example, the addition operator is wrapped in concatenation if at least one line is present among the operators. Some of the standard functions of the language in front of their work lead argument to the string: parseInt, decodeURI, JSON.parse, btoa, and so on.

About implicit type conversion it has been said and ridiculed quite a lot already. We will consider the implementation of toString key objects-prototypes of the language.

Object.prototype.toString

If we refer to the relevant section of the specification, we find that the main task of the default toString is to get the so-called tag for concatenation into the resulting string:

"[object " + tag + "]"

For this:

The internal symbol toStringTag (or the [[Class]] pseudo-property in the old edition) is accessed: many of the built-in prototype objects have it ( Map, Math, JSON, and others).
If that is absent or not a string, then a search is made for a number of other internal pseudo-properties and methods that indicate the type of object: [[Call]] for Function , [[DateValue]] for Date, and so on.
Well, if absolutely nothing, the tag is "Object" .

Those who suffer by reflection will immediately notice the possibility of obtaining the type of an object by a simple operation (not recommended by the specification, but possible):

 const getObjT = obj => Object.prototype.toString.call(obj).match(/\[object\s(\w+)]/)[1];

The peculiarity of the default toString is that it works with any value of this . If it is primitive, then it will be cast to an object ( null and undefined are checked separately). No TypeError :

 [Infinity, null, x => 1, new Date, function*(){}].map(getObjT); > ["Number", "Null", "Function", "Date", "GeneratorFunction"]

How can this come in handy? For example, when developing dynamic code analysis tools. Having an improvised pool of variables used in the process of the application, useful uniform statistics can be collected at run-time.

This approach has one major drawback: custom types. It is not difficult to guess that for their instances we simply get an "Object" .

Custom Symbol.toStringTag and Function.name

OOP in JavaScript is based on prototypes, not on classes (like for example in Java), and we don’t have a ready-made getClass () method. An explicit definition of the toStringTag character for a custom type will help solve the problem:

 class Cat { get [Symbol.toStringTag]() { return 'Cat'; } }

or in prototype style:

 function Dog(){} Dog.prototype[Symbol.toStringTag] = 'Dog';

There is an alternative solution through the read-only Function.name property, which is not yet part of the specification, but is supported by most browsers. Each instance of a prototype / class object has a link to the constructor function with which it was created. So we can find out the name of the type:

 class Cat {} (new Cat).constructor.name < 'Cat'

or in prototype style:

 function Dog() {} (new Dog).constructor.name < 'Dog'

Of course, this solution does not work for objects created using an anonymous function ( "anonymous" ) or Object.create (null) , as well as for primitives without a wrapper object ( null, undefined ).

Thus, for reliable manipulation of the types of variables, it is worthwhile to combine well-known techniques, primarily starting from the problem being solved. In most cases, typeof and instanceof are sufficient.

Function.prototype.toString

We were a little distracted, but as a result we got to the functions that have their own interesting toString . First, take a look at the following code:

 (function() { console.log('(' + arguments.callee.toString() + ')()'); })()

Many probably guessed that this is an example of Quine . If you load a script with such contents into the body of the page, then an exact copy of the source code will be output to the console. This is due to the toString call from the arguments.callee function.

The toString implementation of the Function prototype object returns a string representation of the source code of the function, preserving the syntax used for its definition: FunctionDeclaration, FunctionExpression, ClassDeclaration, ArrowFunction, and so on.

For example, we have an arrow function:

 const bind = (f, ctx) => function() { return f.apply(ctx, arguments); }

Calling bind.toString () will give us a string representation of ArrowFunction :

 "(f, ctx) => function() { return f.apply(ctx, arguments); }"

And the toString call from the wrapped function is already a string representation of the FunctionExpression :

 "function() { return f.apply(ctx, arguments); }"

This example with bind is not accidental, since we have a ready-made solution with context binding Function.prototype.bind , and regarding native bound functions there is a feature of Function.prototype.toString working with them. Depending on the implementation, the presentation of both the wrapped function and the wrapped ( target ) function can be obtained. V8 and SpiderMonkey latest versions of chrome and ff:

 function getx() { return this.x; } getx.bind({ x: 1 }).toString() < "function () { [native code] }"

Thus, it is necessary to exercise caution with natively-decorated functions.

Practice using f.toString

There are a lot of options for using toString , but only as a tool for metaprogramming or debug. The presence of a similar application in the business logic of a typical application will sooner or later lead to an unsupported trough.

The simplest thing that comes to mind is determining the length of a function :

 f.toString().replace(/\s+/g, ' ').length

The location and number of whitespace characters of the toString result is given to the specification for the mercy of a specific implementation, so for cleanliness we first remove the extra, resulting in a general view. By the way, in older versions of the Gecko engine, the function had a special indentation parameter that helps with indent formatting.

The definition of the function parameter names immediately comes to mind, which can be useful for reflection:

 f.toString().match(/^function(?:\s+\w+)?\s*\(([^\)]+)/m)[1].split(/\s*,\s*/)

This knee solution is suitable for the syntax FunctionDeclaration and FunctionExpression . If you need a more detailed and accurate, I recommend looking for examples to the source code of your favorite framework, which certainly has some dependency injection under the hood, based precisely on the names of the declared parameters.

Dangerous and interesting option to override a function with eval :

 const sum = (a, b) => a + b; const prod = eval(sum.toString().replace(/\+(?=\s*(?:a|b))/gm, '*')); sum(5, 10) < 15 prod(5, 10) < 50

Knowing the structure of the original function, we created a new one, replacing the addition operator used in its body before the arguments - multiplication. In the case of software-generated code or the absence of a function extension interface, this can be magically useful. For example, if you are investigating a certain mathematical model, selecting the appropriate function, playing with operators and coefficients.

A more practical use is the compilation and distribution of templates . Many template implementations compile template source and provide a function from the data that already forms the final HTML (or other). Further on the example of the function _.template :

 const helloJst = "Hello, <%= user %>" _.template(helloJst)({ user: 'admin' }) < "Hello, admin"

But what if compiling a template requires hardware resources or is the client very thin? In this case, we can compile the template on the server side and give the clients not the text of the template, but the string representation of the finished function. Moreover, you do not need to load the template engine to the client.

 const helloStr = _.template(helloJst).toString() helloStr < "function(obj) { obj || (obj = {}); var __t, __p = ''; with (obj) { __p += 'Hello, ' + ((__t = ( user )) == null ? '' : __t); } return __p }"

Now we need to execute this code on the client before use. To prevent SyntaxError from being compiled due to the FunctionExpression syntax:

 const helloFn = eval(helloStr.replace(/^function\(obj\)/, 'obj=>'));

or so:

 const helloFn = eval(`const f = ${helloStr};f`);

Or whatever you like. Anyway:

 helloFn({ user: 'admin' }) < "Hello, admin"

This may not be the best practice for compiling templates on the server side and their further distribution to clients. Just an example using the Function.prototype.toString binding and eval .

Finally, the old task of determining the name of a function (before the appearance of the Function.name property) via toString :

 f.toString().match(/function\s+(\w+)(?=\s*\()/m)[1]

Of course, this works well in the case of the syntax FunctionDeclaration . A more intelligent solution will require a clever regular expression or use of pattern matching.

The Internet is full of interesting solutions based on Function.prototype.toString , you just need to ask. Share your experiences in the comments: very interesting.

Array.prototype.toString

The toString implementation of the Array prototype object is generic and can be called for any object. If the object has a join method, then the result of toString is its call, otherwise Object.prototype.toString .

Array , logically, has a join method , which concatenates a string representation of all its elements via the separator parameter (the default is a comma).

Suppose we need to write a function that serializes its argument list. If all parameters are primitives, then in many cases we can do without JSON.stringify :

 function seria() { return Array.from(arguments).toString(); }

or so:

 const seria = (...a) => a.toString();

Just remember that the string '10' and the number 10 will be serialized in the same way. In the task about the shortest memoiser at one of the stages this solution was used.

The native join of the array elements works through an arithmetic cycle from 0 to length and does not filter the missing elements ( null and undefined ). Instead, concatenation with separator occurs. This leads to the following:

 const ar = new Array(1000); ar.toString() < ",,,...,,," // 1000 times

Therefore, if for one reason or another you add an element with a large index to the array (for example, it is a generated natural id), by no means join and, accordingly, do not lead to a string without prior preparation. Otherwise, there may be consequences: Invalid string length, out of memory, or just a hanging script. Use the Object values and keys functions to iterate over only the object's own enumerated properties:

 const k = []; k[2**10] = 1; k[2**20] = 2; k[2**30] = 3; Object.values(k).toString() < "1,2,3" Object.keys(k).toString() < "1024,1048576,1073741824"

But it is much better to avoid such handling of an array: most likely a simple key-value object would suit you as a storage.

By the way, the same danger exists when serializing via JSON.stringify . Only more serious, since empty and unsupported elements are already represented as "null" :

 const ar = new Array(1000); JSON.stringify(ar); < "[null,null,null,...,null,null,null]" // 1000 times

Concluding the section, I would like to remind you that you can define your join method for a custom type and call Array.prototype.toString.call as an alternative cast to a string, but I doubt that this has any practical use.

Number.prototype.toString and parseInt

One of my favorite tasks for js quizzes - What will return the next call to parseInt ?

 parseInt(10**30, 2)

The first thing parseInt does is to implicitly cast the argument to a string through a call to the abstract function ToString , which, depending on the type of argument, performs the desired casting branch. For the number type, do the following:

If the value is NaN, 0 or Infinity , then return the corresponding string.
Otherwise, the algorithm returns the most human-convenient entry of a number: in decimal or exponential form.

I will not duplicate here the algorithm for determining the preferred form, only note the following: if the number of digits in the decimal number exceeds 21 , then the exponential form will be selected. This means that in our case, parseInt does not work with "100 ... 000" but with "1e30". Therefore, the answer is not expected 2 ^ 30. Who knows the nature of this magic number 21 - write!

Next, parseInt looks at the used radix radix numbering system (10 by default, we have 2) and checks the characters of the resulting string for compatibility with it. Upon encountering an 'e', it cuts off the entire tail, leaving only "1". The result will be an integer obtained by converting from a system with a radix base to decimal - in our case, this is 1.

Reverse procedure:

 (2**30).toString(2)

Here, the toString function is called from the Number prototype object, which uses the same algorithm for casting the number to a string. It also has the optional parameter radix . Only he throws a RangeError for an invalid value (must be an integer from 2 to 36 inclusive), while parseInt returns NaN .

It is worth remembering about the upper boundary of the number system, if you plan to implement an exotic hash function: this toString may not be suitable for you.

A challenge to take your mind off for a moment:

 '3113'.split('').map(parseInt)

What will return and how to fix?

Neglected

We have considered toString not all, even native prototype objects. Partly, because I personally did not have to get into trouble with them, and there are not many interesting things about them. We also did not touch the toLocaleString function, since it would be nice to talk about it separately. If I still have something in vain cheated by attention, missed or misunderstood - be sure to write!

Call for inaction

The examples I have given are by no means ready-made recipes - only food for thought. In addition, I find it pointless and a little confused to discuss this in technical interviews: for this there are eternal topics about closures, hoisting, event loop, module / facade / mediator patterns and “of course” questions about [used framework].

This article turned out a hodgepodge, and I hope you found something interesting for yourself. PS JavaScript language is amazing!

Bonus

In preparing this material for publication, I used Google Translator. And quite by chance I discovered an entertaining effect. If you choose a translation from Russian to English, enter "toString" and start erasing it via the Backspace key, then we will observe:

bonus

Such is the irony! I think that I am far from the first one, but just in case I sent them a screenshot with the play script. It looks like a harmless self-XSS, so I share it.

Source: https://habr.com/ru/post/414495/

All Articles