Richard Hamming: Chapter 27. Unreliable Data

“The goal of this course is to prepare you for your technical future.”

image Hi, Habr. Remember the awesome article "You and your work" (+219, 2394 bookmarks, 386k readings)?

So Hamming (yes, yes, self-monitoring and self-correcting Hamming codes ) has a whole book based on his lectures. We translate it, because the man is talking.

This book is not just about IT, it is a book about the thinking style of incredibly cool people. “This is not just a charge of positive thinking; it describes the conditions that increase the chances of doing a great job. ”

We have already translated 21 (out of 30) chapters. And we are working on the publication "in paper."

Chapter 27. Invalid Data


(Thanks to Valentin Pinchuk for the translation, who responded to my call in the “previous chapter.”) Who wants to help with the translation, layout and publication of the book - write in a personal or mail magisterludi2016@yandex.ru

In my experience, and the experience of many other researchers, the data are usually much less accurate than it is declared. This is not a simple point - we depend both on the choice of the initial data for decision-making, and on the initial data during modeling, the results of which make decisions. Since the nature of errors is very diverse, and I do not have a unified theory to explain them all, I have to go to separate examples and generalizations from them.

Let me start with the durability test. A good example is my experience of participating in the tests of the life cycle of vacuum tubes. They were intended to be used in the first submarine cables for voice transmission with an expected lifespan of 20 years (after 22 years we just brought the cable out of service, which became too expensive - and this gives a good idea of ​​the speed of technical progress in those days).

The pipes for the cable were first received in about 18 months of how the cable itself was to be submerged. I had a medium-sized computing device based on the IBM 101 specialized statistical computer, which I provided to the data processing staff. I also helped them mainly in the technical aspects of the calculations. At the same time, I in no way participated in the direct work on the project. However, one day a project manager showed me test equipment stored in the attic. As usual, I wondered: “Why are you sure that the test equipment has the same reliability as the test equipment?” His answer convinced me that he had not thought about it at all. Due to the futility of the deepening of the details, I left this occupation. But I did not forget the question myself!

Durability tests are becoming more and more important and more complex as we need more and more reliable components for ever larger and more complex systems. One of the basic principles is the acceleration of the test process, which is based on the fact that with increasing temperature by 17 ° C many, but not all, chemical reactions double their speed. The method of increasing the operating voltage is also used to accelerate the detection of weak points. A similar effect when testing microcircuits gives an increase in clock frequency. But even the integrated application of methods does not guarantee the strength of the basis for conclusions about durability. However, in response, experts say: “And what else can we do under the constraints on time and money?” After all, the time interval between the scientific discovery and its technical implementation is constantly reduced, so that in order to carry out actual tests of the life cycle of a new device, before will be launched into widespread use, virtually no time left. And if you still prefer to make sure of this, then you will forever lag behind life.

Of course, in addition to these, there are other methods of testing, designed to explore other aspects. So far, I have been convinced of the precariousness of these fundamentals of the test of longevity, but there are no others! Some time ago at Bell Telephone Laboratories, I argued that it was necessary to create a department for durability testing, the task of which would be to prepare for testing a new device, when it was only planned for development, and not when they became necessary with the appearance of a finished device. I did not succeed, although I made some relatively weak assumptions about where to start. There was no time for basic research in durability tests - they were under the strongest time pressure: to get the necessary results tomorrow. As the saying goes: “There is never enough time to do everything right, but then it will always be there to correct errors” - especially in computer software!

Here is the question I’ll put to you: “How do you intend to test a device (or a device node) that requires high reliability when testing equipment is less reliable, the time for testing is extremely limited, but the device requires a very long service life ". This problem will surely come to torment you in the future, so it is better to start thinking about it now to outline ways to solve it in a situation when your time comes to receive the results of durability tests.

Let me turn now to some aspects of measurement. For example, my buddy at Bell Telephone Laboratories, who was a very good statistician, thought that some of the data he analyzed was inaccurate. His arguments about the need to re-measure them did not convince the head of the department, who was convinced of the reliability of his subordinates, and, moreover, all the measuring instruments were with copper plates confirming their accuracy. Then one fine Monday morning, my friend came to work, said that he forgot his briefcase in the train on his way home on Friday and lost everything. The head of the department had to give instructions for repeated measurements, after which my friend presented the initial notes and demonstrated how different they were! Of course, this did not add to his popularity, but revealed an inaccuracy of measurements that were to play a crucial role later.

The same statistician friend once did research for an outside company on patterns of phone calls from their board. These data were recorded by the equipment of the central office, which made calls and made payment documents for their payment. Once he accidentally discovered a call to a non-existent office! Then he studied the data more closely and found a relatively high percentage of calls that connected for several minutes with non-existent offices! The data was recorded by the same machines that made the calls, but this was erroneous data. So you can not even rely on the fact that the machine will correctly record the data about itself!

My brother, who had worked for many years in the Los Angeles Air Pollution Control Service, once told me that they had revealed the need to reassemble, calibrate and install every new instrument that they acquired! Otherwise, there were endless problems with accuracy, and this despite the assurances of the supplier!

Once I did a lot of research on equipment for Western Electric. They provided primary research data for 18 months from records of more than 100 equipment samples. I asked the following question: why should I believe in data consistency — for example, could there be, for example, retirement of non-existent equipment in the records? They assured that they thought about it, looked through all the data and added a few pseudo-transactions to exclude such cases. I recklessly believed them, and it was only later in the course of my work that I found that there were still residual contradictions in the data, so I had to search for them first, then exclude them, and then recalculate all the data again. From this experience, I learned not to begin processing data until a thorough analysis for errors. They complained about my slowness, but almost always I found errors in the data. When I presented them, they were forced to recognize my prudence as reasonable. Regardless of how inviolable the data is and how urgently the response is required, I learned how to pre-test the data for consistency and minimize the number of sharply different values ​​(outliers).

Another time, I participated as an initiator, and then as an adviser in a large study of AT & T personnel in New York using a rented UNIVAC computer. The data had to come from many places, so I decided that it would be wise to first conduct a pilot study to make sure that all sources understand the essence of what is happening and know how to prepare IBM punch cards with the necessary data. We did it. However, when the main study began, some sources did not fill punched cards in accordance with the instructions received. It immediately became clear to me that the pilot study, small in scale, went entirely through a local trained group of punched card specialists, and the main research went through general punched card groups. To my regret, they had no idea about the pilot study! Again, I was unreasonable than I imagined about myself: I underestimated the internal mechanisms of a large organization.

But what about basic scientific data? The publication of the National Bureau of Standards on 10 fundamental physical constants (speed of light, Avogadro number, electron charge, etc.) provides two data sets (for 1929 and 1973) and the corresponding calculation errors (see Fig. 27.I) . It is easy to see that if:

  1. The 1973 data set is taken to be correct (in accordance with the fact that the table illustrates an increase in the accuracy of determining physical constants thousands of times in 44 years between editions),
  2. calculate the deviation of the new values ​​of the constants from the previous ones;
  3. calculate how many times this deviation exceeds the error of the previous calculation,
  4. then, on average, this deviation is 5.267 times larger (the values ​​of the last column R were added to the table by the author).

You could assume that the values ​​of physical constants were calculated carefully, but now you can see how inaccurate they were! The following selection of physical constants (see Fig. 27.II) shows the average error in half of this. However, one can only guess what will happen with this accuracy after another 20 years! Want to bet?

image

Figure 27.l

Signatures: “Unreliable Data” MEASUREMENT ACCURACY (in fractions per million)

Sources
Birge, RT; Probable Values ​​of the General Physical Constants Rev. of Mod. Phys. 1 (1929) 1;
Cohen, E. Richard; Taylor, Barry N. (1973). "The 1973 least-squares adjustment of the fundamental constants" (PDF). Journal of Physical and Chemical Reference Data. 2 (4): 663–734. Bibcode: 1973JPCRD ... 2..663C. doi: 10.1063 / 1.3253130
Cohen, E. Richard; Taylor, Barry N. (1987). "The 1986 CODATA recommended values ​​of the fundamental physical constants." Journal of Research of the National Bureau of Standards. 92 (2): 1–13. doi: 10.6028 / jres.092.010

This is not at all surprising. I recently saw a table of measurements of the Hubble constant (the slope of the redshift versus distance), which is fundamental in modern cosmology. Many values ​​fall outside the scope of errors declared for most other values.

Thus, a direct statistical measurement indicates that even the most accurate physical constants in the tables are not at all as accurate as stated. How can this be? Carelessness and optimism are two main factors. Thoughtful research reveals that the existing technology of experiments that we have been trained on is also not ideal and contributes to errors in error estimation. We will understand how you practice, but not in theory, set up an experiment. You collect equipment and turn it on, and, of course, the equipment does not work as it should. Therefore, you spend some time, often weeks, forcing it to work properly. You are now ready to receive data, but first you fine-tune the hardware. How? Customizing it to get consistent data. Simply put, you achieve low dispersion, and what else can you do? But it is precisely these data with low variance that you transmit statistics, and it is they that are used to estimate variability. You do not transfer the correct data due to the correct settings - you do not know how to do it - you transmit low-dispersion data, and you get high reliability from statistics that you want to declare! This is a common laboratory practice! Not surprisingly, the reliability of the data rarely corresponds to the stated.

image

Figure 27.II

I remind you the Hamming rule:

in 90% of cases, the result of the next independent measurement will exceed the limits assumed by the previous level of 90% confidence!

This rule, of course, slightly exaggerates the facts, but in such a formulation it is easier to remember - most of the published information on the accuracy of measurements is not so good as stated. This is justified by the history of the experiment itself and reflects the discrepancies that subsequently emerge with the stated accuracy. I did not try to get a grant to conduct an appropriate large-scale study, but I have little doubt in its results.

Another amazing phenomenon that can be encountered is the use of data in the model when there are errors in both the data and the model itself. For example, a normal distribution is assumed, but the tails may in fact be more or less predicted by the model. Or negative values ​​cannot be obtained, although the normal distribution allows them. Then there are two sources of errors: measurements and model. And your ability to make more and more accurate measurements only increases the contribution to the error due to the inconsistency of the model of reality.

I recall my experience when I was a member of the Board of Directors of a computer company. We were going to move to a new family of computers and prepared very accurate estimates of the cost of new models. The sales specialist then stated that at a certain price he would be able to receive an order for 10, with another for 15, and for the third, for 20 sales. His assumptions, and I'm not saying that they were wrong, were combined with verified engineering data to make a decision on the price of a new model! That is, the final amount was determined mainly, taking into account the reliability of engineering calculations, ignoring the existing uncertainty of the sales specialist's assumptions. This is typical for large organizations. Careful estimates are combined with arbitrary assumptions, and the reliability of the whole is taken to be equal to the reliability of the engineering component. You may ask a fair question, why bother with careful engineering estimates when they are combined with other arbitrary assumptions, but such is the widespread practice in many areas of activity!

I first talked about science and engineering, so that you are not too ironic when moving to economic data. I read the Morgenstern book On the Accuracy of Economic Dimensions, Princeton Press, 2nd ed. This is a highly respected economist.
My favorite example from his book is the official figures for the flow of gold from one country to another, according to both sides. The numbers can sometimes differ more than twice! If they can not get the correct data for the gold flow, then what kind of data can then be correct? I could see how an electrical device when shipped to third world countries could be called medical due to differences in customs duties, but gold is gold, it can hardly be called anything else.

Morgenstern notes that once DuPont Chemical owned approximately 23% of General Motors shares. Do you think this fact was taken into account when calculating the gross national product (GNP)? Yes, no, get double counting!

As an example, I discovered that not so long ago, when the tax rules for reporting on inventory items changed, many companies changed their reporting methods to benefit from the new rules. To do this, they had to show less goods and materials, and, accordingly, pay less tax. In vain, I searched at least in the Wall Street Journal for a mention of this fact. There was not a single one. Although inventories of inventories are one of the main indicators that we use to estimate producers' expectations, the economy is growing or falling. It is believed that manufacturers reduce inventories of inventories, if they expect a decrease in sales, but increase inventories of inventories, if they expect growth in sales, so as not to miss the possible income from them. So, as far as I could understand, the change in the law on reporting on inventories and its impact on economic measurements were not taken into account at all.

In general, for all time series there is a common problem. The definition of the measured object is constantly changing. Consider, as the best example, poverty. We are constantly raising the level of poverty, so you will never be able to get rid of it - this definition will always be changed by officials interested in keeping the projects they lead, which require a sufficient number of people below the poverty line. What we call “poverty” is in many ways superior to what the King of England had not so long ago!

In the US Navy, the contents of the terms "youmen" (office secretary), "ship", etc. has changed over the years, so in any time series that you study to determine trends in the US Navy, this additional factor will confuse you in your conclusions. It’s not that you don’t have to try to understand the situation using past data (while using sophisticated signal processing techniques from Chapters 14-17), but there are still problems due to changes in definitions that could not be said in official texts. documents! .

, ( , , ), . () ( ), ( ), . , , , – , , . ( ) , , , , , , . , , , . , , .

, , , , .

, , , . , , , , . .

, ? , , , . , . , .

, , . , – ? !

5 ( ) , , , , ! , , , . -, , , .

, . , , .

, , , ?

- , , : , , . , . , – . ! , , .

- , . , . 100% , , 1% 1/10% , ! , , . , , , , . . , , . , .

- , . , , , , , , — . , «-» , , , , , . , . , , . , : , , .., – , ? , , .

, ( , ). , , , . , , , , , . , . ! ( , ), . , , (, , « » – . ) .

, ( – , – , . ) – . .

- , , . , , , , . , . , , , . , , , , , ! , , , , , , , – , , ! «» , . , . - – , , , .

, , – , . , , , , , . : « ?» ? ? ? , , . , , .

, . , . , – , , – . , , , . 0,5 . , . , , .

, , , , , . « », , , , , - . , , – .

( , ! . 1936 «The Literary Digest», , . , : . , , , , . 55% , 41%. «The Literary Digest»: . . 61% , — 37%. , . : , , «» «The Literary Digest». , . . , 1938 «The Literary Digest» «Time Magazine». , , , «The Literary Digest» . , , . , : «The Literary Digest» 1936 . , . , – , 1936 . – .).

, – . ( ), , . , – .

: , . , . , - !

To be continued...

Who wants to help with the translation, layout and publication of the book - write in a personal or mail magisterludi2016@yandex.ru

By the way, we also launched another translation of the coolest book - “The Dream Machine: The History of Computer Revolution” )

Book content and translated chapters
Foreword
  1. Intro to the Art of Doing Science and Engineering: Learning to Learn (March 28, 1995) Translation: Chapter 1
  2. Foundations of the Digital (Discrete) Revolution (March 30, 1995) Chapter 2. Basics of the digital (discrete) revolution
  3. “History of Computers - Hardware” (March 31, 1995) Chapter 3. Computer History — Iron
  4. History of Computers - Software (April 4, 1995) Chapter 4. Computer History - Software
  5. History of Computers - Applications (April 6, 1995) Chapter 5. Computer History — A Practical Application
  6. “Artificial Intelligence - Part I” (April 7, 1995) Chapter 6. Artificial Intelligence - 1
  7. «Artificial Intelligence — Part II» (April 11, 1995) ()
  8. Artificial Intelligence III (April 13, 1995) Chapter 8. Artificial Intelligence-III
  9. N-Dimensional Space (April 14, 1995) Chapter 9. N-Dimensional Space
  10. «Coding Theory — The Representation of Information, Part I» (April 18, 1995) ( :((( )
  11. «Coding Theory — The Representation of Information, Part II» (April 20, 1995)
  12. «Error-Correcting Codes» (April 21, 1995) ()
  13. Information Theory (April 25, 1995) (translator disappeared: ((()
  14. Digital Filters, Part I (April 27, 1995) Chapter 14. Digital Filters - 1
  15. Digital Filters, Part II (April 28, 1995) Chapter 15. Digital Filters - 2
  16. Digital Filters, Part III (May 2, 1995) Chapter 16. Digital Filters - 3
  17. «Digital Filters, Part IV» (May 4, 1995)
  18. «Simulation, Part I» (May 5, 1995) ( )
  19. «Simulation, Part II» (May 9, 1995)
  20. Simulation, Part III (May 11, 1995)
  21. Fiber Optics (May 12, 1995) Chapter 21. Fiber Optics
  22. Computer Aided Instruction (May 16, 1995) (translator disappeared: ((()
  23. "Mathematics" (May 18, 1995) Chapter 23. Mathematics
  24. Quantum Mechanics (May 19, 1995) Chapter 24. Quantum Mechanics
  25. Creativity (May 23, 1995). Translation: Chapter 25. Creativity
  26. Experts (May 25, 1995) Chapter 26. Experts
  27. «Unreliable Data» (May 26, 1995) ()
  28. Systems Engineering (May 30, 1995) Chapter 28. System Engineering
  29. "You Get What You Measure" (June 1, 1995) Chapter 29. You Get What You Measure
  30. “How do we know what we know” (June 2, 1995) missing translator: (((
  31. Hamming, “You and Your Research” (June 6, 1995). Translation: You and Your Work

Who wants to help with the translation, layout and publication of the book - write in a personal or mail magisterludi2016@yandex.ru

Source: https://habr.com/ru/post/413255/


All Articles