Recently, I had the pleasure to attend a Distinguished Lecture given by Dr. Barbara Liskov of MIT, the 2008 ACM A.M. Turing Award winner. The lecture, entitled “The Power of Abstraction,” gave a fascinating historical perspective on how the modern software abstractions have emerged and why the mainstream programming languages look the way they do.
I was particularly intrigued by the observation Dr. Liskov made about the relative time a computer program is written vs. read. Early in the development of programming languages, language designers optimized their designs to make programs easy to write. However, Dr. Liskov has observed that the amount of time spent on reading the source code of a computer program tends to be much bigger than the amount of time it takes to write it. In other words, during the lifetime of a computer program, if t(w) is the time spent writing the program, and t(r) is the time spent reading it, then
t(r) > t(w). This observation has fundamentally transformed the optimization criteria for programming language design, and nowadays programming languages are optimized to make programs easy to read.
It has occurred to me that the t(r) > t(w) inequality is not true for all computer programs. To define t(w) more precisely, let it be the metric that includes the time it takes to write a program and to subsequently modify it as part of various maintenance tasks. However, only useful programs are read and modified continuously.
As a specific example, consider programs written as assignments in CS classes. I don’t think such programs are read more than they are written. Once a student program does what it is assigned to do, the student may never execute the program again. (The program is likely to be executed one more time during the grading process.) Similarly, professional developers write plenty of code that never makes it into production. This code is discarded and never read again. For that kinds of programs, their t(r) < t(w).
To better quality the t(r) > t(w) inequality, let us introduce another metric–t(e), the total time a computer program is executed. The relationship between t(r) and t(w) cannot be accurately assessed without taking t(e) into account.
To support this claim quantitatively, let me offer the following observation. For a program to be read more than it is written, the total time that a program is executed should be at least an order of magnitude larger than it has taken to write it. That is, t(r) > t(w) => t(w) * 10 < t(e).
I am not sure whether “order of magnitude” is the right quantification for this observation, but it is obvious that only useful programs are executed. A computer program’s practical utility directly affects how often it is executed. Furthermore, the more the program is executed, the more likely it is that the users will identify the need to change the program by fixing bugs, adding new features, and modifying existing features. In other words, using a program engenders the need to maintain it. To maintain a program, the programmer first must read and understand its source code. Therefore, there is a direct dependency between the total time the program is executed (i.e., used), read, and written.
In summary, only useful programs are read more than they are written, and the total amount of time spent on executing a program directly affects how often the program’s source code is read and written. I am not sure though whether this observation has any bearing on programming language design and software abstraction.