I will never forget the time I discovered how web pages were made. The year was 1995. I was using Windows 3.1 like most everyone, and it didn’t even support the TCP/IP network stack out of the box. I had to install a program called Trumpet Winsock and a hot new web browser called Netscape to connect to the Internet.


Netscape Navigator 1.2 on Windows 3.1 - Source: Wikipedia Netscape Navigator 1.2 on Windows 3.1 - Source: Wikipedia

In Netscape, there was an option called “Source…” under the View menu, and when I clicked it, I was half expecting to see some sort of cryptic binary file format that would require special software to generate. What I got instead was the actual text of the web page I was viewing, with a bunch of HTML tags thrown in for styling. That was a revelation – with a simple text editor, anyone could create a web page!

I didn’t know at the time, but it was part of the Unix philosophy to use text streams to exchange data between programs. Had Microsoft or some other company managed to invent the web instead of Tim Berners-Lee, I bet they would have opted for a proprietary binary format for web page delivery, as it was customary in the non-Unix world.

In those early days, surfing the web was an amazing experience for me. I had, for the first time in my life, instant access to people anywhere in the world. There was little content back then of course, and connection speeds were very slow. “Image-heavy” web pages could easily take several minutes to download on my 0.014 megabit landline phone/Internet connection.

These days, we have incomparably fast Internet speeds – multi-megabyte web pages download in seconds, and a lot of people watch TV online thanks to the wide availability of on-demand TV series and movies in high resolution. To top it off, all these feats are possible with no wires in sight. How far we’ve come, and yet one thing hasn’t changed – web pages are still delivered in plain text.

Text is the universal interface all humans can understand. You can use literally thousands of different programs to view and process text files. Binary files, on the other hand, require specialized programs that greatly reduce their utility. Granted, not everything can be a text file. Images and sound recordings are better stored in binary, but there’s usually no good reason to use proprietary binary file formats when a simple text file is enough, especially when large text files can easily be compressed on the fly.

Many software developers fall in the trap of writing a program for every problem they encounter, even when there are better alternatives to achieve the same outcome. Modern operating systems based on Unix such as Linux and MacOS come with a large number of built-in Command-Line-Interface (CLI) programs that can be chained together to solve seemingly complex tasks with ease, often with nothing more than one-liners. But first, let’s hear what Master Foo has to say about the subject :)

Master Foo and the Ten Thousand Lines

Master Foo once said to a visiting programmer: “There is more Unix-nature in one line of shell script than there is in ten thousand lines of C”.

The programmer, who was very proud of his mastery of C, said: “How can this be? C is the language in which the very kernel of Unix is implemented!”

Master Foo replied: “That is so. Nevertheless, there is more Unix-nature in one line of shell script than there is in ten thousand lines of C”.

The programmer grew distressed. “But through the C language we experience the enlightenment of the Patriarch Ritchie! We become as one with the operating system and the machine, reaping matchless performance!”

Master Foo replied: “All that you say is true. But there is still more Unix-nature in one line of shell script than there is in ten thousand lines of C”.

The programmer scoffed at Master Foo and rose to depart. But Master Foo nodded to his student Nubi, who wrote a line of shell script on a nearby whiteboard, and said: “Master programmer, consider this pipeline. Implemented in pure C, would it not span ten thousand lines?”

The programmer muttered through his beard, contemplating what Nubi had written. Finally he agreed that it was so.

“And how many hours would you require to implement and debug that C program?” asked Nubi.

“Many”, admitted the visiting programmer. “But only a fool would spend the time to do that when so many more worthy tasks await him”.

“And who better understands the Unix-nature?” Master Foo asked. “Is it he who writes the ten thousand lines, or he who, perceiving the emptiness of the task, gains merit by not coding?”

Upon hearing this, the programmer was enlightened.

I don’t know which script Nubi wrote on the whiteboard, but the following one-liner, when run in Linux or MacOS returns the 5 most frequently used words in a file (in this case LICENSE), and prints out a sorted list of those words along with their frequencies:

cat LICENSE | egrep -o "\w+" | awk '{print tolower($0)}' | sort | uniq -c | sort -rn | head -n5
  14 the
   9 software
   9 or
   8 to
   8 of

Some of you might recognize the script I wrote above as the somewhat modernized version of Doug McIlroy’s version of Donald Knuth’s 10+ page Pascal program. I’m certainly no Donald Knuth, but I think it’s correct to say that even the most brilliant minds may sometimes make things more complicated than they should be.

The power of shell scripts is a direct manifestation of the Unix Philosophy, as succinctly summarized by Doug McIlroy, the inventor of Unix pipes:

  • Write programs that do one thing and do it well.
  • Write programs to work together.
  • Write programs to handle text streams, because that is a universal interface.

As software developers, what can we learn from the Unix philosophy? Isaac Z. Schumpeter, the creator of npm, suggests the following principles in the context of Node.js modules, but I think they are general enough to cover software modules written in other programming languages as well:

  • Write modules that do one thing well. Write a new module rather than complicate an old one.
  • Write modules that encourage composition rather than extension.
  • Write modules that handle data streams, because that is the universal interface.
  • Write modules that are agnostic about the source of their input or the destination of their output.
  • Write modules that solve a problem you know, so you can learn about the ones you don’t.
  • Write modules that are small. Iterate quickly. Refactor ruthlessly. Rewrite bravely.
  • Write modules quickly, to meet your needs, with just a few tests for compliance. Avoid extensive specifications. Add a test for each bug you fix.
  • Write modules for publication, even if you only use them privately. You will appreciate documentation in the future.

Unix has been around in one form or another for over half a century now, and the philosophy behind it remains as relevant today as when it first emerged, not just for operating systems but for software development in general. While combining different modules in software development isn’t quite as straightforward as snapping Lego bricks together, the Unix philosophy is guiding us towards this goal, one brick at a time.

Worth a Read:


Related: