September 28, 2023•1,390 words
or, here's why some real living humans enjoy PHP
I've been writing a fair amount of Bash lately, mostly to automate some repetitive manual processes at my day job, and it's got me thinking.
Bash is not a pretty language. In fact, as far as languages go, at least those which are actively used for real work today, it's got to be one of the ugliest. If you don't believe me, here's what a case statement looks like:
case EXPRESSION in PATTERN_1) STATEMENTS ;; PATTERN_2) STATEMENTS ;; *) DEFAULT STATEMENTS ;; esac
Charming, right? Here's a typical example of an if statement, taken directly from some code I've recently written:
if [ -n "$remote" ] && [[ $(echo "$remote" | wc -l) -eq 1 ]]; then merge_branch $remote fi
If you've never worked with Bash before, could you even begin to comprehend what this code is actually doing? Why are some (but not all!) statements ended by reversing the characters in the keyword that opened them? Why are some clauses surrounded with a single layer of square brackets and others with a double layer?1 What do the dollar signs actually signify and why are some variable names wrapped in quotes while others are bare?2
Anyway, this if statement checks whether the variable
$remote has a value assigned, and then checks that the value is a single line long. I know this because I wrote this code. Makes total sense, right? In this case, we expect
$remote to be a string, but variables in Bash don't really have types, so at any time they could be empty, or a single one-line string, or a multiline string (which works kind of like a collection of strings that can be iterated through, sometimes, under conditions I cannot describe), or a number, or perhaps a small but irritating gremlin. And the language will happily play along when you try to treat a variable as if it's any one of those things, or even multiple of them at once.
All of which leads to an obvious question: why would anyone use this language on purpose? I have asked this to myself on multiple occasions! And I think I've figured out something like an answer.
There are two meaningful dimensions of the quality of any language: Elegance and Utility. Elegance is the obvious one that Real Programmers tend to care about -- it's easy to look at a language and see that it's elegant, and working with the language feels pleasant and refreshing. Keywords have obvious names, the syntax is useful and straightforward, it feels natural to implement supported patterns like inheritance or typechecking or abstraction or iteration. Working with one part of the language gives you a good grasp on how the rest of it works, because it's designed in a consistent way.
By contrast, utility is harder to spot immediately. In rough terms, it describes the ability of a language to natively solve problems in the domain it's commonly used in, using available tooling. Any time you can call a library function to handle a specific task instead of having to write something yourself, that's a language demonstrating utility to you. Languages with a lot of utility tend to have extensive and time-tested standard libraries, or some other system which makes it as easy as possible to solve common tasks in the core domain.
The best languages, of course, have plenty of both. But past a certain point, elegance and utility can conflict -- the need to keep things consistent and simple can make certain tasks harder to express quickly in the language, while obviously filling the global library with a bunch of hacks to solve every domain problem you can think of is a great way to ruin elegance. In fact, elegance and utility lie along a Pareto Frontier, roughly speaking, where it's impossible to maximize one quality past a certain point without reducing the other.
Of course, there's nothing that says a language must have either of these qualities, and there are real languages that exist with neither. But generally speaking, there's no reason for anyone to use a language like that in the presence of alternatives, and so they mostly don't get used in the wild.
So if we imagine active real-world languages, they'll tend to lie along a spectrum between "elegant but not useful" and "useful but not elegant", with most ending up somewhere in the middle. I think things like most functional programming languages lie towards the elegant end; beautiful and well worth learning, but it turns out that typical applications force us to handle things like side effects and mutable data.
But out on the other end of the spectrum are languages with a great deal of utility, but no elegance at all. And this is where Bash comes in; it's a syntatically atrocious glue language that can be used to hack together commands on the *Nix command line, which are a set of some of the handiest utility programs in the world. Seriously, in isolation, programs like grep and curl and sed are amazing in terms of power and flexibility (though of course, the syntax to call them always ends up being some form of a mystical incantation known only to beardy initiates). This gives Bash a "standard library" of great depth and effectiveness.
Other languages towards this end of the spectrum include PHP, which was one of the first languages created specifically for server-side web programming, was initially quite straightforward to hack into the static HTML files people were serving up through physical file paths in the early days of the web, and as an anarchic open-source language rapidly grew a flourishing ecosystem of specific hacks meant to address common use cases that web programmers encountered3. From what I understand COBOL is also in this category; nobody likes the syntax but it was designed from the ground up as a business language meant to define and model financial transactions and it's actually really good at doing exactly that.
I think this also explains why a certain type of person gets attached to ugly utility languages; some people prioritize "ability to solve problems quickly" over "ability to solve problems Philosophically Correctly" and that mindset lends itself to these languages. And of course, once you get to know the warts they stop being such a source of hostility and can start being a source of pride: "yeah I know people complain about case statements in Bash but they're not that hard once you've memorized the syntax like me, a real wizard!" And of course, there's no shortage of the opposite perspective floating around; people who prioritize Philosophical Correctness over problem solving to the point where if a solution isn't sufficiently elegant, we might as well not deploy it at all. I've worked with a couple of those people and it is not a fun experience.
Anyway, if there's any broader conclusion to be drawn here, it's that software design is a complicated place full of hard trade-offs between desirable qualities, and we should maybe try a little harder to see how systems which are conspicuously lacking in certain qualities make it up in others. But also, I'm looking forward to not having to write any more Bash for a while.
I don't actually know the answer to this question. I tried to figure it out once, but every answer I found online made a slight trickle of fluid leak out my ears while the edges of my vision distorted and my nostrils filled with the faint but unmistakable scent of brimstone. So, like a Real Programmer, I gave up and now I just paste in the bracketing that I found on whichever code example I'm currently working off. ↩
See above. ↩
I love a good rant about PHP as much as anyone, but this one is my favorite, partially because of how comprehensive it is, and partially because it works beautifully as a sort of unintentional meditation on the "right way" to design a programming language by focusing on PHP's flaws. If you read beneath the surface, it also gives a surprising level of insight as to how and why the PHP ecosystem developed through the actions of thousands of problem-focused, context-blind developers adding in a little piece at a time. ↩