Hating the web, now on the web

Posted by David on May 21st, 2004

I hate the web. Penny Arcade’s Tycho once said of web forums that they “appear to incubate and empower a twisted menagerie of fucking assholes,” but this epithet it not limited to forums. The web breeds assholes. The freedom and the anonymity serve as the means and opportunity, but there is some other quality to the web that drives people to a life of assholery. I don’t know what this je-ne-sais-something is, and it’s not limited only to the web—mailing lists and Usenet have been crowded by people with a sense of something to prove long before anyone considered hypertext to be feasible—but the combination of these factors and its ease of entrance have left the world-wide web a polluted wasteland of anger of theft.

Besides providing the temptations that tug at the restraints of etiquette and decency, the web fosters a culture of poor programming and half-assed hacks. Tim Berners-Lee himself was a physicist, not a computer scientist, and as a result the first iterations of HTML were simply unparseable. Problems were fixed in later revisions, but thanks to the startling popularity of his invention, in addition to the general sloppiness of web authors, every modern web browser is by necessity a twisted conglomeration of heuristics and rules external to any standard. A basic understanding of parsing, or even typography, could have prevented this disaster. Even the W3C, the self-proclaimed web standards committee, is known for releasing standards that are self-contradictory or computationally impossible.

Where the standards do not cause sufficient pain and confusion, the programs that drive them are more than ready to take up the banner. Take Apache, for instance. The Apache httpd is widely accepted as the most robust among the various web servers, and it sucks. It’s confusing at best, and configuration seems downright backwards at times. The config file sort of looks like HTML, in that it uses angle brackets with things inside them, but in reality all of those brackets are only creating scopes that other programs accomplish with a single curly brace. Every interesting configuration directive resides within some module, and every module brings in its own tiny language, requiring that cryptic strings of punctuation be placed in the proper positions in order to accomplish something seemingly unrelated. Nothing is where you would expect, arguments to half of the options seem backwards, and God save you if you forget to misspell “Referrer”. For most the entrance cost to publish on the web is minimal, but for those with the do-it-yourself spirit, hoping to setup a server to match their desires for web page hosting, publishing on the web is a nightmare.

The complexity of httpd is mostly due in part to the complexity of HTTP, but there are more than enough things to hate where the fault lies entirely in things themselves. PHP, originally the personal homepage tools and later rebranded with an obviously forced recursive acronym, had an opportunity to create something amazing. The idea of creating an HTML page that was both preprocessed /and/ dynamic was not yet widespread, and PHP was able to do it quickly and cleanly. So what did they do with the language? Nothing. There was never a language design. New features were bolted on the side; new functions were all placed in the single global namespace; libraries, where they had an opportunity to create a new representation of the data and actions, suitable for a high level language, instead created simple wrappers around C functions. So PHP does its job, but it does it poorly. It had an opportunity to shine, to rise above the restraints of the web, but instead it took the ideas of a score of languages and copied their mistakes.

To get back to my original point, I’ve been struggling recently with the assholes on the web. I’ve mentioned before that I’ve had trouble with people using the w00t.jpg image in their own pages, but wasting my bandwidth instead of their own. In my efforts not to be merely a repository for that one image, I’ve been struggling with the problems of the web and its programs, and I’ve had enough. Not everything belongs on the web. There has been a disturbing trend in recent years to replace all programs with a web browser. I hate it. If I want to use a different editor, I have to launch it separately and carefully copy and paste my text into the browser. If I want to use a different means of searching a page, such as using regular expressions instead of merely substrings, I have to save the page and search with an external program. Even this isn’t perfect, since the external program is given only the HTML source instead of the parse tree held be the browser. If I want to do anything interesting, a complex ordeal of interaction between a web browser and another program becomes necessary. All of this because someone thought it would be easier to do something as a web page.

Dynamic content doesn’t make sense on the web. Web pages are inherently static, and the best that can be achieved is to generate a different static page for different situations. Conversations have no place on the web, making the task of web forums an eternal battle between reality and the nature of their chosen medium. Other aspects of computing and the Internet do not belong on the web simply because these problems have already been solved elsewhere. Email should not be on the web. Newsgroups and forums should not be on the web. And from this day forward, my public journal will not be on the web.