The New Frontier: 2007

Wednesday, May 09, 2007

Why not Excel?

I read Rory Blyth's blog post "Excel as a Database" (which is now available here) not on the web but in old-fashioned hardback form in Joel Spolsky's collection The Best Software Writing I. I flipped to it right off after seeing the title in the table of contents because I thought, "Ah! I bet this guy has written about a topic that I've often thought about . . . why people use Excel as a database." But, unfortunately, it took the exact opposite direction I expected and was in fact a rather mean-spirit lampooning of people who build complicated collections of data in Excel. The tenor of the piece is essentially, "anyone who creates a database in Excel is an idiot, and they would only use a spreadsheet like this because they are too stupid to understand a thing about computers." I expected a thoughtful piece on why people find it so useful to treat Excel like a database; instead, I got techy bile.

But maybe instead of sneering at semi-technical users who create convoluted, difficult spreadsheets, perhaps there are a lot of lessons we can learn from Excel to understand why it is such a pervasive and popular tool. And saying "because business and marketing people are a bunch of morons" is cathartic but not very helpful.

I was also disappointed in John Gruber's post "The Location Field is the New Command Line" not because I disagree with his points nor because I think it's a bad post. I don't and it isn't. It's just not what I expected from reading the title. I expected a meditation on the Location field of a web browser itself, while Gruber actually writes about web apps vs. desktops apps.

Now, as a little background: I approach this subject as a database guy, having served many years as a hardcore database developer, DBA, and BI developer. I've designed the data models for complex enterprise applications, written more lines of stored procedures than I care to remember, built large data warehouses, and created complex frameworks for data analysis. For much of my career I was in the business of taking "spreadmarts" created by legions of database-challenged but Excel-happy business people and transforming them into structured databases. Many times I've been handed Rory's proverbial crammed-full Excel file and been told "do something with it." So, by all means I should despise Excel.

But I don't. I think it's one of the most wonderful applications ever written.

Also, over the past few years I've been involved in migrating enterprise applications to the Web and, in fact, now work for a 100% browser-based software-as-a-service company. So, I appreciate the many complexities of the web model vs. the desktop model. John Gruber's title, however, hints at another interesting question that his article doesn't address: why has the URL/Location bar persisted across all the evolving variations of web browsers and web applications?

And what, you may ask, do Excel and web browser's Location field have to do with each other?

Quite a lot, actually. They are both very effective and very under-appreciated modes of user interface.

Let's start with Excel. Why is it so popular? Well, in part because you don't have to know how to create complex databases and write complex SQL queries in order to create a basic database. You don't have to write code and compile it. You don't have to build out all the frameworks of a database application in order to get a job done. Look at it from the marketing analyst's side of things. Let's call him Timmy.

Timmy has a job to do. The boss has given Timmy an assignment: I need to know how many of our existing customers have purchased Product A, how many have purchased Product B, and who the best candidates are for cross-selling based on company revenue and budget.

Why does it work so well? A few possibilities:

The UI and the business logic and the data layers are not (contrary to good n-tier design principles) separated at all. Not a good model for building a complex Enterprise app, but for Timmy's 4-hour job it's just the trick. He can click on a cell and immediately see where the formula came from. There's no debugger required (Excel does have a built-in formula debugger, but I doubt many people actually use it--there's no need).

He also doesn't have to compile, run, and execute. He sees the application right away. It's also remarkably easy to replicate data. How do you copy one table to another in a database (INSERT INTO T1(F1,F2) SELECT . . .)? How do you do it in Excel? Drag mouse, CTRL-C, click, CTRL-V. And you can instantly see the result without having to run a query.

And for old Timmy, data entry is a breeze. How many hours have you spent as a programmer creating forms and grids with the sole purpose of allowing data to get easily into a database? Sometimes I think I've spent the better part of my life wrestling with data grid controls and trying to do basic things like putting drop-down combo boxes in cells to allow list-based data entry. Or, when things got a little better, wrestling with complex 3rd party data grids that had all that stuff built in if you could just figure out the object model enough to make it work.

Excel is hardly perfect, and too many people don't take advantage of some of the built in data-validation features, but, boy, is it fast to get data in. It even has that magic auto-complete feature that will guess cell content based on previous keystrokes. And you can copy and paste entire lines.

A gentle reminder to my fellow database developers: there are a lot of smart people out there who don't know anything about database development. And they don't need to. They have other things to focus their energies on.

On the subject of the location bar: how many times do you find yourself cutting and pasting a URL into the Location bar? I do it all the time. And I love it.

If you click a link from another site and land on a page eighty layers deep into a new website and want to get back to the top (maybe you follow a link to an old blog post, like it, and want to read the blogger's most recent posts), you can try to follow links on the page and navigate back using the web designer's predefined routes. Or, you can seize control and whack off all but the first part of the URL in the Location field and, voila!, back to the top-level home page.

You can also guess URLs. I want to go to Old Navy's website and look for clothes. Hmm, let me guess: "OldNavy.com"? Yep. That's it. How easy was that? No search engine, no directory, no looking anything up or barely even thinking about it. I've got Google's toolbar installed in my browser, but I rarely use it. And how about this: I don't have to know whether I want to see a web page or view a Word doc on the file system or launch an application. I just type or paste in a URL and, "bing", the browser magically shifts to display the resource I want.

What's the common thread here? Unstructured, ad-hoc, immediate, easy to copy and paste. Control in the hands of the user and not the application designer. It flies in the face of everything an enterprise software developer would recommend, but in terms of serving the immediate needs of end users it's ideal.

There are a few lessons here somewhere, but I'm not exactly sure what they are. One is that if you are designing Enterprise software, do everything you can to support ad-hoc data input and retrieval modes--including exporting to and import from Excel. Yes, that's right: enable the dumping and retrieving of data from Excel. In fact, with clever use of VBA and hidden properties, I bet there are some really powerful user interface options you could work out that leverage Excel as the primary user interface to an application.

And, most of all, don't restrict the power of the user to efficiently adapt and interact with your application in ways you never expected they would. That is the real holy grail of application development.

Tuesday, May 08, 2007

Great User Interface Design

Not too long ago the snack machine at my company was replaced by a new model. No big deal. It still had the same basic assortment of chips, candy bars, and peanut butter crackers. But, there was one big difference: the numbering for the items had changed.

In the old machine, everything had a two-digit number kept in simple sequence: 23 for Doritos, 24 for the Snicker's bar next door. The new machine had a different number scheme, and a three-digit one to boot. Why on earth start numbering the snack cells all 100? What a waste of finger power.

Ah . . . but not so fast. With the old machines, every now and again I would look to buy the crackers in slot 19, see they were priced at 55 cents, and type 55 on the keypad (transposing the price for the slot number). And instead, item #55--Pringles, yuck!--would come spinning out of the machine.

Not possible with the new machine. No prices began with one, so if you even hit the 1 key first you got an error from the little screen.

Ingenious! And good user interface design, too, because it anticipates the user's mistakes and, rather than saying in the classic socially-maladjusted programmer way, "too bad, you screwed up, live with the Pringles", it says, "Hey, buddy--are you sure about that?" and allows you to correct the error of your ways.

And that's good interface design.

Wednesday, April 18, 2007

Higher Order Endeavors

I started out working in the computer industry as a contractor at NCR, which was going through an interesting corporate shift. As its fullname, National Cash Register, illustrates, NCR began life as a hardware vendor in the pre-computer days. I began working for them in their Columbia, SC, location, which at the time was still largely a hardware manufacturing facility, producing midrange servers and PCs.

NCR started by making the hardware, then at some point moved into making operating systems for the hardware (including their own flavor of Unix), then acquired Teradata (which made a high-end database management system) and moved up the stack a little more. From the box to the OS and drivers to core database software.

In the process, NCR was discarding the lower links of the chain as they became commodities. In the three years I worked at Columbia facility, NCR sold the manufacturing line to Solectron and OEM'ed the servers and PCs back from them. Not long after, it began selling these OEM'ed servers with Sun's Solaris installed on them, easing out of the Unix business. By the time of my last NCR project, the company had turned the Columbia facility into an enterprise software development shop, creating large-scale BI and CRM applications running atop the Teradata database.

Why? To move higher up the food chain.

How many software developers these days learn anything about the OSI Reference Model? Few, thank goodness.

How many software companies write their own proprietary database engines? Few, with the ones who do usually in a very specialized line of high performance software.

And why the hell should they need to bother with any of these things, any more than a master chef needs to waste his time making his own knives or hand-hammering fry pans? The goal in software development is to spend less time on the foundation and plumbing and more time on producing useful applications.

Unless, of course, the stuff available on the open market isn't up to your high standards and highly specialized needs. Few chefs I know of weave their own cloth for towels or sew their own napkins (the linen service does a fine job with that), but plenty do grow their own herbs to guarantee they always have a fresh, high-quality supply on hand.

And that's the mantra I keep repeating to myself when I find myself tempted to build a custom database application from scratch rather than buy something off the shelf, or decide to write my own socket layer or object model rather than building atop someone else's frameworks: Higher Order Endeavors.

Yikes!

A comment and link from Matt Dickman on Rajesh Setty's Life Beyond Code blog is illustrative of something, but I'm not exactly sure what. The comment is as follows:

This graphic (which I came across just yesterday) helps illustrate this point [about competition among open source vendors] very well. Just look at the number of splits in Linux alone. Each slightly different and improving on one another. It's fascinating to me how quickly these iterations happen to deliver a new product to consumers.

While I agree with the general sense of "isn't this graphic amazing?", Dickman seems to view it in a much more positive light than I do. Does it not strike anyone else that the graphic illustrates something just a little bit disfunctional and overly complex?