Tommy’s Blog

Photography, technology, and a little bit more from Tommy Williams

Archive for the 'Technology' Category


Reminding me that Vista is exciting

3rd February 2007

Lots of news this week about the disappointment of Vista. Yes, the security is leaps and bounds better than XP (and that’s enough reason by itself that we should upgrade, but then we should eat more vegetables, we should exercise more, we should stay at our desks when the email goes around about free doughnuts in the break room). And the driver model is a lot more stable, but it’s also new and so Vista generally doesn’t match XP for games performance right now, if you can even get the drivers for your video card. The rest of it? Well, most people think it’s just some fancy graphics.

But hold on.

I’ve been running Vista for a few months now and have gotten used to it. So when all this press started coming out about, well, the mere competence of Vista I didn’t disagree that much.

Until last night, when I installed it on Dawn’s computer and I saw it through her eyes.

The OS does look a lot nicer than XP. SuperFetch makes it faster use: Dawn had been using IE7, Office 2007, and MSN Messenger just a couple of hours previously on XP and, without my prompting, talked about how Word and Excel seemed to launch more quickly and that everything just seemed snappier. She can actually put her one-month-old computer with two monitors to sleep and have it wake up properly. The system-wide searching, and the ability to search the Start menu, are going to transform the way she works with her computer. Seriously. We finally have reliable backup. And she loves the convenience and extra protection of Shadow Copies. She’s not yet excited about all the metadata that Explorer support on files, but she was impressed when I showed her how to apply keywords to photos and then how to find that photo with a simple search from the Start menu. I’m not going to list all the new features but I was surprised to see the operating system from fresh eyes and realize that Vista is not just an incremental improvement over XP. It’s a serious change in the usability of Windows.

Maybe we need to take Chris Pirillo up on his offer.

Technorati tags: , , ,

Posted in Technology | 2 Comments »

It’s hard to have scalable, reliable databases on the Web

21st February 2006

Putting databases on the Web is hard. It’s hard if you do it right, anyway.

Dare Obasanjo collects a set of posts about the challenges of using databases on the Web. These are written for people who have already tried to build systems for millions of users. Let me explain the problems for those of you who haven’t done so but, for some odd reason, are still interested in learning about it.

The people designing and supporting online systems hear this all the time from the people running the company: we want no down time — keep this thing available 24 hours a day, every day, now and evermore.*

There are basically two different ways of sharing the data in a database on the Web: the users of the Web site can read the data and query it, but they can’t change it (this is read-only); or they not only read the data, but they make changes to it as well (this is read-write).

Let’s start with a read-only system. Some examples? Imagine a “knowledge base” for a product, where customers look for answers to problems. They’re not changing the answers, they’re just finding them and reading them. Or even a search engine like Google. Sure, there are the “robots” that grab the data from all the pages on the Web and need to write it into the search engine database, but for all the people who use the search engine, it’s a read-only kind of deal.

The simplest setup (though not a very reliable one) is to have one database server for the Web site. It’s easy to set up, but there are all kinds of limitations. And I’m not even going to talk about scale here — a system needs to be able to scale when the number of users goes from a few hundred to several thousand, or from several thousand to several million. So what are some of the limitations of a single server? What happens when a hard drive fails? Or when the server needs to be taken offline to apply a security patch? What happens when the company needs to update its knowledge base — how does it apply the updates without breaking the content for its users while the updates are being loaded?**

There’s a pretty easy solution here: set up multiple databases that all have the same information, and put them behind a load balancer. This is hardware (or software in some cases) that distributes incoming requests to servers in a group. If one machine fails or needs to be taken offline, the load balancer just stops sending requests to it and the rest of the machines carry the load.

This design can provide a lot of resilience for read-only databases. In fact, with some fancy network setups, it’s possible for these servers to be distributed all around the world. If one datacenter goes offline thanks to a back hoe through a fiber optic cable, the requests are routed to other databases in another part of the world. The customers don’t notice.

Read-only databases are easy. What about read-write? What does a place like Amazon do? They don’t just let you browse their inventory, they let you buy it. They let you create an account, a profile, a wish list — they let you write to their database.

The design for a read-only database doesn’t work when users need to change the data. For a load-balanced pool of read-only databases, each server has the same content. It doesn’t matter which one serves the request: the result will be the same. But as soon as users can write to the database, throw synchronized data out the window. There are naive solutions to this problem, such as having the clients (the Web servers) duplicate their write requests across each database server. There’s no way to guarantee that the request could be completed across all the servers. Remember that this is the Web. There may be thousands of users who want to write to the database during the same second. Even if there are only a few — even if there is simply more than one user — there’s a risk that data would get out of sync or would even be in conflict when users make changes to the same data, but on different servers.

Solutions get a lot tougher here, but one technique separates the read-only parts of the system from those that need to be written to. Imagine an online magazine. Each article has an option for users to rate it — tell the author whether they loved it, hated it, or didn’t care. The articles could be served from a read-only database, while the rating section could be hooked up to a separate database system. This offers some additional flexibility and a chance to clarify the “always on, never down” requirement. Perhaps the articles must always be online, but it’s OK for the rating section of the page to sometimes be offline.

There are ways to make writable databases reliable and highly available (online 24 hours a day) but they’re complicated to deploy and even more complicated to explain, especially if no one out there is interested in hearing me do it. Have a look at this article about SQL Server Clusters (or lack of them) at Microsoft.com. I think Brad LeRoss is the author. For more about the things the Microsoft.com Ops team does, there’s a whole track at Windows Connections 2006 if you’re going.

After reading this, and the other articles, if you still have even the least bit of curiosity about databases and the Web, leave a comment and tell me what you would like to know.

*Well, they hear that request until they tell the management how much it will cost.

**After all, broken content is just as bad to the users as a server that’s down. Too bad so many people who support Web sites don’t see it that way. If the server is up, they reason, things are good. It’s easy to set up a system that simply pings a server to see if its alive. It’s much harder to design and maintain a monitor that can tell that the content is valid.

Tags: , , , ,
del.icio.us tags: , , , ,

Posted in Technology | 2 Comments »