The Economist has a special pull-out section on information overload this week. It’s a useful non-technical overview of where things are.
Economist: Information Overload
Written by Dan Rabin on March 3rd, 2010Thoughts on XML namespaces from James Clark
Written by Dan Rabin on January 2nd, 2010Influential XML personage James Clark has posted a very carefully thought-out essay on XML namespaces.
Everybody loves to bash XML namespaces, but this essay is the most careful and dispassionate I have seen to date. You should read the whole thing if you deal with XML (or even if you just like to read the product of clear thinking).
Here’s a key quote:
I would claim that the aspect of XML Namespaces that causes pain is the URI/prefix duality: the thing that occurs in the document (the prefix + local name) is not the same as the thing that is semantically significant (the namespace URI + local name). As soon as you accept this duality, I believe you are doomed to a significant extra layer of complexity.
New York Times article on mining local government data
Written by Dan Rabin on December 6th, 2009The New York Times has an article on mining local-government data for unforeseen purposes.
Nothing new here, but its being in the Times means my mom reads about it, and yours might too.
Dueling GPSes: Garmin vs. Android G1
Written by Dan Rabin on December 6th, 2009For the last 2.5 years I’ve been using a Garmin GPSmap 60CSx device to record my travels. It’s not bad overall, but sometimes if I don’t replace the micro-SD card just right it doesn’t record my tracks.
Recalling that my Android G1 phone has GPS, I installed MyTracks from the online Android apps store.
Yesterday I took both devices for a spin together, and they recorded the same paths (within a few feet), with more differences while walking or indoors than while driving. Today I tried a comparison bike ride, which the Android app flunked by only seeing satellites intermittently from the pocket where I was carrying it. I’ll have to try again with the phone away from my body.
Assuming that works, the real difference between the two devices is cultural.
The Garmin is a classic device from a hardware company. It’s marketed to runners, bicyclists, boaters, hikers, and other outdoorists. The battery compartment, display, and data ports are protected by rubber fittings: you can use it in the rain. On the other hand, the software sucks, and it’s hooked to a business model that makes maps from Garmin the only ones you can install.
The G1 is an open smartphone with a GPS receiver in it. The MyTracks software is pretty good, and the mapping comes from Google Maps on the device. This is the classic computer platform style: the platform vendor stands back and lets third-parties compete to make the best app in each niche. On the other hand, the hardware is Just a Phone: as suggested above, I have my doubts about the antenna, and there’s certainly no special provision for the elements.
Historical evidence from analogous market situations strongly suggests that the open platform model will win. Hardware-company culture will suck up energy by trying to segment the market with too many SKUs (have a look through Garmin’s web site to see what I mean), and software improvement will always be an afterthought as they try to move boxes through channels as diverse as Best Buy, Crutchfield, REI, and auto parts stores.
Android GPS software vendors, on the other hand, only have one channel to worry about: the app store. They only have to worry about being able to clearly state which devices their apps run on. This frees them to concentrate on acquiring and making interesting use of GPS data. They have no physical boxes to move (the hardware vendors take that risk).
On the hardware side, the niche for rain-resistance can be satisfied by accessory makers making rubber sleeves for entire devices or by hardware vendors wrapping rugged housings around electronics designed and built by another company that knows how. In neither case does the hardware outfit need to know that the customer wants to do GPS: it’s sufficient just to know that they anticipate getting rained on for whatever reason.
The same arguments apply to iPhone, of course, but I don’t have one of those to write about.
Good talk on scaling data services
Written by Dan Rabin on November 3rd, 2009Werner Vogels of Amazon talks about availability and consistency at their kind of scale. What he says resonates with my experience from working at another big Web company.
¡Data Liberation, si!
Written by Dan Rabin on September 14th, 2009Google [disclosure: a former employer] introduces the Data Liberation Front to ensure that all its services have simple data export functionality.
Brad Fitzpatrick says:
What does product liberation look like? Said simply, a liberated product is one which has built-in features that make it easy (and free) to remove your data from the product in the event that you’d like to take it elsewhere.
Note the emphasis on free and easy. Online services tend to stop at possible, which falls short of easy by the proverbial Simple Matter of Programming.
Facebook, for example, has a programming interface for each kind of information you store that lets a Facebook application extract it into a file. It’s possible for me to write an application that does so for everything I have on Facebook, but that isn’t the same as having a predefined procedure that archives my entire Facebook presence into a standard file format that most social networking sites can import from.
Good luck to Google in this effort! I myself would go further in asking that the entire behavior (as well as data) of my social-network presence be standardizable and portable, as Ramón Cáceres [disclosure: personal friend] and his collaborators are proposing with their work on virtual individual servers.
Genome data at NCBI
Written by Dan Rabin on September 13th, 2009The National Center for Biotechnology Information (U.S.) has a nice online viewer for the genomes of many organisms, including
The human genome has just over 3 billion base pairs in about 25 thousand genes. This is a large enough data set that it gets algorithms of its own.
Making public data APIs is a business now
Written by Dan Rabin on July 6th, 2009Jon Udell blogs about a company that builds interfaces to public-sector data.
Udell points out, quite rightly, that
“Give us the data” is an easy slogan to chant. And there’s no doubt that much good will come from poking through what we are given. But we also need to have ideas about what we want the data for, and communicate those ideas to the providers who are gathering it on our behalf.
New York City government seeks data miners
Written by Dan Rabin on June 29th, 2009Sewell Chan and Patrick McGeehan report today in the New York Times that the New York City government is out to make its piles of public data actually usable:
In what is planned to become an annual competition known as NYC Big Apps, the city will make available about 80 data sets from 32 city agencies and commissions. The winners of the competition will get a cash prize, recognition at a dinner with the mayor, and marketing opportunities.
One has to be wary of competitions: they can be a way of trying to get some work for free, or a sign that the project doesn’t have realistic funding behind it. On the other hand, it shows that the sponsor wants to tap a wider range of imagination than it would get with the usual contracting process.
Dinner with the mayor!
Hat tip: "tootie" at reddit.