<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Information in Rotation</title>
	<atom:link href="http://appliedrotation.com/Techblog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://appliedrotation.com/Techblog</link>
	<description>Dan Rabin writes on metadata, data, the information they represent and how.</description>
	<lastBuildDate>Wed, 03 Mar 2010 18:34:12 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Economist: Information Overload</title>
		<link>http://appliedrotation.com/Techblog/?p=96</link>
		<comments>http://appliedrotation.com/Techblog/?p=96#comments</comments>
		<pubDate>Wed, 03 Mar 2010 18:33:38 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=96</guid>
		<description><![CDATA[The Economist has a special pull-out section on information overload this week.  It&#8217;s a useful non-technical overview of where things are.
]]></description>
			<content:encoded><![CDATA[<p><em>The Economist</em> has a <a href=http://www.economist.com/specialreports/displayStory.cfm?story_id=15557443>special pull-out section on information overload</a> this week.  It&#8217;s a useful non-technical overview of where things are.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=96</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on XML namespaces from James Clark</title>
		<link>http://appliedrotation.com/Techblog/?p=91</link>
		<comments>http://appliedrotation.com/Techblog/?p=91#comments</comments>
		<pubDate>Sat, 02 Jan 2010 09:06:34 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=91</guid>
		<description><![CDATA[Influential XML personage James Clark has posted a very carefully thought-out essay on XML namespaces.
Everybody loves to bash XML namespaces, but this essay is the most careful and dispassionate I have seen to date.  You should read the whole thing if you deal with XML (or even if you just like to read the [...]]]></description>
			<content:encoded><![CDATA[<p>Influential XML personage James Clark has posted a very carefully thought-out <a href="http://blog.jclark.com/2010/01/xml-namespaces.html">essay on XML namespaces</a>.</p>
<p>Everybody loves to bash XML namespaces, but this essay is the most careful and dispassionate I have seen to date.  You should read the whole thing if you deal with XML (or even if you just like to read the product of clear thinking).</p>
<p>Here&#8217;s a key quote:</p>
<blockquote><p>
I would claim that the aspect of XML Namespaces that causes pain is the URI/prefix duality: the thing that occurs in the document (the prefix + local name) is not the same as the thing that is semantically significant (the namespace URI + local name).  As soon as you accept this duality, I believe you are doomed to a significant extra layer of complexity.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=91</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New York Times article on mining local government data</title>
		<link>http://appliedrotation.com/Techblog/?p=89</link>
		<comments>http://appliedrotation.com/Techblog/?p=89#comments</comments>
		<pubDate>Mon, 07 Dec 2009 04:54:29 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=89</guid>
		<description><![CDATA[The New York Times has an article on mining local-government data for unforeseen purposes.  
Nothing new here, but its being in the Times means my mom reads about it, and yours might too.
]]></description>
			<content:encoded><![CDATA[<p>The New York Times has an <a href="http://www.nytimes.com/2009/12/07/technology/internet/07cities.html">article</a> on mining local-government data for unforeseen purposes.  </p>
<p>Nothing new here, but its being in the Times means my mom reads about it, and yours might too.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=89</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dueling GPSes: Garmin vs. Android G1</title>
		<link>http://appliedrotation.com/Techblog/?p=85</link>
		<comments>http://appliedrotation.com/Techblog/?p=85#comments</comments>
		<pubDate>Mon, 07 Dec 2009 04:01:49 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=85</guid>
		<description><![CDATA[For the last 2.5 years I&#8217;ve been using a Garmin GPSmap 60CSx device to record my travels.  It&#8217;s not bad overall, but sometimes if I don&#8217;t replace the micro-SD card just right it doesn&#8217;t record my tracks.
Recalling that my Android G1 phone has GPS, I installed MyTracks from the online Android apps store.
Yesterday I [...]]]></description>
			<content:encoded><![CDATA[<p>For the last 2.5 years I&#8217;ve been using a <a href="https://buy.garmin.com/shop/shop.do?cID=145&#038;pID=310">Garmin GPSmap 60CSx</a> device to record my travels.  It&#8217;s not bad overall, but sometimes if I don&#8217;t replace the micro-SD card just right it doesn&#8217;t record my tracks.</p>
<p>Recalling that my Android G1 phone has GPS, I installed <a href="http://mytracks.appspot.com/">MyTracks</a> from the online Android apps store.</p>
<p>Yesterday I took both devices for a spin together, and they recorded the same paths (within a few feet), with more differences while walking or indoors than while driving.  Today I tried a comparison bike ride, which the Android app flunked by only seeing satellites intermittently from the pocket where I was carrying it.  I&#8217;ll have to try again with the phone away from my body.</p>
<p>Assuming that works, the real difference between the two devices is cultural.  </p>
<p>The Garmin is a classic device from a hardware company.  It&#8217;s marketed to runners, bicyclists, boaters, hikers, and other outdoorists.  The battery compartment, display, and data ports are protected by rubber fittings: you can use it in the rain.  On the other hand, the software sucks, and it&#8217;s hooked to a business model that makes maps from Garmin the only ones you can install.  </p>
<p>The G1 is an open smartphone with a GPS receiver in it.  The MyTracks software is pretty good, and the mapping comes from Google Maps on the device.  This is the classic computer platform style: the platform vendor stands back and lets third-parties compete to make the best app in each niche.  On the other hand, the hardware is Just a Phone:  as suggested above, I have my doubts about the antenna, and there&#8217;s certainly no special provision for the elements.</p>
<p>Historical evidence from analogous market situations strongly suggests that the open platform model will win.  Hardware-company culture will suck up energy by trying to segment the market with too many SKUs (have a look through Garmin&#8217;s web site to see what I mean), and software improvement will always be an afterthought as they try to move boxes through channels as diverse as Best Buy, Crutchfield, REI, and auto parts stores.</p>
<p>Android GPS software vendors, on the other hand, only have one channel to worry about: the app store.  They only have to worry about being able to clearly state which devices their apps run on.  This frees them to concentrate on acquiring and making interesting use of GPS data.  They have no physical boxes to move (the hardware vendors take that risk).  </p>
<p>On the hardware side, the niche for rain-resistance can be satisfied by accessory makers making rubber sleeves for entire devices or by hardware vendors wrapping rugged housings around electronics designed and built by another company that knows how.  In neither case does the hardware outfit need to know that the customer wants to do GPS: it&#8217;s sufficient just to know that they anticipate getting rained on for whatever reason.</p>
<p>The same arguments apply to iPhone, of course, but I don&#8217;t have one of those to write about.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=85</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Good talk on scaling data services</title>
		<link>http://appliedrotation.com/Techblog/?p=83</link>
		<comments>http://appliedrotation.com/Techblog/?p=83#comments</comments>
		<pubDate>Wed, 04 Nov 2009 00:42:58 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=83</guid>
		<description><![CDATA[Werner Vogels of Amazon talks about availability and consistency at their kind of scale.  What he says resonates with my experience from working at another big Web company.
]]></description>
			<content:encoded><![CDATA[<p>Werner Vogels of Amazon talks about <a href="http://www.infoq.com/presentations/availability-consistency">availability and consistency</a> at their kind of scale.  What he says resonates with my experience from working at another big Web company.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=83</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>¡Data Liberation, si!</title>
		<link>http://appliedrotation.com/Techblog/?p=78</link>
		<comments>http://appliedrotation.com/Techblog/?p=78#comments</comments>
		<pubDate>Mon, 14 Sep 2009 20:03:22 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=78</guid>
		<description><![CDATA[Google [disclosure: a former employer] introduces the Data Liberation Front to ensure that all its services have simple data export functionality.
Brad Fitzpatrick says:
What does product liberation look like? Said simply, a liberated product is one which has built-in features that make it easy (and free) to remove your data from the product in the event [...]]]></description>
			<content:encoded><![CDATA[<p>Google [disclosure: a former employer] introduces the <a href="http://googlepublicpolicy.blogspot.com/2009/09/introducing-dataliberationorg-liberate.html">Data Liberation Front</a> to ensure that all its services have simple data export functionality.</p>
<p>Brad Fitzpatrick says:</p>
<blockquote><p>What does product liberation look like? Said simply, a liberated product is one which has built-in features that make it easy (and free) to remove your data from the product in the event that you&#8217;d like to take it elsewhere.</p></blockquote>
<p>Note the emphasis on <em>free</em> and <em>easy</em>.  Online services tend to stop at <em>possible</em>, which falls short of easy by the proverbial <a href="http://en.wikipedia.org/wiki/Small_matter_of_programming">Simple Matter of Programming</a>.  </p>
<p><a href="http://www.facebook.com">Facebook</a>, for example, has a programming interface for each kind of information you store that lets a Facebook application extract it into a file.  It&#8217;s possible for me to write an application that does so for everything I have on Facebook, but that isn&#8217;t the same as having a predefined procedure that archives my entire Facebook presence into a standard file format that most social networking sites can import from.  </p>
<p>Good luck to Google in this effort!  I myself would go further in asking that the entire behavior (as well as data) of my social-network presence be standardizable and portable, as<a href="http://www.kiskeya.net/ramon/"> Ramón Cáceres</a> [disclosure: personal friend] and his collaborators are proposing  with their work on <a href="http://www.kiskeya.net/ramon/work/pubs/topic.html#vis">virtual individual servers</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=78</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Genome data at NCBI</title>
		<link>http://appliedrotation.com/Techblog/?p=71</link>
		<comments>http://appliedrotation.com/Techblog/?p=71#comments</comments>
		<pubDate>Sun, 13 Sep 2009 23:27:47 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Areas of application]]></category>
		<category><![CDATA[Biology]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=71</guid>
		<description><![CDATA[The National Center for Biotechnology Information (U.S.) has a nice online viewer for the genomes of many organisms, including Homo sapiens.
The human genome has just over 3 billion base pairs in about 25 thousand genes.  This is a large enough data set that it gets algorithms of its own.
]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.ncbi.nlm.nih.gov/">National Center for Biotechnology Information</a> (U.S.) has a nice <a href="http://www.ncbi.nlm.nih.gov/Genomes/">online viewer</a> for the genomes of many organisms, including <it>Homo sapiens</it>.</p>
<p>The human genome has just over 3 billion base pairs in about 25 thousand genes.  This is a large enough data set that it gets algorithms of its own.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=71</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Geoff Nunberg on Google Books metadata</title>
		<link>http://appliedrotation.com/Techblog/?p=64</link>
		<comments>http://appliedrotation.com/Techblog/?p=64#comments</comments>
		<pubDate>Thu, 03 Sep 2009 16:39:30 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Information Philosophy]]></category>
		<category><![CDATA[Information usage patterns]]></category>
		<category><![CDATA[Metadata]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=64</guid>
		<description><![CDATA[Linguist Geoff Nunberg comments on the poor general quality of metadata in Google Books, and why that&#8217;s a problem.
It&#8217;s a tough problem: if you do things (like scanning entire libraries) at Google-scale, you just can&#8217;t pay attention to the details.  One partial way out (which Geoff mentions) is to allow users to submit corrections, [...]]]></description>
			<content:encoded><![CDATA[<p>Linguist Geoff Nunberg <a href="http://languagelog.ldc.upenn.edu/nll/?p=1701#more-1701">comments on the poor general quality of metadata</a> in Google Books, and why that&#8217;s a problem.</p>
<p>It&#8217;s a tough problem: if you do things (like scanning entire libraries) at Google-scale, you just can&#8217;t pay attention to the details.  One partial way out (which Geoff mentions) is to allow users to submit corrections, as Google Maps does for positions of placemarks.</p>
<p>The article addresses a number of important points about the provenance and usefulness of metadata, and Google employees provide some great comments and discussion.</p>
<p>(Via <a href="http://delong.typepad.com/sdj/2009/09/links-for-2009-09-03.html">Brad DeLong</a>).</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=64</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Making public data APIs is a business now</title>
		<link>http://appliedrotation.com/Techblog/?p=54</link>
		<comments>http://appliedrotation.com/Techblog/?p=54#comments</comments>
		<pubDate>Mon, 06 Jul 2009 14:50:59 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Data and society]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=54</guid>
		<description><![CDATA[Jon Udell blogs about a company that builds interfaces to public-sector data.  

Udell points out, quite rightly, that
“Give us the data” is an easy slogan to chant. And there’s no doubt that much good will come from poking through what we are given. But we also need to have ideas about what we want [...]]]></description>
			<content:encoded><![CDATA[<p>Jon Udell <a href="http://blog.jonudell.net/2009/07/06/influencing-the-production-of-public-data/">blogs about</a> a <a href="http://www.3scale.net/">company</a> that builds interfaces to public-sector data.  </p>
<p>
Udell points out, quite rightly, that</p>
<blockquote><p>“Give us the data” is an easy slogan to chant. And there’s no doubt that much good will come from poking through what we are given. But we also need to have ideas about what we want the data for, and communicate those ideas to the providers who are gathering it on our behalf. </p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=54</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New York City government seeks data miners</title>
		<link>http://appliedrotation.com/Techblog/?p=43</link>
		<comments>http://appliedrotation.com/Techblog/?p=43#comments</comments>
		<pubDate>Mon, 29 Jun 2009 19:00:02 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Data and society]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=43</guid>
		<description><![CDATA[Sewell Chan and Patrick McGeehan report today in the New York Times that the New York City government is out to make its piles of public data actually usable:
In what is planned to become an annual competition known as NYC Big Apps, the city will make available about 80 data sets from 32 city agencies [...]]]></description>
			<content:encoded><![CDATA[<p>Sewell Chan and Patrick McGeehan <a href="http://cityroom.blogs.nytimes.com/2009/06/29/city-invites-software-developers-to-crunch-big-data-sets/">report today in the New York Times</a> that the New York City government is out to make its piles of public data actually usable:</p>
<blockquote><p>In what is planned to become an annual competition known as NYC Big Apps, the city will make available about 80 data sets from 32 city agencies and commissions. The winners of the competition will get a cash prize, recognition at a dinner with the mayor, and marketing opportunities.</p></blockquote>
<p>One has to be wary of competitions: they can be a way of trying to get some work for free, or a sign that the project doesn&#8217;t have realistic funding behind it.  On the other hand, it shows that the sponsor wants to tap a wider range of imagination than it would get with the usual contracting process.</p>
<p>Dinner with the mayor!</p>
<p><em>Hat tip: <a href="http://www.reddit.com/r/programming/comments/8wozw/new_york_city_invites_software_developers_to/">&quot;tootie&quot; at reddit</a>.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=43</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Disk drive reliability in detail</title>
		<link>http://appliedrotation.com/Techblog/?p=37</link>
		<comments>http://appliedrotation.com/Techblog/?p=37#comments</comments>
		<pubDate>Mon, 29 Jun 2009 17:56:15 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Storing data]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=37</guid>
		<description><![CDATA[I tend to get abstract and philosophical about data here, and it&#8217;s good to have have an occasional splash in the cold water of how the stuff gets stored.
Jon Elerath&#8217;s article on hard disk reliability in the June 2009 Communications of the ACM (may require ACM login) gives a lot of detail about the different [...]]]></description>
			<content:encoded><![CDATA[<p>I tend to get abstract and philosophical about data here, and it&#8217;s good to have have an occasional splash in the cold water of how the stuff gets stored.</p>
<p>Jon Elerath&#8217;s <a href="http://portal.acm.org/citation.cfm?id=1516059">article on hard disk reliability in the June 2009 <em>Communications of the ACM</em></a> (may require ACM login) gives a lot of detail about the different kinds of disk storage error and appropriate countermeasures.  He gets down to the level of things that scratch vs. things that smear (kind of like H.P. Lovecraft for the archivally-minded).</p>
<p>The big takeaway is that there are big obvious crash-failures like the bearing getting wobbly or servo tracks being trashed: these make the drive stop working, and you rebuild from something you still trust.  And then there are insidious quiet read/write failures that you can only counteract with a policy of &quot;scrubbing&quot; drives proactively.</p>
<p>Now that I buy storage by the 1.5 terabytes/spindle, I really should do something to dissuade Those Whose Names Are Random from assimilating my data to the Outer Abyss of Maximum Entropy.  If you should happen to find some of your goats missing, better not to ask&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=37</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The two cultures</title>
		<link>http://appliedrotation.com/Techblog/?p=31</link>
		<comments>http://appliedrotation.com/Techblog/?p=31#comments</comments>
		<pubDate>Wed, 24 Jun 2009 21:25:18 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Information Philosophy]]></category>
		<category><![CDATA[Information usage patterns]]></category>

		<guid isPermaLink="false">http://127.0.0.1/Techblog/?p=31</guid>
		<description><![CDATA[Jon Stokes has an excellent description of the two contrasting philosophies of information management in his comparison of the Palm Pre and the iPhone.  
He names the two approaches &#8220;structure-and-browse&#8221; and &#8220;collect-and-query&#8221;.  I feel like I&#8217;ve been groping for these terse descriptions for years!
]]></description>
			<content:encoded><![CDATA[<p>Jon Stokes has an excellent description of the two contrasting philosophies of information management in his <a href="http://arstechnica.com/gadgets/reviews/2009/06/ars-palm-pre-review.ars" target="_self">comparison of the Palm Pre and the iPhone</a>.  </p>
<p>He names the two approaches &#8220;structure-and-browse&#8221; and &#8220;collect-and-query&#8221;.  I feel like I&#8217;ve been groping for these terse descriptions for years!</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=31</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Chris Anderson: One size metadata doesn&#8217;t fit all</title>
		<link>http://appliedrotation.com/Techblog/?p=30</link>
		<comments>http://appliedrotation.com/Techblog/?p=30#comments</comments>
		<pubDate>Wed, 21 Mar 2007 17:04:23 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Areas of application]]></category>
		<category><![CDATA[Information Philosophy]]></category>
		<category><![CDATA[Information usage patterns]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[Musical]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=30</guid>
		<description><![CDATA[Misfits of Metadata
Chris Anderson of The Long Tail has an important post about how the metadata used in some music-listening applications doesn&#8217;t satisfy the listeners needs:
[...] classical is a genre that the one-size-fits-all music aggregators such as iTunes don&#8217;t handle particularly well. They&#8217;re oriented around pop music, with its artist, album, track data format. Meanwhile [...]]]></description>
			<content:encoded><![CDATA[<h3>Misfits of Metadata</h3>
<p>Chris Anderson of <a href="http://www.longtail.com/the_long_tail/" title="The Long Tail">The Long Tail</a> has an<a href="http://www.longtail.com/the_long_tail/2007/03/one_size_aggreg.html" title="One size metadata doesn't fit all"> important post</a> about how the metadata used in some music-listening applications doesn&#8217;t satisfy the listeners needs:</p>
<blockquote><p>[...] classical is a genre that the one-size-fits-all music aggregators such as iTunes don&#8217;t handle particularly well. They&#8217;re oriented around pop music, with its artist, album, track data format. Meanwhile classical music organizes around composer, conductor, performer, soloist</p></blockquote>
<p>He also voices my exact peeve about how jazz is treated:</p>
<blockquote><p>However, neither of them does a very good job with Jazz, where the individual musicians are often more meaningful than the band.</p></blockquote>
<p>Yup.  No reasonable cataloguer of jazz recordings separates &#8220;Thelonious Monk Trio&#8221; from &#8220;Thelonious Monk Quartet&#8221; from &#8220;Thelonious Monk&#8221;.  At the same time, it&#8217;s important to be able to locate all appearances of Thelonious Monk, regardless of whether he was the leader of the session (note that &#8220;leader&#8221; and &#8220;session&#8221; are appropriate terms in jazz discography, but not for pop or classical).
</p>
<h3>When your only tool is a hammer&#8230;</h3>
<p>I can&#8217;t help but wonder if the problems Chris calls out in iTunes come from the poor selection of data tools in most applications programmers&#8217; toolkits.  Relational databases, the current orthodox storage technique, favor using one or more tables, each consisting of records having the same selection of attributes.  There are hacks you can use to simulate having, say, jazz tracks and pop tracks in the same Tracks relation, but hacks and simulations tend to twist one&#8217;s code, so most programmers resist going there.
</p>
<h3>An XML database in every toolbox!</h3>
<p>
We don&#8217;t really have to live this way anymore.  With the popularity of <a href="http://www.w3.org/XML/">XML</a> for data interchange, the tools ecology has given us a <a href="http://www.rpbourret.com/xml/XMLDatabaseProds.htm">variety</a> of <a href="http://www.rpbourret.com/index.htm">XML database systems</a>.  The XML data model has the flexibility to represent varying record structures: in fact, it has much more flexibility than we need for the purpose!</p>
<p>Heretical as it may seem to put the cart of an interchange format before the horse of data abstraction, the XML situation is very useful in practice, at least for databases of moderate size.  The <a href="http://www.w3.org/%20">W3C</a> has come up with the <a href="http://www.w3.org/Style/XSL/">XPath</a> and <a href="http://www.w3.org/XML/Query/">XML Query</a> specifications that provide excellent query mechanisms for data represented in the XML model.  XML Query in particular is designed to look somewhat familiar to the hardened <a href="http://www.jcc.com/sql.htm">SQL</a> user.  There&#8217;s data typing taken from the <a href="http://www.w3.org/XML/Schema">XML Schema</a> <a href="http://www.w3.org/TR/xmlschema-2/">datatype recommendataion</a> as well.</p>
<h3>Better nails</h3>
<p>Anyhow, let&#8217;s learn to design with a more flexible hammer, and maybe we&#8217;ll be able to hit a wider class of nails, rather than our users&#8217; thumbs!</p>
<p><em>March is International Runaway Metaphor Month.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=30</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OpenStreetMap constructs maps from GPS tracks!</title>
		<link>http://appliedrotation.com/Techblog/?p=29</link>
		<comments>http://appliedrotation.com/Techblog/?p=29#comments</comments>
		<pubDate>Wed, 21 Feb 2007 17:06:04 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Areas of application]]></category>
		<category><![CDATA[Geographic]]></category>
		<category><![CDATA[Information Philosophy]]></category>
		<category><![CDATA[Information usage patterns]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=29</guid>
		<description><![CDATA[Sources and uses of digital information are in-scope for this blog, and a great example just showed up in my RSS reader today.
OpenStreetMap is a wiki-like project to build a world map using contributed GPS tracks [OpenGeoData pointed me there].  Their map of Baghdad is here.  
This project is truly a product of [...]]]></description>
			<content:encoded><![CDATA[<p>Sources and uses of digital information are in-scope for this blog, and a great example just showed up in my RSS reader today.</p>
<p><a href="http://wiki.openstreetmap.org/index.php/Main_Page" title="OpenStreetMap home page">OpenStreetMap</a> is a wiki-like project to build a world map using contributed GPS tracks [<a href="http://www.opengeodata.org/?p=167">OpenGeoData</a> pointed me there].  Their map of Baghdad is <a href="http://wiki.openstreetmap.org/index.php/Image:Baghdad.png">here</a>.  </p>
<p>This project is truly a product of the early 21st century: it requires GPS satellites, cheap but accurate GPS receivers, the World Wide Web, inexpensive computers with fast color graphics, and so forth.</p>
<p>And like all modern geographic applications, it also exploits a special property of GPS&#8217;s information domain: everyone agrees on the meaning of geographical location; only dates and times have a similar level of standardization.  In relational-database terminology, this means that any table with a date or location column has a meaningful join with any other.  </p>
<p>This doesn&#8217;t work with most data.  I&#8217;ve had driver&#8217;s licenses in four U.S. states, but you can&#8217;t aggregate my driving record from the state records because they all use different ID numbering schemes (nice for my privacy in this case).</p>
<p>Also noteworthy is the fact that GPS information can be used to put a time dimension into maps, since we can tell <em>when</em> the street is used as well as <em>where</em> it is.  There are some very pretty examples at <a href="http://cabspotting.org/timelapse.html">Cabspotting</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=29</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Should metadata be stored in the file it describes?  Jon Udell wonders&#8230;</title>
		<link>http://appliedrotation.com/Techblog/?p=28</link>
		<comments>http://appliedrotation.com/Techblog/?p=28#comments</comments>
		<pubDate>Wed, 21 Feb 2007 04:20:09 +0000</pubDate>
		<dc:creator>Dan Rabin</dc:creator>
				<category><![CDATA[Information Philosophy]]></category>
		<category><![CDATA[Metadata]]></category>

		<guid isPermaLink="false">http://appliedrotation.com/Techblog/?p=28</guid>
		<description><![CDATA[In &#8220;Who’s got the tag? Database truth versus file truth, part 3&#8243;, Jon Udell contrasts the Microsoft Vista and Mac OS X ways of associating metadata tags with image files: Vista tends to store them into the image files, and Mac OS X tends to leave the files untouched and use a separate database to [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://blog.jonudell.net/2007/02/20/whos-got-the-tag-database-truth-versus-file-truth-part-3/">&#8220;Who’s got the tag? Database truth versus file truth, part 3&#8243;</a>, <a href="http://blog.jonudell.net/">Jon Udell</a> contrasts the Microsoft Vista and Mac OS X ways of associating metadata tags with image files: Vista tends to store them into the image files, and Mac OS X tends to leave the files untouched and use a separate database to store the tags (or at least Jon was under this impression).</p>
<p>There&#8217;s a great discussion about the relative advantages of the two approaches on the blog.  Basically, storing the tags in the file makes the association harder to lose as you move the file around, and storing the tags separately avoids modifying the user&#8217;s data file.  Neither one is obviously in accord with the user&#8217;s intention in all cases.</p>
<p>I think the issue has whole extra layers of subtlety.  We perceive metadata that is stored within a data file as being what Jon Udell calls &#8220;file truth&#8221;.  Since there&#8217;s only one set of metadata stored in the file, it becomes the One True Metadata.  On the other hand, metadata stored in a separate database reads as the opinion of the maintainer of the database.  This is exactly what social bookmarking systems such as <a href="http://del.icio.us">del.icio.us</a>do: each attribution of a tag to a URL is also associated with a user making that attribution.</p>
<h4>A pluralistic society <em>requires</em> a separate metadatabase!</h4>
<p>This isn&#8217;t just another engineering tradeoff, though.  The truth about &#8220;file truth&#8221; is that it&#8217;s still an opinion—the opinion of the last agent to modify the metadata within the file.  When there&#8217;s One True Metadata, we can only represent disagreements by obliterating the last guy&#8217;s assertion.</p>
<p>Imagine trying to tag a scan of a photo taken at your parents&#8217; wedding of someone you don&#8217;t recognize.  You think it&#8217;s Dad&#8217;s college roommate, but your sister thinks it&#8217;s Mom&#8217;s second cousin.  You have one &#8220;person depicted&#8221; slot: do you fight over it?  Do you leave it blank and explain the situation in a semantically bland catch-all description field?  Or do you each tag it as you will in your respective databases?</p>
<p>Not only is it unrealistic to allow for only one true description of a file, it&#8217;s also time we stopped regarding metadata as lost forever just because it&#8217;s not stored in the file.  We could set up a distributed database that works like <a href="http://www.gracenote.com">Gracenote</a>&#8217;s CD identification database, but for all files instead of just music files.  As with CDs, the lookup key for a file can be generated by anyone who possesses the file (by applying a secure hash), but the particular metadata obtained depends on which tagger&#8217;s part of the repository is consulted.  It&#8217;s all doable, and it would eliminate blogstorms about how evil application X erases user metadata.</p>
]]></content:encoded>
			<wfw:commentRss>http://appliedrotation.com/Techblog/?feed=rss2&amp;p=28</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
