Tuesday, February 15, 2005

Unstructured Thoughts on Unstructured Search

If you were wondering what's on TV in California, check out Google video. Apparently they're indexing the supetext captions that are transmitted in the black bits of our TV channels. Man - those guys are developing a finely tuned indexing radar. If there's content anywhere, (even flying through the atmosphere!) -seems they'll find a way to index it.

This makes me speculate on the future of companies that have profited for years by providing structured storage (FileNET,Documentum, OpenText, TOWER Software) etc.

I mean, when you can triumphantly return with exactly what you wanted from a big pile of random stuff, do you really need to structure it properly? Let's use the following half-baked analogy:

My study is full of vaguely important pieces of paper. Becuase deep down I hate all of them, I tend to treat them fairly carelessly. This means I generally just chuck them through the study door in order to minimize the amount of time I have to spend thinking about them. Sadly, every now and again I have to retrieve one for some random bureaucratic purpose. This usually involves swearing and yelling, and lots of looking at papers that I don't want.

If I could use these two approaches(structured vs unstructured storage) on my study, one would shake a bony librarian finger at me and say "You should've filed all your papers in folders, and categorized them appropriately. I have no idea where you car insurance policy is. " After a bunch of searching, I'd eventually find it - probably with lots of swearing and yelling.

Whereas the unstructured index mad loons at Google would just hand it to me. (along with a bunch of other unrelated cat insurance policies and so on. )Minimal swearing and no chastising.

So - is unstructured storage the way of the future? If that means less filing, I'm all for it. Now, If I could only get a googlebot to index my study...


  1. You have heard of data mining haven't you?

  2. Yeah - We used to joke that it's what you did when you screwed up everything else. But perhaps those are the words of frightened software developers. It seems to me that structured systems are less relevant now than they were once. Add on the metadata extensibility in WinFS (which will eventually ship one day) and I'm not surprised that every DM\RM company now advertises itself as a collaboration and content management company. The market is shifting, and the key players are starting to see the writing on the wall.