Friday, November 21, 2008

Data segregation and the Stockholm syndrome

Dreaming in code, the book I mentioned in this post from last week, follows the creation of a software tool named Chandler. At one point, the author makes a connection between how the development team is trying to make the data structures in Chandler as flexible as possible and intertwingularity, a term coined by Ted Nelson, who also came up with the term hypertext, to express the complexity of interrelations. Ted Nelson said that "people keep pretending they can make things hierarchical, categorizable and sequential when they can't".

Image by Nattu

I find myself many times organizing files into folders on my computer, but as time goes by, the data tend to get dispersed anyway. Then I end up procrastinating to clean up my files and directories on my hard drive again. On the other hand I'm using Delicious everyday to save and retrieve my bookmarks, without worrying about the structure.

We're not supposed to help computers into maintaining the data, but to put them to work. I wouldn't probably spend time creating folders on my hard drive, if retrieving the information by searching would be instantaneous.

I think that trying to maintain hierarchical structures turns us into some sort of data victims or as Patrick Mueller said, it makes us suffer from the Stockholm syndrome.

Image by cambodia4kidsorg

As I mentioned above, I keep my bookmarks on Delicious. I have about 1500 and the organization is based on tags. If I would've built a hierarchy for them I would be lost. It's easier to go to a tag and see other tags used in connection with that and so on. The same approach of archiving and searching is so liberating and it makes Gmail so easy to use.

Clay Shirky states the following in this interesting article about categories, links and tags:
One reason Google was adopted so quickly when it came along is that Google understood there is no shelf, and that there is no file system. Google can decide what goes with what after hearing from the user, rather than trying to predict in advance what it is you need to know.
He mentions that categorizing is similar to predicting the future, which turns out to be hard:
Consider the following statements:

A: "This is a book about Dresden."
B: "This is a book about Dresden, and it goes in the category 'East Germany'."

That second sentence seems so obvious, but East Germany actually turned out to be an unstable category. Cities are real. They are real, physical facts. Countries are social fictions. It is much easier for a country to disappear than for a city to disappear, so when you're saying that the small thing is contained by the large thing, you're actually mixing radically different kinds of entities. We pretend that 'country' refers to a physical area the same way 'city' does, but it's not true, as we know from places like the former Yugoslavia.

What about you, do you feel more comfortable building hierarchical categories or tagging information? What tools do you use that offer each of these functions or maybe both?


Eduard said...

First of all interesting subject. I don't think that we can draw a straight line between category an tag...those are two mixing concepts. Migrating away from hierarchical form of organizing information to tag clouds form poses one issue in retrieving it...relevance. There are multiple way of solving this and one of this is hierarchical way...snake biting it's own tail problem :). To reply to your question..I prefer radial form of organization and my favorite example is mind mapping, a extremely useful way of organizing your thoughts for example (and since our brain is capable of storing huge amount of data it is pretty..pretty..pretty relevant :) ). Useful tools... squid sketching. PS: Unfortunately comments don't support target attribute

Paul Marculescu said...

Eduard, thanks for the comment.

You pointed out well relevance, but I think this is where the main issue resides. Relevance is relative and subject to change over time, as in the example from the post with Dresden.

I'm a fan of the mind-mapping concept, I use it most of the time to compose the posts on this blog, for instance. I've been using a bit, but in the end paper and pen turned out to be more productive. :)

You're right, blogger doesn't accept target attribute in the comments. Thanks for trying. :)

kanter said...

Thanks for using my tagging image
thought you'd like to see my tagging screencast and wiki

Nicusor said...

Hm….what you are saying is partially right.

But, for instance, what is easier now? To create a folder with a time stamp and a short description where to put all the 500 pictures you made during an event? Or to take each one and to put labels?
Labels for sure have their advantages. But this is not covering all the situations.

In my opinion the perfect solution is far to be discovered.

Coming back to your question; I’m using labels for small data collection, hierarchy for most of them and search function for very heterogeneous information or when I’m to lazy to go through hierarchies and labels.

Paul Marculescu said...

Nicusor, your example with pictures is very good.

Please consider also this scenario:

Let's say the event is a party on the beach.

Create a folder, as you say, with the name comprised of the date of the event and a short description, like "20081127 Party on the beach" and put all the images inside.

Then label all the images with the following keywords: "event", "party", "beach", "outside", "people", "fun", "sand", "Jamaica", by saving them in the IPCT fields of the images. Some image viewers do this, like Picasa, automatically for all images, so it's just as easy as creating the folder and copying the images.

Next time when you go to another event and take pictures, label them in the same way, according to the environment, let's say: "barbecue", "grass", "outside", "party", "people", "fun".

You're now all set to retrieve all the images from parties across your folders and as you see, there are even more options. The more keywords you select for the images, the more possibilities you got for filtering.

I don't see hierarchies as an easy alternative for this kind of filtering.

Nicusor said...

Dear Paul, seems you studied this subject related with pictures and labels. Are you a photographer?

But, you cannot put the same label on a picture with a palm tree and another one with some wet bikini even they were made on the same event.

What I want to highlight is that in my perspective none of them, hierarchy or labeling, is not THE PERFECT SOLUTION.

Paul Marculescu said...

Nicusor, I'm a camera user. :)

True, no perfect solution, actually no universal one. You must find the one that works best for you. Or a combination, as we discussed.