«« Twisting Rails is Risky Business Sarcastic Dictionary (Part 1 of Many) »»
blog header image
Tags Hierarchies

I would like to explain an idea I have before I try to implement it. I wouldn't doubt if this idea has been done before but I can't find any Rails implementations of it.

I want to support a tag hierarchies on Hey! Heads Up. Tag hierarchies will result in implicit tagging, which most people do in their heads anyway. Examples of implicit tagging will probably help:

1. Something tagged 'hockey' should also be tagged 'sport'
2. Something tagged 'MacBook' should also be tagged 'Apple'
3. Something tagged 'Rails' should also be tagged 'Ruby'

"Why would I want all of these generalized implicit tags on my stuff?", you might ask. In Hey! Heads Up you can view your item lists by tag. If I wanted to read all of the sports items I'd have to go through hockey, football, baseball, golf lists individually.

I could support AND-ing these tags together in the URL -- and I want support that, as well as support OR -- but just being able to say "give me everything tagged 'sport'" would be much cleaner.

Tagging is a personal thing and it seems like everyone does it differently. That's one of the advantages of tagging over a predefined hierarchy. But a lot of people don't like to be redundant. If I tag something with 'hockey' I'm not going to explicitly tag it with 'sport' as well.

When tagging replaced hierarchies as the new way to classify/organize things in the "Web 2.0" world, we lost something. We lost the implicit nature of hierarchies because tags are a flat taxonomy. The upside is that now it's easier to classify things with tags that are orthogonal subjects -- like 'hockey' and 'book'. But the hierarchical nature of each subject that was so handy is lost, so my hockey book wouldn't be classified with 'sport' and 'book'.

How do I support a tag hierarchy taxonomy? I could make a global taxonomy for Hey! Heads Up and force implicit tags on people but that doesn't sound right.

A better solution is allowing people to create their own tag hierarchies. Then they can make the hierarchies for certain subjects as detailed as they want them to be. Every orthogonal subject would have its own tag hierarchy -- the subjects wouldn't be linked together at the top of the taxonomy.

Sure it's more work for people to set up but people who care about it will think it is worth it. People already set up tagging (labelling) for web applications like Gmail in advance. For everyone else, tag hierarchies will be optional and regular tagging will still be supported.

If Gmail had label hierarchies, I would definitely use them. But perhaps I'm in the minority.

How I implement tag hierarchies on the back-end isn't that important to this discussion but it could result in a new Rails plugin, if there isn't already one out there.

Posted at September 27, 2007 at 11:29 AM EST
Last updated September 27, 2007 at 11:29 AM EST

Have you read Clay Shirky's Ontology is Overrated? It's worth taking a look at.

I think your solution to hierarchical tagging is the best one, but do you really think it's necessary? I haven't quite decided how I feel about the idea of hierarchical tags. I see how it could be useful, but I also worry that it leads to compulsive meta-organization. Sometimes imposed simplicity is the best choice.

» Posted by: Patrick D at September 27, 2007 04:21 PM

Whoops, here's the link, since your comment form stripped it out:


» Posted by: Patrick D at September 27, 2007 04:22 PM

Yes, I suppose people could go overboard but if everyone has their own personal taxonomy then it doesn't matter, right? Who am I to say how picky other people should be? :)

» Posted by: Ryan Lowe at September 27, 2007 06:18 PM

I thought about something similar awhile back, but doing it in a centralized way.

Rather than Flickr and Delicious and every other Web 2.0 site using tags, and trying to relate them, why is there not a Web 2.0 company that offers a free, open tagging repository. When I tag something "hockey", maybe sites should automatically offer me related things to tag with. The aggregate knowledge and the relationships contained in tags becomes more valuable and gains traction more quickly.

The problems that I saw are two-fold.

Doing it automatically violates the nature of information. If I tag something "capri", do I mean the Ford Capri? Capri pants? The Italian island? It's unclear. So automating it seems....difficult.

Getting the community to do it assumes that tagging is a thing that people do for the social good. In other words, I tag things not only so I can find them, but so that everyone else can find them, too. Perhaps I'm a little left-leaning in assuming that people would actually do so. (But making it trivial to do would help.)

But maybe once I tag something as Capri, and it offers me a few other potentially related choices (like "car", which I select), it will automatically apply "ford" or anything else it deems appropriate given the tag clustering...

Just some thoughts...


» Posted by: Brad Mazurek at September 28, 2007 12:10 AM

I think you're reinventing categories, not tags.

"Categorizing is like taking all of your socks and putting them into drawers based on colours. Tagging is like sewing a little label on your socks that says when you bought them, how to wash them, and “if lost please return to the dude with the fat cat.” Categories add organization and tags add semantic information. A category can be a tag, but if you use your tags as categories you’ll eventually have a right old mess."


There's also a script to add hierarchies to Gmail labels:

» Posted by: engtech at September 28, 2007 03:12 AM

A single "correct" categorization is a bad thing. It implies that there is a single, correct world view. All other world views suffer from the tyranny of that categorization.

An arbitrary set of categorizations speaks of the relationships between items. It can tell me if a particular tag "capri" refers to a car, the island or a pair of pants. How does making this semantic information or context results in a "right old mess"?

If I tag a tag with a number of GUIDs that allow me to quickly refer to, assemble and dissemble an arbitrary number of hierarchical categorizations and relationships, how has this resulted in something worse?

A political-level categorization of Toronto may identify it as a city in Upper Canada, a British Colony prior to 1867 and as a city in the province of Ontario, in the country of Canada, a member of the British Commonwealth after 1867. The context is important...and tags and arbitrary categorizations can help us navigate it.

» Posted by: Brad Mazurek at September 28, 2007 08:12 AM

Brad> Recommending tags is something sites are doing well already, like delicious. I could certainly let people use tag hierarchies as well as recommend tags from a cumulative tag hierarchy and let the user decide whether to take the recommendation or not. If someone tags something with 'hockey', the implicit tag 'sport' could be recommended.

When everyone has their own tag hierarchy, they can decide where to put the 'Capri' tag.


engtech said: "...if you use your tags as categories you’ll eventually have a right old mess."

I'm not sure I agree with that statement. My Gmail labels do a great job categorizing my mail, I'd just like to put them in a hierarchy because it would be even better. Besides, since each user has their own taxonomy it's their own mess to deal with. I'm not trying to impose my site's tag hierarchy on anyone else.

But you have a point about going too far: people can go overboard with tags and that's up to them. I'm just giving them the rope -- it's not my job to prevent them from hanging themselves. Wordpress probably thinks the same thing.

From the "end user" perspective those words are so close they are almost synonyms: tags, categories, labels, folders. They all do pretty much the same thing. I'm not sure semantics and exact definitions will help the situation, people will do what they want with them. :)

Your sock analogy: you could have 'categories' for sock color, size, warmth, material. If you put the socks in real physical drawers, you'd have to pick one of those categories to organize by. If we're dealing with virtual things we can use all of the categories and decide when we list them which sorting category we want to group by.

As for the Gmail thing: it's a nice hack! But it's probably just OR'ing them together. Can I do more than 2 levels?

» Posted by: Ryan at September 28, 2007 08:28 AM

Although the tags mean something to the tagger, there is a value beyond that individuals selfish action. But it would seem to me that that value will really only arise after the number of tags has grown to a point where machines can start making inferences about the data.

If I want to start a new website, and it services a niche but only has 500 users, why should the tags and taggings from sites like delicious and flickr be locked up there? If by analyzing the tags on delicious and flickr I know that there is a common relationship between Capri and "Ford" or "Island" or "Pants" and a personal relationship between the tagger and their dog, when they tag Capri, it can make the best guess and say "do you mean your dog?".

That's when the value of a system like this really starts to have an impact, doesn't it?

But, sadly, my niche site doesn't get that value when the first person visits and tags something. Why not? Other people have added tags to the "tagosphere". I have added tags to other sites. Inferences could be made...it's just that our data is locked up. (Hmmm...okay, some OpenID stuff may be needed...)

Maybe a lot of this already exists...I have no idea...I'm not as up on things as I should be...

» Posted by: Brad Mazurek at September 28, 2007 08:56 AM

You touched on at least two issues:

1. English is vague. There are lots of synonyms, like "Capri". The only way to tell is by context and trying to automate that in any way is a really really hard problem I'm not very interested in solving.

My solution is to let individuals manage their taxonomies and use words in whatever context *they* choose. This doesn't mix well with other people's taxonomies though because the contexts are different, which leads to number...

2. Combining taxonomies automatically. I'm not sure people need a lot of recommendations or a lot of choice. Sometimes that can be annoying, you know?

I have my own way of organizing my stuff, why do I care how others do it unless it's going to save me time somehow? I'm not convinced there's a benefit to sharing a lot of taxonomy data between users. Each user has a different agenda and different subject domains they are interested in.


Bradley, you used an interesting word: "tagosphere". I don't see tags like that at all. I see them as highly personal, not selfish. Everyone has their own tagging system in their head -- and they use it to organize their stuff with tags online. A combined "tagosphere" would be a huge mix of subject domains and context -- a huge mess.

» Posted by: Ryan at September 28, 2007 09:06 AM

I'm not sure I understand the distinction being made here between "categorization" and "tagging." As far as I know, in the PIM research literature at least, tagging is considered to be a way of categorizing (also known as classifying, the two terms are used interchangeably).

Ryan, said: "it's not my job to prevent them from hanging themselves."

First, I think that one of the great things about really well designed software is that is *does* prevent you from hanging yourself. Think of Perl and "there's more than one way to do it", vs. the Python philosophy. Precisely the reason a lot of people don't like Perl is because it's so easy to write completely unmaintainable code.

But even if you disagree with that philosophy, another thing to consider is that no feature comes for free. You're not just giving the users an extra bit of functionality, you're also making your application more complicated -- both for you to maintain, and for people to use.

» Posted by: Patrick D at September 28, 2007 11:39 AM

This was the last paragraph of my previous post, and for some reason it was denied for containing "questionable content" until I removed this paragraph:

Interesting discussion though. You could also look at what delicious did with hierarchies. I'm not sure exactly how it works, but I know I've seen some hierarchical-looking tags there.

» Posted by: Patrick D at September 28, 2007 11:41 AM

Patrick> good points. It is true that complicated features often lead to complicated software to maintain.

But "preventing the user from hanging themselves" in this case might result in more complicated code for me and more complicated usability for people. Too much hand holding in the usability and not enough flexibility. It's about balancing all of those things.

I see tagging as being as complicated as people want it to be. Done liberally, tagging can be really simple and useful. As a developer I can only give the people freedom to tag things -- I can't prevent people from going nuts with it.

» Posted by: Ryan at September 28, 2007 02:51 PM
Search scope: Web ryanlowe.ca