Tag clouds – have we got them wrong?

Tag clouds, possibly the most hyped thing after AJAX and RoR, so many sites use them, but have they done it the wrong way around, literally? What I’ve noticed, along with a few others, is that the most popular tags are the most generic, covering the widest range of topics.

This often means that those tags with emphasis, often push the less popular, but more relevant, tags out of the cloud entirely.

Take this for example; perhaps I was having trouble with JavaScript strings, and while searching for a solution, the first tag available is JavaScript, which also covers several hundred other topics, including strings somewhere.

Theres also a considerable problem with junk tags in the cloud. Every so often youll find that someone, instead of typing JavaScript, has accidentally typed LavaScript (the keys are like right next to each other), sometimes they dont even get that far and call it thing or stuff instead.

What can be done?

These issues could be prevented for the most part, if:

  • instead of only ever displaying the most popular tags, users could drill down through the tags, so first you would choose JavaScript followed by strings or one of the other tags related to JavaScript
  • instead of allowing just about anyone to create a new tag, you could require a user to reach a set level of posts, comments or even views
  • a function like PHPs levenshtein() was used to catch similar tags, and amalgamate them when 99% similar, or just offer a list of possible corrections
  • the system kept a list of unwanted tags, perhaps generated automatically upon tags moderators have removed
  • non alphabetical, numeric and space characters where removed, so instead of a tag like JavaScript, strings! youd end up with javascript rules.

Im not saying all, or any of these solutions are right, but they might be useful in finding a real solution.

So what do you make of all this? Are there more ways we can improve the humble tag cloud? Have you seen any sites that implement these ideas?

Statistics

This journal entry was written on 1 July 2009, and entombed beneath Design and Interface.

  • three hundred and fifty words
  • two links

Categories

All journal entries on this site are organised by category, here are the most popular: