A Picture Worth a Thousand Tags

Quick, describe Raphael’s St. Michael and the Dragon in 12 adjectives or less. Would they include #knight #religion, #greatart and #chiaroscuro? Would we be missing some of the essence of the image in boiling it down to these tags – these simple, searchable snippets?

Art is more than the sum of its parts, but online it's defined by its tags.

The digital world is a tagged world, a world coded and snipped into little boxes. Content must be deconstructed into its essential elements and coded in this way so that the algorithms that curate content for us (Google, Facebook, etc.) can put them into the appropriate boxes. It’s most obvious on channels like Instagram, where an image might have 10 or more hashtags coding it:

Tags are used to define where and how this image can be found by users.

It’s also apparent in many other contexts, such as meta tags on webpages to improve their search engine optimization (though Google and other search engines have moved to de-emphasize them in their ongoing algorithm updates in favor of content- and link-based analysis).

But it’s not intelligently curated, and it doesn’t speak to quality. I can tag any image or page anything, without that necessarily implying that it’s actually related or that it’s going to be relevant. Even if my tags are accurate, what something is isn’t always what it’s about; content, whether visual or text-based, doesn’t make sense without its context the unspoken relationships it has with other concepts matters deeply in understanding it. English, as with most languages, is very context-oriented. If I say “spring” to you, I could mean:

  • Spring (the season)
  • to spring (the verb)
  • “Spring!” (the verb as a command)
  • a spring (a water source or an elastic object)

Without additional terms, it’s near impossible to know which one is referred to.

Content without curated context

We have unprecedented flexibility in the ways we sort, filter and understand the world online. Yet this poses a new challenge once we come out the other side and work to understand the content. In the physical world, content tends to be placed within a certain context by its curators. To get to the Raphael paintings, you walk through galleries of his predecessors’ art, and to find a book on robotics in the library, the Dewey Decimal system places it with other robotics books.

Online search engines, social channels and other electronic middlemen let us tag things in dozens of different ways and then search based on them, such as subject, color, author, data format, production date, organizations or people mentioned, or more. The results then show up based on that search – putting each item in a context it wasn’t necessarily intended for. Just as with “spring,” if I search for #ocean, I could find the romantic image above, or an image of storm-tossed ships on the verge of destruction, or an article on marine biology – with nothing in common other than this single aspect. Each result must stand alone and be interpreted alone.

We’ve asked algorithms to be our curators, helping us find what we need in whatever way we’re thinking about it. This is an immense opportunity to draw new connections and find new content. Yet the challenge is that to make this possible, we must squash down content into a few tags for search, then try to re-expand it on the other side into its full richness. The more we can emphasize that richness while still making it possible to find, the more likely our content is to resonate and earn results.

Read more on implementing content tagging and the implications of auto-tagging in Tagging Content for Users and Algorithms.