Tagging Content for Users and Algorithms

Algorithms and tools are groping around in the dark, with only the limited tool of tagging to help them figure it out. Let’s say I want to share a blog post on Facebook. I drop in my link, and the page handily populates with information on the post:

Facebook uses meta-tags to know which information to pull in.

All this draws from the page’s metadata and feeds into Facebook’s Open Graph algorithm that determines what the best headline, intro description and image are. If you’re expecting others to share your content, setting up the metadata to feed them the right information will be key – so your copy is the right length and the right image gets pulled in and linked.

Creating Your Tags

When you’re thinking about creating tags, consider which types are most appropriate:

  • Descriptive – Terms like #ocean or #beach that say something about what’s in the image, or meta tags that describe the content on the page.
  • Image type (for images only)– Qualities of the picture itself – a close-up, a landscape, a soft-focus image.
  • Contextual – Relates to the conversation that you’d like to be in – becoming a part of that discussion.
  • Conversational – When the tag becomes the conversation. This most commonly happens on Twitter, where hashtags such as the joking apology of #sorrynotsorry are more message than meta.

The Future of Tagging

Search engines like Google have moved away from keyword tagging and towards automatically analyzing the text and structure of a webpage in order to draw conclusions. Similarly, as image processing gets more advanced, algorithms are able to parse out some of the details of what’s in an image.

As an example, Shutterstock recently launched an auto-tagging tool for its mobile image uploading. The tool looks through the existing metadata/tags in the current library of images, maps it against the content of the image, and provides a range of suggestions. For a nature photo, these might include “flower,” “nature,” “beautiful,” “red,”closeup” and more. As these tools become more prominent, we should expect:

  • Clustering – We’re likely to see more of what already exists. If people already know to search for #destinationwedding in order to find content related to weddings, we’ll see more and more uses of that tag by people who want to show up in that context, and the tools will continue to recommend it.
  • Tags substituting for descriptions – Descriptions are challenging to write, since they need to encompass all that a piece of content contains. Tags are easy, since they can be single facets of that content, because they’re automatically recommended, and because they automatically feed search engines. Expect to see the continued growth of numerous tags over lengthy descriptions of content.

Where do you see the future of content tagging?

A Picture Worth a Thousand Tags

Quick, describe Raphael’s St. Michael and the Dragon in 12 adjectives or less. Would they include #knight #religion, #greatart and #chiaroscuro? Would we be missing some of the essence of the image in boiling it down to these tags – these simple, searchable snippets?

Art is more than the sum of its parts, but online it's defined by its tags.

The digital world is a tagged world, a world coded and snipped into little boxes. Content must be deconstructed into its essential elements and coded in this way so that the algorithms that curate content for us (Google, Facebook, etc.) can put them into the appropriate boxes. It’s most obvious on channels like Instagram, where an image might have 10 or more hashtags coding it:

Tags are used to define where and how this image can be found by users.

It’s also apparent in many other contexts, such as meta tags on webpages to improve their search engine optimization (though Google and other search engines have moved to de-emphasize them in their ongoing algorithm updates in favor of content- and link-based analysis).

But it’s not intelligently curated, and it doesn’t speak to quality. I can tag any image or page anything, without that necessarily implying that it’s actually related or that it’s going to be relevant. Even if my tags are accurate, what something is isn’t always what it’s about; content, whether visual or text-based, doesn’t make sense without its context the unspoken relationships it has with other concepts matters deeply in understanding it. English, as with most languages, is very context-oriented. If I say “spring” to you, I could mean:

  • Spring (the season)
  • to spring (the verb)
  • “Spring!” (the verb as a command)
  • a spring (a water source or an elastic object)

Without additional terms, it’s near impossible to know which one is referred to.

Content without curated context

We have unprecedented flexibility in the ways we sort, filter and understand the world online. Yet this poses a new challenge once we come out the other side and work to understand the content. In the physical world, content tends to be placed within a certain context by its curators. To get to the Raphael paintings, you walk through galleries of his predecessors’ art, and to find a book on robotics in the library, the Dewey Decimal system places it with other robotics books.

Online search engines, social channels and other electronic middlemen let us tag things in dozens of different ways and then search based on them, such as subject, color, author, data format, production date, organizations or people mentioned, or more. The results then show up based on that search – putting each item in a context it wasn’t necessarily intended for. Just as with “spring,” if I search for #ocean, I could find the romantic image above, or an image of storm-tossed ships on the verge of destruction, or an article on marine biology – with nothing in common other than this single aspect. Each result must stand alone and be interpreted alone.

We’ve asked algorithms to be our curators, helping us find what we need in whatever way we’re thinking about it. This is an immense opportunity to draw new connections and find new content. Yet the challenge is that to make this possible, we must squash down content into a few tags for search, then try to re-expand it on the other side into its full richness. The more we can emphasize that richness while still making it possible to find, the more likely our content is to resonate and earn results.

Read more on implementing content tagging and the implications of auto-tagging in Tagging Content for Users and Algorithms.

Product Previews: Managing URLs

Watch out for when convenience and search engine optimization (SEO) best practices collide with information control. When releasing a series of product previews, your audience will often be wise to the patterns you use in releasing preview content – and they’ll take advantage of that to get additional glimpses of your news ahead of when you want to release it.

For SEO purposes, we know that it’s highly recommended to use URLs that reflect your content. So if you’re previewing new rules for the Elves faction in your wargame, you might have a URL along the lines “of rootdomain.com/wargame/elves-preview.” Then you might have “/dwarves-preview,” followed by “/humans-preview” (let’s leave the other SEO considerations aside for the moment).

Your path to launch may seem clear, but keep an eye out for these product preview pitfalls.

We also know that it can often be efficient and convenient to create and stage a lot of content at once. For example, if you’re writing three preview articles to be released daily, you might write something at “rootdomain.com/insider/5-20-2016,” then “rootdomain.com/insider/5-21-2016,” etc. Or you might upload a lot of images, with image names like “articlename/preview1.jpg” then “/preview2.jpg” and so on – just to have them ready and convenient.

In both situations, once people see one or two items in a pattern, they’re pretty creative about figuring out the rest. For example, Privateer Press, maker of the miniatures game Warmachine, has been rolling out short fiction about its’ armies leaders throughout spring 2016, and asked people to sign up to get emailed right when the stories come out so that they could be the first to read these product previews. Yet even before most emails were out, people had figured out the URL for the next story and accessed it. They knew the pattern and followed it:

Forum users easily found the link to Privateer Press's short fiction around its new product preview almost as soon as the company took it live.

Similarly, if you’ve got tomorrow’s newsletter already uploaded and just haven’t shared the link yet, people can probably extrapolate from yesterday’s URL.

So, lots of explanation to say simply that if you’re hosting it on your site in a predictably named way, fans can track it down.

Simple solutions:

  • Don’t stop optimizing your product URLs for SEO – but wait to take pages or posts live until the actual time when you’re ready for that information to get out there.
  • Be thoughtful when bulk uploading images or other visual assets that you’re not ready to reveal – adjust their URLs to be less predictable.
  • If something is intended to be an email exclusive (at least initially), consider requiring users to enter a simple code included in the email so that people testing URLs can’t find it in advance.

Image from USFWS, available at https://commons.wikimedia.org/wiki/File:Wildlife_Viewing_(9160100369).jpg under the CC-BY-2.0 license.