Metainformation I: Information about information

Senare i januari ska jag på en konferens med det fina namnet Global Leaders 2010 i Singapore. Ämnet är metainformation och arrangören, Viktor Mayer-Schönberger har redan signalerat att han tror att denna fråga kommer att bli en av de mest intressanta under 10-talet. Jag skall leda en workshop och litet plenarer och så, men framförallt ser jag fram mot diskussionerna. Som en föreberedelse har jag samlat litet tankar och anteckningar om metainformation som jag också publicerar här. De är på engelska.

Metainformation is information about information. There is probably no single point where metainformation suddenly emerges, but we could imagine that information sets of certain diversity and size automatically create a demand for metainformation. One example would be a library. With a few books we can find what we need quickly, but when the size of the book collection grows it quickly becomes impossible to locate books without a systematic description of where the information is. In fact, we could argue, this is the point when the book collection becomes a library. This point, the library point, will differ for all kinds of information sets, but it will always be signified by the same qualities: the emergence of metainformation and attempts to structure the information in the information set to reduce search costs.

Now, when we construct metainformation we can do it in different ways. Constructing metainformation sets is a difficult task since it seems to require that we envision possible uses of the information sets, possible searches that we may want to perform on the material. Metainformation, in a very real sense, will be the thing that enables searches in vast information sets. Without an “index” and relevance indicators search engines would not know how to handle a search query. Or? Let’s examine that idea closer. What is the relationship between search and metainformation? One observation is that the space of possible searches is determined by the metainformation available. When you enter a library you will be able to perform searches according to the metainformation available to you. Imagine two searches: the first is “all relevant books on Picasso” and the second is “all books set in Garamond type”. The first is answerable by the index. The second is not. Metainformation thus delineates the knowable in a very real sense. Wittgenstein used to say that the boundaries of my language are the boundaries of my world (misquoted and misconstrued here, but to illustrate a point). Well, the boundaries of our searching capacity are set by the boundaries implied in metainformation.

What, then, does this mean? I think one answer to that question is that it means that the ways of producing metainformation (in this narrow sense) are ways of producing the boundaries of search space. Innovating the production of metainformation is expanding the set of possioble questions we can ask. It also implies a power relationship. He who controls the metainformation controls the search.

It is worthwhile here to make a slight detour into thinking about different ways of creating power out of information. The first point I think we need to make is that we have moved from an economy where owning information was a viable way of creating value sustainably. It is still possible, where asymmetries of information are stable or at least monetizable at very quick rates, to generate value by owning and transferring ownership of information to other parties. But another form of value creation has become much more interesting, and that is the organization of information. It is well known that Google’s vision is NOT to “own the world’s information and make it universally accessible and useful”. It is to organize the world’s information. But exactly how is value produced when we organize information? I think that it may be here that we should begin to examine the notion that if value is not produced by owning information, it may be produced by owning metainformation. I am not sure this is true, as there is another possibility as well. And that is that what we see is not value accruing to metainformation as such, as much as to the means of producing metainformation.

If this is the case it lends itself to a Marxian analysis. Marx noted that capitalism would collapse under its own victories. We would see capitalism make the means of production available to all, even to the workers, by constantly lowering the cost of the means of production. This is, essentially, what has happened to the content industry. In many cases, though not all (there is an argument here pertaining to quality and investments that I will not address here, but which I think is problematic at best), this means that films, photos, music and other forms of content can easily be produced by users. This phenomenon – user created culture – has shifted the power away from the previous owners of the means of producing content or information to users. This is a well-known analysis presented by Yochai Benkler, Lawrence Lessig and others. But what has happened at the same time is that as the information avalanche grows, we see the value shifting to another set of means of production: the production of metainformation.

The argument that we are beginning to shape here is this: in an age where anyone can make a pop song or take photos, the ability to produce metainformation is still an ability where costs prohibit the emergence of wide-scale user created metainformation.

But is this true? Look at folksonomies and tagging as phenomena. Are these not phenomena that seem to indicate the opposite? If we argue from search engines, then, yes, it seems trivial that not everyone can produce their own index and relevance algorithms on a global scale (here is an interesting question: what will happen when the prizes of indexing and relevance structuring become so low as to allow users to create metainformation on a global level?). But that is only saying that not anyone with a digital film camera can produce Lord of the Rings, right?

Well, yes and no. I think that there are reasons to think about the means of producing metainformation as a new and unevenly distributed source of value. The anatomy of the new source of value may offer interesting fields of exploration, not least when we think about how metainformation is best produced. We seem to need two things: relevance producing mechanisms (or algorithms) as well as vast data sets to test them. Now, one crucial question here seems to be if producing qualitative metainformation (in a sense, producing relevance) is positively correlated with the size of the information sets available to the party producing the metainformation. If this is the case – if those with larger data sets produce better metainformation – it would seem that there is still value in owning information. At least in owning vast information sets.

The larger the information sets I command, in that case, the better the metainformation I can produce. Think about the library. Which library would you predict has the better catalogue? A large library or a small library? Here is a thought: may it be that once we pass the library point where metainformation emerges we see the quality of the metainformation grow with the set of information it is being produced from? Is the quality of metainformation a function of the size of the set it is being generated from? Some studies – specifically of translation technology – would imply that this is indeed the case.

Indeed, beyond the library point, where metainformation emerges, we may imagine another point, the general search point, where the marginal utility of adding information to the set for the value of the metainformation being produced explodes. In such a scenario we should expect the producers of metainformation to try to access all kinds of information and accelerate this process as their information sets grow. Of course, we need to qualify this scenario, by thinking about how different kinds of information add value. One thing we can see is that personal information seems to be rich with metainformation values. The value of work shifts to the value of attention, and the value of the traces of our attention in vast information sets, used to structure metainformation, may very well be the main source of competitive advantage in the metainformation markets.

We are all, in a very real sense, librarians engaged in structuring the world.

  3. Jag har haft “Computers are useless. They can only give you answers. [Picasso]” som tag sedan tidigt 90-tal. det är djupare än jag tänkt tidigare, inser jag, när jag läser dina metainformationsfunderingar. den som får flest frågor är den som förstår hur och på vilka sätt det är mest relevant att utöka indexeringen/sökbarheten. eller hur? detta skriver du inte explict. det är de (för tillfället) obesvarbara frågorna som är de viktigaste, de som pekar på hur sökbarheten borde, kan och ska utökas. peace / peter

  4. Peter; exakt – vad du inte kan fråga efter ger en insikt i strukturer som saknas. Det viktiga kanske inte är vad som kan uttryckas, utan vad som kan undras.

