The Type System In Freebase

The data in Freebase is a collection of information structured to make
access and collaboration easy and useful. A primary mechanism for
structuring this data is a system of topics, types, properties, and
domains. A description of the these terms in may be helpful in
understanding how the Freebase type system works.

  • Topic: This is an object in Freebase representing a discrete
    entity. A topic may be specific and concrete, (e.g. “Bill
    Clinton” or “The City of Vancouver, Canada”) or an abstract
    concept (e.g. “The number PI”, “Zoroastrianism”). In Freebase,
    each topic is given exactly one globally unique identifier (GUID)
    which refers only that topic. While a topic may have many names
    and may be used in many different contexts in Freebase, each topic
    should represent one and only one concept or entity in the world.
  • Type: A type is an object in Freebase that is used to semantically
    group topics. A topic associated with a type is considered to be
    an “instance” of that type. Examples of types include “Film
    Actor”, “Person”, “Airport”, and “Programming Language”. Topics
    may have multiple types which may be added or removed over time.
    (e.g. A person is elected president in the USA may have the “US
    President” type added.) Like a topic, a type represents exactly
    one meaning, even if it has multiple names and usages.
  • Property: A property can be thought of as a flavor of attribute of
    a topic in Freebase. Properties may be literal values (e.g. a
    name, a length, a SKU number) or a relationship to another topic
    (e.g. “is a parent of”, “is contained by”, “has sequel”). A topic
    using a property is considered to have an “instance” of that
    property.
  • Schema: Each type has collection of zero or more properties, known
    as the schema of that type. If a topic is an instance of a type,
    this is an indication that the properties in the schema of that
    type are appropriate for application to that topic. For example,
    if the topic “Danny Hillis” is an instance of the type “Person”
    which has a schema with property “Date of Birth”, then it is
    expected that “Danny Hillis” may have an instance of the “Date of
    Birth” property. Each property is in exactly one schema, and thus
    associated with exactly one type.
  • Domain: A domain is a collection of types that are expected to be
    commonly used together or in similar ways. Examples include
    “Music” (which has the “Musical Artist” and “Album” types),
    “Medicine” (which has “Disease” and “Drug”), and “Business” (which
    has “Company” and “Retail Location”). Domains exist to make
    management of types easier. Every user is given a personal domain
    be default to help him/her start building their own types.

An organizing principle of Freebase’s type system is to collect
properties around each type that which only are semantically relevant
to that type. For example, the type “Film Actor” should not have the
property “Date of Birth”, even if all actors have this property.
Rather, each actor should be typed both as “Person” and “Film Actor”
The “Person” type has the property “Date of Birth” and “Film Actor”
has properties such as “Film performances”, which are specific to that
type. Types should not have properties that are semantic duplicates
of properties of other types.

A topic may have multiple types in Freebase, and the union of schemas
of a topic’s types are a powerful description of how that topic is
expected to be used. For example, The topic “Arnold Schwarzenegger” is
typed “Person”, “Film Actor”, “Bodybuilder”, and “US Politician”.
Each of these types implies a set of properties that are distinct from
each other, making it easy to separate (using MQL queries) the various
aspects of Arnold’s life.

There has been some comparison of Freebase’s type system to other
systems of semantic structuring, such as ontologies, folksonomies
object oriented systems, and tagging systems. While sharing some
characteristics with other such systems, the Freebase type system does
not have:

  • Tags and Folksonomies: Tags are text labels on objects. They
    represent the idea that a specific object has a relatedness to a
    specific text string. (e.g. A photo on Flickr is tagged “Bush” so
    that searches on that text string will retrieve that photo.) The
    meaning of the relatedness (”is about”? “contains”?) and the
    definition of “Bush” (a plant? a US president?) are unspecified.
    Types in Freebase are not tags. Rather, a Freebase type on a
    topic implies a very specific semantic indication for that topic,
    independent of the name of the topic or type. The Freebase Topic
    “George W. Bush” has the type “US President”, meaning that this
    person is an instance of the group defined by the type. Because
    there are no tags, Freebase is not a folksonomy like Web sites
    such as “Flickr” and “del.icio.us” are often described as being.
  • Classes and Hierarchies: Objects in OO systems usually have
    exactly one class, and to gain the characteristics of multiple
    classes, a hierarchy of inheritance is used to allow an object
    essentially to also be instances of classes higher in the class
    tree. Types in Freebase are not OO classes and topics are not OO
    objects. There is no type hierarchy, inheritance of properties,
    or mutual exclusivity of types in Freebase. Topics may have
    multiple types directly, and users may “mix in” types as needed to
    build the collection of properties desired for a topic. For
    example the type “Basketball Player” is not a sub-type of
    “Person”. The topic “Wilt Chamberlain” should have both of these
    types explicitly in order to use the properties of both types.
  • A Complete Ontology: In a strictly philosophical sense, Freebase
    does represent a mindset of entity organization that can be
    considered an ontology. Freebase does have entities (topics)
    types, and relationships of various flavors between topics and
    types. However, Freebase is more like a semantic workshop in
    which users work together to figure out how to use types and
    properties to build the best organization of knowledge. There is
    no intrinsic hierarchy of categories or canonical world view of
    all knowledge. Conflicting and contradictory types and properties
    may exist simultaneously in order to reflect users’ differing
    opinions and understanding. Real world data and its organization
    are almost always messy, and Freebase has been designed to
    represent that mess rather than hide or ignore it. We believe
    users will build practical organizational structures. Those that
    are useful will tend to survive, while those that are not will
    tend to be replaced.

4 Responses to “The Type System In Freebase”

  1. Vidar Masson Says:

    Is it possible to make the docs available for those without alpha access?

  2. Jeremy Olson Says:

    What are your thoughts on objects like “fruit”? In Freebase fruit has a type called “fruit” as well as a topic for “fruit”. They both refer to the same thing but they are different objects in the Freebase database. I am guessing fruit became a topic because there was a Wikipedia article on fruit. Is fruit an inappropriate topic? Should it only be a type?

  3. Robin Boast Says:

    This is a bit out of date, but I have only just joined Freebase, so I hope this is still of some use. Jeremy Olson’s comment is pertinent. While Freebase has developed an impressive system for typing topics, one that is more than welcome in its scope, flexibility and diversity, there remains a problem with the claims about the topics themselves. Kurt above says that a Topic “is an object in Freebase representing a discrete entity.” and gives an example of “bush”. However, a topic in Freebase is not a discrete entity, concrete or abstract. It is, in fact, an account of an entity. While George Bush, or a bush, has an identity, though they too have different identities in different contexts, a Topic in Freebase, or anywhere else for that matter, is an account of an entity and is, therefore, multiple. This seems to me to be a fundamental problem for Freebase, though not a fatal one.

  4. Robert Muetzelfeldt Says:

    In the section “Tags and Folksonomies”, Kurt says “Types in Freebase are not tags”, which is a pretty reasonable position to take (but: see below).

    However, there seems to be a much closer conceptual link between tags and Freebase Properties (or rather, the value(s) of Properties). So, an image on flickr of a painting of Edinburgh Castle might be tagged with “castle” and “Edinburgh”. A Freebase Topic on the same painting (of Type “Painting”) could well have the Property “Location” with value “Edinburgh”, and the Property “Subject” with value “castle”.

    We can argue about the details. For example, the fact that it is a topic of Type “Painting” in Freebase could well correspond to the use of a tag “painting” in flickr (which – weakly – goes against the point that Kurt made). But my point is that the closer correspondence is between tags and Freebase Properties.

    There is a rather separate point, namely that tags tend to be provided by individual users, whereas Properties are (deemed to be) universally agreed truths about the object. I’ll be addressing that in a Bugs/Features Request suggestion, that it should be possible for users to tag topics.

About

Freebase is a free database of the world's information. This is the official Freebase blog.