First Patterns: Data Modeling Idioms

In most design endeavors standard ways of doing things eventually emerge. The reuse and codification of these standard approaches are called Design Patterns and they form a language, or short-hand, which can be used to express the design to others.

You will discover there are many such patterns when constructing data models in Freebase. One of these patterns, the Compound Value Type (or CVT Pattern,) is so pervasive that the UI provides special support for it.

A type, which participates in the CVT pattern, has properties which work together to express one “value.” For example, it is often important to identify when a certain measurement or count was performed. For this reason a property, such as the “number of undergraduates” on the “College/University” type, uses a “Dated Integer” as its Expected Type. The “Dated Integer” Type employs the CVT Pattern providing two properties, one for the Date of the count and the second for the actual tally of students. These two properties should be interpreted together when examining the “number of undergraduates” property.

Dated Integer

Image 1: The Dated Integer CVT

An instance of a CVT is just an object that binds multiple properties together. The object doesn’t really identify a “Topic” (something worthy of discussion) but rather, models the relationship between two or more properties that have a combined meaning when describing a property on another Topic. For this reason, instances of CVTs are not given display names (what would you name it?) nor are they typed as /common/topic. (Typing something as /common/topic bestows a certain level of importance to the object. In fact, the search system and auto-complete service will ignore objects unless they are typed /common/topic.)

The UI, when displaying a property that has an Expected Type which is identified as a CVT, will expand the CVT, showing the values for properties of the CVT object instead of the name of the CVT object (remember, CVT’s don’t have names, so the only identifier the UI could display would be the guid of the CVT object, which really wouldn’t be too useful.) The UI determines which properties from the CVT to display by looking at the CVT’s definition. Any properties on the CVT which have been marked as “disambiguators” will appear when the UI encounters a property with an Expected Type with the CVT.

Mediators as Compound Value Types

In the “number of undergraduates” example, the CVT contained both a date and an integer property. Dates and Integers are “value” types, or primitives-types in the Freebase system. These types do not require any additional connections to other objects to interpret their contribution. But Compound Value Types need not solely model terminal leaf nodes. It is possible to use CVTs to annotate the relationship between two objects. This means that at least one of the properties on the CVT itself is a link to another object. That is, a CVT may contain properties which themselves have Expected Types which are other objects (non-value types.)

Take for instance, the Film Performance type (/film/performance,) which links actors to films they have performed in. The Film Performance type is also a CVT. Performances are typically not named relationships, but the the Film Performance objects allow you to annotate an actor’s performance with information about the role (or character) the actor played in the movie, along with other information about that actor’s job (for instance, you might include information about contract or salary.)

Film Performance Model

Image 2:The Performance Mediator.
Note: links in Freebase are bi-directional and can be referenced through different names at each end.

In this sense the Performance CVT is acting as a “mediator” between two objects and the properties on the CVT must be interpreted together to make sense of properties that connect to the CVT.

Again, the UI provides the ability to view properties of the CVT/Mediator from the object that references the CVT. Thus, because the “film” property on the Performance CVT is marked as a disambiguator, the “value” (in this case the name of the film connected to the film property) is visible when examining the Actor object.

You will discover other data modeling patterns as you browse through the Freebase Schemas. As more of the patterns become standardized through practice, applications (including the Freebase UI) will be make use of them to improve the end-user experience and the ability to interpret the data stored in the model.

Appendix: Examining Compound Value Types through the User Interface

To examine the definition of a Type, start with an instance of the Type, click on the menu arrow to the left of type listed in the “Types” property and select “Edit Type.”

Image1.1

Image 1: Navigating to the Schema Editor from the Blade Runner instance
http://www.freebase.com/view?id=%239202a8c04000641f8000000000009e89
Image1.2

Image 2: The Film Schema indicates that the Starring Property has an Expected Type of Performance
http://www.freebase.com/view/schema?id=%2Ffilm%2Ffilm

Clicking the Performance label takes you to the definition of the Performance Type, which is marked as a Compound Value Type via the check box at the top of the page.
Image1.3

Image 3: The Performance Type is marked as a Compound Value Type
http://www.freebase.com/view/schema?id=%2Ffilm%2Fperformance

Opening the Actor Property (via the twist arrow on the left) reveals that this property is identified as a Disambiguator. This indicates to the UI that it should display the value (or in this case, the name of the connecting object) when displaying properties which use Performance as an Expected Type.

Image1.4

Image 4: Details of the Actor Property in the Performance Type.
http://www.freebase.com/view/schema?id=%2Ffilm%2Fperformance

The display for the Blade Runner film instance shows the values of properties on the underlying mediator object when displaying the Film’s Starring property.

Image1.5

Image 5: The Blade Runner instance displays the cast, by showing values on the mediator.
http://www.freebase.com/view?id=%239202a8c04000641f8000000000009e89

One Response to “First Patterns: Data Modeling Idioms”

  1. carmen Says:

    sweet. blank nodes now sound like a weird arterial disease. i guess you can catch it from freebasing too much?

About

Freebase is a free database of the world's information. This is the official Freebase blog.