There’s an interesting data pattern emerging on Freebase that seemed worth a post here.
Shawn Simister wrote:
“As I understand it, the JobTitle type describes a person’s position within a certain occupation, like CEO of Dell or Chairman of the Board. The Freebase type /business/job_title seems consistent with this definition. However, when I look at the /people/profession type I see a lot of topics that I would consider to be JobTitles and several of them are actually typed as both a JobTitle and a Profession.”
Robert’s response:
“The blurred semantics here are largely inherited from Wikipedia, although Freebase users have continued the trend. WP includes very useful categories for professions that rarely pick the correct noun form (marketing as profession, marketer as job title.) Regarding job titles, contributors seem to be often confused whether they should be entering a job title that you might find on a CV (eg. “Sterling Professor of Physics”) vs. something more generic, but still a job title (”Professor”).
The point I’m trying to make is that it may be very difficult to normalize data in a collaborative environment when there isn’t general agreement among smart and educated people of what a profession or job title really mean. In other words, the output of such collaborative work is likely to be inconsistent and colloquial. That isn’t to say the data collected isn’t extremely useful, it’s just that the data may not yield to straightforward querying.”
This touches on one of the challenges faced when modeling interconnected schema. Robert’s conclusion is that capturing the data still had utility and that ultimately the community can clean up any inconsistencies. For the complete response as well as the original thread, see the Data Modeling list archive
