Here’s a video from our last user group meeting. Brian Karlak, one of our data team, tells us how Freebase mines Wikipedia for structured data.
Since you can’t see the slides very well because of the lighting in the video, here is his Powerpoint deck uploaded to slideshare.net:
Don’t forget there’s a Freebase User Group meeting next Tuesday, June 17th, in San Francisco. If you’re in the Bay Area you should definitely try to make it! RSVP at upcoming.org.


June 11th, 2008 at 9:39 am
Great talk — thanks for posting this — however the slideshare link seems to be broken.
June 11th, 2008 at 9:42 am
Yousef: broken how? I’ve embedded the presentation in the blog post, and there’s a link below that says “View” and links to the page on slideshare.net here. Hope that helps.
June 15th, 2008 at 1:06 am
did you opensource this automated tools?
June 16th, 2008 at 8:27 am
Max –
The core of the automated wikipedia processing is WEX, developed by Colin Evans, Nick Thompson, and Alexander Marks here at Metaweb. The WEX data is freely available to the community under the terms of version 1.2 of the GNU Free Documentation License:
http://blog.freebase.com/2008/02/18/mining-knowledge-from-wikipedia-announcing-wex/
http://download.freebase.com/wex/
We invite you to check it out and let us know what you think.
Brian