More fun this week, Analysis for wikipedia data

Analysis for WEX:

http://blog.freebase.com/?p=108

"Growing at approximately 1,700 articles a day, Wikipedia is a significant repository of human knowledge. With its focus and depth, Wikipedia has emerged as a public good of information, fueling a small industry of computer science research. And though Wikipedia contains a wealth of collective knowledge, due to is idiosyncratic markup and semi-structured design, developers wishing to utilize this resource each incur significant start-up costs simply handling, parsing and decoding the raw corpus."

Comments

Popular posts from this blog

JVM Notebook: Basic Clojure, Java and JVM Language performance

On Unit Testing, Java TDD for developers to write

Is Java the new COBOL? Yes. What does that mean, exactly? (Part 1)