I was browsing del.icio.us and found an interesting project involving researchers from NITLE and Middlebury College: The Semantic Indexing Project.
The prototype is in two parts: the Semantic Engine, the utility that generates metadata from a corpus of text, and the Explorer, a node-centric visualizer.
I indexed some SEC filings and then exported into the Explorer. The result was muddled with extra terms (probably due to the raw nature of these particular filings), so I loaded in the html content of the Google's IPO Prospectus. This second attempt was more interesting, displaying a navigatable graph of related terms. Unfortunately, I did not see UI depicting how the nodes (representing corpus terms) are related.
Still, it is worth a try if you're running OSX and are into natural language processing, LSA, and the art of automatic classification. Also worth looking at are their research papers and presentations.
Comments
Post a comment