|
||||
|
« September 2004 |
Main
| November 2004 »
October 26, 2004
Wikinews Wikimedia has put up a proposal and voting page for Wikinews, a service to neutrally summarize and report on current events. While Wikinews aims to be a useful resource of its own, it will also provide an alternative to proprietary news agencies like the Associated Press or Reuters; that is, it will allow independent media outfits to get a high quality feed of news free of charge to complement their own reporting. Thanks to copyleft, anyone can create their own free news source - even a non-neutral one - on the basis of our work. Even if our articles will initially be few, they will be free, permanently available and not require registration before reading. Source: Wikimedia.Wikinews takes the collaborative content phenomenon even closer to real-time. I look forward to contributing to this! Posted by Jamie Pitts at 6:00 PM | TrackBack October 25, 2004
Looking forward to SPARQL I have been looking over the W3C's Working Draft for the SPARQL Query Language for RDF. These proposed / discussed features caught my eye: October 18, 2004
Google's Life Recorder Danny has some interesting things to say about Google's life recorder (AKA Desktop Search) and other similar products. I was blown away last week, having only expected to install searching. The recorder was a wonderful surprise. My interest in the semantic web began with investigating how organize and annotate my business research and fiction writing. Meaningful search and the web recorder changes how I do my research. Due to the rapid recall, local cross-referencing and commentary are much more feasible. Desktop Search also says a lot about where computing is headed, with its unabashed browser interface to a local web server and simplicity. To millions of common internet consumers, the web will soon be much less "remote" than it has been. The PC which I purchased last year for web development and games can now serve as my primary information platform. For this purpose, my PC is on the verge of displacing my OS X system. October 14, 2004
Role Playing Paul Ford has posted his latest entry in his "Hacking Congress" series, correcting some mistakes made the last round. He examines why he shouldn't have used a "USSenator" tag to describe the role of a person in government. Paul also talks about using Tag URIs to identify individuals. I have also recently been dealing with roles in my processing of SEC filings. Extracting people involved in companies for 2004 took over two weeks to download. I have pulled upwards of 6,000 officer titles from over 100,000 filings. Many titles associated with a officers actually refer to more than one role, and in a myraid of different ways. I have been able to extract a lot of meaningful data from this raw text. As I get nearer to actually publishing this data in various formats (including FOAFCorp), I have been looking into creating an onology for company roles. The basic role types will include Chairman, CEO, CFO, VP, and so on, but there is a need to add additional information. Looking at the huge amount of free-form text for officer titles, I found that a person's role at a company is very often nuanced by two additional concepts: a qualifier such as "Retired" or "Former", and a domain of responsibility such as "Marketing Division" and "Human Resources". I am now working on identifying and naming instances of each of these two concepts (as well the core role types) in the raw officer title text. October 5, 2004
Wikiproxy Stefan Magdalinski has posted details about his BBC News Wikiproxy. His goal is to integrate the Beeb with the greater web, and wikiproxied stories include Wikipedia links for capitalzed nouns. Technorati references relating to the story are also incorporated. I am very jazzed about this because I am nearly done with a long SEC company and officer extraction run. A test appplication I built uses this data to identify names in Yahoo! / Reuters stories, and the BBC was next in line. Stefan's wikiproxy should help me get started on connecting my company names and people to Wikipedia URIs, along the lines of James Tauber's interesting suggestion. I plan to use Wikipedia references in human-readable interfaces, along with equally-useful SEC Edgar and Yahoo! Finance representations of companies. The PHP source code uses regexes in the identification of names; my bot will probably be doing some additional work with respect to the story context to ensure good matches. By way of Cory Doctorow. |
Archives
January 2008
whoami?
Projects:
The Art of Unix Programming
Eric Raymond Dave Beckett Tim Berners-Lee Tim Bray Dan Brickley Marc Canter Paul Ford Seth Ladd Seb Paquet Clay Shirky Roland Tanglao Dave Winer
Syndication:
Recent Entries
Categories
|
|||
| Copyright © Jamie Pitts | ||||