Semantic Wave Blog
News feeds and commentary by Jamie Pitts
« September 2004 | Main | November 2004 »

October 26, 2004

Wikinews

Wikimedia has put up a proposal and voting page for Wikinews, a service to neutrally summarize and report on current events.

We seek to create a free source of news, where every human being is invited to contribute reports about events large and small, either from direct experience, or summarized from elsewhere.

While Wikinews aims to be a useful resource of its own, it will also provide an alternative to proprietary news agencies like the Associated Press or Reuters; that is, it will allow independent media outfits to get a high quality feed of news free of charge to complement their own reporting. Thanks to copyleft, anyone can create their own free news source - even a non-neutral one - on the basis of our work. Even if our articles will initially be few, they will be free, permanently available and not require registration before reading. Source: Wikimedia.
Wikinews takes the collaborative content phenomenon even closer to real-time. I look forward to contributing to this! Posted by Jamie Pitts at 6:00 PM | TrackBack

October 25, 2004

Looking forward to SPARQL

I have been looking over the W3C's Working Draft for the SPARQL Query Language for RDF. These proposed / discussed features caught my eye:

Posted by Jamie Pitts at 6:48 PM | TrackBack

October 18, 2004

Google's Life Recorder

Danny has some interesting things to say about Google's life recorder (AKA Desktop Search) and other similar products.

I was blown away last week, having only expected to install searching. The recorder was a wonderful surprise. My interest in the semantic web began with investigating how organize and annotate my business research and fiction writing. Meaningful search and the web recorder changes how I do my research. Due to the rapid recall, local cross-referencing and commentary are much more feasible.

Desktop Search also says a lot about where computing is headed, with its unabashed browser interface to a local web server and simplicity. To millions of common internet consumers, the web will soon be much less "remote" than it has been.

The PC which I purchased last year for web development and games can now serve as my primary information platform. For this purpose, my PC is on the verge of displacing my OS X system.

Posted by Jamie Pitts at 2:03 PM | TrackBack

October 14, 2004

Role Playing

Paul Ford has posted his latest entry in his "Hacking Congress" series, correcting some mistakes made the last round. He examines why he shouldn't have used a "USSenator" tag to describe the role of a person in government. Paul also talks about using Tag URIs to identify individuals.

I have also recently been dealing with roles in my processing of SEC filings. Extracting people involved in companies for 2004 took over two weeks to download. I have pulled upwards of 6,000 officer titles from over 100,000 filings. Many titles associated with a officers actually refer to more than one role, and in a myraid of different ways. I have been able to extract a lot of meaningful data from this raw text.

As I get nearer to actually publishing this data in various formats (including FOAFCorp), I have been looking into creating an onology for company roles. The basic role types will include Chairman, CEO, CFO, VP, and so on, but there is a need to add additional information.

Looking at the huge amount of free-form text for officer titles, I found that a person's role at a company is very often nuanced by two additional concepts: a qualifier such as "Retired" or "Former", and a domain of responsibility such as "Marketing Division" and "Human Resources". I am now working on identifying and naming instances of each of these two concepts (as well the core role types) in the raw officer title text.

Posted by Jamie Pitts at 8:13 PM | Comments (2) | TrackBack

October 5, 2004

Wikiproxy

Stefan Magdalinski has posted details about his BBC News Wikiproxy. His goal is to integrate the Beeb with the greater web, and wikiproxied stories include Wikipedia links for capitalzed nouns. Technorati references relating to the story are also incorporated.

I am very jazzed about this because I am nearly done with a long SEC company and officer extraction run. A test appplication I built uses this data to identify names in Yahoo! / Reuters stories, and the BBC was next in line.

Stefan's wikiproxy should help me get started on connecting my company names and people to Wikipedia URIs, along the lines of James Tauber's interesting suggestion. I plan to use Wikipedia references in human-readable interfaces, along with equally-useful SEC Edgar and Yahoo! Finance representations of companies.

The PHP source code uses regexes in the identification of names; my bot will probably be doing some additional work with respect to the story context to ensure good matches.

By way of Cory Doctorow.

Posted by Jamie Pitts at 2:33 PM | TrackBack

Archives

 January 2008
 November 2007
 October 2007
 September 2007
 August 2007
 June 2007
 May 2007
 April 2007
 March 2007
 February 2007
 January 2007
 December 2006
 November 2006
 October 2006
 September 2006
 August 2006
 July 2006
 May 2006
 April 2006
 March 2006
 February 2006
 January 2006
 November 2005
 October 2005
 September 2005
 August 2005
 June 2005
 May 2005
 April 2005
 March 2005
 January 2005
 December 2004
 November 2004
 October 2004
 September 2004
 August 2004
 July 2004
 June 2004
 May 2004
 April 2004
 March 2004


Small picture of Jamie Pitts When I talk about the semantic web, I feel a lot like Linus. No, not Linus Torvalds. I meant the other one. - JP


whoami?

Projects:
  Winnow My Bloglines Down
  Memecat
  Listgasm


Curently Reading

cover The Art of Unix Programming
Eric Raymond

Semantic People
Danny Ayers
Dave Beckett
Tim Berners-Lee
Tim Bray
Dan Brickley
Marc Canter
Paul Ford
Seth Ladd
Seb Paquet
Clay Shirky
Roland Tanglao
Dave Winer

Syndication:
 RSS Version 1.0
 RSS Version 0.91


Recent Entries
 Wikinews
 Looking forward to SPARQL
 Google's Life Recorder
 Role Playing
 Wikiproxy

Categories
 AI
 Blogs
 Business
 Data Munging
 Development
 Formats
 How-To
 Ideas
 Languages
 Law
 Ontologies
 OWL
 People
 Products
 Projects
 QOTD
 RDF
 Research
 Social Software
 SRM
 Standards
 Thinking Out Loud
 Trends
 Twitter
 Visualization
 W3C
 Web Services
 Wikis


Creative Commons License
This weblog is licensed under a Creative Commons License.

Powered by Movable Type

Copyright © Jamie Pitts