|
||||
|
« Wikiproxy | Main | Google's Life Recorder » October 14, 2004 Role PlayingPaul Ford has posted his latest entry in his "Hacking Congress" series, correcting some mistakes made the last round. He examines why he shouldn't have used a "USSenator" tag to describe the role of a person in government. Paul also talks about using Tag URIs to identify individuals. I have also recently been dealing with roles in my processing of SEC filings. Extracting people involved in companies for 2004 took over two weeks to download. I have pulled upwards of 6,000 officer titles from over 100,000 filings. Many titles associated with a officers actually refer to more than one role, and in a myraid of different ways. I have been able to extract a lot of meaningful data from this raw text. As I get nearer to actually publishing this data in various formats (including FOAFCorp), I have been looking into creating an onology for company roles. The basic role types will include Chairman, CEO, CFO, VP, and so on, but there is a need to add additional information. Looking at the huge amount of free-form text for officer titles, I found that a person's role at a company is very often nuanced by two additional concepts: a qualifier such as "Retired" or "Former", and a domain of responsibility such as "Marketing Division" and "Human Resources". I am now working on identifying and naming instances of each of these two concepts (as well the core role types) in the raw officer title text. | TrackBack |
Recent Entries
Categories
Archives
|
|||
| Copyright © Jamie Pitts | ||||