Semantic Wave Blog
News feeds and commentary by Jamie Pitts
Login  

June 13, 2010

Why we should call them "postmodern databases"

After many years developing applications on mysql, oracle, and postgres, I recently decided to cast aside my biases against high-performance, weakly-consistent data stores and delve into CouchDB and its illustrious ilk. I am excited by the prospect of flexibility and high-availability.

This type of database has been begging for a name ever since BerkeleyDB reigned. It probably should refer to what it is (and not what it isn't). so out the window we should throw the term "nosql". Following that should be "structured storage", which does not fully differentiate it from a relational database. On our march to O'rielly conferences, blog posts, and constant twitterings we should guide the terminology for clarity and for posterity.

After all, we don't want another AJAX or Web 2.0 on our key-flattened fingertips.

So, what is a clear way to refer to a database that favors availability over integrity? I just read an account of a talk at Southeast Linux Fest in which Richard Hipp referred to them as "postmodern databases."

Which is a great way to put it. When I hear this in my mind, I laugh because I am reminded of Baudrillard's Simulacra and Simulation. According to Baudrillard's way of looking at things, we are so deep in simulation that we no longer even know what is original, or even what original is.

The postmodern database is the appropriate technology for our age of mutability and interconnectedness, a world in which our need to instantaneously connect to each other and to connect to our knowledge is more important than "properly" categorizing and extracting meaning.

We can always write processes later on to cull postmodern data and diligently pack it into something "tried and true".

Meanwhile, real people have seemingly figured everything out.

Posted by Jamie Pitts at 11:05 PM | Comments (0) | TrackBack (0)

January 10, 2007

Micro APIs

I wrote a rambling comment on Danny's Steampunk Semantics and I should explain more about what I am getting at. Or trying to get at :)

The way I see it, microformats offer an approach to serving metadata that is more accessible to developers - they weave the meta into the presentation of the data. The big trade-off is a lower level of specificity than required to reliably connect a local graph of data into a global fabric. In a world where there are linguists, lawyers, smart browsers, and automated data aggregators, this trade-off is very costly.

"It depends on what the meaning of the words 'is' is."
- Bill Clinton

While microformats do not have to be trapped into the same ambiguity as JSON API output (or your typical XML generated from XSD or a DTD), the first seeds of the microformats crystal indicate to me that the condition will continue. While it may be obvious - or feel obvious - to a web developer what Hcalendar's "summary" and "location" are referring to, I think that we should hold ourselves to a higher standard of succintness. If we do so, some amazing new applications will be able to emerge.

So somewhere in the midst of this idealism I thought up the Micro APIs concept. It really is just a play on words - it probably should be called a Meta API.

Basically, a Micro API would help small-time web developers disambiguate their application data (either through config file or through a GUI setup) and then serve the metadata in parallel with each content page. The metadata served by this lightweight app would take whatever format the browser or bot requested - JSON, plain XML, and different flavors of RDF. The Micro API could even pregenerate the metadata much like RSS currently is by the blog apps.

In Ruby, the Micro API would simply be a plugin with generators for models and controllers. Each model that can be used in serving metadata would have a method containing mappings of the model and its various properties to URIs on the semantic web.

And, ideally, web app frameworks such as Ruby on Rails will incorporate OWL into the process of generating models.

Posted by Jamie Pitts at 4:20 PM | Comments (2) | TrackBack (0)

November 16, 2006

Breaking Tags Out of Their Existential Crisis

Harry Chen posted some useful commentary about Tom Gruber's efforts to bring structure to tagging and to foster interop between the tags of different apps.

In order to extend what is possible in the social web, tags need to be recognized for what they are - plain literals associated with a subject through the slippery predication of "I tagged." I tagged, but what did I mean? Without too much work, community developers can get started now on making tags in their apps more meaningful.

First, apps can allow for the entry of tags for a particular subject through different predicates. If users can be trusted to type in some interesting tags, why not offer tagging through a context? For example: a photo-sharing app that encourages users to add keywords - separately - for interesting aspects of a photo such as "contains", "has colors", and "reflects cultures".

With subject-predicate tagging it will be possible to offer all sorts of interesting aggregations and structured search options from what users enter for each pair. This approach also puts the application much closer than "raw tags" would have to being able to publish consumable RDF for use by other sites.

Second, in conjunction with offering tagging through predicates, a community culture can be fostered around the tagging activity. This does not have to limit what an individual wishes to enter - it is merely the encouragement of meaning that will make the aggregated-aspects of the system more useful.

Rather than bother with tedious, mouse-centric UIs or complex and limiting text validation, the app can encourage norms for what to enter for each property through a tag cloud of commonly used tags - e.g. colorful adjectives to represent user ratings.

These two suggestions for web developers will add many new interesting dimensions to an online community without asking too much of the developers (or the users for that matter).

I am developing a Rails plugin along the lines of "Acts As Taggable" that will enable users to add free-form tags for any field of a Rails model. Probably I will call this plugin "Acts As Assertable."

Posted by Jamie Pitts at 4:15 PM | Comments (0) | TrackBack (0)

July 9, 2006

Diminishing Participation

Digg has become the avant-garde of American tech workers. However, there is emerging doubt that an application like Digg can withstand the destructive effects of... popularity.

A would-be Digg might consider granting diminishing voting power to new members along a semilogarithmic scale. This would generate an automated inner cadre of sorts, formalizing what naturally occurs in most organizations online and off. Granting fractional votes to the horde would not be fair, but it would preserve the basic reason why newcomers came to the party in the first place.

But what would the unintended consequences of this be? Would newcomers learn of this and not want to use the aggregate data generated by the community? Would that matter to the original participants?

Posted by Jamie Pitts at 2:43 PM | Comments (0) | TrackBack (0)

September 14, 2004

Reducing the Tragedy of the RDF Commons

I am working on a nightowl project to harvest, process, and provide useful financial RDF to the public, free of charge. How do I allow others to query this data and, more importantly, refer to nodes in this data in a way that is reliable... without saturating my server with requests?

Some resources I have been checking out:

Distributed querying on the semantic web
Detailed discussion on the www-rdf-interest list.

RDF Peers [PDF]
A continuation of ISI's MAAN [PDF] concept.

Edutella Project
Uses JXTA to distribute RDF data and queries.

RDF Query, distributed...
Search Mesh Topology
Dan Brickley's early discussions.

FeedMesh
For blog update notification, hopefully relieving us of the regular RSS pounding in the near-future. Spotted today by Danny Ayers and others.

Posted by Jamie Pitts at 6:00 AM | TrackBack (0)

July 24, 2004

Digestible Information

Today, I spotted two approaches to serving a large helping of information:

Pot Roast

Responding to perceived failures on the part of mass media, Change This intends to distribute large, PDF "manifestos" through the blogosphere. They must have forgotten that the medium is the message.

Clay Shirky has a lot to say about this:

In the middle of announcing their plans to rescue intellectual discourse, they suddenly point to a specific document format; it?s like listing the brand of knife the chef uses on a menu. What do PDFs have to do with Change This?s larger goals?

And the answer, of course, is ?Everything.? PDF is the ultimate no-backtalk format. It is designed for the page, not the screen, can?t be annotated, has no provision for comments and nor can it host any trackbacks ? in short, it is almost useless as a site for subsequent reference to the very conversations Change This says they want to stir up. Source.

Cheese and Crackers

Vivisimo is hosting a "clustered" version of the 9/11 Commission's Final Report. Their approach is obvious and simple, and it works very well.

It is too bad that deconstructed documents such as this one are walled off from one another, and walled off from those who would annotate them. Statements in documents should be as easily referenced and retrieved as verses in religious texts.

Posted by Jamie Pitts at 5:45 AM | TrackBack (0)

May 19, 2004

PIKII

Marc Canter posted an excerpt from A Personal Information and Knowledge Infrastructure Integrator. This academic paper is chock full of interesting (but familiar) ideas. PIKII is an information management system which, essentially, is the social software/ blogging / semweb scene of the near-future.

Weblogging adapts hypertext features

Nodes:
Encapsulated units of content in any MIME type format identifiable by W3C-compliant protocols and data structures.

Transclusion:
XML RSS-based syndication distributes content across multiple venues.

Link types:
Static and dynamic URIs for tracking and addressing comments, posts and news with time/date stamps and associative properties identifiable using information retrieval methods.

Backlinks:
Trackback and HTTP-referrer linking provides bidirectional links.

Annotation:
A core attribute of many blog posts and the syndication format is a link, often directly connectable to a content-level XML node. Source.

Posted by Jamie Pitts at 3:12 PM | TrackBack (0)

May 4, 2004

Bible Study

I have always believed that Bible study groups are a great source of ideas for the semantic web. These small communities use techniques for referencing, discussing, and bringing meaning to text that were developed long before information became a science.

The Holy Bible as Placeless Content is not only an excellent exploration of the semantic web practices of Bible scholarship, but also a very decent set of practical useage scenarios.

But certain tools are build to make it easier for Bible-enthusiasts to live their christian faith.
  • Normative Identifiers (aka Uris)
  • Cross References (aka Hyperlinks)
  • Blogging
  • Indices (aka Search Engines)
  • Chain Systems (aka Link Collections)
  • Excessive Quoting (aka blogrolling)
  • Source.

I look forward to spending more time delving into how Christians and other followers of "the Book" have implemented the semantic web.

Posted by Jamie Pitts at 6:40 PM | TrackBack (0)

April 22, 2004

TalkBack: Reply on My Blog

RE:
How to Make Blogging More SemWeb Friendly,
RSS Conversations,
Topics in Weblogs,
Threads of Conversation.

This particular fork got started here.

Blog-to-blog conversations are easily fragmented, with responses residing on blog discussion boards, the responder's site, and on various tracking services.

I have been thinking about how to implement a "reply on my blog" link which would enable a writer to reply using his own CMS.

Implemented with current web technology, this idea assumes that the API of the responder's blogging software can be known to the site which he is responding to. This reply would link to a script on the original site, which, after noting the action, would redirect the request to the content management system under the responder's control.

RDF for the original entry (and historical data about the conversation) would be passed to the responder's CMS in the url string. He would respond. As his CMS updates his blog, his response would be forwarded on to the cited blog (which would be expecting it), trackers, aggregators, and online communities.

Additional concepts:

  • the responder would be provided with a standard blog posting page, perhaps with: a listing of links in the original post, metadata about the discussion, information about the participants, a way to view the conversation tree, etc.
  • any metadata (such as dc:Subject and the tracking service of choice) assigned to the entry by the original blogger could carry over to the responder, who could add additional metadata. But how much metadata should accumulate?
  • the effort to implement something like "reply on my site" would add to the culture of interoperability between sites (as TrackBack did)
  • threading, summarization, and other forms of conversational analysis could be performed by any interested party, provided that the RDF can be obtained from each participating site
  • organized conversation forking could be supported
It would be a big convenience if we could automatically generate a visual summary of interesting web conversations (such as this current one).

Posted by Jamie Pitts at 7:10 PM | TrackBack (0)

April 21, 2004

Organizing the Blogosphere

Seb Paquet has posted some interesting ideas about self-organizing blog directories. He cooks up a distributed solution in which GeoURLish "badges" would visibly (and semantically) designate a blogger as a member of a particular academic community.

I definitely like the badge idea. The badge should link to a local RDF file designating community afiliation, with additional data appertaining to that affiliation. Categorization schemes on the part of search engines have loosened the dependency on directories, and self-categorization is becoming the norm as web developers get accustomed to the process of generating RSS and other metadata.

Community centers, blog planets, and leading bloggers will be the primary driving force in popularizing these badges, and in organizing the category sets used in the population of blog directories. I think that developing tools for these sites to create badges and organize the harvesting of participant data is more important than tools for the blogs themselves. Early adopters will simply copy and manually modify the RDF of the leaders as it happened with FOAF.

Mr. Paquet also set up a wiki page and a TE channel to track this conversation.

Posted by Jamie Pitts at 3:30 PM | TrackBack (0)

April 15, 2004

RE: Hacking Movable Type

Seth Ladd's post about improving MT with DMOZ uris really has me thinking about cateogory sets again. The fact that the rdf representing the DMOZ tree is 51.8M (gzipped) says a lot about the need for communities to agree on standard categories for blog entries.

I can see an individual blogger subscribing to category sets published by several different communities. These categories would be used, as Seth envisions, in the MT entry interface, and represented in the rss data by dc:subject. Community aggregators such as PlanetRDF could then be pinged only with stories appertaining to its published category set (or the reverse).

Posted by Jamie Pitts at 2:27 PM | TrackBack (0)

Small picture of Jamie Pitts
Jamie Pitts

Facebook
LinkedIn


Projects

  Winnow My Bloglines Down
  Memecat
  TigerLead


Curently Reading

cover The Art of Unix Programming
Eric Raymond

Semantic People
Danny Ayers
Dave Beckett
Tim Berners-Lee
Tim Bray
Dan Brickley
Marc Canter
Paul Ford
Seth Ladd
Seb Paquet
Clay Shirky
Roland Tanglao
Dave Winer

Syndication:
 RSS Version 1.0
 RSS Version 0.91


Categories

 AI
 Blogs
 Business
 Data Munging
 Databases
 Development
 Formats
 How-To
 Ideas
 Languages
 Law
 Ontologies
 OWL
 People
 Perl
 Products
 Projects
 QOTD
 RDF
 Research
 Social Software
 SRM
 Standards
 Thinking Out Loud
 Trends
 Twitter
 Visualization
 W3C
 Web Services
 Wikis


Recent Entries
 Why we should call them "postmodern databases"
 Micro APIs
 Breaking Tags Out of Their Existential Crisis
 Diminishing Participation
 Reducing the Tragedy of the RDF Commons
 Digestible Information
 PIKII
 Bible Study
 TalkBack: Reply on My Blog
 Organizing the Blogosphere

Archives
 June 2010
 January 2010
 April 2009
 April 2008
 March 2008
 February 2008
 January 2008
 November 2007
 October 2007
 September 2007
 August 2007
 June 2007
 May 2007
 April 2007
 March 2007
 February 2007
 January 2007
 December 2006
 November 2006
 October 2006
 September 2006
 August 2006
 July 2006
 May 2006
 April 2006
 March 2006
 February 2006
 January 2006
 November 2005
 October 2005
 September 2005
 August 2005
 June 2005
 May 2005
 April 2005
 March 2005
 January 2005
 December 2004
 November 2004
 October 2004
 September 2004
 August 2004
 July 2004
 June 2004
 May 2004
 April 2004
 March 2004


Creative Commons License
This weblog is licensed under a Creative Commons License.

Powered by Movable Type

Copyright © Jamie Pitts