Semantic-Free Referencing

description

papers

talks

software

people

funding

contact us

Description

Introduction

At a logical level, the Web requires a Reference Resolution Service (RRS) to map from references (also known as links) to actual network locations. In the current Web, references are URLs with a hostname/pathname structure, and DNS serves as the RRS by mapping the hostname to an IP address where the target is stored. Our premise is that the Web would be better served by a new RRS, one that does not impose the limits of the current DNS. As others, most notably those in the URN community, have pointed out [1], an RRS for the Web should have at least the following two characteristics (neither of which DNS-based URLs have):

Persistent object references:
Reference persistence implies (a) that references should not be tied to particular administrative domains or entities, as they currently are in DNS and (b) that references should be at the level of objects rather than entire sites. If someone creates a Web page at one institution then changes his affiliation or network provider, then maintaining persistence would require that the first institution continue to serve the individual's Web page (or provide an HTTP redirect) for all time, which is an impractical expectation [2]. Also, without resorting to HTTP redirects, the current Web infrastructure has no way for individual objects to separate from their sites and migrate cleanly: today, if an object moves, everyone linking to the object must update their references.
Contention-free references:
DNS has become a branding mechanism and, as a result, fights over domain ownership are common. Social problems include name squatting, typo squatting, and lawsuits over trademark infringement. Although disputes over human-readable names are inevitable, we believe the reference resolution infrastructure is a poor place to resolve those disputes. In fact, we believe the infrastructure should force references to be inherently human-unfriendly. Of course users must be able to associate meaning to references, but the binding between opaque references and human-friendly names should be done outside the referencing infrastructure. Such a separation would (a) free the RRS to focus only on technical concerns and (b) permit multiple, competing solutions to human-friendly naming.

Approach

We believe that the simplest way to achieve persistence and freedom from contention is to use a reference namespace devoid of explicit semantics, meaning that a reference should neither embed information about the organization, administrative domain, or network provider it originated in or in which it is currently located, nor be human-friendly. We call such references semantic-free.

Unlike DNS-based URLs, flat, semantic-free references have no explicit structure to give resolution hints. Until recently, there was no way to resolve references scalably in such a namespace, which is largely why the URN literature chose to use a partitioned set of context-specific resolvers. However, the recently developed Distributed Hash Table (DHT) technology is exactly designed to map from an unstructured key to a network location responsible for the key. SFR uses DHTs to map each object reference to a machine that contains object meta-data, such as the object's current IP address and the pathname of the object. Once an application, such as a Web browser, has this meta-data, it can retrieve the actual object.

We imagine that SFR would be deployed on a managed infrastructure (as mentioned above, we believe the namespace should be unmanaged; the infrastructure, however, is a different story), not on the desktops of random cable modem users. To repeat: even though SFR uses DHTs, which are a so-called peer-to-peer technology, we are not relying on flaky personal machines connected via cable modems!

Benefits and Challenges

A version of the Web that used SFR instead of DNS would realize certain benefits, including:

resilient linking and seamless migration: avoid messages like "Please update the referrer that this Web page moved";
flexible replication: this includes allowing individuals to host content for each other without making clients select among mirror sites;
reliable pointer services: individuals can link to each other's links instead of having to link directly to objects;
extensibility and generality: because SFR provides a very general, application-independent interface, other applications besides the Web could get referencing functions.

SFR's features naturally do not come without cost since many of the desirable features of today's Web derive from DNS. As examples, DNS's hierarchical structure enforces URLs' uniqueness and provides fate sharing (a disconnected institution can still access local pages) while the human readability of DNS hostnames gives users some (perhaps misguided) confidence they have reached their desired data. Some of the challenges SFR must address include:

fast lookups
security and integrity
fate sharing
canonical names: users need easy ways to communicate references to each other;
confidence: users need to have confidence in the data they are viewing.

For more information, please see our papers or contact us.

Notes

[1] Besides the URN literature, similar observations have also been made by: (a) Michael O'Donnell, in his Proposal to separate Internet handles from names; (b) Bob Frankston in this essay; and (c) the Globe Project (see especially the paper Locating objects in wide-area systems, IEEE Communications Magazine, January 1998; here is ps or pdf).
[2] The current approach, in which individuals maintain Web pages with domains like www.personalname.org, does not allow an individual Web object (such as a Web page or a directory of photographs) to separate from its original site. Also, even today, this approach might be awkward or inappropriate in contexts when content should not be named by a particular individual.

Papers

The following papers give more information about SFR. The first is a position paper for a workshop; it outlines an earlier version of our philosophy. The second is a report for a student workshop and contains a later version of our philosophy. The third is a full-length conference paper and contains the most refined statement of our philosophy along with design and implementation details.

Semantic-Free Referencing in Linked Distributed Systems
Hari Balakrishnan, Scott Shenker, and Michael Walfish
2nd International Workshop on Peer-to-Peer Systems (IPTPS '03), Berkeley, CA, February 2003.
[PostScript (61KB)] [Gzipped PostScript (25KB)] [PDF (62KB)]
Using DHTs to Untangle the Web from DNS
Michael Walfish
First IRIS Student Workshop, Cambridge, MA, August 2003.
[PostScript (100 KB)] [Gzipped PostScript (34 KB)] [PDF (91 KB)]
Untangling the Web from DNS
Michael Walfish, Hari Balakrishnan, and Scott Shenker
1st Symposium on Networked Systems Design and Implementation (NSDI '04), San Francisco, CA, March 2004.
[PostScript (615KB)] [Gzipped PostScript (130KB)] [PDF (142KB) ]

Talks

The Case for Semantic-Free References: IPTPS '03. February 21, 2003.
Untangling the Web from DNS: NSDI '04. March 30, 2004. ppt html

Software

We plan to release a prototype of SFR shortly; this prototype is layered on top of the Chord and DHash system.

People

Hari Balakrishnan Scott Shenker Michael Walfish

Funding

This project is being conducted as part of the IRIS project, supported by the National Science Foundation under Cooperative Agreement No. ANI-0225660.

Contact Us

We welcome comments, questions, and feedback. Please send e-mail to sfr-n at nms.lcs.mit.edu.

NMS @ MIT LCS

M. I. T. Laboratory for Computer Science · 200 Technology Square · Cambridge, MA 02139 · USA