• Home
  • Blog
  • About
  • Contact
CloudAve
Software in Business. The Business of Software.
  • Business
    • Analysis
    • Entrepreneurship
    • Marketing
    • Strategy
    • Small business
  • Technology
    • Application Software
    • Infrastructure
    • Open Source
    • Mobile
    • Platforms
    • Product reviews
    • Security
  • Misc
    • Design
    • Just for fun
    • Trends & Concepts
  • Sponsors
Browse: Home / Top Level Domain for data answers the wrong question

Top Level Domain for data answers the wrong question

By Paul Miller on January 11, 2012

English: Publicity photo of en:Stephen Wolfram.

Image of Stephen Wolfram via Wikipedia

British-born computer scientist Stephen Wolfram sees ongoing efforts to extend the Internet’s top-level domains (TLDs) beyond the familiar .com, .org, .uk etc as an opportunity to raise the profile of machine-readable data. In a blog post published yesterday, he argues that a new .data domain would increase “exposure of data on the internet—and [provide] added impetus for organizations to expose data in a way that can efficiently be found and accessed.” Whilst wholly in favour of Wolfram’s stated aim, I can’t help feeling that his suggested solution is at best unnecessary and at worst a worrying segregration of data from the ‘proper’ web that everyone else will continue to exploit.

Back in June of last year, the body responsible for coordinating the global domain name system approved a plan to permit new top-level domains (the letters after the final dot in an internet address — the .com in cloudofdata.com, the .uk in bbc.co.uk, the .edu in harvard.edu). Until recently, these top-level domains have been tightly controlled, with a small set of generic domains (.edu, .gov, .mil, .org, etc), a larger set of country domains (.uk, .fi, .nz, etc) and one or two others such as .eu. From tomorrow, anyone with $185,000 will be able to submit a proposal to create and manage a new top level domain, and it’s possible that there could eventually be thousands of them. Wolfram is keen to ensure that data doesn’t miss out on the ‘opportunity.’

As Wolfram himself recognises, there is already an awful lot of machine-readable data on the web. Some of it sits embedded within the web pages that humans read, with specially formatted code waiting to be triggered by the calendars, the address books, or the browser plugins of site visitors. Some of it is packaged up in data files, offered for download. And some of it waits inside a database, ready to be delivered in response to an API call or a query typed into a web form.

There is a growing enthusiasm for exposing this data for reuse. Government transparency agendas have driven public sector data sites like data.gov.uk and data.gov. Similarly, efforts such as data.open.ac.uk and data.southampton.ac.uk see universities beginning to consciously collect data sets together and offer them up for reuse. Similar efforts in the commercial world are less easy to point to, but that reticence has nothing whatsoever to do with the lack of a ford.data, boeing.data, ge.data or astrazeneca.data domain!

In some ways, the convention for gathering significant chunks of data on a data.xxx.yyy site echoes Wolfram’s intention, but with a number of advantages. Data without context is far less valuable than data with context. Much of that context may be inferred from the domain in which the data lives, with data delivered from a .gov or .edu (or .gov.uk or .ac.uk) site perhaps interpreted differently to data hosted on .com, .biz, or .xxx. Southampton University, the Open University, and the US Federal Government are able to gather data up and make it available for download via their existing data. sites if they choose. This offers human visitors to their sites a degree of convenience, whilst retaining the power and brand attributes of their existing domain. Gov.data, gov.uk.data, open.ac.uk.data, southampton.ac.uk.data, though? All are messy, in ways that Wolfram’s own wolfram.data would admittedly not be, and all are simply additional registrations that the institutions would have to pay for in order to stop someone else grabbing the domain.

At the end of the day, the machines don’t actually care. The existing data.open.ac.uk-type sites are human conveniences, not machine enablers. The computers, and the software they run, are quite capable of crawling the public web and finding accessible data wherever it lies on a site. There are plenty of reasons to continue embedding little snippets of data inside human readable web pages, regardless of whether you have a data.wolfram.com or a wolfram.data site. Content negotiation is becoming increasingly capable, such that there really is no need for what Wolfram calls a ‘parallel construct to the ordinary web’ at all. A human being arriving at a web site sees human readable content, whilst various software tools would automatically be presented with very different data or functions, optimised to their capabilities and requirements.

By all means, let us show the curious some of the existing techniques that work in making data more easily accessible. By all means, let us identify the gaps, the issues, the problems (none of which a new TLD even begins to address). Yes, let us definitely and unambiguously set about “highlighting the exposure of data on the internet—and providing added impetus for organizations to expose data in a way that can efficiently be found and accessed.”

But please, let us not be distracted by the false hope that adding yet another TLD to the babel that ICANN is about to unleash can do anything more than consign data to some online ghetto, wallowing unwanted, unloved and unused as companies and their customers lavish love, attention, and clicks upon the .com domain over on the ‘proper’ web.

Thanks to Raphaël Troncy, whose tweet first drew the story to my attention.

Related articles
  • Is It Time For Computers To Have Their Own .Data Domains? (techcrunch.com)
  • ICANN Pushes Ahead With January 12 Launch For New Top-Level Domains (wired.com)
  • The biggest change in DNS since Dot-Com (wired.com)

Share:

  • Twitter
  • Facebook
  • LinkedIn
  • Google +1
  • StumbleUpon

(Cross-posted @ The Cloud of Data)

Posted in Featured Posts, Infrastructure | Tagged big data, cloud computing, content negotiation, Cybersquatting, data, data publishing, data science, Data sharing, Data Web, domain name, Domain Name System, Enterprise Computing, ICANN, Linked Data, open data, Open University, semantic web, Southampton University, Stephen Wolfram, TLD, Top-level domain, web 3.0, Wolfram Research

Paul Miller

« Previous Next »
feed mail facebook twitter linkedin

Sponsor Posts

HR Tech Vendors: Who’s Out There?
HR Tech Vendors: Who’s Out There?
7 B2B Strategies for LinkedIn Marketing
7 B2B Strategies for LinkedIn Marketing
The Next Revolution for Finance -- Embedded Analytics
The Next Revolution for Finance -- Embedded Analytics
Understanding the Magic Quadrant\
Understanding the Magic Quadrant\'s New Name
  • Tags
  • Calendar
  • Comments

accy2 amazon android Apple aws briefs cloud cloud computing collaboration conferences Enterprise enterprise 2.0 Entrepreneurship facebook google humor iaas IBM innovation insights integration ipad iphone marketing microsoft netsuite open source openstack paas platform services saas salesforce.com sap Security Social Business social media software as a service Startup Advice startups Tech Market Analysis twitter vc funding venture capital vmware xero

May 2013
M T W T F S S
« Apr    
 12345
6789101112
13141516171819
20212223242526
2728293031  
  • Abhishek: I see nothing wrong with rewarding...
  • CloudAve: always insightful Mark Suster...
  • fred zimny's serve4impact: See on...
  • CloudAve: 5 Key Essentials of Cloud Workloads...
  • jasonlkn: It’s natural … especially...
  • Rick: Great article Jason! I feel the same way...
  • James Strayer: there are companies out there...
  • 5 Key Essentials of Cloud Workloads Migration: ...
  • nielsjhansen: Good post. I also liked the quote...
  • Keith: You are optimistic that the nature of...
  • Michael: Datahero looks like a cool product....
  • DataH: Chirag, we are seeing an increase in...
  • Cyberculture History: The Origin Of E-Mail: ...
  • CloudAve: Yesterday I wrote a post about...
  • CloudAve: Related post: Why Early-Stage VCs...

Archives

Authors

  • Adron Hall
  • Ben Kepes
  • Chirag Mehta
  • Chris Yeh
  • Christian Reilly
  • Colin Berkshire
  • Dan Morrill
  • Dan Pepper
  • Dave Michels
  • Dave Roberts
  • Hutch Carpenter
  • Jacob Morgan
  • Jarret Pazahanick
  • Jason M. Lemkin
  • Jeffrey Vocell
  • Joel York
  • John Taschek
  • Krishnan Subramanian
  • Mark Fidelman
  • Mark Suster
  • Martijn Linssen
  • Michael Krigsman
  • Ofir Nachmani
  • Paul Miller
  • Rakesh Malhotra
  • Randy Bias
  • Sadagopan
  • Scott Bils
  • Zoli Erdos
Sponsored by: