Rough notes follow.
What’s the Data supply business now? What’s a Data Marketplace? How do we get from here to there?
Today, data is a $100Bn global market. But what is data, and why should we care? High value top-down data sets like stock feeds, but also ‘open source upstarts like us;’ excited by open data, hacking, Government data, and mashups. “Lots of people think data should be free… like music and movies.”
There is a huge opportunity for change. There hasn’t been real innovation “since Bloomberg came to town.”
Data products today tightly tied to a value chain, with defensible customers, rich UI, etc. Moving down the value chain, there are products that focus less on building a coherent offer; ‘messy’ data? Marketplace has tended to focus on high end, but opportunities for innovation and disruption throughout value chain.
Data distribution/delivery – do a Bloomberg, and sell a terminal? Do a Nielsen, and build a portal? Offer flat-file dumps by ftp? Offer a feed to large customers from your proprietary system? Or use one of the most highly advanced data distribution mechanisms of our time… and ship a disk in the mail.
Bloomberg… disrupted a business and created competitive advantage by acquiring, integrating and distributing data. But a lot of the data wasn’t proprietary at all… it was simply difficult to reach.
Challenges for current market/ opportunities for new entrants; current high value projects are highly inflexible. Lock-in and control? There are few pricing options for incumbent systems, and they struggle to react as the economics of supplying data alter.
Consumer data is a big opportunity – but we need to crack privacy first. [Pete Soderling]
What do you need to consider when acquiring a data set? Freshness/ currency. Accuracy. Integrity. Licensing. Format. Open Data today; mostly just lists of lists of data. But 10-15 companies working to build data marketplaces that go further.
But what is a Data Marketplace? Numerous definitions.
Data catalogue – like Infochimps. “Pretty cool.” An online mail order catalogue. Microsoft Azure’s solution is offering this sort of solution too. “Catalogue shopping is probably not the future of consuming data.”
Real-time feeds – like Factual, or Gnip… which is ‘seriously rad.’
A huge amount of what we think of as open data (sunlight, world bank, data.gov, etc) comes from a mandate to make data available… but they are really difficult to use.
‘Find it and graph the shit out of it;’ timetric, Iceland’s data marketplace, etc. Visualisation is often the point. But they’re not the sort of site you visit regularly in your data acquisition routine, says Forde.
FluidDB, Freebase et al – the solutions of ‘ambitious nerds.’ ‘Totally awesome,’ but ‘a bit too nerdy.’
Need to develop solutions that solve real problems, rather than developing things just because they’re cool.
So do we need a data marketplace at all? Lots of people think we don’t. Do people in the street need what we’ve got? Probably not. Is Open Data even valuable? “People add value to things that are technically free all the time.”
Maybe the brand is wrong – ‘Data Market’ is not the right concept with which to lead. Infrastructure for data is important, but maybe it doesn’t need a name and identity.
There’s a land grab going on as various entrants round up data and talent as quickly as possible. Maybe we should focus on rounding up some customers?
Lines between ‘open’ data and valuable market data blurring. But we need to get better at explaining why anyone should care. “Data is completely worthless without context.” “There’s an absence of discourse around the data sets themselves.”
Data collaboration hubs; BuzzData and Talis’ Kasabi. Conversation about and around data, in a comfortable environment. Brings people without deep data analysis/ technical skills into the conversation. “Without conversations around data, it never becomes human. It remains cold and alien.”
If you build it, will anyone come? Technical considerations; Data as a Service (DaaS). What is DaaS? It should deliver fresh data. That could mean real-time, but it wouldn’t have to. It just needs to be timely. If you have taken data offline, do you know where it’s from? Do you know when updates become available? Easy integration needs to move us away from proprietary solutions. REST-based apis good, or Microsoft’s ODATA spec. Use cases should become more flexible, supporting integration into a customer’s own chosen apps. But ‘the more flexible access you give to people, the more chance there is they won’t know what to do with it.’
DaaS delivery – how do we get data to people? APIs, downloads, vendor-backed data stores, and data marketplaces all remain options.
Delivery metrics are key; know who your customers are, what data they use, and how. Creates opportunities for pricing based on usage. You don’t always need to license access to the whole database…
Most importantly in a nascent market, pricing needs to remain flexible. “This world is changing fast.”
(Cross-posted @ Paul Miller – The Cloud of Data)