Every day automated trading creates mountains of historical stock data that traders must store and manage for trade execution, compliance, and modeling trading strategies. You’d think it would all be on the cloud already, but it isn’t. While most of the financial services sector ponders the future impact of the cloud, NASDAQ OMX is exploring ways the cloud can make life easier for stock traders today.
The Automated Trading Fire Hose
While many people have heard of automated stock trading and some of its more spectacular consequences like the 2010 Flash Crash, what most people don’t know is that there is a financial information technology arms race going on that has produced radical changes in stock trading over the last ten years. With every progression in industry standards, increase in processor speed and decrease data storage costs, automated traders up the ante by developing faster, more computationally intensive algorithms to outwit the market and outsmart each other. The result is a feedback loop that demands ever increasing data processing power and data management efficiency. One can debate whether this is a good thing or a bad thing, but what is not up for debate is that financial services firms that want to stay competitive must constantly look to new technologies to avoid being crushed under the relentless deluge of stock trade data.
To put the growth in automated trade data in perspective,
this graph shows the multiple orders of magnitude growth in quote and trade volume
in US options over the last ten years.
Source: Presentation by CBOE at the Financial Information Forum, 10/2010
The Big Historical Stock Data Dilemma
Historical stock data in particular is a real problem, because it just gets bigger and bigger as the trades pile up over time. Data management costs increase in lockstep with the amount of historical stock data stored, but value declines as the immediate usefulness of any given historical trade is not always apparent. For example, you can’t unleash an untested trading algorithm on real money, so every automated trading algorithm requires extensive back-testing with historical stock data. That’s a big data management problem when you don’t know today what automated trading algorithm you will be testing tomorrow. You don’t know what historical stock data you will need until you need it, so you must store it all just in case.
NASDAQ Data-On-Demand | Tick Data on the Cloud
Last week NASDAQ OMX released a new enterprise version of NASDAQ Data-On-Demand, a cloud-based historical stock data-as-a-service with 3 years of historical stock trade and quote data, including every tick for every symbol traded on NASDAQ, NYSE, and US OTC markets. That translates into mountains of data that customers of the service don’t have to manage internally. Users can request custom data sets as needed by selecting historical stock data for specific ticker symbols, data fields, and time frames. Depending on the size of the output, the data can be accessed either through a Web service API that can be plugged directly into applications or through a batch request interface that provides various file formats for FTP download.
In addition to straightforward slice-and-dice access to historical stock data, NASDAQ Data-On-Demand also provides an historical analytics service overlaid on top of the raw data to sweeten the pot. For example, users can calculate VWAP and TWAP (volume/time weighted average price) for a given stock over a given period on the fly by picking the specific time range of trades down to the sub-second level with the result typically returned in under one second.
Just the Beginning?
This is surely the first of many big data migrations to the cloud in the financial services sector. Financial market data in particular is a natural fit for cloud storage and data-as-a-service delivery. It’s big, it’s in demand, it’s valuable, and unlike corporate and consumer data, it isn’t private. Moreover, automated trading is systematically making inroads into every global market and every asset class from equities to forex. It’s just too much data for most firms to manage in-house. That new giant sucking sound we’ll soon be hearing will be the whirlwind of all those petabytes of historical trade and reference data rising up to the cloud.

I find it amazing how many financial institutions still have fear of making this style of move toward cloud technology. Most of the traditional technologies (single dedicated in house data centers, single database servers, vertical scaled (poorly) relational databases, and more) are highly NOT available and have very few of the advantages of cloud technologies. These are advances that financial providers NEED to make use of. I sure hope to see more advances like this over the next few years. I’d like to see the financial sectors really step up and take some lead in this regard.