Many
Data Sources
These
days, traders can obtain, from various vendors, a reasonably accurate
historical data base for about $1,000 and keep it current, by
subscription for about $500 / year. Traders can also find, on the
net, many resources for free historical data.
Like
many traders, I use various sources, including subscription data
services. For this study I am using data from CRB.
Scan
The Data to Verify Consistency
Before
I begin system testing I like to to verify my data is OK. I have a
scan program that checks my data for basic problems such as missing
days, open or close outside the high/low range and open interest change
exceeding volume.
Most
futures data has inconsistencies between volume and open interest,
particularly in the early days of a delivery when the trading is thin.
Sometimes
firms under-report or over-report open interest. For example, if a
firm has 10 lots long for one client and 10 short for another, it
reports open interest of 10 lots. If a firm has 10 lots long and
short for the same client, it report open interest as zero. Thus,
for the same position, the firm may report various values for open
interest.
When firms
mis-report or correct a previous mis-report, open interest can change
without any corresponding volume. The CME reports volume and
open interest once per day. CME does not go back and correct prior
numbers.
Cross-Check
With Other Data Sources
Different
data vendors use different conventions for reporting Open, High and Low.
Some vendors report all-session range while others report day-session
only. This can impact systems that signal on new highs and lows.
Different
vendors have different conventions for computing the open from the
opening range and / or creating an open if no trades occur during the
open. This can impact systems that compute volatility from a delta
from the opening price.
Gibbons
Burke compares CRB data against MJK
data. His study shows the difference between all-session and
day-session-only series. For similar sessions, he gets a good
match, even to containing the same volume / open interest
inconsistencies.
Jake
Carriker and David
Meyer also provide comparison studies.