|
Data
Comparison
|
July 19, 2005
Comparison of Ed's Data
with Other Sources
From: Gibbons
Burke
Attached is the
spreadsheet highlighting the differences between CRB and MJK data.
Instead of using the pit-only series which I had been using
before, I grabbed the data for the composite, which is alleged to
contain the combined session range. I am pleased to see that in the
Close column there was only one point of difference and same with the
Volume column. The open interest data matches yours exactly. I removed
your Saturday and Sunday dates, and changed [no data] values to zeros
for matching purposes.
I added another sheet which takes the spread between the cells in the
two underlying data sheets to observe more directly the point
differences between the two data sets. I double checked the closing
price difference I found against another source and it agrees with CRB.
In the case of the volume difference, the other source agrees with MJK.
My analysis of what we are seeing in the open-high-low fields is that it
reflects a difference in vendor reporting conventions. MJK is the
primary source for our data. So our series flow from the exchanges by
different routes.
I believe MJK assigns the official settlement price for the day to all
fields for those days where there is no trading, whereas CRB records the
quotes in the bid-ask spread in its calculation of the high-low range
for those days. This belief is supported by the observation that in the
comparison report the MJK data has the same value for all the fields.
In my opinion the MJK convention is more "accurate" in the
sense that the day's range, which is zero, exactly reflects the
volatility of that contract on that day, given it saw no trading. The
"phantom" range reported by CRB may indicate quote interest
where there was no activity, and is subject to manipulation - anyone can
put in a bid or an offer and it doesn't have to be hit to be reflected
in the day's range.
The MJK policy about the Open price is also well-considered. On days
where there is trading (as opposed to the zero volume days, where the
open reflects the settlement price) they use the first number in the
opening range reported by the exchanges (for those exchanges that report
a range rather than a single number for the open) which good analysts I
know have found to be the better reflection of the first printed trade
of the day.
Regarding the volume / open-interest change anomalies you have found, it
is significant that your series matches our series so closely in the
volume and open interest fields. This fact suggests to me that the
likely source of those anomalies is the exchange itself. Is it
possible that the OI changes may be recorded after the volume is
published, and reflect out-trade settlements, which take place the
following morning? |
|
The
Smoking Spreadsheet
During
an email exchange with Gibbons Burke,
I
mention I normally keep contributors to FAQ anonymous
and
only publish private names with permission.
Gibbons
points out that Microsoft products now include
file
tracking technology, to detect the author
of
an excel spreadsheet.

This
screenshot from the UNIX more command
clearly
identifies Gibbons Burke
as
the creator of the smoking spreadsheet.
|
|