Chapter 3 Create Series of Nearby Prices
One of the most common tasks that we do is examine a series of nearby futures prices. Historical data for futures prices is readily available, but series of nearby prices are often behind a paywall. Peicing together a nearby series on your own by hand is tedious and error prone. The script found at
https://github.com/mindymallory/RollFutures
does this programmatically. It fetches the appropriate contracts from Quandl.com, and then identifies the nearby contract. The nearby is rolled to the first deferred when volume in the first deferred overtakes volume in the nearby. Hu et al. (2017) showed that price discovery typically moves to the first deferred contract when it’s volume surpasses the nearby, so that is the method of rolling employed here.
3.1 Markets Covered
This script will pull corn, soybeans, KC wheat, Chicago wheat, and crude oil.
3.2 The Script
Before you begin running the script, make sure you have created a new folder for you project and create a new R project. This will make all the working directories correct. See this for details.
First, install and load the required packages. Note that you will need to go to Quandl.com and sign up for a free account and get an api key. Place it inside the quotes in the Quandl.api.key()
function.
# install.packages("Quandl")
# install.packages("plyr")
# install.packages("tidyr")
# install.packages("ggplot2")
library(Quandl)
library(plyr)
library(tidyr)
library(ggplot2)
#Quandl.api_key("YourAPIKeyHERE")
Next, initialize the parameters how you want. Choose the start and end date you need for your data. Then modify the c_code
line to choose the correct commodity. c_code=2
is shown so this will pull soybeans data. If you choose something else you need to comment the contracts <-
line for soybeans and uncomment the line for the commodity you want. The letters correspond to the listed contract months for each commodity. E.g, Z
is for December.
start <- 2015
end <- 2017
c_code <- 2
commodity_code <- c("C", "S", "W", "KW", "CL")
# #C
# contracts <- c('H', 'K', 'N', 'U', 'Z')
#S
contracts <- c( 'F', 'H', 'K', 'N', 'Q', 'U', 'X')
# #W
# contracts <- c( 'H', 'K', 'N', 'U', 'Z')
# #KW
# contracts <- c( 'H', 'K', 'N', 'U', 'Z')
# #CL
# contracts <- c( 'F', 'G', 'H', 'J', 'K', 'M', 'N', 'Q', 'U', 'V', 'X', 'Z')
The next portion of code defines a list and then uses the Quandl()
function to fetch data from Quandl.com. Notice we use the paste0()
function to buld the correct commodity code to give to the Quandl api.
years <- seq(start, end, by =1)
data <- list()
k <- 1
for (i in start:end){
for (j in 1:length(contracts)){
MyData = cbind(Quandl(paste0("CME/", commodity_code[c_code], contracts[j], years[i-start +1])), paste0(contracts[j], '-', years[i-start+1], '-Settle'), paste0(contracts[j], '-', years[i-start+1], '-Volume'))
data[[k]] <- MyData
k <- k+1
# To keep raw data files uncomment this line.
# write.csv(MyData, file = paste0("data-download/", contracts[j], years[i-start +1], ".csv"))
}
}
Then the list is converted to a dataframe with the ldply()
function. Here we also separate out the settle
and volume
and stack them vertically with rbind
so that the contract info can be stored as a single factor variable.
DATA <- ldply(data, rbind)
DATA <- DATA[, c(1,7,8,10, 11)]
DATA <- as.data.frame(DATA)
colnames(DATA) <- c('Date', 'Settle', 'Volume', 'Contract.Settle', 'Contract.Volume')
settle <- DATA[, c("Date", "Contract.Settle", "Settle")]
volume <- DATA[, c("Date", "Contract.Volume", "Volume")]
colnames(settle) <- c('Date', 'Contract', 'Value')
colnames(volume) <- c('Date', 'Contract', 'Value')
DATA <- rbind(settle, volume)
colnames(DATA) <- c('Date', 'Contract', 'Value')
In the next lines we use the spread
function to take the long data set to a wide dataset where each row contains a date with all the contracts trading in separate columns for settle price and volume. We print the ‘long’ dataset before applying spread
and the ‘wide’ dataset after to give a sense what the code is doing. The last line trims the dataset to your requested dates.
head(DATA)
## Date Contract Value
## 1 2015-01-14 F-2015-Settle 992.0
## 2 2015-01-13 F-2015-Settle 1000.0
## 3 2015-01-12 F-2015-Settle 1013.0
## 4 2015-01-09 F-2015-Settle 1051.0
## 5 2015-01-08 F-2015-Settle 1045.0
## 6 2015-01-07 F-2015-Settle 1052.5
DATA <- spread(DATA, Contract, Value)
DATA <- as.xts(DATA[, -1], order.by = DATA[, 1])
DATA <- DATA[paste0(start,'/',end)]
head(DATA)
## F-2015-Settle H-2015-Settle K-2015-Settle N-2015-Settle
## 2015-01-02 1002.0 1007.0 1014.5 1020.0
## 2015-01-05 1039.5 1045.5 1052.0 1057.0
## 2015-01-06 1051.0 1055.5 1061.0 1066.0
## 2015-01-07 1052.5 1056.5 1061.0 1065.5
## 2015-01-08 1045.0 1048.5 1053.5 1058.5
## 2015-01-09 1051.0 1052.5 1057.0 1062.0
## Q-2015-Settle U-2015-Settle X-2015-Settle F-2016-Settle
## 2015-01-02 1020.5 1004.0 993.5 998.0
## 2015-01-05 1055.5 1037.5 1025.0 1029.5
## 2015-01-06 1063.0 1043.0 1028.5 1033.5
## 2015-01-07 1062.5 1041.0 1025.0 1029.0
## 2015-01-08 1056.5 1034.5 1017.0 1022.5
## 2015-01-09 1059.0 1037.5 1019.0 1024.0
## H-2016-Settle K-2016-Settle N-2016-Settle Q-2016-Settle
## 2015-01-02 1002.0 1004.5 1008.5 1007.5
## 2015-01-05 1034.5 1036.5 1039.0 1037.5
## 2015-01-06 1038.5 1040.0 1043.0 1041.5
## 2015-01-07 1034.5 1036.0 1039.0 1037.5
## 2015-01-08 1027.0 1029.5 1032.0 1030.5
## 2015-01-09 1029.5 1031.0 1035.5 1033.0
## U-2016-Settle X-2016-Settle F-2017-Settle H-2017-Settle
## 2015-01-02 992.5 985.0 985.5 985.5
## 2015-01-05 1022.5 1008.0 1008.5 1008.5
## 2015-01-06 1026.5 1012.5 1013.0 1013.0
## 2015-01-07 1019.5 1001.0 1001.5 1001.5
## 2015-01-08 1012.5 992.0 995.0 995.5
## 2015-01-09 1015.0 995.5 997.5 998.0
## K-2017-Settle N-2017-Settle Q-2017-Settle U-2017-Settle
## 2015-01-02 991.5 1006.5 1006.5 1006.5
## 2015-01-05 1018.5 1029.0 1029.0 1029.0
## 2015-01-06 1018.0 1033.5 1033.5 1033.5
## 2015-01-07 1011.5 1022.0 1022.0 1022.0
## 2015-01-08 1001.5 1013.5 1013.5 1013.5
## 2015-01-09 1006.5 1017.0 1017.0 1017.0
## X-2017-Settle F-2015-Volume H-2015-Volume K-2015-Volume
## 2015-01-02 979.5 9191 63624 15351
## 2015-01-05 1001.5 7334 117101 20283
## 2015-01-06 1005.5 3997 100146 24102
## 2015-01-07 993.5 3865 78827 23459
## 2015-01-08 985.0 4033 72445 19887
## 2015-01-09 987.5 3320 64335 18130
## N-2015-Volume Q-2015-Volume U-2015-Volume X-2015-Volume
## 2015-01-02 13690 472 201 6484
## 2015-01-05 19915 1306 185 10079
## 2015-01-06 21067 413 215 9464
## 2015-01-07 23225 607 313 10202
## 2015-01-08 14560 614 239 8110
## 2015-01-09 14016 730 297 8428
## F-2016-Volume H-2016-Volume K-2016-Volume N-2016-Volume
## 2015-01-02 89 28 0 1
## 2015-01-05 121 34 10 1
## 2015-01-06 102 51 16 23
## 2015-01-07 126 69 34 21
## 2015-01-08 437 121 48 96
## 2015-01-09 323 65 4 1
## Q-2016-Volume U-2016-Volume X-2016-Volume F-2017-Volume
## 2015-01-02 0 0 38 0
## 2015-01-05 0 0 69 0
## 2015-01-06 0 0 35 0
## 2015-01-07 0 0 57 0
## 2015-01-08 0 0 42 0
## 2015-01-09 0 0 20 0
## H-2017-Volume K-2017-Volume N-2017-Volume Q-2017-Volume
## 2015-01-02 0 0 1 0
## 2015-01-05 0 0 0 0
## 2015-01-06 0 0 0 0
## 2015-01-07 0 0 0 0
## 2015-01-08 0 0 0 0
## 2015-01-09 0 0 0 0
## U-2017-Volume X-2017-Volume
## 2015-01-02 0 0
## 2015-01-05 0 1
## 2015-01-06 0 0
## 2015-01-07 0 0
## 2015-01-08 0 0
## 2015-01-09 0 0
Now, we use the apply
function to apply which.max
to every row. This delivers the index of which column takes the maximum value. Since we only identified the volume columns, this will identify the column with maximum volume. Once we have all these indices, we can build a series of nearby prices by using the index to pick out the contracts’ settle price that had the maximum volume for each date.
temp <- apply(DATA[, (length(contracts)*length(years)+1):dim(DATA)[2]], 1, which.max)
nearby <- vector(mode= "numeric", length = length(temp))
for (i in 1:length(temp)){
nearby[i] <- DATA[i, temp[i]]
}
DATA$Nearby <- nearby
head(DATA$Nearby)
## Nearby
## 2015-01-02 1007.0
## 2015-01-05 1045.5
## 2015-01-06 1055.5
## 2015-01-07 1056.5
## 2015-01-08 1048.5
## 2015-01-09 1052.5
g <- autoplot(DATA$Nearby)
g
Now the dataset DATA$Nearby
can be used in whatever analysis you want to do. If you prefer to write it to a .csv file in order to use it somewhere else do something like the following, but put in your own file path.
write.zoo(DATA$Nearby, file = "YourFilePath.csv", sep = ",")
Multi-Market Models VECM
Zhepeng Hu, Philip Garcia, Mindy Mallory. 2017. “Measuring Price Discovery Between Nearby and Deferred Contracts in Storable and Non-Storable Commodity Futures Markets.” Working Paper.