Principal Components Part Two

30 Apr, 2016 — 7 min

Everything is better with data, and a post on principal components is no exception. Principal components analysis is often used with interest rates and it is probably the most well-known example of where one can use the technique. But often the story ends with a justification of using shift/slope/curvature to describe yield curve movements and that’s it. Not today! Armed with R and some data, we’ll dive in deeper.

The code in this post can be downloaded from this gist, where it’s a little better organized.

Step one is to get rate data, in this case from our friends at the Saint Louis Fed who keep the FRED database. FRED is great, and even better that you can easily load data from it into R using Quantmod. (Does this make up for the ill-advised QE policy? Not quite.)

Below is some code for getting the data series and converting it to weekly data. When you download the data, it will be for every business day, but we’ll convert it to weekly. We want rate changes, so take the difference in the weekly rate series. Finally, each time series is centered by subtracting the mean.

# Get data for Treasury interest rates from FRED
t.symbol <- c('DGS6MO', 'DGS2', 'DGS5', 'DGS10', 'DGS20', 'DGS30')
out <- getSymbols(t.symbol,src="FRED", auto.assign = TRUE)

# Combine the data series into one xts object
rates.xts <- DGS6MO
for (i in 2:6){
  rates.xts <- merge(rates.xts, get(t.symbol[i]))
}
# Take only data since 2006-2-10 and convert to weekly data
rates.xts <- rates.xts['2006-02-10/']/100
# to.period is smart enough to eliminate NAs from holidays
rates.xts <- to.period(rates.xts, period='weeks', OHLC=FALSE)
# Take difference to get rate changes
rate.chng <- na.omit(diff(log(1+rates.xts)))
# Compute column means and create a centered time series
rate.mean <- colMeans(rate.chng)
rate.chng.zm <- matrix(0, nrow=dim(rate.chng)[1], ncol = dim(rate.chng)[2])
for(i in 1:ncol(rate.chng)){rate.chng.zm[,i] <- rate.chng[,i]-rate.mean[i]}

Here we’ll calculate the elements of PCA: E.mat is a matrix containing the eigenvectors, lambda is a vector of the eigenvalues, and F.mat is a matrix of the factor realizations.

lambda <- eigen(cov(rate.chng))$values # lambda = eigenvalues
E.mat <- eigen(cov(rate.chng))$vectors # E.mat = eigenvectors
F.mat <- rate.chng.zm%*%E.mat # Meucci 3.240
F.mat <- xts(F.mat, order.by=index(rate.chng))
colnames(F.mat) <- paste('Factor', 1:ncol(rate.chng))

A Couple Of Quick Equalities

Just to reinforce a few things from the previous post, I’ll demonstrate a few equalities using this data. The first is that you can recover the entire series of rate changes from your matrix of factor realizations and eigenvectors, adding back the column means of course. But remember that this only works if you use all the principal components. Anything less is an approximation.

# This is the reconstructed data series
x.tilde <- F.mat%*%t(E.mat)+ matrix(rate.mean, nrow=dim(rate.chng)[1], ncol=dim(rate.chng)[2], byrow=TRUE)
# These are equal to 15 digits, the number of attributes is different. see str(coredata(t.series))
all.equal(x.tilde, coredata(rate.chng), check.attributes = FALSE)

## [1] TRUE

Another thing to show is that \(R^T R\) is equal to the covariance of matrix of the centered data, once you divide by n-1.

# prove that t(r)%*% r equals cov(r) when r is centered
r <- coredata(rate.chng.zm)
rt.r <- t(r)%*%r
rates.cov <- cov(rate.chng)
all.equal(unname(rates.cov), rt.r/(dim(rate.chng)[1]-1))

## [1] TRUE

Finally let’s show that the singular value decomposition results are the same as from the eigendecomposition. The eigenvectors can turn out to have the opposite sign from the two methods, and the code below switches the sign based on data on the date of the post. On a different day, the signs may be flipped in a different way.

# Calculate the SVD on the demeaned rate data
UDV <- svd(coredata(rate.chng.zm))

# First the easy part, E.mat.
# You'll need to check for columns that have the sign reversed
E.temp <- E.mat
E.temp [, c(1:2, 5:6)]<- -1*E.mat[, c(1:2, 5:6)]
all.equal(UDV$v, E.temp, check.attributes=FALSE, tolerance=0.000000001)

## [1] TRUE

The end of the last post showed that if you want to get the factor realizations, you can use either side of the equality \(R V = U D\), and here F.mat equals \(R V\) using the set of eigenvectors from above (E.mat that we got from the function eigen). We have to change the sign of some of the columns to demonstrate this equality.

# Let's check F.mat == U times diag(d). Equal yet again.
F.temp <- coredata(F.mat)
F.temp[, c(1:2, 5:6)] <- -1*F.mat[, c(1:2, 5:6)]
all.equal(UDV$u%*%diag(UDV$d), F.temp, check.attributes=FALSE,
  tolerance=0.000000001, scale=1)

## [1] TRUE

Finally the SVD’s diagonal matrix is the square root of the eigenvalues we got above (after dividing by n-1). The eigenvalues are the variance of the matrix of factor realizations, so the standard deviation of the first factor is the square root of lambda.

# Show that the values of d^ equal the values of lambda
all.equal( ((UDV$d)^2)/(nrow(rate.chng.zm)-1), lambda)
# Show that the standard deviation of F.mat is the square root of lambda. Meucci 3.235.
all.equal(sd(F.mat[,1]), sqrt(lambda[1]))

Ok, that was boring. Now it gets a little better.

What Do The PCA Curve Shifts Look Like?

Following Meucci’s book the code below shows the 1-standard deviation shift in the first three principal components (shift, slope and curvature). This is done by taking the square root of the eigenvalues times the eigenvector. You can see, for example, that a 1-standard deviation move in PC1 is a pretty flat move down in rates, that PC2 is a flattener, and PC2 is a change in curvature.

# Construct a matrix of the 1 standard deviation PC shifts Meucci 3.236
PC.shift <- matrix(0, nrow=3, ncol=dim(rate.chng)[2])
for(i in 1:3){PC.shift[i,] <- sqrt(lambda[i])*E.mat[,i]}
colnames(PC.shift) <- colnames(rate.chng)
rownames(PC.shift) <- c('PC1', 'PC2', 'PC3')
round(PC.shift*100, 4)

##      DGS6MO    DGS2    DGS5   DGS10   DGS20   DGS30
## PC1 -0.0268 -0.0709 -0.1144 -0.1192 -0.1127 -0.1068
## PC2  0.0474  0.0554  0.0308 -0.0109 -0.0316 -0.0361
## PC3  0.0511 -0.0137 -0.0245 -0.0053  0.0098  0.0181

This is a pairs plot of the PCA factors. Not exactly a scattershot pattern.

plot of chunk showPCAscatter

While the results are uncorrelated, they do show concordance. So you can’t say they are unrelated. This shows the correlation, which is as expected zero, but checkout Kendall’s tau.

# Show that the Factor realizations are uncorrelated
round(cor(F.mat), 8)

##          Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Factor 6
## Factor 1        1        0        0        0        0        0
## Factor 2        0        1        0        0        0        0
## Factor 3        0        0        1        0        0        0
## Factor 4        0        0        0        1        0        0
## Factor 5        0        0        0        0        1        0
## Factor 6        0        0        0        0        0        1

# But there is concordance
round(cor(F.mat, method="kendall"), 6)

##          Factor 1  Factor 2  Factor 3  Factor 4  Factor 5  Factor 6
## Factor 1 1.000000  0.080055  0.083200  0.033263  0.049161  0.016900
## Factor 2 0.080055  1.000000 -0.220006  0.011878  0.014459 -0.000169
## Factor 3 0.083200 -0.220006  1.000000 -0.079914  0.105686 -0.003879
## Factor 4 0.033263  0.011878 -0.079914  1.000000 -0.065313  0.043152
## Factor 5 0.049161  0.014459  0.105686 -0.065313  1.000000  0.000818
## Factor 6 0.016900 -0.000169 -0.003879  0.043152  0.000818  1.000000

What About The Behavior Of The Factors?

PCA has a number of great attributes: it reduces greatly reduces the dimensionality of the problem it is attacking, it is intuitive and it has some cool math behind it. What would make it perfect? If the factor realizations were normally distributed white noise. Nerd friends, I’m sure you have many fantasies, and this would be another. Here is a histogram of the first three factors with a normal curve superimposed on top. The first is not too far from the normal distribution but the second and third are very fat tailed.

plot of chunk showPCAcurve

Finally, let’s look at how the factors vary over time. One of the stylized facts of financial data is volatility clustering that can be seen by looking at squared returns or absolute returns. Below are charts of the first two factors, showing change in rates, squared changes and absolute value of changes.

plot of chunk showPCAret

The second certainly has some noticable volatility clustering. I think these can be successfully modeled with a GARCH, but that’s a topic for another day.

So, that was a lot in one post but hopefully this gives you some insight into calculating PCA factors and using data using R. And, finally, a shout out to all who made it possible to publish directly from R to Wordpress.