3. QR Decomposition#
3.1. Overview#
This lecture describes the QR decomposition and how it relates to
Orthogonal projection and least squares
A Gram-Schmidt process
Eigenvalues and eigenvectors
We’ll write some Python code to help consolidate our understandings.
3.2. Matrix Factorization#
The QR decomposition (also called the QR factorization) of a matrix is a decomposition of a matrix into the product of an orthogonal matrix and a triangular matrix.
A QR decomposition of a real matrix
where
is an orthogonal matrix (so that ) is an upper triangular matrix
We’ll use a Gram-Schmidt process to compute a QR decomposition
Because doing so is so educational, we’ll write our own Python code to do the job
3.3. Gram-Schmidt process#
We’ll start with a square matrix
If a square matrix
We’ll deal with a rectangular matrix
Actually, our algorithm will work with a rectangular
3.3.1. Gram-Schmidt process for square #
Here we apply a Gram-Schmidt process to the columns of matrix
In particular, let
Let
The Gram-Schmidt algorithm repeatedly combines the following two steps in a particular order
normalize a vector to have unit norm
orthogonalize the next vector
To begin, we set
We orgonalize first to compute
We invite the reader to verify that
The Gram-Schmidt procedure continues iterating.
Thus, for
Here
it is the inner product of
and divided by the inner product of where , as normalization has assured us.this regression coefficient has an interpretation as being a covariance divided by a variance
It can be verified that
Thus, we have constructed the decomposision
where
and
3.3.2. not square#
Now suppose that
Then a
which implies that
3.4. Some Code#
Now let’s write some homemade Python code to implement a QR decomposition by deploying the Gram-Schmidt process described above.
import numpy as np
from scipy.linalg import qr
def QR_Decomposition(A):
n, m = A.shape # get the shape of A
Q = np.empty((n, n)) # initialize matrix Q
u = np.empty((n, n)) # initialize matrix u
u[:, 0] = A[:, 0]
Q[:, 0] = u[:, 0] / np.linalg.norm(u[:, 0])
for i in range(1, n):
u[:, i] = A[:, i]
for j in range(i):
u[:, i] -= (A[:, i] @ Q[:, j]) * Q[:, j] # get each u vector
Q[:, i] = u[:, i] / np.linalg.norm(u[:, i]) # compute each e vetor
R = np.zeros((n, m))
for i in range(n):
for j in range(i, m):
R[i, j] = A[:, j] @ Q[:, i]
return Q, R
The preceding code is fine but can benefit from some further housekeeping.
We want to do this because later in this notebook we want to compare results from using our homemade code above with the code for a QR that the Python scipy
package delivers.
There can be be sign differences between the
All of these are valid QR decompositions because of how the sign differences cancel out when we compute
However, to make the results from our homemade function and the QR module in scipy
comparable, let’s require that
We do this by adjusting the signs of the columns in
To accomplish this we’ll define a pair of functions.
def diag_sign(A):
"Compute the signs of the diagonal of matrix A"
D = np.diag(np.sign(np.diag(A)))
return D
def adjust_sign(Q, R):
"""
Adjust the signs of the columns in Q and rows in R to
impose positive diagonal of Q
"""
D = diag_sign(Q)
Q[:, :] = Q @ D
R[:, :] = D @ R
return Q, R
3.5. Example#
Now let’s do an example.
A = np.array([[1.0, 1.0, 0.0], [1.0, 0.0, 1.0], [0.0, 1.0, 1.0]])
# A = np.array([[1.0, 0.5, 0.2], [0.5, 0.5, 1.0], [0.0, 1.0, 1.0]])
# A = np.array([[1.0, 0.5, 0.2], [0.5, 0.5, 1.0]])
A
array([[1., 1., 0.],
[1., 0., 1.],
[0., 1., 1.]])
Q, R = adjust_sign(*QR_Decomposition(A))
Q
array([[ 0.70710678, -0.40824829, -0.57735027],
[ 0.70710678, 0.40824829, 0.57735027],
[ 0. , -0.81649658, 0.57735027]])
R
array([[ 1.41421356, 0.70710678, 0.70710678],
[ 0. , -1.22474487, -0.40824829],
[ 0. , 0. , 1.15470054]])
Let’s compare outcomes with what the scipy
package produces
Q_scipy, R_scipy = adjust_sign(*qr(A))
print('Our Q: \n', Q)
print('\n')
print('Scipy Q: \n', Q_scipy)
Our Q:
[[ 0.70710678 -0.40824829 -0.57735027]
[ 0.70710678 0.40824829 0.57735027]
[ 0. -0.81649658 0.57735027]]
Scipy Q:
[[ 0.70710678 -0.40824829 -0.57735027]
[ 0.70710678 0.40824829 0.57735027]
[ 0. -0.81649658 0.57735027]]
print('Our R: \n', R)
print('\n')
print('Scipy R: \n', R_scipy)
Our R:
[[ 1.41421356 0.70710678 0.70710678]
[ 0. -1.22474487 -0.40824829]
[ 0. 0. 1.15470054]]
Scipy R:
[[ 1.41421356 0.70710678 0.70710678]
[ 0. -1.22474487 -0.40824829]
[ 0. 0. 1.15470054]]
The above outcomes give us the good news that our homemade function agrees with what scipy produces.
Now let’s do a QR decomposition for a rectangular matrix
A = np.array([[1, 3, 4], [2, 0, 9]])
Q, R = adjust_sign(*QR_Decomposition(A))
Q, R
(array([[ 0.4472136 , -0.89442719],
[ 0.89442719, 0.4472136 ]]),
array([[ 2.23606798, 1.34164079, 9.8386991 ],
[ 0. , -2.68328157, 0.4472136 ]]))
Q_scipy, R_scipy = adjust_sign(*qr(A))
Q_scipy, R_scipy
(array([[ 0.4472136 , -0.89442719],
[ 0.89442719, 0.4472136 ]]),
array([[ 2.23606798, 1.34164079, 9.8386991 ],
[ 0. , -2.68328157, 0.4472136 ]]))
3.6. Using QR Decomposition to Compute Eigenvalues#
Now for a useful fact about the QR algorithm.
The following iterations on the QR decomposition can be used to compute eigenvalues
of a square matrix
Here is the algorithm:
Set
and formForm
. Note that is similar to (easy to verify) and so has the same eigenvalues.Form
(i.e., form the decomposition of ).Form
and then .Iterate to convergence.
Compute eigenvalues of
and compare them to the diagonal values of the limiting found from this process.
Remark: this algorithm is close to one of the most efficient ways of computing eigenvalues!
Let’s write some Python code to try out the algorithm
def QR_eigvals(A, tol=1e-12, maxiter=1000):
"Find the eigenvalues of A using QR decomposition."
A_old = np.copy(A)
A_new = np.copy(A)
diff = np.inf
i = 0
while (diff > tol) and (i < maxiter):
A_old[:, :] = A_new
Q, R = QR_Decomposition(A_old)
A_new[:, :] = R @ Q
diff = np.abs(A_new - A_old).max()
i += 1
eigvals = np.diag(A_new)
return eigvals
Now let’s try the code and compare the results with what scipy.linalg.eigvals
gives us
Here goes
# experiment this with one random A matrix
A = np.random.random((3, 3))
sorted(QR_eigvals(A))
[-0.6339352457654999, 0.2857791957067247, 1.3315117001699257]
Compare with the scipy
package.
sorted(np.linalg.eigvals(A))
[-0.6339352457654991, 0.2857791957067246, 1.3315117001699248]
3.7. and PCA#
There are interesting connections between the
Here are some.
Let
be a random matrix where the th column is a random draw from where is vector of means and is a covariance matrix. We want – this is an “econometrics example”.Form
where is and is .Form the eigenvalues of
, i.e., we’ll compute .Form
and compare it with the eigen decomposition .It will turn out that that
and that .
Let’s verify conjecture 5 with some Python code.
Start by simulating a random
k = 5
n = 1000
# generate some random moments
𝜇 = np.random.random(size=k)
C = np.random.random((k, k))
Σ = C.T @ C
# X is random matrix where each column follows multivariate normal dist.
X = np.random.multivariate_normal(𝜇, Σ, size=n)
X.shape
(1000, 5)
Let’s apply the QR decomposition to
Q, R = adjust_sign(*QR_Decomposition(X.T))
Check the shapes of
Q.shape, R.shape
((5, 5), (5, 1000))
Now we can construct
RR = R @ R.T
𝜆, P_tilde = np.linalg.eigh(RR)
Λ = np.diag(𝜆)
We can also apply the decomposition to
XX = X.T @ X
𝜆_hat, P = np.linalg.eigh(XX)
Λ_hat = np.diag(𝜆_hat)
Compare the eigenvalues that are on the diagonals of
𝜆, 𝜆_hat
(array([ 8.7138404 , 139.48713424, 696.92519758, 1241.07419969,
7679.48828897]),
array([ 8.7138404 , 139.48713424, 696.92519758, 1241.07419969,
7679.48828897]))
Let’s compare
Again we need to be careful about sign differences between the columns of
QP_tilde = Q @ P_tilde
np.abs(P @ diag_sign(P) - QP_tilde @ diag_sign(QP_tilde)).max()
2.9559688030644793e-15
Let’s verify that
QPΛPQ = Q @ P_tilde @ Λ @ P_tilde.T @ Q.T
np.abs(QPΛPQ - XX).max()
7.73070496506989e-12