{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2f2344e0",
   "metadata": {},
   "source": [
    "# VARs and DMDs\n",
    "\n",
    "This lecture applies computational methods  that we learned about in this lecture\n",
    "[Singular Value Decomposition](https://python.quantecon.org/svd_intro.html) to\n",
    "\n",
    "- first-order vector autoregressions (VARs)  \n",
    "- dynamic mode decompositions (DMDs)  \n",
    "- connections between DMDs and first-order VARs  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6247202f",
   "metadata": {},
   "source": [
    "## First-Order Vector Autoregressions\n",
    "\n",
    "We want to fit a **first-order vector autoregression**\n",
    "\n",
    "\n",
    "<a id='equation-eq-varfirstorder'></a>\n",
    "$$\n",
    "X_{t+1} = A X_t + C \\epsilon_{t+1}, \\quad \\epsilon_{t+1} \\perp X_t \\tag{6.1}\n",
    "$$\n",
    "\n",
    "where $ \\epsilon_{t+1} $ is the time $ t+1 $ component  of a sequence of  i.i.d. $ m \\times 1 $ random vectors with mean vector\n",
    "zero and identity  covariance matrix and where\n",
    "the $ m \\times 1 $ vector $ X_t $ is\n",
    "\n",
    "\n",
    "<a id='equation-eq-xvector'></a>\n",
    "$$\n",
    "X_t = \\begin{bmatrix}  X_{1,t} & X_{2,t} & \\cdots & X_{m,t}     \\end{bmatrix}^\\top \\tag{6.2}\n",
    "$$\n",
    "\n",
    "and where $ \\cdot ^\\top $ again denotes complex transposition and $ X_{i,t} $ is  variable $ i $ at time $ t $.\n",
    "\n",
    "We want to fit equation [(6.1)](#equation-eq-varfirstorder).\n",
    "\n",
    "Our data are organized in   an $ m \\times (n+1) $ matrix  $ \\tilde X $\n",
    "\n",
    "$$\n",
    "\\tilde X =  \\begin{bmatrix} X_1 \\mid X_2 \\mid \\cdots \\mid X_n \\mid X_{n+1} \\end{bmatrix}\n",
    "$$\n",
    "\n",
    "where for $ t = 1, \\ldots, n +1 $,  the $ m \\times 1 $ vector $ X_t $ is given by [(6.2)](#equation-eq-xvector).\n",
    "\n",
    "Thus, we want to estimate a  system  [(6.1)](#equation-eq-varfirstorder) that consists of $ m $ least squares regressions of **everything** on one lagged value of **everything**.\n",
    "\n",
    "The $ i $’th equation of [(6.1)](#equation-eq-varfirstorder) is a regression of $ X_{i,t+1} $ on the vector $ X_t $.\n",
    "\n",
    "We proceed as follows.\n",
    "\n",
    "From $ \\tilde X $,  we  form two $ m \\times n $ matrices\n",
    "\n",
    "$$\n",
    "X =  \\begin{bmatrix} X_1 \\mid X_2 \\mid \\cdots \\mid X_{n}\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "and\n",
    "\n",
    "$$\n",
    "X' =  \\begin{bmatrix} X_2 \\mid X_3 \\mid \\cdots \\mid X_{n+1}\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "Here $ ' $  is part of the name of the matrix $ X' $ and does not indicate matrix transposition.\n",
    "\n",
    "We  use  $ \\cdot^\\top $ to denote matrix transposition or its extension to complex matrices.\n",
    "\n",
    "In forming $ X $ and $ X' $, we have in each case  dropped a column from $ \\tilde X $,  the last column in the case of $ X $, and  the first column in the case of $ X' $.\n",
    "\n",
    "Evidently, $ X $ and $ X' $ are both $ m \\times  n $ matrices.\n",
    "\n",
    "We denote the rank of $ X $ as $ p \\leq \\min(m, n) $.\n",
    "\n",
    "Two  cases that interest us are\n",
    "\n",
    "- $ n > > m $, so that we have many more time series  observations $ n $ than variables $ m $  \n",
    "- $ m > > n $, so that we have many more variables $ m $ than time series observations $ n $  \n",
    "\n",
    "\n",
    "At a general level that includes both of these special cases, a common formula describes the least squares estimator $ \\hat A $ of $ A $.\n",
    "\n",
    "But important  details differ.\n",
    "\n",
    "The common formula is\n",
    "\n",
    "\n",
    "<a id='equation-eq-commona'></a>\n",
    "$$\n",
    "\\hat A = X' X^+ \\tag{6.3}\n",
    "$$\n",
    "\n",
    "where $ X^+ $ is the pseudo-inverse of $ X $.\n",
    "\n",
    "To read about the **Moore-Penrose pseudo-inverse** please see [Moore-Penrose pseudo-inverse](https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse)\n",
    "\n",
    "Applicable formulas for the pseudo-inverse differ for our two cases.\n",
    "\n",
    "**Short-Fat Case:**\n",
    "\n",
    "When $ n > > m $, so that we have many more time series  observations $ n $ than variables $ m $ and when\n",
    "$ X $ has linearly independent **rows**, $ X X^\\top $ has an inverse and the pseudo-inverse $ X^+ $ is\n",
    "\n",
    "$$\n",
    "X^+ = X^\\top  (X X^\\top )^{-1}\n",
    "$$\n",
    "\n",
    "Here $ X^+ $ is a **right-inverse** that verifies $ X X^+ = I_{m \\times m} $.\n",
    "\n",
    "In this case, our formula [(6.3)](#equation-eq-commona) for the least-squares estimator of the population matrix of regression coefficients  $ A $ becomes\n",
    "\n",
    "\n",
    "<a id='equation-eq-ahatform101'></a>\n",
    "$$\n",
    "\\hat A = X' X^\\top  (X X^\\top )^{-1} \\tag{6.4}\n",
    "$$\n",
    "\n",
    "This  formula for least-squares regression coefficients is widely used in econometrics.\n",
    "\n",
    "It is used  to estimate vector autorgressions.\n",
    "\n",
    "The right side of formula [(6.4)](#equation-eq-ahatform101) is proportional to the empirical cross second moment matrix of $ X_{t+1} $ and $ X_t $ times the inverse\n",
    "of the second moment matrix of $ X_t $.\n",
    "\n",
    "**Tall-Skinny Case:**\n",
    "\n",
    "When $ m > > n $, so that we have many more attributes $ m $ than time series observations $ n $ and when $ X $ has linearly independent **columns**,\n",
    "$ X^\\top  X $ has an inverse and the pseudo-inverse $ X^+ $ is\n",
    "\n",
    "$$\n",
    "X^+ = (X^\\top  X)^{-1} X^\\top\n",
    "$$\n",
    "\n",
    "Here  $ X^+ $ is a **left-inverse** that verifies $ X^+ X = I_{n \\times n} $.\n",
    "\n",
    "In this case, our formula  [(6.3)](#equation-eq-commona) for a least-squares estimator of $ A $ becomes\n",
    "\n",
    "\n",
    "<a id='equation-eq-hataversion0'></a>\n",
    "$$\n",
    "\\hat A = X' (X^\\top  X)^{-1} X^\\top \\tag{6.5}\n",
    "$$\n",
    "\n",
    "Please compare formulas [(6.4)](#equation-eq-ahatform101) and [(6.5)](#equation-eq-hataversion0) for $ \\hat A $.\n",
    "\n",
    "Here we are especially interested in formula [(6.5)](#equation-eq-hataversion0).\n",
    "\n",
    "The $ i $th  row of $ \\hat A $ is an $ m \\times 1 $ vector of regression coefficients of $ X_{i,t+1} $ on $ X_{j,t}, j = 1, \\ldots, m $.\n",
    "\n",
    "If we use formula [(6.5)](#equation-eq-hataversion0) to calculate $ \\hat A X $ we find that\n",
    "\n",
    "$$\n",
    "\\hat A X = X'\n",
    "$$\n",
    "\n",
    "so that the regression equation **fits perfectly**.\n",
    "\n",
    "This is a typical outcome in an **underdetermined least-squares** model.\n",
    "\n",
    "To reiterate, in the  **tall-skinny** case (described in [Singular Value Decomposition](https://python.quantecon.org/svd_intro.html))  in which we have a number $ n $ of observations   that is small relative to the number $ m $ of\n",
    "attributes that appear in the vector $ X_t $,  we want to fit equation [(6.1)](#equation-eq-varfirstorder).\n",
    "\n",
    "We  confront the facts that the least squares estimator is underdetermined and that the regression equation fits perfectly.\n",
    "\n",
    "To proceed, we’ll want efficiently to calculate the pseudo-inverse $ X^+ $.\n",
    "\n",
    "The pseudo-inverse $ X^+ $ will be a component of our estimator of $ A $.\n",
    "\n",
    "As our  estimator $ \\hat A $ of $ A $ we want to form an  $ m \\times m $ matrix that  solves the least-squares best-fit problem\n",
    "\n",
    "\n",
    "<a id='equation-eq-alseqn'></a>\n",
    "$$\n",
    "\\hat A = \\textrm{argmin}_{\\check A} || X' - \\check  A X ||_F \\tag{6.6}\n",
    "$$\n",
    "\n",
    "where $ || \\cdot ||_F $ denotes the Frobenius (or Euclidean) norm of a matrix.\n",
    "\n",
    "The Frobenius norm is defined as\n",
    "\n",
    "$$\n",
    "||A||_F = \\sqrt{ \\sum_{i=1}^m \\sum_{j=1}^m |A_{ij}|^2 }\n",
    "$$\n",
    "\n",
    "The minimizer of the right side of equation [(6.6)](#equation-eq-alseqn) is\n",
    "\n",
    "\n",
    "<a id='equation-eq-hataform'></a>\n",
    "$$\n",
    "\\hat A =  X'  X^{+} \\tag{6.7}\n",
    "$$\n",
    "\n",
    "where the (possibly huge) $ n \\times m $ matrix $ X^{+} = (X^\\top  X)^{-1} X^\\top $ is again a pseudo-inverse of $ X $.\n",
    "\n",
    "For some situations that we are interested in, $ X^\\top  X $ can be close to singular, a situation that  makes some numerical algorithms  be inaccurate.\n",
    "\n",
    "To acknowledge that possibility, we’ll use  efficient algorithms to  constructing\n",
    "a **reduced-rank approximation** of  $ \\hat A $ in formula [(6.5)](#equation-eq-hataversion0).\n",
    "\n",
    "Such an approximation to our vector autoregression will no longer fit perfectly.\n",
    "\n",
    "The $ i $th  row of $ \\hat A $ is an $ m \\times 1 $ vector of regression coefficients of $ X_{i,t+1} $ on $ X_{j,t}, j = 1, \\ldots, m $.\n",
    "\n",
    "An efficient way to compute the pseudo-inverse $ X^+ $ is to start with  a singular value decomposition\n",
    "\n",
    "\n",
    "<a id='equation-eq-svddmd'></a>\n",
    "$$\n",
    "X =  U \\Sigma  V^\\top \\tag{6.8}\n",
    "$$\n",
    "\n",
    "where we remind ourselves that for a **reduced** SVD, $ X $ is an $ m \\times n $ matrix of data, $ U $ is an $ m \\times p $ matrix, $ \\Sigma $  is a $ p \\times p $ matrix, and $ V $ is an $ n \\times p $ matrix.\n",
    "\n",
    "We can    efficiently  construct the pertinent pseudo-inverse $ X^+ $\n",
    "by recognizing the following string of equalities.\n",
    "\n",
    "\n",
    "<a id='equation-eq-efficientpseudoinverse'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "X^{+} & = (X^\\top  X)^{-1} X^\\top  \\\\\n",
    "  & = (V \\Sigma U^\\top  U \\Sigma V^\\top )^{-1} V \\Sigma U^\\top  \\\\\n",
    "  & = (V \\Sigma \\Sigma V^\\top )^{-1} V \\Sigma U^\\top  \\\\\n",
    "  & = V \\Sigma^{-1} \\Sigma^{-1} V^\\top  V \\Sigma U^\\top  \\\\\n",
    "  & = V \\Sigma^{-1} U^\\top  \n",
    "\\end{aligned} \\tag{6.9}\n",
    "$$\n",
    "\n",
    "(Since we are in the $ m > > n $ case in which $ V^\\top  V = I_{p \\times p} $ in a reduced SVD, we can use the preceding\n",
    "string of equalities for a reduced SVD as well as for a full SVD.)\n",
    "\n",
    "Thus, we shall  construct a pseudo-inverse $ X^+ $  of $ X $ by using\n",
    "a singular value decomposition of $ X $ in equation [(6.8)](#equation-eq-svddmd)  to compute\n",
    "\n",
    "\n",
    "<a id='equation-eq-xplusformula'></a>\n",
    "$$\n",
    "X^{+} =  V \\Sigma^{-1}  U^\\top \\tag{6.10}\n",
    "$$\n",
    "\n",
    "where the matrix $ \\Sigma^{-1} $ is constructed by replacing each non-zero element of $ \\Sigma $ with $ \\sigma_j^{-1} $.\n",
    "\n",
    "We can  use formula [(6.10)](#equation-eq-xplusformula)   together with formula [(6.7)](#equation-eq-hataform) to compute the matrix  $ \\hat A $ of regression coefficients.\n",
    "\n",
    "Thus, our  estimator $ \\hat A = X' X^+ $ of the $ m \\times m $ matrix of coefficients $ A $    is\n",
    "\n",
    "\n",
    "<a id='equation-eq-ahatsvdformula'></a>\n",
    "$$\n",
    "\\hat A = X' V \\Sigma^{-1}  U^\\top \\tag{6.11}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9eecdba",
   "metadata": {},
   "source": [
    "## Dynamic Mode Decomposition (DMD)\n",
    "\n",
    "We turn to the $ m >>n $ **tall and skinny** case  associated with **Dynamic Mode Decomposition**.\n",
    "\n",
    "Here an $ m \\times n+1 $  data matrix $ \\tilde X $ contains many more attributes (or variables) $ m $ than time periods  $ n+1 $.\n",
    "\n",
    "Dynamic mode decomposition was introduced by [[Schmid, 2010](https://python.quantecon.org/zreferences.html#id20)],\n",
    "\n",
    "You can read  about Dynamic Mode Decomposition [[Kutz *et al.*, 2016](https://python.quantecon.org/zreferences.html#id43)] and [[Brunton and Kutz, 2019](https://python.quantecon.org/zreferences.html#id259)] (section 7.2).\n",
    "\n",
    "**Dynamic Mode Decomposition** (DMD) computes a rank $ r < p $ approximation to the least squares regression coefficients $ \\hat A $  described by formula [(6.11)](#equation-eq-ahatsvdformula).\n",
    "\n",
    "We’ll  build up gradually  to a formulation that is useful  in applications.\n",
    "\n",
    "We’ll do this by describing three  alternative representations of our first-order linear dynamic system, i.e., our vector autoregression.\n",
    "\n",
    "**Guide to three representations:** In practice, we’ll mainly be interested in Representation 3.\n",
    "\n",
    "We use the first two representations  to present some useful  intermediate steps that  help us to appreciate what is under the hood of Representation 3.\n",
    "\n",
    "In applications, we’ll use only a small  subset of **DMD modes** to approximate dynamics.\n",
    "\n",
    "We use  such a small subset of DMD modes to  construct a reduced-rank approximation to $ A $.\n",
    "\n",
    "To do that, we’ll want to use the  **reduced**  SVD’s affiliated with representation 3, not the **full** SVD’s affiliated with representations 1 and 2.\n",
    "\n",
    "**Guide to impatient reader:** In our applications, we’ll be using Representation 3.\n",
    "\n",
    "You might want to skip the stage-setting representations 1 and 2 on first reading."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d290ec0f",
   "metadata": {},
   "source": [
    "## Representation 1\n",
    "\n",
    "In this representation, we shall use a **full** SVD of $ X $.\n",
    "\n",
    "We use the $ m $  **columns** of $ U $, and thus the $ m $ **rows** of $ U^\\top $,  to define   a $ m \\times 1 $  vector $ \\tilde b_t $ as\n",
    "\n",
    "\n",
    "<a id='equation-eq-tildexdef2'></a>\n",
    "$$\n",
    "\\tilde b_t = U^\\top  X_t . \\tag{6.12}\n",
    "$$\n",
    "\n",
    "The original  data $ X_t $ can be represented as\n",
    "\n",
    "\n",
    "<a id='equation-eq-xdecoder'></a>\n",
    "$$\n",
    "X_t = U \\tilde b_t \\tag{6.13}\n",
    "$$\n",
    "\n",
    "(Here we use $ b $ to remind ourselves that we are creating a **basis** vector.)\n",
    "\n",
    "Since we are now using a **full** SVD, $ U U^\\top  = I_{m \\times m} $.\n",
    "\n",
    "So it follows from equation [(6.12)](#equation-eq-tildexdef2) that we can reconstruct  $ X_t $ from $ \\tilde b_t $.\n",
    "\n",
    "In particular,\n",
    "\n",
    "- Equation [(6.12)](#equation-eq-tildexdef2) serves as an **encoder** that  **rotates** the $ m \\times 1 $ vector $ X_t $ to become an $ m \\times 1 $ vector $ \\tilde b_t $  \n",
    "- Equation [(6.13)](#equation-eq-xdecoder) serves as a **decoder** that **reconstructs** the $ m \\times 1 $ vector $ X_t $ by rotating  the $ m \\times 1 $ vector $ \\tilde b_t $  \n",
    "\n",
    "\n",
    "Define a  transition matrix for an $ m \\times 1 $ basis vector  $ \\tilde b_t $ by\n",
    "\n",
    "\n",
    "<a id='equation-eq-atilde0'></a>\n",
    "$$\n",
    "\\tilde A = U^\\top  \\hat A U \\tag{6.14}\n",
    "$$\n",
    "\n",
    "We can  recover $ \\hat A $ from\n",
    "\n",
    "$$\n",
    "\\hat A = U \\tilde A U^\\top\n",
    "$$\n",
    "\n",
    "Dynamics of the  $ m \\times 1 $ basis vector $ \\tilde b_t $ are governed by\n",
    "\n",
    "$$\n",
    "\\tilde b_{t+1} = \\tilde A \\tilde b_t\n",
    "$$\n",
    "\n",
    "To construct forecasts $ \\overline X_t $ of  future values of $ X_t $ conditional on $ X_1 $, we can apply  decoders (i.e., rotators) to both sides of this equation and deduce\n",
    "\n",
    "$$\n",
    "\\overline X_{t+1} = U \\tilde A^t U^\\top  X_1\n",
    "$$\n",
    "\n",
    "where we use $ \\overline X_{t+1}, t \\geq 1 $ to denote a forecast."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "57ced126",
   "metadata": {},
   "source": [
    "## Representation 2\n",
    "\n",
    "This representation is related to  one originally proposed by  [[Schmid, 2010](https://python.quantecon.org/zreferences.html#id20)].\n",
    "\n",
    "It can be regarded as an intermediate step on the way  to obtaining  a related   representation 3 to be presented later\n",
    "\n",
    "As with Representation 1, we continue to\n",
    "\n",
    "- use a **full** SVD and **not** a reduced SVD  \n",
    "\n",
    "\n",
    "As we observed and illustrated   in a lecture about the [Singular Value Decomposition](https://python.quantecon.org/svd_intro.html)\n",
    "\n",
    "- (a) for a full SVD $ U U^\\top  = I_{m \\times m} $ and $ U^\\top  U = I_{p \\times p} $ are both identity matrices  \n",
    "- (b)  for  a reduced SVD of $ X $, $ U^\\top  U $ is not an identity matrix.  \n",
    "\n",
    "\n",
    "As we shall see later, a full SVD is  too confining for what we ultimately want to do, namely,  cope with situations in which  $ U^\\top  U $ is **not** an identity matrix because we  use a reduced SVD of $ X $.\n",
    "\n",
    "But for now, let’s proceed under the assumption that we are using a full SVD so that  requirements (a) and (b) are both satisfied.\n",
    "\n",
    "Form an eigendecomposition of the $ m \\times m $ matrix $ \\tilde A = U^\\top  \\hat A U $ defined in equation [(6.14)](#equation-eq-atilde0):\n",
    "\n",
    "\n",
    "<a id='equation-eq-tildeaeigen'></a>\n",
    "$$\n",
    "\\tilde A = W \\Lambda W^{-1} \\tag{6.15}\n",
    "$$\n",
    "\n",
    "where $ \\Lambda $ is a diagonal matrix of eigenvalues and $ W $ is an $ m \\times m $\n",
    "matrix whose columns are eigenvectors  corresponding to rows (eigenvalues) in\n",
    "$ \\Lambda $.\n",
    "\n",
    "When $ U U^\\top  = I_{m \\times m} $, as is true with a full SVD of $ X $, it follows that\n",
    "\n",
    "\n",
    "<a id='equation-eq-eqeigahat'></a>\n",
    "$$\n",
    "\\hat A = U \\tilde A U^\\top  = U W \\Lambda W^{-1} U^\\top \\tag{6.16}\n",
    "$$\n",
    "\n",
    "According to equation [(6.16)](#equation-eq-eqeigahat), the diagonal matrix $ \\Lambda $ contains eigenvalues of $ \\hat A $ and corresponding eigenvectors of $ \\hat A $ are columns of the matrix $ UW $.\n",
    "\n",
    "It follows that the systematic (i.e., not random) parts of the $ X_t $ dynamics captured by our first-order vector autoregressions   are described by\n",
    "\n",
    "$$\n",
    "X_{t+1} = U W \\Lambda W^{-1} U^\\top   X_t\n",
    "$$\n",
    "\n",
    "Multiplying both sides of the above equation by $ W^{-1} U^\\top $ gives\n",
    "\n",
    "$$\n",
    "W^{-1} U^\\top  X_{t+1} = \\Lambda W^{-1} U^\\top  X_t\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "$$\n",
    "\\hat b_{t+1} = \\Lambda \\hat b_t\n",
    "$$\n",
    "\n",
    "where our **encoder**  is\n",
    "\n",
    "$$\n",
    "\\hat b_t = W^{-1} U^\\top  X_t\n",
    "$$\n",
    "\n",
    "and our **decoder** is\n",
    "\n",
    "$$\n",
    "X_t = U W \\hat b_t\n",
    "$$\n",
    "\n",
    "We can use this representation to construct a predictor $ \\overline X_{t+1} $ of $ X_{t+1} $ conditional on $ X_1 $  via:\n",
    "\n",
    "\n",
    "<a id='equation-eq-dssebookrepr'></a>\n",
    "$$\n",
    "\\overline X_{t+1} = U W \\Lambda^t W^{-1} U^\\top  X_1 \\tag{6.17}\n",
    "$$\n",
    "\n",
    "In effect,\n",
    "[[Schmid, 2010](https://python.quantecon.org/zreferences.html#id20)] defined an $ m \\times m $ matrix $ \\Phi_s $ as\n",
    "\n",
    "\n",
    "<a id='equation-eq-phisfull'></a>\n",
    "$$\n",
    "\\Phi_s = UW \\tag{6.18}\n",
    "$$\n",
    "\n",
    "and a generalized inverse\n",
    "\n",
    "\n",
    "<a id='equation-eq-phisfullinv'></a>\n",
    "$$\n",
    "\\Phi_s^+ = W^{-1}U^\\top \\tag{6.19}\n",
    "$$\n",
    "\n",
    "[[Schmid, 2010](https://python.quantecon.org/zreferences.html#id20)] then  represented equation [(6.17)](#equation-eq-dssebookrepr) as\n",
    "\n",
    "\n",
    "<a id='equation-eq-schmidrep'></a>\n",
    "$$\n",
    "\\overline X_{t+1} = \\Phi_s \\Lambda^t \\Phi_s^+ X_1 \\tag{6.20}\n",
    "$$\n",
    "\n",
    "Components of the  basis vector $ \\hat b_t = W^{-1} U^\\top  X_t \\equiv \\Phi_s^+ X_t $ are\\\\\n",
    "\n",
    "\n",
    "DMD **projected modes**.\n",
    "\n",
    "To understand why they are called **projected modes**, notice that\n",
    "\n",
    "$$\n",
    "\\Phi_s^+ = ( \\Phi_s^\\top  \\Phi_s)^{-1} \\Phi_s^\\top\n",
    "$$\n",
    "\n",
    "so that the $ m \\times p $ matrix\n",
    "\n",
    "$$\n",
    "\\hat b =  \\Phi_s^+ X\n",
    "$$\n",
    "\n",
    "is a matrix of regression coefficients of the $ m \\times n $ matrix $ X $ on the $ m \\times p $ matrix $ \\Phi_s $.\n",
    "\n",
    "We’ll say more about this interpretation in a related context when we discuss representation 3, which was suggested by  Tu et al. [[Tu *et al.*, 2014](https://python.quantecon.org/zreferences.html#id29)].\n",
    "\n",
    "It is more appropriate to use  representation 3  when, as is often the case  in practice, we want to use a reduced SVD."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f078c57",
   "metadata": {},
   "source": [
    "## Representation 3\n",
    "\n",
    "Departing from the procedures used to construct  Representations 1 and 2, each of which deployed a **full** SVD, we now use a **reduced** SVD.\n",
    "\n",
    "Again, we let  $ p \\leq \\textrm{min}(m,n) $ be the rank of $ X $.\n",
    "\n",
    "Construct a **reduced** SVD\n",
    "\n",
    "$$\n",
    "X = \\tilde U \\tilde \\Sigma \\tilde V^\\top ,\n",
    "$$\n",
    "\n",
    "where now $ \\tilde U $ is $ m \\times p $, $ \\tilde \\Sigma $ is $ p \\times p $, and $ \\tilde V^\\top $ is $ p \\times n $.\n",
    "\n",
    "Our minimum-norm least-squares approximator of  $ A $ now has representation\n",
    "\n",
    "\n",
    "<a id='equation-eq-ahatwithtildes'></a>\n",
    "$$\n",
    "\\hat A = X' \\tilde V \\tilde \\Sigma^{-1} \\tilde U^\\top \\tag{6.21}\n",
    "$$\n",
    "\n",
    "**Computing Dominant Eigenvectors of $ \\hat A $**\n",
    "\n",
    "We begin by paralleling a step used to construct  Representation 1, define a  transition matrix for a rotated $ p \\times 1 $ state $ \\tilde b_t $ by\n",
    "\n",
    "\n",
    "<a id='equation-eq-atildered'></a>\n",
    "$$\n",
    "\\tilde A =\\tilde  U^\\top  \\hat A \\tilde U \\tag{6.22}\n",
    "$$\n",
    "\n",
    "**Interpretation as projection coefficients**\n",
    "\n",
    "[[Brunton and Kutz, 2022](https://python.quantecon.org/zreferences.html#id44)] remark that $ \\tilde A $  can be interpreted in terms of a projection of $ \\hat A $ onto the $ p $ modes in $ \\tilde U $.\n",
    "\n",
    "To verify this, first note that, because  $ \\tilde U^\\top  \\tilde U = I $, it follows that\n",
    "\n",
    "\n",
    "<a id='equation-eq-tildeaverify'></a>\n",
    "$$\n",
    "\\tilde A = \\tilde U^\\top  \\hat A \\tilde U = \\tilde U^\\top  X' \\tilde V \\tilde \\Sigma^{-1} \\tilde U^\\top  \\tilde U \n",
    "= \\tilde U^\\top  X' \\tilde V \\tilde \\Sigma^{-1} \\tilde U^\\top \\tag{6.23}\n",
    "$$\n",
    "\n",
    "Next, we’ll just  compute the regression coefficients in a projection of $ \\hat A $ on $ \\tilde U $ using a standard least-squares formula\n",
    "\n",
    "$$\n",
    "(\\tilde U^\\top  \\tilde U)^{-1} \\tilde U^\\top  \\hat A = (\\tilde U^\\top  \\tilde U)^{-1} \\tilde U^\\top  X' \\tilde V \\tilde \\Sigma^{-1} \\tilde U^\\top  = \n",
    "\\tilde U^\\top  X' \\tilde V \\tilde \\Sigma^{-1} \\tilde U^\\top   = \\tilde A .\n",
    "$$\n",
    "\n",
    "Thus, we have verified that $ \\tilde A $ is a least-squares projection of $ \\hat A $ onto $ \\tilde U $.\n",
    "\n",
    "**An Inverse Challenge**\n",
    "\n",
    "Because we are using  a reduced SVD,  $ \\tilde U \\tilde U^\\top  \\neq I $.\n",
    "\n",
    "Consequently,\n",
    "\n",
    "$$\n",
    "\\hat A \\neq \\tilde U \\tilde A \\tilde U^\\top ,\n",
    "$$\n",
    "\n",
    "so we can’t simply  recover $ \\hat A $ from  $ \\tilde A $ and $ \\tilde U $.\n",
    "\n",
    "**A Blind Alley**\n",
    "\n",
    "We can start by   hoping for the best and proceeding to construct an eigendecomposition of the $ p \\times p $ matrix $ \\tilde A $:\n",
    "\n",
    "\n",
    "<a id='equation-eq-tildeaeigenred'></a>\n",
    "$$\n",
    "\\tilde A =  \\tilde  W  \\Lambda \\tilde  W^{-1} \\tag{6.24}\n",
    "$$\n",
    "\n",
    "where $ \\Lambda $ is a diagonal matrix of $ p $ eigenvalues and the columns of $ \\tilde W $\n",
    "are corresponding eigenvectors.\n",
    "\n",
    "Mimicking our procedure in Representation 2, we cross our fingers and compute an $ m \\times p $ matrix\n",
    "\n",
    "\n",
    "<a id='equation-eq-phisred'></a>\n",
    "$$\n",
    "\\tilde \\Phi_s = \\tilde U \\tilde W \\tag{6.25}\n",
    "$$\n",
    "\n",
    "that  corresponds to [(6.18)](#equation-eq-phisfull) for a full SVD.\n",
    "\n",
    "At this point, where $ \\hat A $ is given by formula [(6.21)](#equation-eq-ahatwithtildes) it is interesting to compute $ \\hat A \\tilde  \\Phi_s $:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\hat A \\tilde \\Phi_s & = (X' \\tilde V \\tilde \\Sigma^{-1} \\tilde U^\\top ) (\\tilde U \\tilde W) \\\\\n",
    "  & = X' \\tilde V \\tilde \\Sigma^{-1} \\tilde  W \\\\\n",
    "  & \\neq (\\tilde U \\tilde  W) \\Lambda \\\\\n",
    "  & = \\tilde \\Phi_s \\Lambda\n",
    "  \\end{aligned}\n",
    "$$\n",
    "\n",
    "That\n",
    "$ \\hat A \\tilde \\Phi_s \\neq \\tilde \\Phi_s \\Lambda $ means that, unlike the  corresponding situation in Representation 2, columns of $ \\tilde \\Phi_s = \\tilde U \\tilde  W $\n",
    "are **not** eigenvectors of $ \\hat A $ corresponding to eigenvalues  on the diagonal of matix $ \\Lambda $.\n",
    "\n",
    "**An Approach That Works**\n",
    "\n",
    "Continuing our quest for eigenvectors of $ \\hat A $ that we **can** compute with a reduced SVD,  let’s define  an $ m \\times p $ matrix\n",
    "$ \\Phi $ as\n",
    "\n",
    "\n",
    "<a id='equation-eq-phiformula'></a>\n",
    "$$\n",
    "\\Phi \\equiv \\hat A \\tilde \\Phi_s = X' \\tilde V \\tilde \\Sigma^{-1}  \\tilde  W \\tag{6.26}\n",
    "$$\n",
    "\n",
    "It turns out that columns of $ \\Phi $ **are** eigenvectors of $ \\hat A $.\n",
    "\n",
    "This is  a consequence of a  result established by Tu et al. [[Tu *et al.*, 2014](https://python.quantecon.org/zreferences.html#id29)] that we now present.\n",
    "\n",
    "**Proposition** The $ p $ columns of $ \\Phi $ are eigenvectors of $ \\hat A $.\n",
    "\n",
    "**Proof:** From formula [(6.26)](#equation-eq-phiformula) we have\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "  \\hat A \\Phi & =  (X' \\tilde  V \\tilde  \\Sigma^{-1} \\tilde  U^\\top ) (X' \\tilde  V \\Sigma^{-1} \\tilde  W) \\cr\n",
    "  & = X' \\tilde V \\tilde  \\Sigma^{-1} \\tilde A \\tilde  W \\cr\n",
    "  & = X' \\tilde  V \\tilde  \\Sigma^{-1}\\tilde  W \\Lambda \\cr\n",
    "  & = \\Phi \\Lambda \n",
    "  \\end{aligned}\n",
    "$$\n",
    "\n",
    "so that\n",
    "\n",
    "\n",
    "<a id='equation-eq-aphilambda'></a>\n",
    "$$\n",
    "\\hat A \\Phi = \\Phi \\Lambda . \\tag{6.27}\n",
    "$$\n",
    "\n",
    "Let $ \\phi_i $ be the $ i $th  column of $ \\Phi $ and $ \\lambda_i $ be the corresponding $ i $ eigenvalue of $ \\tilde A $ from decomposition [(6.24)](#equation-eq-tildeaeigenred).\n",
    "\n",
    "Equating the $ m \\times 1 $ vectors that appear on the two  sides of  equation [(6.27)](#equation-eq-aphilambda)  gives\n",
    "\n",
    "$$\n",
    "\\hat A \\phi_i = \\lambda_i \\phi_i .\n",
    "$$\n",
    "\n",
    "This equation confirms that  $ \\phi_i $ is an eigenvector of $ \\hat A $ that corresponds to eigenvalue  $ \\lambda_i $ of both  $ \\tilde A $ and $ \\hat A $.\n",
    "\n",
    "This concludes the proof.\n",
    "\n",
    "Also see [[Brunton and Kutz, 2022](https://python.quantecon.org/zreferences.html#id44)] (p. 238)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da56aa94",
   "metadata": {},
   "source": [
    "### Decoder of  $ \\check b $ as a linear projection\n",
    "\n",
    "From  eigendecomposition [(6.27)](#equation-eq-aphilambda) we can represent $ \\hat A $ as\n",
    "\n",
    "\n",
    "<a id='equation-eq-aform12'></a>\n",
    "$$\n",
    "\\hat A = \\Phi \\Lambda \\Phi^+ . \\tag{6.28}\n",
    "$$\n",
    "\n",
    "From formula [(6.28)](#equation-eq-aform12) we can deduce  dynamics of the $ p \\times 1 $ vector $ \\check b_t $:\n",
    "\n",
    "$$\n",
    "\\check b_{t+1} = \\Lambda \\check b_t\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "\n",
    "<a id='equation-eq-decoder102'></a>\n",
    "$$\n",
    "\\check b_t  = \\Phi^+ X_t \\tag{6.29}\n",
    "$$\n",
    "\n",
    "Since the $ m \\times p $ matrix $ \\Phi $ has $ p $ linearly independent columns, the generalized inverse of $ \\Phi $ is\n",
    "\n",
    "$$\n",
    "\\Phi^{+} = (\\Phi^\\top  \\Phi)^{-1} \\Phi^\\top\n",
    "$$\n",
    "\n",
    "and so\n",
    "\n",
    "\n",
    "<a id='equation-eq-checkbform'></a>\n",
    "$$\n",
    "\\check b = (\\Phi^\\top  \\Phi)^{-1} \\Phi^\\top  X \\tag{6.30}\n",
    "$$\n",
    "\n",
    "The $ p \\times n $  matrix $ \\check b $  is recognizable as a  matrix of least squares regression coefficients of the $ m \\times n $  matrix\n",
    "$ X $ on the $ m \\times p $ matrix $ \\Phi $ and consequently\n",
    "\n",
    "\n",
    "<a id='equation-eq-xcheck'></a>\n",
    "$$\n",
    "\\check X = \\Phi \\check b \\tag{6.31}\n",
    "$$\n",
    "\n",
    "is an $ m \\times n $ matrix of least squares projections of $ X $ on $ \\Phi $.\n",
    "\n",
    "**Variance Decomposition of $ X $**\n",
    "\n",
    "By virtue of the least-squares projection theory discussed in  this quantecon lecture  [https://python-advanced.quantecon.org/orth_proj.html](https://python-advanced.quantecon.org/orth_proj.html), we can represent $ X $ as the sum of the projection $ \\check X $ of $ X $ on $ \\Phi $  plus a matrix of errors.\n",
    "\n",
    "To verify this, note that the least squares projection $ \\check X $ is related to $ X $ by\n",
    "\n",
    "$$\n",
    "X = \\check X + \\epsilon\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "\n",
    "<a id='equation-eq-xbcheck'></a>\n",
    "$$\n",
    "X = \\Phi \\check b + \\epsilon \\tag{6.32}\n",
    "$$\n",
    "\n",
    "where $ \\epsilon $ is an $ m \\times n $ matrix of least squares errors satisfying the least squares orthogonality conditions $ \\epsilon^\\top  \\Phi =0 $ or\n",
    "\n",
    "\n",
    "<a id='equation-eq-orthls'></a>\n",
    "$$\n",
    "(X - \\Phi \\check b)^\\top  \\Phi = 0_{m \\times p} \\tag{6.33}\n",
    "$$\n",
    "\n",
    "Rearranging  the orthogonality conditions [(6.33)](#equation-eq-orthls) gives $ X^\\top  \\Phi = \\check b \\Phi^\\top  \\Phi $, which implies formula [(6.30)](#equation-eq-checkbform)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7fdba9ad",
   "metadata": {},
   "source": [
    "### An Approximation\n",
    "\n",
    "We now describe a way to approximate  the $ p \\times 1 $ vector $ \\check b_t $ instead of using  formula [(6.29)](#equation-eq-decoder102).\n",
    "\n",
    "In particular, the following argument adapted from [[Brunton and Kutz, 2022](https://python.quantecon.org/zreferences.html#id44)] (page 240) provides a computationally efficient way to approximate $ \\check b_t $.\n",
    "\n",
    "For convenience, we’ll apply the method at  time $ t=1 $.\n",
    "\n",
    "For $ t=1 $, from equation [(6.32)](#equation-eq-xbcheck) we have\n",
    "\n",
    "\n",
    "<a id='equation-eq-x1proj'></a>\n",
    "$$\n",
    "\\check X_1 = \\Phi \\check b_1 \\tag{6.34}\n",
    "$$\n",
    "\n",
    "where $ \\check b_1 $ is a $ p \\times 1 $ vector.\n",
    "\n",
    "Recall from representation 1 above that  $ X_1 =  U \\tilde b_1 $, where $ \\tilde b_1 $ is a time $ 1 $  basis vector for representation 1 and $ U $ is from the full SVD  $ X = U \\Sigma V^\\top $.\n",
    "\n",
    "It  then follows from equation [(6.32)](#equation-eq-xbcheck) that\n",
    "\n",
    "$$\n",
    "U \\tilde b_1 = X' \\tilde V \\tilde \\Sigma^{-1} \\tilde  W \\check b_1 + \\epsilon_1\n",
    "$$\n",
    "\n",
    "where $ \\epsilon_1 $ is a least-squares error vector from equation [(6.32)](#equation-eq-xbcheck).\n",
    "\n",
    "It follows that\n",
    "\n",
    "$$\n",
    "\\tilde b_1 = U^\\top  X' V \\tilde \\Sigma^{-1} \\tilde W \\check b_1 + U^\\top  \\epsilon_1\n",
    "$$\n",
    "\n",
    "Replacing the error term $ U^\\top  \\epsilon_1 $ by zero, and replacing $ U $ from a **full** SVD of $ X $ with $ \\tilde U $ from a **reduced** SVD,  we obtain  an approximation $ \\hat b_1 $ to $ \\tilde b_1 $:\n",
    "\n",
    "$$\n",
    "\\hat b_1 = \\tilde U^\\top  X' \\tilde V \\tilde \\Sigma^{-1} \\tilde  W \\check b_1\n",
    "$$\n",
    "\n",
    "Recall that  from equation [(6.23)](#equation-eq-tildeaverify),  $ \\tilde A = \\tilde U^\\top  X' \\tilde V \\tilde \\Sigma^{-1} $.\n",
    "\n",
    "It then follows  that\n",
    "\n",
    "$$\n",
    "\\hat  b_1 = \\tilde   A \\tilde W \\check b_1\n",
    "$$\n",
    "\n",
    "and therefore, by the  eigendecomposition  [(6.24)](#equation-eq-tildeaeigenred) of $ \\tilde A $, we have\n",
    "\n",
    "$$\n",
    "\\hat b_1 = \\tilde W \\Lambda \\check b_1\n",
    "$$\n",
    "\n",
    "Consequently,\n",
    "\n",
    "$$\n",
    "\\hat b_1 = ( \\tilde W \\Lambda)^{-1} \\tilde b_1\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "\n",
    "<a id='equation-eq-beqnsmall'></a>\n",
    "$$\n",
    "\\hat b_1 = ( \\tilde W \\Lambda)^{-1} \\tilde U^\\top  X_1 , \\tag{6.35}\n",
    "$$\n",
    "\n",
    "which is a computationally efficient approximation to  the following instance of  equation [(6.29)](#equation-eq-decoder102) for  the initial vector $ \\check b_1 $:\n",
    "\n",
    "\n",
    "<a id='equation-eq-bphieqn'></a>\n",
    "$$\n",
    "\\check b_1= \\Phi^{+} X_1 \\tag{6.36}\n",
    "$$\n",
    "\n",
    "(To highlight that [(6.35)](#equation-eq-beqnsmall) is an approximation, users of  DMD sometimes call  components of   basis vector $ \\check b_t  = \\Phi^+ X_t $  the  **exact** DMD modes and components of $ \\hat b_t = ( \\tilde W \\Lambda)^{-1} \\tilde U^\\top  X_t $ the **approximate** modes.)\n",
    "\n",
    "Conditional on $ X_t $, we can compute a decoded $ \\check X_{t+j},   j = 1, 2, \\ldots $  from the exact modes via\n",
    "\n",
    "\n",
    "<a id='equation-eq-checkxevoln'></a>\n",
    "$$\n",
    "\\check X_{t+j} = \\Phi \\Lambda^j \\Phi^{+} X_t \\tag{6.37}\n",
    "$$\n",
    "\n",
    "or  use compute a decoded $ \\hat X_{t+j} $ from  approximate modes via\n",
    "\n",
    "\n",
    "<a id='equation-eq-checkxevoln2'></a>\n",
    "$$\n",
    "\\hat X_{t+j} = \\Phi \\Lambda^j (\\tilde W \\Lambda)^{-1}  \\tilde U^\\top  X_t . \\tag{6.38}\n",
    "$$\n",
    "\n",
    "We can then use  a decoded $ \\check X_{t+j} $ or $ \\hat X_{t+j} $ to forecast $ X_{t+j} $."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb90a0dc",
   "metadata": {},
   "source": [
    "### Using Fewer Modes\n",
    "\n",
    "In applications, we’ll actually  use only  a few modes, often  three or less.\n",
    "\n",
    "Some of the preceding formulas assume that we have retained all $ p $ modes associated with  singular values of $ X $.\n",
    "\n",
    "We can  adjust our  formulas to describe a situation in which we instead retain only\n",
    "the $ r < p $ largest singular values.\n",
    "\n",
    "In that case, we simply replace $ \\tilde \\Sigma $ with the appropriate $ r\\times r $ matrix of singular values, $ \\tilde U $ with the $ m \\times r $ matrix  whose columns correspond to the $ r $ largest singular values, and $ \\tilde V $ with the $ n \\times r $ matrix whose columns correspond to the $ r $ largest  singular values.\n",
    "\n",
    "Counterparts of all of the salient formulas above then apply."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0274e7f3",
   "metadata": {},
   "source": [
    "## Source for Some Python Code\n",
    "\n",
    "You can find a Python implementation of DMD here:\n",
    "\n",
    "[https://mathlab.sissa.it/pydmd](https://mathlab.sissa.it/pydmd)"
   ]
  }
 ],
 "metadata": {
  "date": 1751441160.0865927,
  "filename": "var_dmd.md",
  "kernelspec": {
   "display_name": "Python",
   "language": "python3",
   "name": "python3"
  },
  "title": "VARs and DMDs"
 },
 "nbformat": 4,
 "nbformat_minor": 5
}