Advanced Algebra part of Homework
本文最后更新于:1 年前
\(\large\textcolor{blue}{\mbox{Advanced Algebra } \small \mathbb{HW}\mathrm{2}}\ \ \ \ \ \ _\textcolor{blue}{2022.3.16}\)
1.2.1\(\ \small \mbox{four subspaces}\)
Prove or find counter examples.
- For four subspaces, if any three of them are linearly independent, then the four subspaces are linearly independent.
- If subspaces \(V_{1}, V_{2}\) are linearly independent, and \(V_{1}, V_{3}, V_{4}\) are linearly independent, and \(V_{2}, V_{3}, V_{4}\) are linearly independent, then all four subspaces are linearly independent.
- If \(V_{1}, V_{2}\) are linearly independent, and \(V_{3}, V_{4}\) are linearly independent, and \(V_{1}+V_{2}, V_{3}+V_{4}\) are linearly independent, then all four subspaces are linearly independent.
\((1)\) Construct four subspaces below. It is obvious that any three of them are linearly independent, but four subspaces together are linearly dependent. (\(\dim \mathbb {R}^{3}=3\)) \[ V_1=\{\begin{bmatrix}k\\0\\0\end{bmatrix}\mid k\in \mathbb{R}\},V_2=\{\begin{bmatrix}0\\k\\0\end{bmatrix}\mid k\in \mathbb{R}\},V_3=\{\begin{bmatrix}0\\0\\k\end{bmatrix}\mid k\in \mathbb{R}\},V_4=\{\begin{bmatrix}k\\k\\k\end{bmatrix}\mid k\in \mathbb{R}\} \] \((2)\) Construct four subspaces below. We can prove that each of \(V_1,V_2\) and \(V_1,V_3,V_4\) and \(V_2,V_3,V_4\) are linearly independent. \[ V_1=\{\begin{bmatrix}k\\0\\0\end{bmatrix}\mid k\in \mathbb{R}\},V_2=\{\begin{bmatrix}0\\k\\0\end{bmatrix}\mid k\in \mathbb{R}\},V_3=\{\begin{bmatrix}k\\0\\k\end{bmatrix}\mid k\in \mathbb{R}\},V_4=\{\begin{bmatrix}0\\k\\k\end{bmatrix}\mid k\in \mathbb{R}\} \] However, pick some special vectors from these subspaces and its linear combination are zero \[ 1\cdot \begin{bmatrix}1\\0\\0\end{bmatrix}+(-1)\cdot \begin{bmatrix}0\\1\\0\end{bmatrix}+(-1)\begin{bmatrix}1\\0\\1\end{bmatrix}+1\cdot \begin{bmatrix}0\\1\\1\end{bmatrix}=\vec{0} \] \((3)\) Reduction to absurdity, assume that four real numbers that not all of them is zero \[ a_1\vec{v}_1+a_2\vec{v}_2+a_3\vec{v}_3+a_4\vec{v}_4=0 \] We can proof that \(a_1\vec{v}_1+a_2\vec{v}_2\neq 0\), otherwise according to the independence of \(V_1,V_2\) and
\(V_3,V_4\) \(a_1=a_2=0,a_3=a_4=0\), so \(a_1\vec{v}_1+a_2\vec{v}_2\neq 0,a_3\vec{v}_3+a_4\vec{v}_4\neq 0\)
But \(a_1\vec{v}_1+a_2\vec{v}_2\in V_1+V_2,a_3\vec{v}_3+a_4\vec{v}_4\in V_3+V_4\), so linear combination of \(a_1\vec{v}_1+a_2\vec{v}_2\) and
\(a_3\vec{v}_3+a_4\vec{v}_4\) can add up to \(0\), which is contrary to the independence of \(V_1+V_2.V_3+V_4\)
All in all, the four subspaces must be linearly independence
1.2.2\(\ \small \mbox{decomposition of transpose}\)
Let \(V\) be the space of \(n \times n\) real matrices. Let \(T: V \rightarrow V\) be the transpose operation, i.e., \(T\) sends \(A\) to \(A^{\mathrm{T}}\) for each \(A \in V\). Find a non-trivial \(T\)-invariant decomposition of \(V\), and find the corresponding block form of \(T\). (Here we use real matrices for your convenience, but the statement is totally fine for complex matrices and conjugate transpose.)
Set \(S=\{A\mid A=A^{T},A\in M_{n\times n}\}\), this decomposition is invariant. Because after transposing any symmetric matrix, the matrix remains itself.
Set \(S'=\{A\mid A=-A^{T},A\in M_{n\times n}\}\), \(A=-A^{T}\Longrightarrow A^{T}=-A=-(A^{T})^{T}\), so any antisymmetric matrix's transpose is antisymmetric.
So decompose the linear map of \(T\) into \(S\) and \(S'\), \(\dim S=\dfrac{n(n+1)}{2},\dim S'=\dfrac{n(n-1)}{2}\)
Because any \(n\times n\) matrix \(B=\dfrac{B+B^{T}}{2}+\dfrac{B-B^{T}}{2}\) , so transpose \(T\) can be decomposed
into \(S\) and \(S'\) two block form.
\(A\to \begin{pmatrix}\frac{A+A^{T}}{2}&0\\0&\frac{A-A^{T}}{2}\end{pmatrix}\) so the corresponding block form of \(T\) is \(\begin{pmatrix}I&0\\0&-I\end{pmatrix}\)
1.2.3\(\ \small \mbox{ultimate subspaces }\)
Let \(p(x)\) be any polynomial, and define \(p(A)\) in the obvious manner. E.g., if \(p(x)=\) \(x^{2}+2 x+3\), then \(p(A)=A^{2}+2 A+3 I\). We fix some \(n \times n\) matrix \(A\).
- If \(A B=B A\), show that \(\operatorname{Ker}(B), \operatorname{Ran}(B)\) are both \(A\)-invariant subspaces.
- Prove that \(A p(A)=p(A) A\).
- Conclude that \(N_{\infty}(A-\lambda I), R_{\infty}(A-\lambda I)\) are both \(A\)-invariant for any \(\lambda \in \mathbb{C}\).
\((1)\) For any vector \(\vec{v}\in \mbox{Ker}(B),B\vec{v}=\vec{0}\), so \(B(A\vec{v})=BA\vec{v}=AB\vec{v}=0,A\vec{v}\in \mbox{Ker}(B)\)
And for any vector \(\vec{v}\in \mbox{Ran}(B),B\vec{x}=\vec{v}\), so \(A\vec{v}=A(B\vec{v})=AB\vec{v}=BA\vec{v}=B(A\vec{v})\)
\(A\vec{v}\in \mbox{Ran}(B)\). So \(\mbox{Ker}(B)\) and \(\mbox{Ran}(B)\) are both \(A-\)invariant subspaces
\((2)\) Similar to polynomial, set
\(p(A)=\displaystyle
\sum_{i=0}^{n}a_{i}A^{n}\), so calculate \(Ap(A)\)
\[
Ap(A)=A\displaystyle
\sum_{i=0}^{n}a_iA^{n}=\sum_{i=0}^{n}a_{i}A^{n+1}=(\sum_{i=0}^{n}a_{i}A^{n})A=p(A)A
\] \((3)\) Due to limited
dimension \(n\) of \(A\), \(N_{\infty}(A-\lambda I)\) and \(R_{\infty}(A-\lambda I)\) are limited
combinations
of \(\mbox{Ker}(A-\lambda I)^{k}\) and \(\mbox{Ran}(A-\lambda I)^{k}\) . According to the conclusion from \(\small (1)\) and $(2) $
\(A(A-\lambda I)=(A-\lambda I)A,A(A-\lambda I)^2=(A-\lambda I)^2A,\cdots ,A(A-\lambda I)^k=(A-\lambda I)^kA\)
so \(\forall \ k\in Z\) , \(\mbox{Ker}(A-\lambda I)^{k}\) and \(\mbox{Ran}(A-\lambda I)^{k}\) are all \(A-\)invariant. Add them all together
So, \(N_{\infty}(A-\lambda I), R_{\infty}(A-\lambda I)\) are both \(A\)-invariant for any \(\lambda \in \mathbb{C}\).
1.2.4\(\ \small \mbox{interchangeability and common eigenvector}\)
Note that any linear map must have at least one eigenvector. (You may try to prove this yourself, but it is not part of this homework.) You may use this fact freely in this problem. Fix any two \(n \times n\) square matrices \(A, B\). Suppose \(A B=B A\).
- If \(W\) is an A-invariant subspace, show that \(A\) has an eigenvector in \(W\).
- Show that \(\operatorname{Ker}(A-\lambda I)\) is always \(B\)-invariant for all \(\lambda \in \mathbb{C}\). (Hint: Last problem.)
- Show that \(A, B\) has a common eigenvector. (Hint: Last two sub-problem.)
\((1)\) Construct a new linear map from \(W\) to \(W\) \[ W\longmapsto W: \vec{w}\longmapsto A\vec{w} \] According to the fact that any linear map must have at least one eigenvector, A has an eigenvector in \(W\)
\((2)\) For any vector \(\vec{v}\) from \(\mbox{Ker}(A-\lambda I)\), \((A-\lambda I)\vec{v}=0,A\vec{v}=\lambda\vec{v}\), use \(AB=BA\)
Because \((A-\lambda I)(B\vec{v})=(AB-\lambda B)\vec{v}=(BA-\lambda B)\vec{v}=B(A-\lambda I)\vec{v}=0\)
So \(\mbox{Ker}(A-\lambda I)\) is \(B-\)invariant for all \(\lambda \in \mathbb{C}\)
\((3)\) According to \(\small (1)\) and \(\small (2)\), there exists at least one eigenvector \(\vec{v}\) in \(\mbox{Ker}(A-\lambda I)\) that
\(B\vec{v}=\lambda_{B}\vec{v}\) And the vector \(\vec{v}\) also satisfies that \(A\vec{v}=\lambda_{A}\vec{v}\)
So if \(A\) and \(B\) are interchangeable, they must have common eigenvector.
\(\large\textcolor{blue}{\mbox{Advanced Algebra } \small \mathbb{HW}\mathrm{3}}\ \ \ \ \ \ _\textcolor{blue}{2022.3.24}\)
1.3.1\(\ \small \mbox{jordan normal form}\)
Find a basis in the following vector space so that the linear map involved will be in Jordan normal form. Also find the Jordan normal form.
- \(V=\mathbb{C}^{2}\) is a real vector space, and \(A: V \rightarrow V\) that sends \(\left[\begin{array}{l}x \\ y\end{array}\right]\) to \(\left[\begin{array}{c}\bar{x}-\Re(y) \\ (1+i) \Im(x)-y\end{array}\right]\) is a real linear map. (Here \(\bar{x}\) means the complex conjugate of a complex number \(x\), and \(\Re(x), \Im(x)\) means the real part and the imaginary part of a complex number \(x .)\)
- \(V=P_{4}\), the real vector space space of all real polynomials of degree at most 4. And \(A: V \rightarrow V\) is a linear map such that \(A(p(x))=p^{\prime}(x)+p(0)+p^{\prime}(0) x^{2}\) for each polynomial \(p \in P_{4}\).
- \(A=\left[\begin{array}{llll} & & & a_{1} \\ & & a_{2} & \\ & a_{3} & & \\ a_{4} & & & \end{array}\right]\). Be careful here. Maybe we have many possibilities for its Jordan normal form depending on the values of \(a_{1}, a_{2}, a_{3}, a_{4}\).
\((1)\) \(A_1=\begin{pmatrix}1&0&-1&0\\0&-1&0&0\\0&1&-1&0\\0&1&0&-1\end{pmatrix}=\begin{pmatrix}1&2&1&0\\0&0&4&0\\0&4&0&0\\0&4&0&1\end{pmatrix}\begin{pmatrix}1&0&0&0\\0&-1&1&0\\0&0&-1&0\\0&0&0&-1\end{pmatrix}\begin{pmatrix}1&2&1&0\\0&0&4&0\\0&4&0&0\\0&4&0&1\end{pmatrix}^{-1}\)
\((2)\) \(A:a_0+a_1x+a_2x^2+a_3x^3+a_4x^4\longmapsto(a_0+a_1)+2a_2x+(a_1+2a_3)x^2+4a_4x^3\) \[ \begin{gathered} \begin{pmatrix}1&1&0&0&0\\0&0&2&0&0\\0&1&0&3&0\\0&0&0&0&4\\0&0&0&0&0\end{pmatrix}=J\begin{pmatrix}1&0&0&0&0\\0&\sqrt{2}&0&0&0\\0&0&-\sqrt{2}&0&0\\0&0&0&0&1\\0&0&0&0&0\end{pmatrix}J^{-1}\\ J=\begin{pmatrix}1&\sqrt{2}+1&1-\sqrt{2}&12&12\\0&1&1&-12&0\\0&\dfrac{\sqrt{2}}{2}&-\dfrac{\sqrt{2}}{2}&0&-6\\0&0&0&4&0\\0&0&0&0&1\end{pmatrix} \end{gathered} \] \((3)\) \[ J=\begin{pmatrix}J_{1,4}&O\\O&J_{2,3}\end{pmatrix},\mbox{where }J_{i,j}=\begin{cases}\begin{pmatrix}\sqrt{a_ia_j}&0\\0&-\sqrt{a_ia_j}\end{pmatrix}&a_i\neq 0,a_j\neq 0\\\begin{pmatrix}0&0\\0&0\end{pmatrix}&a_i= 0,a_j\neq 0\mbox{ or }a_i\neq 0,a_j=0 \\\begin{pmatrix}0&1\\0&0\end{pmatrix}&a_i=a_j=0\\\end{cases} \]
1.3.2\(\ \small \mbox{partitions of interger}\)
A partition of integer \(n\) is a way to write \(n\) as a sum of other positive integers, say \(5=2+2+1\). If you always order the summands from large to small, you end up with a dot diagram, where each column represent an integer: \(\left[\begin{array}{ll}\cdot & \cdot \\ \cdot & \cdot \\ \cdot\end{array}\right]\). Similarly, \(7=4+2+1\) should be represented as \(\left[\begin{array}{lll} \cdot & \cdot & \cdot \\ \cdot & \cdot & \\ \cdot & & \\ \cdot & & \end{array}\right]\)
If the Jordan normal form of an \(n \times n\) nilpotent matrix \(A\) is diag \(\left(J_{a_{1}}, J_{a_{2}}, \ldots, J_{a_{k}}\right)\), then we have a partition of integer \(n=a_{1}+\ldots+a_{k}\). However, we also have a partition of integer \(n=\small [\operatorname{dim} \operatorname{Ker}(A)]+\left[\operatorname{dim} \operatorname{Ker}\left(A^{2}\right)-\operatorname{dim} \operatorname{Ker}(A)\right]+\left[\operatorname{dim} \operatorname{Ker}\left(A^{3}\right)-\operatorname{dim} \operatorname{Ker}\left(A^{2}\right)\right]+\ldots\) where we treat the content of each bracket as a positive integer. Can you find a relation between the two dot diagrams?
A partition of integer \(n=a_{1}+\ldots+a_{k}\) is called self-conjugate if, for the matrix \(A=\operatorname{diag}\left(J_{a_{1}}, J_{a_{2}}, \ldots, J_{a_{k}}\right)\), the two dot diagrams you obtained above are the same. Show that, for a fixed integer n, the number of self-conjugate partition of \(n\) is equal to the number of partition of \(n\) into distinct odd positive integers. (Hint: For a self-conjugate dot diagram, count the total number of dots that are either in the first column or in the first row or in both. Is this always odd?)
Suppose a 4 by 4 matrix \(A\) is nilpotent and upper trianguler, and all \((i, j)\) entries for \(i<j\) are chosen randomly and uniformly in the interval \([-1,1]\). What are the probabilities that its Jordan canonical form corresponds to the partitions \(4=4,4=3+1,4=2+2,4=2+1+1,4=1+1+1+1\) ?
\((1)\) I can find that the sequence of each bracket' number is not incremental, which is just like the dot graph. Put \(\small [\operatorname{dim} \operatorname{Ker}(A)],\left[\operatorname{dim} \operatorname{Ker}\left(A^{2}\right)-\operatorname{dim} \operatorname{Ker}(A)\right],\left[\operatorname{dim} \operatorname{Ker}\left(A^{3}\right)-\operatorname{dim} \operatorname{Ker}\left(A^{2}\right)\right],\ldots\) like the dot graph. According to the 'killing chain', the number of each row's dots of the dot graph is just \(a_1,a_2,a_3,\ldots\)
\((2)\) The self-conjugate of the partition is just like a flying wing. Like this \(\left[\begin{array}{lll} \cdot & \cdot & \cdot & \cdot&\cdot &\cdot \\ \cdot & \cdot & \cdot & \cdot&\cdot \\ \cdot &\cdot &\cdot & \\\cdot &\cdot &\\ \cdot &\cdot &\\\cdot \end{array}\right]\)
If rudely call this as a matrix \(A\), because of the condition, we have \(A=A^{T}\).
Consider the outermost corner, the total number of the dots is \(n+n-1=2n-1\in \mbox{odd}\)
And deprive each corner, the newest outermost corner also satisfies the odd condition.
So every self-conjugate can be correspondence to odd-partition. And for each odd-partition, we can construct the matrix one corner by one corner.
In conclusion, the two numbers are the same.
\((3)\) \(A\) is nilpotent and its dimension is \(4\), so \(A^{4}=O\). Consider the diagonal elements.
\(A^4(i,i)=(A(i,i))^4=0\), so \(A\) is like \(A=\begin{pmatrix}0&\mbox{ran}&\mbox{ran}&\mbox{ran}\\0&0&\mbox{ran}&\mbox{ran}\\0&0&0&\mbox{ran}\\0&0&0&0\end{pmatrix}\). It is obvious that all the
eigenvalues are \(0\). Consider \(\mbox{Ker}(A),\mbox{Ker}(A^2),\mbox{Ker}(A^3),\mbox{Ke}(A^4)\), dimension of each is nearly
\(1,2,3,4\) , since each \(\mbox{ran}\) is randomly chosen from \([-1,1]\). The probability \(4=1+1+1+1\)
is \(1\), others are \(0\).
\(\large\textcolor{blue}{\mbox{Advanced Algebra } \small \mathbb{HW}\mathrm{6}}\ \ \ \ \ \ _\textcolor{blue}{2022.4.29}\)
1.6.1\(\ \small \mbox{Vandermode matrix}\)
Let \(V\) be the space of real polynomials of degree less than \(n\). So \(\operatorname{dim} V=n\). Then for each \(a \in \mathbb{R}\), the evaluation \(\mathrm{ev}_{a}\) is a dual vector.
For any real numbers \(a_{1}, \ldots, a_{n} \in \mathbb{R}\), consider the map \(L: V \rightarrow \mathbb{R}^{n}\) such that \(L(p)=\left[\begin{array}{c}p\left(a_{1}\right) \\ \vdots \\ p\left(a_{n}\right)\end{array}\right]\).
- Write out the matrix for \(L\) under the basis \(1, x, \ldots, x^{n-1}\) for \(V\) and the standard basis for \(\mathbb{R}^{n}\). (Do you know the name for this matrix?)
- Prove that \(L\) is invertible if and only if \(a_{1}, \ldots, a_{n}\) are distinct. (If you can name the matrix \(L\), then you may use its determinant formula without proof.)
- Show that \(\mathrm{ev}_{a_{1}}, \ldots, \mathrm{ev}_{a_{n}}\) form a basis for \(V^{*}\) if and only if all \(a_{1}, \ldots, a_{n}\) are distinct.
- Set \(n=3\). Find polynomials \(p_{1}, p_{2}, p_{3}\) such that \(p_{i}(j)=\delta_{i j}\) for \(i, j \in\{-1,0,1\}\).
- Set \(n=4\), and consider \(\mathrm{ev}_{-2}, \mathrm{ev}_{-1}, \mathrm{ev}_{0}, \mathrm{ev}_{1}, \mathrm{ev}_{2} \in V^{*}\). Since \(\operatorname{dim} V^{*}=4\), these must be linearly dependent. Find a non-trivial linear combination of these which is zero.
\((1)\) \(L=\left[\begin{array}{cccc} 1 & a_{1} & \cdots & a_{1}^{n-1} \\ 1 & a_{2} & \cdots & a_{2}^{n-1} \\ \vdots & & & \vdots \\ 1 & a_{n} & \cdots & a_{n}^{n-1} \end{array}\right]\)Vandermode matrix
Calculate its inverse and we can get standard basis using Lagrange interpolation. \[ \left(\begin{array}{cccc} 1 & a_{1} & \cdots & a_{1}^{n-1} \\ 1 & a_{2} & \cdots & a_{2}^{n-1} \\ \vdots & & & \vdots \\ 1 & a_{n} & \cdots & a_{n}^{n-1} \end{array}\right)\left(\begin{array}{c}x_{0} \\ x_{1} \\ x_{2} \\ \vdots \\ x_{n-1}\end{array}\right)=\left(\begin{array}{c}y_{0} \\ y_{1} \\ y_{2} \\ \vdots \\ y_{n-1}\end{array}\right)\\\left(\begin{array}{cccc} 1 & a_{1} & \cdots & a_{1}^{n-1} \\ 1 & a_{2} & \cdots & a_{2}^{n-1} \\ \vdots & & & \vdots \\ 1 & a_{n} & \cdots & a_{n}^{n-1} \end{array}\right)^{-1}\left(\begin{array}{c}y_{0} \\ y_{1} \\ y_{2} \\ \vdots \\ y_{n-1}\end{array}\right)=\left(\begin{array}{c}x_{0} \\ x_{1} \\ x_{2} \\ \vdots \\ x_{n-1}\end{array}\right) \] Construct polynomial \(f(a)=\displaystyle \sum_{i}y_i \prod_{j \neq i} \frac{a-a_{j}}{a_{i}-a_{j}}\) So \(f(a_i)=x_i\)
So \(\left(V^{-1}\right)_{i j}\) is the coefficient of \(\displaystyle \prod_{k \neq i} \frac{a-a_{k}}{a_{i}-a_{k}}\) at \(x^{j-1}\) , which is \(\displaystyle\left(V^{-1}\right)_{i j}=\left[x^{j-1}\right] \prod_{k \neq i} \frac{a-a_{k}}{a_{i}-a_{k}}\) \[ \left(V^{-1}\right)_{i j}=(-1)^{j+1} \frac{\displaystyle \sum_{0 \leq p_{1}<\cdots<p_{n-\zeta} ; p_{1}, p_{2}, \cdots p_{n-j} \neq i} x_{p_{1}} x_{p_{2}} \cdots x_{p_{n-j}}}{\displaystyle \prod_{0 \leq k<n ; k \neq i}\left(x_{k}-x_{i}\right)} \] The column of the \((V^{-1})\) is the standard basis of \(V\)
\((2)\) \(\det L=\displaystyle \prod_{1\leq i\leq j\leq n }(a_i-a_j)\neq 0\Longleftrightarrow a_i\neq a_j\)
\((3)\) \(e v_{a_{1}}, \cdots, ev_{a_n}\) form a basis for \(V^{*} \Longleftrightarrow e v_{a_{1}}, e v_{a_{2}}, \cdots, ev_{a_n}\), evan are linearly indepenctent.
The matrix \(\left(\begin{array}{c}e v_{a_{1}} \\ e v_{a_{2}} \\ \vdots \\ e v_{a_{n}}\end{array}\right)\) is invertible, which means \(L\) is invertible. According to \(\small(2)\), \(a_i\) are distinct.
\((4)\) Pick original basis \(\{1,x,x^2\}\) So \(\alpha_{1}=(1,-1,1) \quad \alpha_{2}=(1,0,0) \quad \alpha_{3}=(1,1,1)\) \[ A=\left(\begin{array}{l}\alpha_{1} \\ \alpha_{2} \\ \alpha_{3}\end{array}\right)=\left(\begin{array}{ccc}1 & -1 & 1 \\ 1 & 0 & 0 \\ 1 & 1 & 1\end{array}\right) \quad A^{-1}=\left(\begin{array}{ccc}0 & 1 & 0 \\ -\frac{1}{2} & 0 & \frac{1}{2} \\ \frac{1}{2} & -1 & \frac{1}{2}\end{array}\right) \] So \(p_{1}=-\dfrac{1}{2} x+\dfrac{1}{2} x^{2} \quad p_{2}=1-x^{2} \quad p_{3}=\dfrac{1}{2} x+\dfrac{1}{2} x^{2}\)
\((5)\) Set \(p(x)=ax^3+bx^2+cx+d\) and \[m e v_{-2}+n e v_{-1}+p e v_{0}+q e v_{1}+r e v_{2}=0\] \[ \begin{gathered} e v_{-2}=-8 a+4 b-2 c+d \quad e v_{-1}=-a+b-c+d \quad e v_{0}=d\\ \quad e v_{1}=a+b+c+d \quad e v_{2}=8 a+4 b+2 c+d\\ \left(\begin{array}{ccccc}-8 & -1 & 0 & 1 & 8 \\ 4 & 1 & 0 & 1 & 4 \\ -2 & -1 & 0 & 1 & 2 \\ 1 & 1 & 1 & 1 &1\end{array}\right)\left(\begin{array}{l}m \\ n \\ p \\ q \\ r\end{array}\right)=\boldsymbol 0 \end{gathered} \] Solve that \(e v_{-2}-4 e v_{-1}+6 e v_{0}-4 e v_{1}+e v_{2}=0\)
1.6.2\(\ \small \mbox{dual vector in polynomials}\)
Let \(V\) be the space of real polynomials of degree less than \(3\). Which of the following is a dual vector? Prove it or show why not.
- \(p \mapsto \operatorname{ev}_{5}((x+1) p(x))\).
- \(p \mapsto \lim _{x \rightarrow \infty} \dfrac{p(x)}{x}\).
- \(p \mapsto \lim _{x \rightarrow \infty} \dfrac{p(x)}{x^{2}}\).
- \(p \mapsto p(3) p^{\prime}(4)\).
- \(p \mapsto \operatorname{deg}(p)\), the degree of the polynomial \(p\).
\((1)\) Yes. \(①\ e_{s}((x+1) p(x))=6 p(5)\). that is the map from \(V\) to \(\mathbb{R}\). \(②\) For \(\forall\ p, q \in V\) and
\(\forall m, n \in \mathbb{R}, L(m p+n q)=6 m p(5)+6 n p(5)=m L(p)+n L(q)\), so its bilinear.
\((2)\) No. Sometimes the limit doeen't exist when \(\mbox{deg}(p)\geq 2\)
\((3)\) Yes. This vector is just a 'taking the coefficient of \(x^2\)', which is bilinear.
\((4)\) No. For instance, \(p(x)=x^{2}+2, q(x)=-2 x, \quad L(p)=11 \times 8=88 \quad L(q)=12\)
\(L(p+q)=5 \times 6=30 . \quad L(p+q) \neq L(p)+L(q)\). So it's not a dual vector.
\((5)\) No. For instance, \(p(x)=x^{2}+x, q(x)=-x^{2}+1, L(p)=\operatorname{deg}(p)=2, L(q)=\operatorname{deg}(q)=2\)
\(L(p+q)=\operatorname{deg}(x+1)=1 . L(p)+L(q) \neq L(p+q)\). So it's not a clual vector
1.6.3\(\ \small \mbox{directional derivative}\)
Fix a differentiable function \(f: \mathbb{R}^{2} \rightarrow \mathbb{R}\), and fix a point \(\boldsymbol{p} \in \mathbb{R}^{2}\). For any vector \(\boldsymbol{v} \in \mathbb{R}^{2}\), then the directional derivative of \(f\) at \(\boldsymbol{p}\) in the direction of \(\boldsymbol{v}\) is defined as \(\nabla_{\boldsymbol{v}} f:=\lim _{t \rightarrow 0} \dfrac{f(\boldsymbol{p}+t \boldsymbol{v})-f(\boldsymbol{p})}{t} . S \mathrm{Show}\) that the map \(\nabla f: \boldsymbol{v} \mapsto \nabla_{\boldsymbol{v}}(f)\) is a dual vector in \(\left(\mathbb{R}^{2}\right)^{*}\), i.e., a row vector. Also, what are its "coordinates" under the standard dual basis? (Remark: In calculus, we write \(\nabla f\) as a column vector for historical reasons. By all means, from a mathematical perspective, the correct way to write \(\nabla f\) is to write it as a row vector, as illlustrated in this problem. (But don't annoy your calculus teachers though.... In your calculus class, you use whatever notation your calculus teacher told you.) (Extra Remark: If we use row vector, then the evaluation of \(\nabla f\) at \(\boldsymbol{v}\) is purely linear, and no inner product structure is needed, which is sweet. But if we HAVE TO write \(\nabla f\) as a column vector (for historical reason), then we would have to do a dot product between \(\nabla f\) and \(\boldsymbol{v}\), which now requires an inner product structure. That is an unnecessary dependence on an extra structure that actually should have no influence.)
Set \(\vec{v}=\begin{pmatrix}a\\b\end{pmatrix}\), since the function \(f\) is differentiable, \(\nabla_{\vec{v}} f=\dfrac{\partial f(\vec{v})}{\partial x} \cdot \dfrac{a}{\sqrt{a^{2}+b^{2}}}+\dfrac{\partial f(\vec{v})}{\partial y} \dfrac{b}{\sqrt{a^{2}+b}}\)
So the map \(\nabla f\) is a map from \(\left(\mathbb{R}^{2}\right)^{*}\) to \(\mathbb{R}\).And for \(\forall\) diffential functions \(f, g, \forall m, n \in \mathbb{R}\), we have: \[ \begin{aligned} \nabla_{\vec{v}}(m f+n g) &=\frac{\partial[m f(\vec{v})+n g(\vec{v})]}{\partial x} \cdot \frac{a}{\sqrt{a^{2}+b^{2}}}+\frac{\partial[m f(\vec{v})+n g(\vec{v})]}{\partial y} \cdot \frac{b}{\sqrt{a^{2}+b^{2}}} \\ &=m \nabla_{\vec{v}} f+n \nabla_{\vec{v}} g \end{aligned} \] So the map \(\nabla f\) is bilinear, i.e, is a dual vector in \(\left(\mathbb{R}^{2}\right)^{*}\).
Since \(x\)-axis and \(y\)-axis are perpendicular, the standard dual basis is \(\dfrac{\partial f}{\partial x}\) and \(\dfrac{\partial f}{\partial y} .\) And the coordinates are \(\dfrac{\vec{v} \cdot \hat{x}}{\|\vec{v}\|}\) and \(\dfrac{\vec{v} \cdot \hat{y}}{\|\vec{v}\|}\)
1.6.4$ $
Consider a linear map \(L: V \rightarrow W\) and its dual map \(L^{*}: W^{*} \rightarrow V^{*}\). Prove the following.
- \(\operatorname{Ker}\left(L^{*}\right)\) is exactly the collection of dual vectors in \(W^{*}\) that kills \(\operatorname{Ran}(L)\).
- \(\operatorname{Ran}\left(L^{*}\right)\) is exactly the collection of dual vectors in \(V^{*}\) that kills \(\operatorname{Ker}(L)\).
\((1)\) First, all the elements in \(\ker(L^*)\in W^{*}\), so just prove \(\forall \ \vec{v}\in \ker(L^*),\vec{\omega}\in \mbox{Ran}(L),\vec{v}^{T}\cdot \vec{\omega}=0\)
\(L^{*}\vec{v}=0,L^{*}\vec{v}\cdot\vec{x}=0=\vec{v}^{T}(L\vec{x})\), set \(L\vec{x}=\vec{\omega}\) So \(\vec{v}^{T}\cdot \vec{\omega}=\langle \vec{v},\vec{\omega}\rangle=0\)
\((2)\) First, all the elements in \(\mbox{Ran}(L^{*})\in V^{*}\) ,so just prove \(\forall \ \vec{v}\in \mbox{Ran}(L^*),\vec{\omega}\in \mbox{Ker}(L),\vec{v}^{T}\cdot \vec{\omega}=0\)
\(L\vec{\omega}=0,L\vec{\omega}\cdot\vec{x}=0=\vec{\omega}^{T}(L^{*}\vec{x})\), set \(L^{*}\vec{x}=\vec{v}\), So \(\vec{\omega}^{T}\cdot \vec{v}=\langle \vec{\omega},\vec{v}\rangle=0\)
\(\large\textcolor{blue}{\mbox{Advanced Algebra } \small \mathbb{HW}\mathrm{7}}\ \ \ \ \ \ _\textcolor{blue}{2022.5.3}\)
1.7.1\(\ \small \mbox{bra map and Riesz map}\)
On the space \(\mathbb{R}^{n}\), we fix a symmetric positive-definite matrix \(A\), and define \((\boldsymbol{v}, \boldsymbol{w})=\boldsymbol{v}^{\mathrm{T}}A\boldsymbol{w}\)
- Show that this is an inner product.
- The Riesz map (inverse of the bra map) from \(V^{*}\) to \(V\) would send a row vector \(\boldsymbol{v}^{\mathrm{T}}\) to what?
- The bra map from \(V\) to \(V^{*}\) would send a vector \(\boldsymbol{v}\) to what?
- The dual of the Riesz map from \(V^{*}\) to \(V\) would send a row vector \(\boldsymbol{v}^{\mathrm{T}}\) to what?
\((1)\) Easy to prove.
\((2)\) The bra map means \(\forall \ \langle \boldsymbol{v}|\in \mathcal{B},s.t.\langle \boldsymbol{v}|:\boldsymbol{\omega\longmapsto}\langle\boldsymbol{v},\boldsymbol{\omega}\rangle=\boldsymbol{v}^{\mathrm{T}}A\boldsymbol{w}\) So the inverse of \(\langle \boldsymbol{v}|\) is make \(\boldsymbol{v}^{\mathrm{T}}A\)
to \(\boldsymbol {v}\), which is from \(V^{*}\) to \(V\). Set \(\boldsymbol{u}=\boldsymbol {v}^{T}A\), so \(\boldsymbol{v}^{T}=\boldsymbol{u}A^{-1}\) then transpose it we can get \(\boldsymbol{v}=(A^{-1})^{T}\boldsymbol{u^{T}}\)
So if the input is \(\boldsymbol{v}^{T}\in V^{*}\), the output is \((A^{-1})^{T}(\boldsymbol v^{T})^{T}=(A^{-1})^{T}\boldsymbol{v}\in V\)
\((3)\) Obviously, it sends \(\boldsymbol{v}\) to \(\boldsymbol{v}^{T}A\)
\((4)\) I guess the result is the same with \(\small(2)\), which sends \(\boldsymbol {v}^{T}\) to \((A^{-1})^{T}\boldsymbol{v}\)
1.7.2\(\ \small \mbox{What is a derivative}\)
The discussions in this problem holds for all manifolds \(M\). But for simplicities sake, suppose \(M=\mathbb{R}^{3}\) for this problem.
Let \(V\) be the space of all analytic functions from \(M\) to \(\mathbb{R}\). Here analytic means \(f(x, y, z)\) is a infinite polynomial series (its Taylor expansion) with variables \(x, y, z\). Approximately \(f(x, y, z)=a_{0}+a_{1} x+a_{2} y+\) \(a_{3} z+a_{4} x^{2}+a_{5} x y+a_{6} x z+a_{7} y^{2}+\ldots\), and things should converge always.
Then a dual vector \(v \in V^{*}\) is said to be a "derivation at \(\boldsymbol{p} \in M^{\prime \prime}\) if it satisfy the following Leibniz rule (or product rule): \[ v(f g)=f(\boldsymbol{p}) v(g)+g(\boldsymbol{p}) v(f) . \] (Note the similarity with your traditional product rule \((f g)^{\prime}(x)=f(x) g^{\prime}(x)+g(x) f^{\prime}(x)\).) Prove the following:
- Constant functions in \(V\) must be sent to zero by all derivations at any point.
- Let \(x, y, z \in V\) be the coordinate function. Suppose \(\boldsymbol{p}=\left[\begin{array}{l}p_{1} \\ p_{2} \\ p_{3}\end{array}\right]\), then for any derivation \(v\) at \(\boldsymbol{p}\), then we have \(v\left(\left(x-p_{1}\right) f\right)=f(\boldsymbol{p}) v(x), v\left(\left(y-p_{2}\right) f\right)=f(\boldsymbol{p}) v(y)\) and \(v\left(\left(z-p_{3}\right) f\right)=f(\boldsymbol{p}) v(z)\).
- Let \(x, y, z \in V\) be the coordinate function. Suppose \(\boldsymbol{p}=\left[\begin{array}{l}p_{1} \\ p_{2} \\ p_{3}\end{array}\right]\), then for any derivation \(v\) at \(\boldsymbol{p}\), then we have \(v\left(\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c}\right)=0\) for any non-negative integers \(a, b, c\) such that \(a+b+c>1\).
- Let \(x, y, z \in V\) be the coordinate function. Suppose \(\boldsymbol{p}=\left[\begin{array}{l}p_{1} \\ p_{2} \\ p_{3}\end{array}\right]\), then for any derivation \(v\) at \(\boldsymbol{p}, v(f)=\) \(\dfrac{\partial f}{\partial x}(\boldsymbol{p}) v(x)+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) v(y)+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) v(z)\). (Hint: use the Taylor expansion of \(f\) at \(\left.\boldsymbol{p} .\right)\)
- Any derivation \(v\) at \(\boldsymbol{p}\) must be exactly the directional derivative operator \(\nabla_{\boldsymbol{v}}\) where \(\boldsymbol{v}=\left[\begin{array}{l}v(x) \\ v(y) \\ v(z)\end{array}\right]\). (Remark: So, algebraically speaking, tangent vectors are exactly derivations, i.e., things that satisfy the Leibniz rule.)
\((1)\) Set \(f(\vec{p})\equiv a\), because \(v(f g)=f(\boldsymbol{p}) v(g)+g(\boldsymbol{p}) v(f)\) and \(v\in V^{*}\) is linear, \(v(fg)=av(g)+g(\boldsymbol{p})v(f)\)
Set \(g=f\), so \(v(ff)=2av(f)=v(af)=av(f)\) If \(a=0\), \(v(g)=v(0)=v(gg)=0+0=0\) If \(a\neq 0\)
\(av(f)=0\), and \(v(f)=0\) So constant functions in \(V\) are all sent to \(0\)
\((2)\) Calculate \(v((x-p_1)f)=f(\boldsymbol {p})v(x-p_1)+(x-p_1)(\boldsymbol{p})v(f)\) And \(x(\boldsymbol{p})=p_1,p_1(\boldsymbol{p})=p_1\)
So \((x-p_1)(\boldsymbol{p})=0\) For constant \(\boldsymbol{p}\), \(v(\boldsymbol{p}=0)\) So \(v((x-p_1)f)=f(\boldsymbol{p})(v(x)-v(p_1))=f(\boldsymbol{p})v(x)\)
\((3)\) As \(a,b,c\) are all integers, if \(a,b,c<1\), the sum of them is \(a+b+c\leq 0\), contradiction
So at least one intergers is \(\geq 1\), without loss of generalization, assume \(a\geq 1\)
\(v\left(\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c}\right)=v(x-p_1)((x-p_1)^{a-1}(y-p_2)^b(z-p_3)^c)(\boldsymbol{p})\)
And if \(a> 1\), \((x-p_1)^{a-1}(\boldsymbol{p})=(x-p_1)^{a-2}(x-p_1)(\boldsymbol{p})=0\) if \(a=1\), then \(b+c>0\)
at least one integer in \(b\) and \(c\) \(\geq 1\), without of generalization, assume \(b\geq 1\). Then
\((y-p_2)^{b}(\boldsymbol{p})=(y-p_2)^{b-1}(y-p_2)(\boldsymbol{p})=0\) So \(v\left(\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c}\right)=0\)
\((4)\) Use Taylor expansion for \(f\), we have \[ f(x, y, z)=f\left(p_{1}, p_{2}, p_{3}\right)+\frac{\partial f}{\partial x}(\boldsymbol{p})\left(x-p_{1}\right)+\frac{\partial f}{\partial y}(\boldsymbol{p})\left(y-p_{2}\right)+\frac{\partial f}{\partial z}(\boldsymbol{p})\left(z-p_{3}\right)+o(|\boldsymbol{r}-\boldsymbol{p}|) \] According to \((3)\), the remainder \(o(|\boldsymbol{r}-\boldsymbol{p}|)=\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c},a+b+c>1\)
So \(v(o(|\boldsymbol{r}-\boldsymbol{p}|))=0\) For constant number \(v\) sends to \(0\). Take \(v\) function to the Taylor expansion \[ v(f)=\dfrac{\partial f}{\partial x}(\boldsymbol{p}) v(x)+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) v(y)+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) v(z) \] Which shows the complete differential at \(\boldsymbol{p}\)
\((5)\) Just calculate the directional derivative \(\nabla_{\boldsymbol{v}}\) where \(\boldsymbol{v}=\left[\begin{array}{l}v(x) \\ v(y) \\ v(z)\end{array}\right]\) \[ \begin{aligned} \nabla_{v} f &=\lim _{t \rightarrow 0^{+}} \frac{f(\boldsymbol{p}+t \boldsymbol{v})-f(\boldsymbol{p})}{t} \\ &=\lim _{t \rightarrow 0^{+}} \dfrac{f(\boldsymbol{p})+\dfrac{\partial f}{\partial x}(\boldsymbol{p}) v(x) t+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) v(y) t+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) v(z) t+o(\|\boldsymbol{v}\| t)-f(\boldsymbol{p})}{t} \\ &=\frac{\partial f}{\partial x}(\boldsymbol{p}) v(x)+\frac{\partial f}{\partial y}(\boldsymbol{p}) v(y)+\frac{\partial f}{\partial z}(\boldsymbol{p}) v(z) \\ &=v(f) \end{aligned} \] So derivative in calculus is just a vector in the dual space of \(R^{n}\) which suits Leibniz rule
1.7.3\(\ \small \mbox{What is a vector field}\)
The discussions in this problem holds for all manifolds \(M\). But for simplicities sake, suppose \(M=\mathbb{R}^{3}\) for this problem. Let \(V\) be the space of all analytic functions from \(M\) to \(\mathbb{R}\) as usual. We say \(X: V \rightarrow V\) is a vector field on \(X\) if \(X(f g)=f X(g)+g X(f)\), i.e., the Leibniz rule again! Prove the following:
Show that \(X_{\boldsymbol{p}}: V \rightarrow \mathbb{R}\) such that \(X_{\boldsymbol{p}}(f)=(X(f))(\boldsymbol{p})\) is a derivation at \(\boldsymbol{p}\). (Hence \(X\) is indeed a vector field, since it is the same as picking a tangent vector at each point.)
Note that each \(f\) on \(M\) induces a covector field \(\mathrm{d} f\). Then at each point \(\boldsymbol p\), the cotangent vector \(\mathrm{d} f\) and the tangent vector \(X\) would evaluate to some number. So \(\mathrm{d} f(X)\) is a function \(M \rightarrow \mathbb{R}\). Show that \(\mathrm{d} f(X)=X(f)\), i.e., the two are the same. (Hint: just use definitions and calculate directly.)
If \(X, Y: V \rightarrow V\) are vector fields, then note that \(X \circ Y: V \rightarrow V\) might not be a vector field. (Leibniz rule might fail.) However, show that \(X \circ Y-Y \circ X\) is always a vector field.
On a related note, show that if \(A, B\) are skew-symmetric matrices, then \(A B-B A\) is still skewsymmetric. (Skew-symmetric matrices actually corresponds to certain vector fields on the manifold of orthogonal matrices. So this is no coincidence.)
\((1)\) \(X_{\boldsymbol{p}}(fg)=(X(fg))(\boldsymbol{p})=(fX(g)+gX(f))(\boldsymbol{p})=f(\boldsymbol{p})X(g)(\boldsymbol{p})+g(\boldsymbol{p})X(f)(\boldsymbol{p})\)
\(=f(\boldsymbol{p})X_{\boldsymbol{p}}(g)+g(\boldsymbol{p})X_{\boldsymbol{p}}(f)\) So it suits the definition of derivative at \(\boldsymbol{p}\)
\((2)\) \(X_{\boldsymbol{p}}:V\rightarrow R\) So \(X_{\boldsymbol{p}}\in V^{*}\) \(X_{p}=\left[\begin{array}{c} X_{\boldsymbol{p}}(X) & X_{p}(Y) & X_{p}(Z) \end{array}\right]\)
\(df(X)(\boldsymbol{p})=df_{\boldsymbol{p} }(X_{\boldsymbol{p}})\) where \(df=\left[\begin{array}{c} \dfrac{\partial f}{\partial x}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial y}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial z}(\boldsymbol{p})\\ \end{array}\right]\) combine them together then we have
From \(\small (1.4)\) \((X(f))(\boldsymbol{p})=X_{p}(f)=\dfrac{\partial f}{\partial x}(\boldsymbol{p}) X_{\boldsymbol{p}}(X)+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) X_{\boldsymbol{p}}(Y)+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) X_{\boldsymbol{p}}(Z)\)
\(=\left[\begin{array}{c} X_{\boldsymbol{p}}(X) & X_{p}(Y) & X_{p}(Z) \end{array}\right]\left[\begin{array}{c} \dfrac{\partial f}{\partial x}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial y}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial z}(\boldsymbol{p})\\ \end{array}\right]=df(X)\)
\((3)\) Just unfold the fomula \[ \begin{gathered} (X\circ Y-Y\circ X)(fg)=X\circ Y(fg)-Y\circ X(fg)=X\circ (fY(g)+gY(f))-Y\circ (fX(g)+gX(f))\\ =\small f (X\circ Y)g+X(f)Y(g)+g (X\circ Y)f+X(g)Y(f)-(f (Y\circ X)g+Y(f)X(g)+g (Y\circ X)f+Y(g)X(f))\\ =f(X\circ Y-Y\circ X)g+g(X\circ Y-Y\circ X)f \end{gathered} \] So \(X\circ Y-Y \circ X\) suits the Leibniz rule, which is a vector field.
\((4)\) Calculate directly we can prove \((AB-BA)^{T}=-(AB-BA)\) \[ \begin{aligned} (A B-B A)^{T} =(A B)^{T}-(B A)^{T} =(B^{T} A^{T}-A^{T} B^{T}) \\ =(-B)(-A)-(-A)(-B) =BA-AB=-(A B-B A) \end{aligned} \] For all the positive orthogonal matrices \(\mathcal{Q}\) \(\forall \ Q\in \mathcal{Q},\det(Q)=1\). At one matrix, its tangent vector
\((Q+A)^{T}(Q+A)=I=(Q^{T}+A^{T})(Q+A)=I\Longrightarrow A=-A^{T},||A||\to 0\)
are all Skew-symmetric matrices. According to \(\small (3)\), \(AB-BA\) is also a Skew-symmetric matrix
\(\Large \mathbf{If\ your\ life\ is\ tense,\ it\ could\ be\ a\ tensor. }\)
\(\large\textcolor{blue}{\mbox{Advanced Algebra } \small \mathbb{HW}\mathrm{8}}\ \ \ \ \ \ _\textcolor{blue}{2022.5.9}\)
1.8.1\(\ \small \mbox{Elementary layer operations for tensors}\)
Note that, for "2D" matrices we have row and column operations, and the two kinds of operations corresponds to the two dimensions of the array.
For simplicity, let \(M\) be a \(2 \times 2 \times 2\) "3D matrix". Then we have "row layer operations", "column layer operations", "horizontal layer operations". The three kinds corresponds to the three dimensions of the array. We interpret this as a multilinear map \(M: \mathbb{R}^{2} \times \mathbb{R}^{2} \times \mathbb{R}^{2} \rightarrow \mathbb{R}\). Let \(\left(\left(\mathbb{R}^{2}\right)^{*}\right)^{\otimes 3}\) be the space of all multilinear maps from \(\mathbb{R}^{2} \times \mathbb{R}^{2} \times \mathbb{R}^{2}\) to \(\mathbb{R}\).
- Given \(\alpha, \beta, \gamma \in\left(\mathbb{R}^{2}\right)^{*}\), what is the \((i, j, k)\)-entry of the "3D matrix" \(\alpha \otimes \beta \otimes \gamma\) in terms of the coordinates of \(\alpha, \beta, \gamma\) ? Here \(\alpha \otimes \beta \otimes \gamma\) is the multilinear map sending \((\boldsymbol{u}, \boldsymbol{v}, \boldsymbol{w})\) to the real number \(\alpha(\boldsymbol{u}) \beta(\boldsymbol{v}) \gamma(\boldsymbol{w})\).
- Let \(E\) be an elementary matrix. Then we can send \(\alpha \otimes \beta \otimes \gamma\) to \((\alpha E) \otimes \beta \otimes \gamma\). Why can this be extended to a linear map \(M_{E}:\left(\left(\mathbb{R}^{2}\right)^{*}\right)^{\otimes 3} \rightarrow\left(\left(\mathbb{R}^{2}\right)^{*}\right)^{\otimes 3}\) ? (This gives a formula for the "elementary layer operations" on "3D matrices", where the three kinds of layer operations corresponds to applying \(E\) to the three arguments respectively.)
- Show that elementary layer operations preserve rank. Here we say \(M\) has rank \(r\) if \(r\) is the smallest possible integer such that \(M\) can be written as the linear combination of \(r\) "rank one" maps, i.e., maps of the kind \(\alpha \otimes \beta \otimes \gamma\) for some \(\alpha, \beta, \gamma \in\left(\mathbb{R}^{2}\right)^{*}\).
- Show that, if some "2D" layer matrix of a "3D matrix" has rank r, then the \(3 D\) matrix has rank at least \(r\).
- Let \(M\) be made of two layers, \(\left[\begin{array}{ll}1 & 0 \\ 0 & 1\end{array}\right]\) and \(\left[\begin{array}{ll}0 & 1 \\ 1 & 0\end{array}\right]\). Find its rank.
- (Read only) Despite some practical interests, finding the tensor rank in general is NOT easy. In fact, it is NP-complete just for 3-tensors over finite field. Furthermore, a tensor with all real entries might have different real rank and complex rank.
\((1)\) According to the symmetry of dot product \(\langle\alpha,u\rangle=\langle u,\alpha\rangle\) we have equation \(\alpha^Tu=u^T\alpha\) So
\(\alpha^{T} u \beta^{T} v \gamma^{T} \omega=u^{T} \alpha \beta^{T} v \gamma^{T} \omega=u^{T}\left[\alpha \beta^{T} v \gamma_{1} \quad \partial \beta^{T} v \gamma_{2}\right] \omega\) Compare to \([u^TA_1v\quad u^TA_2v]\omega\) hence
\(A_{1}=\gamma_{1}\left(\begin{array}{ll}\alpha_{1} \beta_{1} & \alpha_{1} \beta_{2} \\ \alpha_{2} \beta_{1} & \alpha_{2} \beta_{2}\end{array}\right)=\gamma_1\alpha\beta^T,A_{2}=\gamma_{2}\left(\begin{array}{ll}\alpha_{1} \beta_{1} & \alpha_{1} \beta_{2} \\ \alpha_{2} \beta_{1} & \alpha_{2} \beta_{2}\end{array}\right)=\gamma_2\alpha\beta^T\) . So \(A_{ijk}=\gamma_i\alpha_j\beta_k\)
\((2)\) \(M_E\) sends tensor \(M=[[A_1,A_2]]=[[\gamma_1\alpha \beta^T,\gamma_2\alpha \beta^T]]\) to \(M'=[[\gamma_1\alpha E \beta^T,\gamma_2\alpha E\beta^T]]\)
For \(\alpha _1,\alpha_2\) in \((\mathbb{R}^2)^*\) as one part in \(((\mathbb{R}^2)^*)^{\otimes3}\) \[ M_{k\alpha_1+\mu\alpha_2}=[[\gamma_1(k\alpha_1+\mu\alpha_2) \beta^T,\gamma_2(k\alpha_1+\mu\alpha_2) \beta^T]]=kM_{\alpha _1}+\mu M_{\alpha _2}\in ((\mathbb{R}^2)^*)^{\otimes3} \] So for \(\alpha ,\beta\) is linear. and for \(\gamma_1,\gamma_2\) in \((\mathbb{R}^2)^*\) as one part in \(((\mathbb{R}^2)^*)^{\otimes3}\), set \(\gamma_{11,12}\) as the component of \(\gamma_1\)
\(\gamma_{21,22}\) as the component of \(\gamma_2\), So \[ M_{k\gamma_1+\mu\gamma_2}=[[(k\gamma_{11}+\mu\gamma_{21})\alpha\beta^T,(k\gamma_{12}+\mu\gamma_{22})\alpha \beta^T]]=kM_{\gamma _1}+\mu M_{\gamma _2}\in ((\mathbb{R}^2)^*)^{\otimes3} \] So \(M_E\) is a linear map, three operations at \(\alpha,\beta,\gamma\)
\((3)\) Suppose \(M=\displaystyle \sum_{i=1}^rM_{base(i)}\) if we operate elementary layer operations for \(M\), the right hand side
is also in "rank one" maps, so \(r'\leq r\). And if \(r'<r\), i.e., \(M'=\displaystyle \sum_{i=1}^{r'}M_{base(i)}\) As elementary have its inverse
operate the inverse of elementary operation, and we have contradiction, so \(r'=r\)
\((4)\) According to \(\mbox{SVD}\), the minimum number of decomposing a matrix into rank-\(1\) matrixes equals to rank
So if \(2D\) matrix needs at least \(r\) rank-\(1\) matrixes to make up, since every rank-\(1\) maps in \(\alpha \otimes \beta \otimes \gamma\)
contains two rank-\(1\) matrix, so the \(3D\) matrix also needs at least \(r\) rank-\(1\) tensors to make up
\((5)\) For \(A_1=\left[\begin{array}{ll}1 & 0 \\ 0 & 1\end{array}\right]\) its rank is \(2\), so \(r(M)\geq 2\). Besides, construct two rank-\(1\) tensors \[ M_1=[[\dfrac{1}{2}\begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix},\dfrac{1}{2}\begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}]], M_2=[[\dfrac{1}{2}\begin{bmatrix}1 & -1 \\ -1 & 1\end{bmatrix},-\dfrac{1}{2}\begin{bmatrix}1 & -1 \\ -1 & 1\end{bmatrix}]] \] And \(M=M_1+M_2\), so its rank is \(2\)
1.8.2\(\ \small \mbox{i+j+k rank-3 tensor}\)
Let \(M\) be a \(3 \times 3 \times 3\) "3D matrix" whose \((i, j, k)\)-entry is \(i+j+k\). We interpret this as a multilinear map \(M: \mathbb{R}^{3} \times \mathbb{R}^{3} \times \mathbb{R}^{3} \rightarrow \mathbb{R}\).
- Let \(\boldsymbol{v}=\left[\begin{array}{l}x \\ y \\ z\end{array}\right]\), then \(M(\boldsymbol{v}, \boldsymbol{v}, \boldsymbol{v})\) is a polynomial in \(x, y, z\). What is this polynomial?
- Let \(\sigma:\{1,2,3\} \rightarrow\{1,2,3\}\) be any bijection. Show that \(M\left(\boldsymbol{v}_{1}, \boldsymbol{v}_{2}, \boldsymbol{v}_{3}\right)=M\left(\boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\right)\). (Hint: brute force works. But alternatively, try find the \((i, j, k)\) entry of the multilinear map \(M^{\sigma}\), a map that sends \(\left(\boldsymbol{v}_{1}, \boldsymbol{v}_{2}, \boldsymbol{v}_{3}\right)\) to \(M\left(\boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\right)\).)
- Show that the rank \(r\) of \(M\) is at least 2 and at most 3. (It is actually exactly three.)
- (Read only) Any study of polynomial of degree \(d\) on \(n\) variables is equivalent to the study of some symmetric \(d\) tensor on \(\mathbb{R}^{n}\).
\((1)\) \(M=[[\begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix},\begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix},\begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}]]=[[A_1,A_2,A_3]]\) And \(M(\boldsymbol v,\boldsymbol v,\boldsymbol v)\) where \(\boldsymbol{v}=\begin{pmatrix}x\\y\\z\end{pmatrix}\)
\(=[\boldsymbol v^TA_1\boldsymbol v\quad \boldsymbol v^TA_2\boldsymbol v\quad \boldsymbol v^TA_3\boldsymbol v]\boldsymbol v=[\boldsymbol v^T\begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix}\boldsymbol v\quad \boldsymbol v^T\begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix}\boldsymbol v\quad \boldsymbol v^T\begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}\boldsymbol v]\boldsymbol v\)
Calculate it by \(\mbox{Mathematica}\), the result is \(p(x,y,z)=3 (x+y+z)^2 (x+2 y+3 z)\)
\((2)\) let a linear map from \(\mathbb{R}^{3} \times \mathbb{R}^{3} \times \mathbb{R}^{3}\) to \(\mathbb{R}\) sends \(\left(\boldsymbol{v}_{1}, \boldsymbol{v}_{2}, \boldsymbol{v}_{3}\right)\) to \(M\left(\boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\right)\)
Obviously, it's multi-linear since for \(\boldsymbol{v}_i\) the evaluation result is linear no matter which position \(\boldsymbol{v}_i\) is.
And this map have a tensor such \(M'=[[A_1',A_2',A_3']]\). Specialise \(\boldsymbol{v_i}\) to get value of \(A_{1}',A_{2}',A_{3}'\)
Set \(\boldsymbol{b}_1=\begin{pmatrix}1\\0\\0\end{pmatrix},\boldsymbol{b}_2=\begin{pmatrix}0\\1\\0\end{pmatrix},\boldsymbol{b}_3=\begin{pmatrix}0\\0\\1\end{pmatrix}\) let \((\boldsymbol{v_1},\boldsymbol{v_2},\boldsymbol{v_3})=(\boldsymbol{b_i},\boldsymbol{b_j},\boldsymbol{b_k})\) where \(1\leq i,j,k\leq 3\)
and they can be the same. Put one of condition into the map so \[ A_{k(ij)}'=A_{\sigma^{-1}(i)(\sigma^{-1}(j)\sigma^{-1}(k))}=\sigma^{-1}(i)+\sigma^{-1}(j)+\sigma^{-1}(k)=i+j+k \] So for \(k=1,2,3\) \(A_{k}'=A_k\), which implies that \(M'=M\), so the equation is proved.
A brute try (failed):
It is obvious that swaping \(\boldsymbol{v}_{i},\boldsymbol{v}_{j}\) at most twice can make \(\boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\) $$ \(\boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\)
If swap \(\boldsymbol{v}_{1},\boldsymbol{v}_{2}\), the value \(\boldsymbol v_1^TA_i\boldsymbol v_2=\boldsymbol v_2^TA_i\boldsymbol v_1,i=1,2,3\), since \(A_i=A_i^T\) so the result stays the same.
And if we swap \(\boldsymbol{v}_{2},\boldsymbol{v}_{3}\), \([\boldsymbol v_1^TA_1\boldsymbol v_2\quad \boldsymbol v_1^TA_2\boldsymbol v_2\quad \boldsymbol v_1^TA_3\boldsymbol v_2]\boldsymbol v_3=\boldsymbol v_1^TA_1\boldsymbol v_2v_{3x}+\boldsymbol v_1^TA_2\boldsymbol v_2v_{3y}+\boldsymbol v_1^TA_3\boldsymbol v_2v_{3z}\)
\((3)\) According to \(1.8.1(4)\), since \(\mbox{rank}(A_i)=2\), so \(r\geq 2\). Then just construct a reasonable combination
I guess \(\alpha=\begin{pmatrix}1\\1\\-1\end{pmatrix},\beta=\begin{pmatrix}1\\-1\\1\end{pmatrix},\gamma=\begin{pmatrix}-1\\1\\1\end{pmatrix}\) so the rank-\(1\) matrix like \[ \begin{bmatrix}a&b&c\\a&b&c\\-a&-b&-c\end{bmatrix},\begin{bmatrix}d&e&f\\-d&-e&-f\\d&e&f\end{bmatrix},\begin{bmatrix}-g&-h&-i\\g&h&i\\g&h&i\end{bmatrix}, \] And the linear combinations of these three matrixes are \(\begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix},\begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix},\begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}\)
which transfers to three nine - dimensional equations, out of my hand ability, the coefficient matrix is \[ A=\left( \begin{array}{ccccccccc} 1 & 0 & 0 & 1 & 0 & 0 & -1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 0 & -1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & -1 \\ 0 & 1 & 0 & 0 & -1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & -1 & 0 & 0 & 1 \\ 1 & 0 & 0 & -1 & 0 & 0 & 1 & 0 & 0 \\ -1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\ 0 & -1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & -1 & 0 & 0 & 1 & 0 & 0 & 1 \\ \end{array} \right),A\vec{x}=\vec{b}_{i}=\begin{pmatrix}3\\4\\5\\4\\5\\6\\5\\6\\7\end{pmatrix},\begin{pmatrix}4\\5\\6\\5\\6\\7\\6\\7\\8\end{pmatrix},\begin{pmatrix}5\\6\\7\\6\\7\\8\\7\\8\\9\end{pmatrix} \] And the accurate solution is \[ \begin{gathered} \begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix}=\begin{bmatrix}3.5&4.5&5.5\\3.5&4.5&5.5\\-3.5&-4.5&-5.5\end{bmatrix}+ \begin{bmatrix}4&5&6\\-4&-5&-6\\4&5&6\end{bmatrix}+\begin{bmatrix}-4.5&-5.5&-6.5\\4.5&5.5&6.5\\4.5&5.5&6.5\end{bmatrix}\\ \begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix}=\begin{bmatrix}4.5&5.5&6.5\\4.5&5.5&6.5\\-4.5&-5.5&-6.5\end{bmatrix}+ \begin{bmatrix}5&6&7\\-5&-6&-7\\5&6&7\end{bmatrix}+\begin{bmatrix}-5.5&-6.5&-7.5\\5.5&6.5&7.5\\5.5&6.5&7.5\end{bmatrix}\\ \begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}=\begin{bmatrix}5.5&6.5&7.5\\5.5&6.5&7.5\\-5.5&-6.5&-7.5\end{bmatrix}+ \begin{bmatrix}6&7&8\\-6&-7&-8\\6&7&8\end{bmatrix}+\begin{bmatrix}-6.5&-7.5&-8.5\\6.5&7.5&8.5\\6.5&7.5&8.5\end{bmatrix}\\ \end{gathered} \] So it can be decomposed into three rank-\(1\) tensors. So \(r\leq 3\)