Advanced Algebra part of Homework

本文最后更新于:1 年前

\textcolorblueAdvanced Algebra HW2      \textcolorblue2022.3.16\textcolorblueAdvanced Algebra HW2      \textcolorblue2022.3.16

1.2.1 four subspaces four subspaces

Prove or find counter examples.

  1. For four subspaces, if any three of them are linearly independent, then the four subspaces are linearly independent.
  2. If subspaces V1,V2V1,V2 are linearly independent, and V1,V3,V4V1,V3,V4 are linearly independent, and V2,V3,V4V2,V3,V4 are linearly independent, then all four subspaces are linearly independent.
  3. If V1,V2V1,V2 are linearly independent, and V3,V4V3,V4 are linearly independent, and V1+V2,V3+V4V1+V2,V3+V4 are linearly independent, then all four subspaces are linearly independent.

(1)(1) Construct four subspaces below. It is obvious that any three of them are linearly independent, but four subspaces together are linearly dependent. (dimR3=3dimR3=3V1={[k00]kR},V2={[0k0]kR},V3={[00k]kR},V4={[kkk]kR}V1={[k00]kR},V2={[0k0]kR},V3={[00k]kR},V4={[kkk]kR} (2)(2) Construct four subspaces below. We can prove that each of V1,V2V1,V2 and V1,V3,V4V1,V3,V4 and V2,V3,V4V2,V3,V4 are linearly independent. V1={[k00]kR},V2={[0k0]kR},V3={[k0k]kR},V4={[0kk]kR}V1={[k00]kR},V2={[0k0]kR},V3={[k0k]kR},V4={[0kk]kR} However, pick some special vectors from these subspaces and its linear combination are zero 1[100]+(1)[010]+(1)[101]+1[011]=01[100]+(1)[010]+(1)[101]+1[011]=0 (3)(3) Reduction to absurdity, assume that four real numbers that not all of them is zero a1v1+a2v2+a3v3+a4v4=0a1v1+a2v2+a3v3+a4v4=0 We can proof that a1v1+a2v20a1v1+a2v20, otherwise according to the independence of V1,V2V1,V2 and

V3,V4V3,V4 a1=a2=0,a3=a4=0a1=a2=0,a3=a4=0, so a1v1+a2v20,a3v3+a4v40a1v1+a2v20,a3v3+a4v40

But a1v1+a2v2V1+V2,a3v3+a4v4V3+V4a1v1+a2v2V1+V2,a3v3+a4v4V3+V4, so linear combination of a1v1+a2v2a1v1+a2v2 and

a3v3+a4v4a3v3+a4v4 can add up to 00, which is contrary to the independence of V1+V2.V3+V4V1+V2.V3+V4

All in all, the four subspaces must be linearly independence

1.2.2 decomposition of transpose decomposition oftranspose

Let VV be the space of n×nn×n real matrices. Let T:VVT:VV be the transpose operation, i.e., TT sends AA to ATAT for each AVAV. Find a non-trivial TT-invariant decomposition of VV, and find the corresponding block form of TT. (Here we use real matrices for your convenience, but the statement is totally fine for complex matrices and conjugate transpose.)


Set S={AA=AT,AMn×n}S={AA=AT,AMn×n}, this decomposition is invariant. Because after transposing any symmetric matrix, the matrix remains itself.

Set S={AA=AT,AMn×n}S={AA=AT,AMn×n}, A=ATAT=A=(AT)TA=ATAT=A=(AT)T, so any antisymmetric matrix's transpose is antisymmetric.

So decompose the linear map of TT into SS and SS, dimS=n(n+1)2,dimS=n(n1)2dimS=n(n+1)2,dimS=n(n1)2

Because any n×nn×n matrix B=B+BT2+BBT2B=B+BT2+BBT2 , so transpose TT can be decomposed

into SS and SS two block form.

A(A+AT200AAT2)A(A+AT200AAT2) so the corresponding block form of TT is (I00I)(I00I)

1.2.3 ultimate subspaces  ultimate subspaces 

Let p(x)p(x) be any polynomial, and define p(A)p(A) in the obvious manner. E.g., if p(x)=p(x)= x2+2x+3x2+2x+3, then p(A)=A2+2A+3Ip(A)=A2+2A+3I. We fix some n×nn×n matrix AA.

  1. If AB=BAAB=BA, show that Ker(B),Ran(B)Ker(B),Ran(B) are both AA-invariant subspaces.
  2. Prove that Ap(A)=p(A)AAp(A)=p(A)A.
  3. Conclude that N(AλI),R(AλI)N(AλI),R(AλI) are both AA-invariant for any λCλC.

(1)(1) For any vector vKer(B),Bv=0vKer(B),Bv=0, so B(Av)=BAv=ABv=0,AvKer(B)B(Av)=BAv=ABv=0,AvKer(B)

And for any vector vRan(B),Bx=vvRan(B),Bx=v, so Av=A(Bv)=ABv=BAv=B(Av)Av=A(Bv)=ABv=BAv=B(Av)

AvRan(B)AvRan(B). So Ker(B)Ker(B) and Ran(B)Ran(B) are both AAinvariant subspaces

(2)(2) Similar to polynomial, set p(A)=ni=0aiAnp(A)=i=0naiAn, so calculate Ap(A)Ap(A)
Ap(A)=Ani=0aiAn=ni=0aiAn+1=(ni=0aiAn)A=p(A)AAp(A)=Ai=0naiAn=i=0naiAn+1=(i=0naiAn)A=p(A)A (3)(3) Due to limited dimension nn of AA, N(AλI)N(AλI) and R(AλI)R(AλI) are limited combinations

of Ker(AλI)kKer(AλI)k and Ran(AλI)kRan(AλI)k . According to the conclusion from (1)(1) and (2)(2)

A(AλI)=(AλI)A,A(AλI)2=(AλI)2A,,A(AλI)k=(AλI)kAA(AλI)=(AλI)A,A(AλI)2=(AλI)2A,,A(AλI)k=(AλI)kA

so  kZ kZ , Ker(AλI)kKer(AλI)k and Ran(AλI)kRan(AλI)k are all AAinvariant. Add them all together

So, N(AλI),R(AλI)N(AλI),R(AλI) are both AA-invariant for any λCλC.

1.2.4 interchangeability and common eigenvector interchangeability and commoneigenvector

Note that any linear map must have at least one eigenvector. (You may try to prove this yourself, but it is not part of this homework.) You may use this fact freely in this problem. Fix any two n×nn×n square matrices A,BA,B. Suppose AB=BAAB=BA.

  1. If WW is an A-invariant subspace, show that AA has an eigenvector in WW.
  2. Show that Ker(AλI)Ker(AλI) is always BB-invariant for all λCλC. (Hint: Last problem.)
  3. Show that A,BA,B has a common eigenvector. (Hint: Last two sub-problem.)

(1)(1) Construct a new linear map from WW to WW WW:wAwWW:wAw According to the fact that any linear map must have at least one eigenvector, A has an eigenvector in WW

(2)(2) For any vector vv from Ker(AλI)Ker(AλI), (AλI)v=0,Av=λv(AλI)v=0,Av=λv, use AB=BAAB=BA

Because (AλI)(Bv)=(ABλB)v=(BAλB)v=B(AλI)v=0(AλI)(Bv)=(ABλB)v=(BAλB)v=B(AλI)v=0

So Ker(AλI)Ker(AλI) is BBinvariant for all λCλC

(3)(3) According to (1)(1) and (2)(2), there exists at least one eigenvector vv in Ker(AλI)Ker(AλI) that

Bv=λBvBv=λBv And the vector vv also satisfies that Av=λAvAv=λAv

So if AA and BB are interchangeable, they must have common eigenvector.

\textcolorblueAdvanced Algebra HW3      \textcolorblue2022.3.24\textcolorblueAdvanced Algebra HW3      \textcolorblue2022.3.24

1.3.1 jordan normal form jordan normal form

Find a basis in the following vector space so that the linear map involved will be in Jordan normal form. Also find the Jordan normal form.

  1. V=C2V=C2 is a real vector space, and A:VVA:VV that sends [xy][xy] to [ˉx(y)(1+i)(x)y][x¯(y)(1+i)(x)y] is a real linear map. (Here ˉxx¯ means the complex conjugate of a complex number xx, and (x),(x)(x),(x) means the real part and the imaginary part of a complex number x.)x.)
  2. V=P4V=P4, the real vector space space of all real polynomials of degree at most 4. And A:VVA:VV is a linear map such that A(p(x))=p(x)+p(0)+p(0)x2A(p(x))=p(x)+p(0)+p(0)x2 for each polynomial pP4pP4.
  3. A=[a1a2a3a4]A=[a1a2a3a4]. Be careful here. Maybe we have many possibilities for its Jordan normal form depending on the values of a1,a2,a3,a4a1,a2,a3,a4.

(1)(1) A1=(1010010001100101)=(1210004004000401)(1000011000100001)(1210004004000401)1A1=(1010010001100101)=(1210004004000401)(1000011000100001)(1210004004000401)1

(2)(2) A:a0+a1x+a2x2+a3x3+a4x4(a0+a1)+2a2x+(a1+2a3)x2+4a4x3A:a0+a1x+a2x2+a3x3+a4x4(a0+a1)+2a2x+(a1+2a3)x2+4a4x3 (1100000200010300000400000)=J(1000002000002000000100000)J1J=(12+112121201112002222060004000001)(1100000200010300000400000)=J(1000002000002000000100000)J1J=(12+112121201112002222060004000001) (3)(3) J=(J1,4OOJ2,3),where Ji,j={(aiaj00aiaj)ai0,aj0(0000)ai=0,aj0 or ai0,aj=0(0100)ai=aj=0J=(J1,4OOJ2,3),where Ji,j={(aiaj00aiaj)ai0,aj0(0000)ai=0,aj0 or ai0,aj=0(0100)ai=aj=0

1.3.2 partitions of interger partitions ofinterger

A partition of integer nn is a way to write nn as a sum of other positive integers, say 5=2+2+15=2+2+1. If you always order the summands from large to small, you end up with a dot diagram, where each column represent an integer: [][]. Similarly, 7=4+2+17=4+2+1 should be represented as [][]

  1. If the Jordan normal form of an n×nn×n nilpotent matrix AA is diag (Ja1,Ja2,,Jak)(Ja1,Ja2,,Jak), then we have a partition of integer n=a1++akn=a1++ak. However, we also have a partition of integer n=[dimKer(A)]+[dimKer(A2)dimKer(A)]+[dimKer(A3)dimKer(A2)]+n=[dimKer(A)]+[dimKer(A2)dimKer(A)]+[dimKer(A3)dimKer(A2)]+ where we treat the content of each bracket as a positive integer. Can you find a relation between the two dot diagrams?

  2. A partition of integer n=a1++akn=a1++ak is called self-conjugate if, for the matrix A=diag(Ja1,Ja2,,Jak)A=diag(Ja1,Ja2,,Jak), the two dot diagrams you obtained above are the same. Show that, for a fixed integer n, the number of self-conjugate partition of nn is equal to the number of partition of nn into distinct odd positive integers. (Hint: For a self-conjugate dot diagram, count the total number of dots that are either in the first column or in the first row or in both. Is this always odd?)

  3. Suppose a 4 by 4 matrix AA is nilpotent and upper trianguler, and all (i,j)(i,j) entries for i<ji<j are chosen randomly and uniformly in the interval [1,1][1,1]. What are the probabilities that its Jordan canonical form corresponds to the partitions 4=4,4=3+1,4=2+2,4=2+1+1,4=1+1+1+14=4,4=3+1,4=2+2,4=2+1+1,4=1+1+1+1 ?


(1)(1) I can find that the sequence of each bracket' number is not incremental, which is just like the dot graph. Put [dimKer(A)],[dimKer(A2)dimKer(A)],[dimKer(A3)dimKer(A2)],[dimKer(A)],[dimKer(A2)dimKer(A)],[dimKer(A3)dimKer(A2)], like the dot graph. According to the 'killing chain', the number of each row's dots of the dot graph is just a1,a2,a3,a1,a2,a3,

(2)(2) The self-conjugate of the partition is just like a flying wing. Like this [][]

If rudely call this as a matrix AA, because of the condition, we have A=ATA=AT.

Consider the outermost corner, the total number of the dots is n+n1=2n1oddn+n1=2n1odd

And deprive each corner, the newest outermost corner also satisfies the odd condition.

So every self-conjugate can be correspondence to odd-partition. And for each odd-partition, we can construct the matrix one corner by one corner.

In conclusion, the two numbers are the same.

(3)(3) AA is nilpotent and its dimension is 44, so A4=OA4=O. Consider the diagonal elements.

A4(i,i)=(A(i,i))4=0A4(i,i)=(A(i,i))4=0, so AA is like A=(0ranranran00ranran000ran0000)A=(0ranranran00ranran000ran0000). It is obvious that all the

eigenvalues are 00. Consider Ker(A),Ker(A2),Ker(A3),Ke(A4)Ker(A),Ker(A2),Ker(A3),Ke(A4), dimension of each is nearly

1,2,3,41,2,3,4 , since each ranran is randomly chosen from [1,1][1,1]. The probability 4=1+1+1+14=1+1+1+1

is 11, others are 00.

\textcolorblueAdvanced Algebra HW6      \textcolorblue2022.4.29\textcolorblueAdvanced Algebra HW6      \textcolorblue2022.4.29

1.6.1 Vandermode matrix Vandermode matrix

Let VV be the space of real polynomials of degree less than nn. So dimV=ndimV=n. Then for each aRaR, the evaluation evaeva is a dual vector.

For any real numbers a1,,anRa1,,anR, consider the map L:VRnL:VRn such that L(p)=[p(a1)p(an)]L(p)=[p(a1)p(an)].

  1. Write out the matrix for LL under the basis 1,x,,xn11,x,,xn1 for VV and the standard basis for RnRn. (Do you know the name for this matrix?)
  2. Prove that LL is invertible if and only if a1,,ana1,,an are distinct. (If you can name the matrix LL, then you may use its determinant formula without proof.)
  3. Show that eva1,,evaneva1,,evan form a basis for VV if and only if all a1,,ana1,,an are distinct.
  4. Set n=3n=3. Find polynomials p1,p2,p3p1,p2,p3 such that pi(j)=δijpi(j)=δij for i,j{1,0,1}i,j{1,0,1}.
  5. Set n=4n=4, and consider ev2,ev1,ev0,ev1,ev2Vev2,ev1,ev0,ev1,ev2V. Since dimV=4dimV=4, these must be linearly dependent. Find a non-trivial linear combination of these which is zero.

(1)(1) L=[1a1an111a2an121anan1n]L=[1a1a1n11a2a2n11anann1]Vandermode matrix

Calculate its inverse and we can get standard basis using Lagrange interpolation. (1a1an111a2an121anan1n)(x0x1x2xn1)=(y0y1y2yn1)(1a1an111a2an121anan1n)1(y0y1y2yn1)=(x0x1x2xn1)(1a1a1n11a2a2n11anann1)(x0x1x2xn1)=(y0y1y2yn1)(1a1a1n11a2a2n11anann1)1(y0y1y2yn1)=(x0x1x2xn1) Construct polynomial f(a)=iyijiaajaiajf(a)=iyijiaajaiaj So f(ai)=xif(ai)=xi

So (V1)ij(V1)ij is the coefficient of kiaakaiakkiaakaiak at xj1xj1 , which is (V1)ij=[xj1]kiaakaiak(V1)ij=[xj1]kiaakaiak (V1)ij=(1)j+10p1<<pnζ;p1,p2,pnjixp1xp2xpnj0k<n;ki(xkxi)(V1)ij=(1)j+10p1<<pnζ;p1,p2,pnjixp1xp2xpnj0k<n;ki(xkxi) The column of the (V1)(V1) is the standard basis of VV

(2)(2) detdetL=1ijn(aiaj)0aiaj

(3)(3) e v_{a_{1}}, \cdots, ev_{a_n}eva1,,evan form a basis for V^{*} \Longleftrightarrow e v_{a_{1}}, e v_{a_{2}}, \cdots, ev_{a_n}Veva1,eva2,,evan, evan are linearly indepenctent.

The matrix \left(\begin{array}{c}e v_{a_{1}} \\ e v_{a_{2}} \\ \vdots \\ e v_{a_{n}}\end{array}\right)(eva1eva2evan) is invertible, which means LL is invertible. According to \small(2)(2), a_iai are distinct.

(4)(4) Pick original basis \{1,x,x^2\}{1,x,x2} So \alpha_{1}=(1,-1,1) \quad \alpha_{2}=(1,0,0) \quad \alpha_{3}=(1,1,1)α1=(1,1,1)α2=(1,0,0)α3=(1,1,1) A=\left(\begin{array}{l}\alpha_{1} \\ \alpha_{2} \\ \alpha_{3}\end{array}\right)=\left(\begin{array}{ccc}1 & -1 & 1 \\ 1 & 0 & 0 \\ 1 & 1 & 1\end{array}\right) \quad A^{-1}=\left(\begin{array}{ccc}0 & 1 & 0 \\ -\frac{1}{2} & 0 & \frac{1}{2} \\ \frac{1}{2} & -1 & \frac{1}{2}\end{array}\right) A=(α1α2α3)=(111100111)A1=(0101201212112) So p_{1}=-\dfrac{1}{2} x+\dfrac{1}{2} x^{2} \quad p_{2}=1-x^{2} \quad p_{3}=\dfrac{1}{2} x+\dfrac{1}{2} x^{2}p1=12x+12x2p2=1x2p3=12x+12x2

(5)(5) Set p(x)=ax^3+bx^2+cx+dp(x)=ax3+bx2+cx+d and m e v_{-2}+n e v_{-1}+p e v_{0}+q e v_{1}+r e v_{2}=0mev2+nev1+pev0+qev1+rev2=0 \begin{gathered} e v_{-2}=-8 a+4 b-2 c+d \quad e v_{-1}=-a+b-c+d \quad e v_{0}=d\\ \quad e v_{1}=a+b+c+d \quad e v_{2}=8 a+4 b+2 c+d\\ \left(\begin{array}{ccccc}-8 & -1 & 0 & 1 & 8 \\ 4 & 1 & 0 & 1 & 4 \\ -2 & -1 & 0 & 1 & 2 \\ 1 & 1 & 1 & 1 &1\end{array}\right)\left(\begin{array}{l}m \\ n \\ p \\ q \\ r\end{array}\right)=\boldsymbol 0 \end{gathered} ev2=8a+4b2c+dev1=a+bc+dev0=dev1=a+b+c+dev2=8a+4b+2c+d(81018410142101211111)(mnpqr)=\boldsymbol0 Solve that e v_{-2}-4 e v_{-1}+6 e v_{0}-4 e v_{1}+e v_{2}=0ev24ev1+6ev04ev1+ev2=0

1.6.2\ \small \mbox{dual vector in polynomials} dual vector inpolynomials

Let VV be the space of real polynomials of degree less than 33. Which of the following is a dual vector? Prove it or show why not.

  1. p \mapsto \operatorname{ev}_{5}((x+1) p(x))pev5((x+1)p(x)).
  2. p \mapsto \lim _{x \rightarrow \infty} \dfrac{p(x)}{x}plimxp(x)x.
  3. p \mapsto \lim _{x \rightarrow \infty} \dfrac{p(x)}{x^{2}}plimxp(x)x2.
  4. p \mapsto p(3) p^{\prime}(4)pp(3)p(4).
  5. p \mapsto \operatorname{deg}(p)pdeg(p), the degree of the polynomial pp.

(1)(1) Yes. ①\ e_{s}((x+1) p(x))=6 p(5) es((x+1)p(x))=6p(5). that is the map from VV to \mathbb{R}R. For \forall\ p, q \in V p,qV and

\forall m, n \in \mathbb{R}, L(m p+n q)=6 m p(5)+6 n p(5)=m L(p)+n L(q)m,nR,L(mp+nq)=6mp(5)+6np(5)=mL(p)+nL(q), so its bilinear.

(2)(2) No. Sometimes the limit doeen't exist when \mbox{deg}(p)\geq 2deg(p)2

(3)(3) Yes. This vector is just a 'taking the coefficient of x^2x2', which is bilinear.

(4)(4) No. For instance, p(x)=x^{2}+2, q(x)=-2 x, \quad L(p)=11 \times 8=88 \quad L(q)=12p(x)=x2+2,q(x)=2x,L(p)=11×8=88L(q)=12

L(p+q)=5 \times 6=30 . \quad L(p+q) \neq L(p)+L(q)L(p+q)=5×6=30.L(p+q)L(p)+L(q). So it's not a dual vector.

(5)(5) No. For instance, p(x)=x^{2}+x, q(x)=-x^{2}+1, L(p)=\operatorname{deg}(p)=2, L(q)=\operatorname{deg}(q)=2p(x)=x2+x,q(x)=x2+1,L(p)=deg(p)=2,L(q)=deg(q)=2

L(p+q)=\operatorname{deg}(x+1)=1 . L(p)+L(q) \neq L(p+q)L(p+q)=deg(x+1)=1.L(p)+L(q)L(p+q). So it's not a clual vector

1.6.3\ \small \mbox{directional derivative} directionalderivative

Fix a differentiable function f: \mathbb{R}^{2} \rightarrow \mathbb{R}f:R2R, and fix a point \boldsymbol{p} \in \mathbb{R}^{2}\boldsymbolpR2. For any vector \boldsymbol{v} \in \mathbb{R}^{2}\boldsymbolvR2, then the directional derivative of ff at \boldsymbol{p}\boldsymbolp in the direction of \boldsymbol{v}\boldsymbolv is defined as \nabla_{\boldsymbol{v}} f:=\lim _{t \rightarrow 0} \dfrac{f(\boldsymbol{p}+t \boldsymbol{v})-f(\boldsymbol{p})}{t} . S \mathrm{Show}\boldsymbolvf:=limt0f(\boldsymbolp+t\boldsymbolv)f(\boldsymbolp)t.SShow that the map \nabla f: \boldsymbol{v} \mapsto \nabla_{\boldsymbol{v}}(f)f:\boldsymbolv\boldsymbolv(f) is a dual vector in \left(\mathbb{R}^{2}\right)^{*}(R2), i.e., a row vector. Also, what are its "coordinates" under the standard dual basis? (Remark: In calculus, we write \nabla ff as a column vector for historical reasons. By all means, from a mathematical perspective, the correct way to write \nabla ff is to write it as a row vector, as illlustrated in this problem. (But don't annoy your calculus teachers though.... In your calculus class, you use whatever notation your calculus teacher told you.) (Extra Remark: If we use row vector, then the evaluation of \nabla ff at \boldsymbol{v}\boldsymbolv is purely linear, and no inner product structure is needed, which is sweet. But if we HAVE TO write \nabla ff as a column vector (for historical reason), then we would have to do a dot product between \nabla ff and \boldsymbol{v}\boldsymbolv, which now requires an inner product structure. That is an unnecessary dependence on an extra structure that actually should have no influence.)


Set \vec{v}=\begin{pmatrix}a\\b\end{pmatrix}v=(ab), since the function ff is differentiable, \nabla_{\vec{v}} f=\dfrac{\partial f(\vec{v})}{\partial x} \cdot \dfrac{a}{\sqrt{a^{2}+b^{2}}}+\dfrac{\partial f(\vec{v})}{\partial y} \dfrac{b}{\sqrt{a^{2}+b}}vf=f(v)xaa2+b2+f(v)yba2+b

So the map \nabla ff is a map from \left(\mathbb{R}^{2}\right)^{*}(R2) to \mathbb{R}R.And for \forall diffential functions f, g, \forall m, n \in \mathbb{R}f,g,m,nR, we have: \begin{aligned} \nabla_{\vec{v}}(m f+n g) &=\frac{\partial[m f(\vec{v})+n g(\vec{v})]}{\partial x} \cdot \frac{a}{\sqrt{a^{2}+b^{2}}}+\frac{\partial[m f(\vec{v})+n g(\vec{v})]}{\partial y} \cdot \frac{b}{\sqrt{a^{2}+b^{2}}} \\ &=m \nabla_{\vec{v}} f+n \nabla_{\vec{v}} g \end{aligned} v(mf+ng)=[mf(v)+ng(v)]xaa2+b2+[mf(v)+ng(v)]yba2+b2=mvf+nvg So the map \nabla ff is bilinear, i.e, is a dual vector in \left(\mathbb{R}^{2}\right)^{*}(R2).

Since xx-axis and yy-axis are perpendicular, the standard dual basis is \dfrac{\partial f}{\partial x}fx and \dfrac{\partial f}{\partial y} .fy. And the coordinates are \dfrac{\vec{v} \cdot \hat{x}}{\|\vec{v}\|}vx^v and \dfrac{\vec{v} \cdot \hat{y}}{\|\vec{v}\|}vy^v

1.6.4 

Consider a linear map L: V \rightarrow WL:VW and its dual map L^{*}: W^{*} \rightarrow V^{*}L:WV. Prove the following.

  1. \operatorname{Ker}\left(L^{*}\right)Ker(L) is exactly the collection of dual vectors in W^{*}W that kills \operatorname{Ran}(L)Ran(L).
  2. \operatorname{Ran}\left(L^{*}\right)Ran(L) is exactly the collection of dual vectors in V^{*}V that kills \operatorname{Ker}(L)Ker(L).

(1)(1) First, all the elements in \ker(L^*)\in W^{*}ker(L)W, so just prove \forall \ \vec{v}\in \ker(L^*),\vec{\omega}\in \mbox{Ran}(L),\vec{v}^{T}\cdot \vec{\omega}=0 vker(L),ωRan(L),vTω=0

L^{*}\vec{v}=0,L^{*}\vec{v}\cdot\vec{x}=0=\vec{v}^{T}(L\vec{x})Lv=0,Lvx=0=vT(Lx), set L\vec{x}=\vec{\omega}Lx=ω So \vec{v}^{T}\cdot \vec{\omega}=\langle \vec{v},\vec{\omega}\rangle=0vTω=v,ω=0

(2)(2) First, all the elements in \mbox{Ran}(L^{*})\in V^{*}Ran(L)V ,so just prove \forall \ \vec{v}\in \mbox{Ran}(L^*),\vec{\omega}\in \mbox{Ker}(L),\vec{v}^{T}\cdot \vec{\omega}=0 vRan(L),ωKer(L),vTω=0

L\vec{\omega}=0,L\vec{\omega}\cdot\vec{x}=0=\vec{\omega}^{T}(L^{*}\vec{x})Lω=0,Lωx=0=ωT(Lx), set L^{*}\vec{x}=\vec{v}Lx=v, So \vec{\omega}^{T}\cdot \vec{v}=\langle \vec{\omega},\vec{v}\rangle=0ωTv=ω,v=0

\large\textcolor{blue}{\mbox{Advanced Algebra } \small \mathbb{HW}\mathrm{7}}\ \ \ \ \ \ _\textcolor{blue}{2022.5.3}\textcolorblueAdvanced Algebra HW7      \textcolorblue2022.5.3

1.7.1\ \small \mbox{bra map and Riesz map} bra map and Rieszmap

On the space \mathbb{R}^{n}Rn, we fix a symmetric positive-definite matrix AA, and define (\boldsymbol{v}, \boldsymbol{w})=\boldsymbol{v}^{\mathrm{T}}A\boldsymbol{w}(\boldsymbolv,\boldsymbolw)=\boldsymbolvTA\boldsymbolw

  1. Show that this is an inner product.
  2. The Riesz map (inverse of the bra map) from V^{*}V to VV would send a row vector \boldsymbol{v}^{\mathrm{T}}\boldsymbolvT to what?
  3. The bra map from VV to V^{*}V would send a vector \boldsymbol{v}\boldsymbolv to what?
  4. The dual of the Riesz map from V^{*}V to VV would send a row vector \boldsymbol{v}^{\mathrm{T}}\boldsymbolvT to what?

(1)(1) Easy to prove.

(2)(2) The bra map means \forall \ \langle \boldsymbol{v}|\in \mathcal{B},s.t.\langle \boldsymbol{v}|:\boldsymbol{\omega\longmapsto}\langle\boldsymbol{v},\boldsymbol{\omega}\rangle=\boldsymbol{v}^{\mathrm{T}}A\boldsymbol{w} \boldsymbolv|B,s.t.\boldsymbolv|:\boldsymbolω\boldsymbolv,\boldsymbolω=\boldsymbolvTA\boldsymbolw So the inverse of \langle \boldsymbol{v}|\boldsymbolv| is make \boldsymbol{v}^{\mathrm{T}}A\boldsymbolvTA

to \boldsymbol {v}\boldsymbolv, which is from V^{*}V to VV. Set \boldsymbol{u}=\boldsymbol {v}^{T}A\boldsymbolu=\boldsymbolvTA, so \boldsymbol{v}^{T}=\boldsymbol{u}A^{-1}\boldsymbolvT=\boldsymboluA1 then transpose it we can get \boldsymbol{v}=(A^{-1})^{T}\boldsymbol{u^{T}}\boldsymbolv=(A1)T\boldsymboluT

So if the input is \boldsymbol{v}^{T}\in V^{*}\boldsymbolvTV, the output is (A^{-1})^{T}(\boldsymbol v^{T})^{T}=(A^{-1})^{T}\boldsymbol{v}\in V(A1)T(\boldsymbolvT)T=(A1)T\boldsymbolvV

(3)(3) Obviously, it sends \boldsymbol{v}\boldsymbolv to \boldsymbol{v}^{T}A\boldsymbolvTA

(4)(4) I guess the result is the same with \small(2)(2), which sends \boldsymbol {v}^{T}\boldsymbolvT to (A^{-1})^{T}\boldsymbol{v}(A1)T\boldsymbolv

1.7.2\ \small \mbox{What is a derivative} What is aderivative

The discussions in this problem holds for all manifolds MM. But for simplicities sake, suppose M=\mathbb{R}^{3}M=R3 for this problem.

Let VV be the space of all analytic functions from MM to \mathbb{R}R. Here analytic means f(x, y, z)f(x,y,z) is a infinite polynomial series (its Taylor expansion) with variables x, y, zx,y,z. Approximately f(x, y, z)=a_{0}+a_{1} x+a_{2} y+f(x,y,z)=a0+a1x+a2y+ a_{3} z+a_{4} x^{2}+a_{5} x y+a_{6} x z+a_{7} y^{2}+\ldotsa3z+a4x2+a5xy+a6xz+a7y2+, and things should converge always.

Then a dual vector v \in V^{*}vV is said to be a "derivation at \boldsymbol{p} \in M^{\prime \prime}\boldsymbolpM if it satisfy the following Leibniz rule (or product rule): v(f g)=f(\boldsymbol{p}) v(g)+g(\boldsymbol{p}) v(f) . v(fg)=f(\boldsymbolp)v(g)+g(\boldsymbolp)v(f). (Note the similarity with your traditional product rule (f g)^{\prime}(x)=f(x) g^{\prime}(x)+g(x) f^{\prime}(x)(fg)(x)=f(x)g(x)+g(x)f(x).) Prove the following:

  1. Constant functions in VV must be sent to zero by all derivations at any point.
  2. Let x, y, z \in Vx,y,zV be the coordinate function. Suppose \boldsymbol{p}=\left[\begin{array}{l}p_{1} \\ p_{2} \\ p_{3}\end{array}\right]\boldsymbolp=[p1p2p3], then for any derivation vv at \boldsymbol{p}\boldsymbolp, then we have v\left(\left(x-p_{1}\right) f\right)=f(\boldsymbol{p}) v(x), v\left(\left(y-p_{2}\right) f\right)=f(\boldsymbol{p}) v(y)v((xp1)f)=f(\boldsymbolp)v(x),v((yp2)f)=f(\boldsymbolp)v(y) and v\left(\left(z-p_{3}\right) f\right)=f(\boldsymbol{p}) v(z)v((zp3)f)=f(\boldsymbolp)v(z).
  3. Let x, y, z \in Vx,y,zV be the coordinate function. Suppose \boldsymbol{p}=\left[\begin{array}{l}p_{1} \\ p_{2} \\ p_{3}\end{array}\right]\boldsymbolp=[p1p2p3], then for any derivation vv at \boldsymbol{p}\boldsymbolp, then we have v\left(\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c}\right)=0v((xp1)a(yp2)b(zp3)c)=0 for any non-negative integers a, b, ca,b,c such that a+b+c>1a+b+c>1.
  4. Let x, y, z \in Vx,y,zV be the coordinate function. Suppose \boldsymbol{p}=\left[\begin{array}{l}p_{1} \\ p_{2} \\ p_{3}\end{array}\right]\boldsymbolp=[p1p2p3], then for any derivation vv at \boldsymbol{p}, v(f)=\boldsymbolp,v(f)= \dfrac{\partial f}{\partial x}(\boldsymbol{p}) v(x)+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) v(y)+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) v(z)fx(\boldsymbolp)v(x)+fy(\boldsymbolp)v(y)+fz(\boldsymbolp)v(z). (Hint: use the Taylor expansion of ff at \left.\boldsymbol{p} .\right)\boldsymbolp.)
  5. Any derivation vv at \boldsymbol{p}\boldsymbolp must be exactly the directional derivative operator \nabla_{\boldsymbol{v}}\boldsymbolv where \boldsymbol{v}=\left[\begin{array}{l}v(x) \\ v(y) \\ v(z)\end{array}\right]\boldsymbolv=[v(x)v(y)v(z)]. (Remark: So, algebraically speaking, tangent vectors are exactly derivations, i.e., things that satisfy the Leibniz rule.)

(1)(1) Set f(\vec{p})\equiv af(p)a, because v(f g)=f(\boldsymbol{p}) v(g)+g(\boldsymbol{p}) v(f)v(fg)=f(\boldsymbolp)v(g)+g(\boldsymbolp)v(f) and v\in V^{*}vV is linear, v(fg)=av(g)+g(\boldsymbol{p})v(f)v(fg)=av(g)+g(\boldsymbolp)v(f)

Set g=fg=f, so v(ff)=2av(f)=v(af)=av(f)v(ff)=2av(f)=v(af)=av(f) If a=0a=0, v(g)=v(0)=v(gg)=0+0=0v(g)=v(0)=v(gg)=0+0=0 If a\neq 0a0

av(f)=0av(f)=0, and v(f)=0v(f)=0 So constant functions in VV are all sent to 00

(2)(2) Calculate v((x-p_1)f)=f(\boldsymbol {p})v(x-p_1)+(x-p_1)(\boldsymbol{p})v(f)v((xp1)f)=f(\boldsymbolp)v(xp1)+(xp1)(\boldsymbolp)v(f) And x(\boldsymbol{p})=p_1,p_1(\boldsymbol{p})=p_1x(\boldsymbolp)=p1,p1(\boldsymbolp)=p1

So (x-p_1)(\boldsymbol{p})=0(xp1)(\boldsymbolp)=0 For constant \boldsymbol{p}\boldsymbolp, v(\boldsymbol{p}=0)v(\boldsymbolp=0) So v((x-p_1)f)=f(\boldsymbol{p})(v(x)-v(p_1))=f(\boldsymbol{p})v(x)v((xp1)f)=f(\boldsymbolp)(v(x)v(p1))=f(\boldsymbolp)v(x)

(3)(3) As a,b,ca,b,c are all integers, if a,b,c<1a,b,c<1, the sum of them is a+b+c\leq 0a+b+c0, contradiction

So at least one intergers is \geq 11, without loss of generalization, assume a\geq 1a1

v\left(\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c}\right)=v(x-p_1)((x-p_1)^{a-1}(y-p_2)^b(z-p_3)^c)(\boldsymbol{p})v((xp1)a(yp2)b(zp3)c)=v(xp1)((xp1)a1(yp2)b(zp3)c)(\boldsymbolp)

And if a> 1a>1, (x-p_1)^{a-1}(\boldsymbol{p})=(x-p_1)^{a-2}(x-p_1)(\boldsymbol{p})=0(xp1)a1(\boldsymbolp)=(xp1)a2(xp1)(\boldsymbolp)=0 if a=1a=1, then b+c>0b+c>0

at least one integer in bb and cc \geq 11, without of generalization, assume b\geq 1b1. Then

(y-p_2)^{b}(\boldsymbol{p})=(y-p_2)^{b-1}(y-p_2)(\boldsymbol{p})=0(yp2)b(\boldsymbolp)=(yp2)b1(yp2)(\boldsymbolp)=0 So v\left(\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c}\right)=0v((xp1)a(yp2)b(zp3)c)=0

(4)(4) Use Taylor expansion for ff, we have f(x, y, z)=f\left(p_{1}, p_{2}, p_{3}\right)+\frac{\partial f}{\partial x}(\boldsymbol{p})\left(x-p_{1}\right)+\frac{\partial f}{\partial y}(\boldsymbol{p})\left(y-p_{2}\right)+\frac{\partial f}{\partial z}(\boldsymbol{p})\left(z-p_{3}\right)+o(|\boldsymbol{r}-\boldsymbol{p}|) f(x,y,z)=f(p1,p2,p3)+fx(\boldsymbolp)(xp1)+fy(\boldsymbolp)(yp2)+fz(\boldsymbolp)(zp3)+o(|\boldsymbolr\boldsymbolp|) According to (3)(3), the remainder o(|\boldsymbol{r}-\boldsymbol{p}|)=\left(x-p_{1}\right)^{a}\left(y-p_{2}\right)^{b}\left(z-p_{3}\right)^{c},a+b+c>1o(|\boldsymbolr\boldsymbolp|)=(xp1)a(yp2)b(zp3)c,a+b+c>1

So v(o(|\boldsymbol{r}-\boldsymbol{p}|))=0v(o(|\boldsymbolr\boldsymbolp|))=0 For constant number vv sends to 00. Take vv function to the Taylor expansion v(f)=\dfrac{\partial f}{\partial x}(\boldsymbol{p}) v(x)+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) v(y)+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) v(z) v(f)=fx(\boldsymbolp)v(x)+fy(\boldsymbolp)v(y)+fz(\boldsymbolp)v(z) Which shows the complete differential at \boldsymbol{p}\boldsymbolp

(5)(5) Just calculate the directional derivative \nabla_{\boldsymbol{v}}\boldsymbolv where \boldsymbol{v}=\left[\begin{array}{l}v(x) \\ v(y) \\ v(z)\end{array}\right]\boldsymbolv=[v(x)v(y)v(z)] \begin{aligned} \nabla_{v} f &=\lim _{t \rightarrow 0^{+}} \frac{f(\boldsymbol{p}+t \boldsymbol{v})-f(\boldsymbol{p})}{t} \\ &=\lim _{t \rightarrow 0^{+}} \dfrac{f(\boldsymbol{p})+\dfrac{\partial f}{\partial x}(\boldsymbol{p}) v(x) t+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) v(y) t+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) v(z) t+o(\|\boldsymbol{v}\| t)-f(\boldsymbol{p})}{t} \\ &=\frac{\partial f}{\partial x}(\boldsymbol{p}) v(x)+\frac{\partial f}{\partial y}(\boldsymbol{p}) v(y)+\frac{\partial f}{\partial z}(\boldsymbol{p}) v(z) \\ &=v(f) \end{aligned} vf=limt0+f(\boldsymbolp+t\boldsymbolv)f(\boldsymbolp)t=limt0+f(\boldsymbolp)+fx(\boldsymbolp)v(x)t+fy(\boldsymbolp)v(y)t+fz(\boldsymbolp)v(z)t+o(\boldsymbolvt)f(\boldsymbolp)t=fx(\boldsymbolp)v(x)+fy(\boldsymbolp)v(y)+fz(\boldsymbolp)v(z)=v(f) So derivative in calculus is just a vector in the dual space of R^{n}Rn which suits Leibniz rule

1.7.3\ \small \mbox{What is a vector field} What is a vectorfield

The discussions in this problem holds for all manifolds MM. But for simplicities sake, suppose M=\mathbb{R}^{3}M=R3 for this problem. Let VV be the space of all analytic functions from MM to \mathbb{R}R as usual. We say X: V \rightarrow VX:VV is a vector field on XX if X(f g)=f X(g)+g X(f)X(fg)=fX(g)+gX(f), i.e., the Leibniz rule again! Prove the following:

  1. Show that X_{\boldsymbol{p}}: V \rightarrow \mathbb{R}X\boldsymbolp:VR such that X_{\boldsymbol{p}}(f)=(X(f))(\boldsymbol{p})X\boldsymbolp(f)=(X(f))(\boldsymbolp) is a derivation at \boldsymbol{p}\boldsymbolp. (Hence XX is indeed a vector field, since it is the same as picking a tangent vector at each point.)

  2. Note that each ff on MM induces a covector field \mathrm{d} fdf. Then at each point \boldsymbol p\boldsymbolp, the cotangent vector \mathrm{d} fdf and the tangent vector XX would evaluate to some number. So \mathrm{d} f(X)df(X) is a function M \rightarrow \mathbb{R}MR. Show that \mathrm{d} f(X)=X(f)df(X)=X(f), i.e., the two are the same. (Hint: just use definitions and calculate directly.)

  3. If X, Y: V \rightarrow VX,Y:VV are vector fields, then note that X \circ Y: V \rightarrow VXY:VV might not be a vector field. (Leibniz rule might fail.) However, show that X \circ Y-Y \circ XXYYX is always a vector field.

  4. On a related note, show that if A, BA,B are skew-symmetric matrices, then A B-B AABBA is still skewsymmetric. (Skew-symmetric matrices actually corresponds to certain vector fields on the manifold of orthogonal matrices. So this is no coincidence.)


(1)(1) X_{\boldsymbol{p}}(fg)=(X(fg))(\boldsymbol{p})=(fX(g)+gX(f))(\boldsymbol{p})=f(\boldsymbol{p})X(g)(\boldsymbol{p})+g(\boldsymbol{p})X(f)(\boldsymbol{p})X\boldsymbolp(fg)=(X(fg))(\boldsymbolp)=(fX(g)+gX(f))(\boldsymbolp)=f(\boldsymbolp)X(g)(\boldsymbolp)+g(\boldsymbolp)X(f)(\boldsymbolp)

=f(\boldsymbol{p})X_{\boldsymbol{p}}(g)+g(\boldsymbol{p})X_{\boldsymbol{p}}(f)=f(\boldsymbolp)X\boldsymbolp(g)+g(\boldsymbolp)X\boldsymbolp(f) So it suits the definition of derivative at \boldsymbol{p}\boldsymbolp

(2)(2) X_{\boldsymbol{p}}:V\rightarrow RX\boldsymbolp:VR So X_{\boldsymbol{p}}\in V^{*}X\boldsymbolpV X_{p}=\left[\begin{array}{c} X_{\boldsymbol{p}}(X) & X_{p}(Y) & X_{p}(Z) \end{array}\right]Xp=[X\boldsymbolp(X)Xp(Y)Xp(Z)]

df(X)(\boldsymbol{p})=df_{\boldsymbol{p} }(X_{\boldsymbol{p}})df(X)(\boldsymbolp)=df\boldsymbolp(X\boldsymbolp) where df=\left[\begin{array}{c} \dfrac{\partial f}{\partial x}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial y}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial z}(\boldsymbol{p})\\ \end{array}\right]df=[fx(\boldsymbolp)fy(\boldsymbolp)fz(\boldsymbolp)] combine them together then we have

From \small (1.4)(1.4) (X(f))(\boldsymbol{p})=X_{p}(f)=\dfrac{\partial f}{\partial x}(\boldsymbol{p}) X_{\boldsymbol{p}}(X)+\dfrac{\partial f}{\partial y}(\boldsymbol{p}) X_{\boldsymbol{p}}(Y)+\dfrac{\partial f}{\partial z}(\boldsymbol{p}) X_{\boldsymbol{p}}(Z)(X(f))(\boldsymbolp)=Xp(f)=fx(\boldsymbolp)X\boldsymbolp(X)+fy(\boldsymbolp)X\boldsymbolp(Y)+fz(\boldsymbolp)X\boldsymbolp(Z)

=\left[\begin{array}{c} X_{\boldsymbol{p}}(X) & X_{p}(Y) & X_{p}(Z) \end{array}\right]\left[\begin{array}{c} \dfrac{\partial f}{\partial x}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial y}(\boldsymbol{p})\\ \dfrac{\partial f}{\partial z}(\boldsymbol{p})\\ \end{array}\right]=df(X)=[X\boldsymbolp(X)Xp(Y)Xp(Z)][fx(\boldsymbolp)fy(\boldsymbolp)fz(\boldsymbolp)]=df(X)

(3)(3) Just unfold the fomula \begin{gathered} (X\circ Y-Y\circ X)(fg)=X\circ Y(fg)-Y\circ X(fg)=X\circ (fY(g)+gY(f))-Y\circ (fX(g)+gX(f))\\ =\small f (X\circ Y)g+X(f)Y(g)+g (X\circ Y)f+X(g)Y(f)-(f (Y\circ X)g+Y(f)X(g)+g (Y\circ X)f+Y(g)X(f))\\ =f(X\circ Y-Y\circ X)g+g(X\circ Y-Y\circ X)f \end{gathered} (XYYX)(fg)=XY(fg)YX(fg)=X(fY(g)+gY(f))Y(fX(g)+gX(f))=f(XY)g+X(f)Y(g)+g(XY)f+X(g)Y(f)(f(YX)g+Y(f)X(g)+g(YX)f+Y(g)X(f))=f(XYYX)g+g(XYYX)f So X\circ Y-Y \circ XXYYX suits the Leibniz rule, which is a vector field.

(4)(4) Calculate directly we can prove (AB-BA)^{T}=-(AB-BA)(ABBA)T=(ABBA) \begin{aligned} (A B-B A)^{T} =(A B)^{T}-(B A)^{T} =(B^{T} A^{T}-A^{T} B^{T}) \\ =(-B)(-A)-(-A)(-B) =BA-AB=-(A B-B A) \end{aligned} (ABBA)T=(AB)T(BA)T=(BTATATBT)=(B)(A)(A)(B)=BAAB=(ABBA) For all the positive orthogonal matrices \mathcal{Q}Q \forall \ Q\in \mathcal{Q},\det(Q)=1 QQ,det(Q)=1. At one matrix, its tangent vector

(Q+A)^{T}(Q+A)=I=(Q^{T}+A^{T})(Q+A)=I\Longrightarrow A=-A^{T},||A||\to 0(Q+A)T(Q+A)=I=(QT+AT)(Q+A)=IA=AT,||A||0

are all Skew-symmetric matrices. According to \small (3)(3), AB-BAABBA is also a Skew-symmetric matrix

\Large \mathbf{If\ your\ life\ is\ tense,\ it\ could\ be\ a\ tensor. }If your life is tense, it could be a tensor.

\large\textcolor{blue}{\mbox{Advanced Algebra } \small \mathbb{HW}\mathrm{8}}\ \ \ \ \ \ _\textcolor{blue}{2022.5.9}\textcolorblueAdvanced Algebra HW8      \textcolorblue2022.5.9

1.8.1\ \small \mbox{Elementary layer operations for tensors} Elementary layer operations fortensors

Note that, for "2D" matrices we have row and column operations, and the two kinds of operations corresponds to the two dimensions of the array.

For simplicity, let MM be a 2 \times 2 \times 22×2×2 "3D matrix". Then we have "row layer operations", "column layer operations", "horizontal layer operations". The three kinds corresponds to the three dimensions of the array. We interpret this as a multilinear map M: \mathbb{R}^{2} \times \mathbb{R}^{2} \times \mathbb{R}^{2} \rightarrow \mathbb{R}M:R2×R2×R2R. Let \left(\left(\mathbb{R}^{2}\right)^{*}\right)^{\otimes 3}((R2))3 be the space of all multilinear maps from \mathbb{R}^{2} \times \mathbb{R}^{2} \times \mathbb{R}^{2}R2×R2×R2 to \mathbb{R}R.

  1. Given \alpha, \beta, \gamma \in\left(\mathbb{R}^{2}\right)^{*}α,β,γ(R2), what is the (i, j, k)(i,j,k)-entry of the "3D matrix" \alpha \otimes \beta \otimes \gammaαβγ in terms of the coordinates of \alpha, \beta, \gammaα,β,γ ? Here \alpha \otimes \beta \otimes \gammaαβγ is the multilinear map sending (\boldsymbol{u}, \boldsymbol{v}, \boldsymbol{w})(\boldsymbolu,\boldsymbolv,\boldsymbolw) to the real number \alpha(\boldsymbol{u}) \beta(\boldsymbol{v}) \gamma(\boldsymbol{w})α(\boldsymbolu)β(\boldsymbolv)γ(\boldsymbolw).
  2. Let EE be an elementary matrix. Then we can send \alpha \otimes \beta \otimes \gammaαβγ to (\alpha E) \otimes \beta \otimes \gamma(αE)βγ. Why can this be extended to a linear map M_{E}:\left(\left(\mathbb{R}^{2}\right)^{*}\right)^{\otimes 3} \rightarrow\left(\left(\mathbb{R}^{2}\right)^{*}\right)^{\otimes 3}ME:((R2))3((R2))3 ? (This gives a formula for the "elementary layer operations" on "3D matrices", where the three kinds of layer operations corresponds to applying EE to the three arguments respectively.)
  3. Show that elementary layer operations preserve rank. Here we say MM has rank rr if rr is the smallest possible integer such that MM can be written as the linear combination of rr "rank one" maps, i.e., maps of the kind \alpha \otimes \beta \otimes \gammaαβγ for some \alpha, \beta, \gamma \in\left(\mathbb{R}^{2}\right)^{*}α,β,γ(R2).
  4. Show that, if some "2D" layer matrix of a "3D matrix" has rank r, then the 3 D3D matrix has rank at least rr.
  5. Let MM be made of two layers, \left[\begin{array}{ll}1 & 0 \\ 0 & 1\end{array}\right][1001] and \left[\begin{array}{ll}0 & 1 \\ 1 & 0\end{array}\right][0110]. Find its rank.
  6. (Read only) Despite some practical interests, finding the tensor rank in general is NOT easy. In fact, it is NP-complete just for 3-tensors over finite field. Furthermore, a tensor with all real entries might have different real rank and complex rank.

(1)(1) According to the symmetry of dot product \langle\alpha,u\rangle=\langle u,\alpha\rangleα,u=u,α we have equation \alpha^Tu=u^T\alphaαTu=uTα So

\alpha^{T} u \beta^{T} v \gamma^{T} \omega=u^{T} \alpha \beta^{T} v \gamma^{T} \omega=u^{T}\left[\alpha \beta^{T} v \gamma_{1} \quad \partial \beta^{T} v \gamma_{2}\right] \omegaαTuβTvγTω=uTαβTvγTω=uT[αβTvγ1βTvγ2]ω Compare to [u^TA_1v\quad u^TA_2v]\omega[uTA1vuTA2v]ω hence

A_{1}=\gamma_{1}\left(\begin{array}{ll}\alpha_{1} \beta_{1} & \alpha_{1} \beta_{2} \\ \alpha_{2} \beta_{1} & \alpha_{2} \beta_{2}\end{array}\right)=\gamma_1\alpha\beta^T,A_{2}=\gamma_{2}\left(\begin{array}{ll}\alpha_{1} \beta_{1} & \alpha_{1} \beta_{2} \\ \alpha_{2} \beta_{1} & \alpha_{2} \beta_{2}\end{array}\right)=\gamma_2\alpha\beta^TA1=γ1(α1β1α1β2α2β1α2β2)=γ1αβT,A2=γ2(α1β1α1β2α2β1α2β2)=γ2αβT . So A_{ijk}=\gamma_i\alpha_j\beta_kAijk=γiαjβk

(2)(2) M_EME sends tensor M=[[A_1,A_2]]=[[\gamma_1\alpha \beta^T,\gamma_2\alpha \beta^T]]M=[[A1,A2]]=[[γ1αβT,γ2αβT]] to M'=[[\gamma_1\alpha E \beta^T,\gamma_2\alpha E\beta^T]]M=[[γ1αEβT,γ2αEβT]]

For \alpha _1,\alpha_2α1,α2 in (\mathbb{R}^2)^*(R2) as one part in ((\mathbb{R}^2)^*)^{\otimes3}((R2))3 M_{k\alpha_1+\mu\alpha_2}=[[\gamma_1(k\alpha_1+\mu\alpha_2) \beta^T,\gamma_2(k\alpha_1+\mu\alpha_2) \beta^T]]=kM_{\alpha _1}+\mu M_{\alpha _2}\in ((\mathbb{R}^2)^*)^{\otimes3} Mkα1+μα2=[[γ1(kα1+μα2)βT,γ2(kα1+μα2)βT]]=kMα1+μMα2((R2))3 So for \alpha ,\betaα,β is linear. and for \gamma_1,\gamma_2γ1,γ2 in (\mathbb{R}^2)^*(R2) as one part in ((\mathbb{R}^2)^*)^{\otimes3}((R2))3, set \gamma_{11,12}γ11,12 as the component of \gamma_1γ1

\gamma_{21,22}γ21,22 as the component of \gamma_2γ2, So M_{k\gamma_1+\mu\gamma_2}=[[(k\gamma_{11}+\mu\gamma_{21})\alpha\beta^T,(k\gamma_{12}+\mu\gamma_{22})\alpha \beta^T]]=kM_{\gamma _1}+\mu M_{\gamma _2}\in ((\mathbb{R}^2)^*)^{\otimes3} Mkγ1+μγ2=[[(kγ11+μγ21)αβT,(kγ12+μγ22)αβT]]=kMγ1+μMγ2((R2))3 So M_EME is a linear map, three operations at \alpha,\beta,\gammaα,β,γ

(3)(3) Suppose M=\displaystyle \sum_{i=1}^rM_{base(i)}M=i=1rMbase(i) if we operate elementary layer operations for MM, the right hand side

is also in "rank one" maps, so r'\leq rrr. And if r'<rr<r, i.e., M'=\displaystyle \sum_{i=1}^{r'}M_{base(i)}M=i=1rMbase(i) As elementary have its inverse

operate the inverse of elementary operation, and we have contradiction, so r'=rr=r

(4)(4) According to \mbox{SVD}SVD, the minimum number of decomposing a matrix into rank-11 matrixes equals to rank

So if 2D2D matrix needs at least rr rank-11 matrixes to make up, since every rank-11 maps in \alpha \otimes \beta \otimes \gammaαβγ

contains two rank-11 matrix, so the 3D3D matrix also needs at least rr rank-11 tensors to make up

(5)(5) For A_1=\left[\begin{array}{ll}1 & 0 \\ 0 & 1\end{array}\right]A1=[1001] its rank is 22, so r(M)\geq 2r(M)2. Besides, construct two rank-11 tensors M_1=[[\dfrac{1}{2}\begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix},\dfrac{1}{2}\begin{bmatrix}1 & 1 \\ 1 & 1\end{bmatrix}]], M_2=[[\dfrac{1}{2}\begin{bmatrix}1 & -1 \\ -1 & 1\end{bmatrix},-\dfrac{1}{2}\begin{bmatrix}1 & -1 \\ -1 & 1\end{bmatrix}]] M1=[[12[1111],12[1111]]],M2=[[12[1111],12[1111]]] And M=M_1+M_2M=M1+M2, so its rank is 22

1.8.2\ \small \mbox{i+j+k rank-3 tensor} i+j+k rank-3 tensor

Let MM be a 3 \times 3 \times 33×3×3 "3D matrix" whose (i, j, k)(i,j,k)-entry is i+j+ki+j+k. We interpret this as a multilinear map M: \mathbb{R}^{3} \times \mathbb{R}^{3} \times \mathbb{R}^{3} \rightarrow \mathbb{R}M:R3×R3×R3R.

  1. Let \boldsymbol{v}=\left[\begin{array}{l}x \\ y \\ z\end{array}\right]\boldsymbolv=[xyz], then M(\boldsymbol{v}, \boldsymbol{v}, \boldsymbol{v})M(\boldsymbolv,\boldsymbolv,\boldsymbolv) is a polynomial in x, y, zx,y,z. What is this polynomial?
  2. Let \sigma:\{1,2,3\} \rightarrow\{1,2,3\}σ:{1,2,3}{1,2,3} be any bijection. Show that M\left(\boldsymbol{v}_{1}, \boldsymbol{v}_{2}, \boldsymbol{v}_{3}\right)=M\left(\boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\right)M(\boldsymbolv1,\boldsymbolv2,\boldsymbolv3)=M(\boldsymbolvσ(1),\boldsymbolvσ(2),\boldsymbolvσ(3)). (Hint: brute force works. But alternatively, try find the (i, j, k)(i,j,k) entry of the multilinear map M^{\sigma}Mσ, a map that sends \left(\boldsymbol{v}_{1}, \boldsymbol{v}_{2}, \boldsymbol{v}_{3}\right)(\boldsymbolv1,\boldsymbolv2,\boldsymbolv3) to M\left(\boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\right)M(\boldsymbolvσ(1),\boldsymbolvσ(2),\boldsymbolvσ(3)).)
  3. Show that the rank rr of MM is at least 2 and at most 3. (It is actually exactly three.)
  4. (Read only) Any study of polynomial of degree dd on nn variables is equivalent to the study of some symmetric dd tensor on \mathbb{R}^{n}Rn.

(1)(1) M=[[\begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix},\begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix},\begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}]]=[[A_1,A_2,A_3]]M=[[[345456567],[456567678],[567678789]]]=[[A1,A2,A3]] And M(\boldsymbol v,\boldsymbol v,\boldsymbol v)M(\boldsymbolv,\boldsymbolv,\boldsymbolv) where \boldsymbol{v}=\begin{pmatrix}x\\y\\z\end{pmatrix}\boldsymbolv=(xyz)

=[\boldsymbol v^TA_1\boldsymbol v\quad \boldsymbol v^TA_2\boldsymbol v\quad \boldsymbol v^TA_3\boldsymbol v]\boldsymbol v=[\boldsymbol v^T\begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix}\boldsymbol v\quad \boldsymbol v^T\begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix}\boldsymbol v\quad \boldsymbol v^T\begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}\boldsymbol v]\boldsymbol v=[\boldsymbolvTA1\boldsymbolv\boldsymbolvTA2\boldsymbolv\boldsymbolvTA3\boldsymbolv]\boldsymbolv=[\boldsymbolvT[345456567]\boldsymbolv\boldsymbolvT[456567678]\boldsymbolv\boldsymbolvT[567678789]\boldsymbolv]\boldsymbolv

Calculate it by \mbox{Mathematica}Mathematica, the result is p(x,y,z)=3 (x+y+z)^2 (x+2 y+3 z)p(x,y,z)=3(x+y+z)2(x+2y+3z)

(2)(2) let a linear map from \mathbb{R}^{3} \times \mathbb{R}^{3} \times \mathbb{R}^{3}R3×R3×R3 to \mathbb{R}R sends \left(\boldsymbol{v}_{1}, \boldsymbol{v}_{2}, \boldsymbol{v}_{3}\right)(\boldsymbolv1,\boldsymbolv2,\boldsymbolv3) to M\left(\boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\right)M(\boldsymbolvσ(1),\boldsymbolvσ(2),\boldsymbolvσ(3))

Obviously, it's multi-linear since for \boldsymbol{v}_i\boldsymbolvi the evaluation result is linear no matter which position \boldsymbol{v}_i\boldsymbolvi is.

And this map have a tensor such M'=[[A_1',A_2',A_3']]M=[[A1,A2,A3]]. Specialise \boldsymbol{v_i}\boldsymbolvi to get value of A_{1}',A_{2}',A_{3}'A1,A2,A3

Set \boldsymbol{b}_1=\begin{pmatrix}1\\0\\0\end{pmatrix},\boldsymbol{b}_2=\begin{pmatrix}0\\1\\0\end{pmatrix},\boldsymbol{b}_3=\begin{pmatrix}0\\0\\1\end{pmatrix}\boldsymbolb1=(100),\boldsymbolb2=(010),\boldsymbolb3=(001) let (\boldsymbol{v_1},\boldsymbol{v_2},\boldsymbol{v_3})=(\boldsymbol{b_i},\boldsymbol{b_j},\boldsymbol{b_k})(\boldsymbolv1,\boldsymbolv2,\boldsymbolv3)=(\boldsymbolbi,\boldsymbolbj,\boldsymbolbk) where 1\leq i,j,k\leq 31i,j,k3

and they can be the same. Put one of condition into the map so A_{k(ij)}'=A_{\sigma^{-1}(i)(\sigma^{-1}(j)\sigma^{-1}(k))}=\sigma^{-1}(i)+\sigma^{-1}(j)+\sigma^{-1}(k)=i+j+k Ak(ij)=Aσ1(i)(σ1(j)σ1(k))=σ1(i)+σ1(j)+σ1(k)=i+j+k So for k=1,2,3k=1,2,3 A_{k}'=A_kAk=Ak, which implies that M'=MM=M, so the equation is proved.

A brute try (failed):

It is obvious that swaping \boldsymbol{v}_{i},\boldsymbol{v}_{j}\boldsymbolvi,\boldsymbolvj at most twice can make \boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\boldsymbolv1,\boldsymbolv2,\boldsymbolv3 $$ \boldsymbol{v}_{\sigma(1)}, \boldsymbol{v}_{\sigma(2)}, \boldsymbol{v}_{\sigma(3)}\boldsymbolvσ(1),\boldsymbolvσ(2),\boldsymbolvσ(3)

If swap \boldsymbol{v}_{1},\boldsymbol{v}_{2}\boldsymbolv1,\boldsymbolv2, the value \boldsymbol v_1^TA_i\boldsymbol v_2=\boldsymbol v_2^TA_i\boldsymbol v_1,i=1,2,3\boldsymbolv1TAi\boldsymbolv2=\boldsymbolv2TAi\boldsymbolv1,i=1,2,3, since A_i=A_i^TAi=AiT so the result stays the same.

And if we swap \boldsymbol{v}_{2},\boldsymbol{v}_{3}\boldsymbolv2,\boldsymbolv3, [\boldsymbol v_1^TA_1\boldsymbol v_2\quad \boldsymbol v_1^TA_2\boldsymbol v_2\quad \boldsymbol v_1^TA_3\boldsymbol v_2]\boldsymbol v_3=\boldsymbol v_1^TA_1\boldsymbol v_2v_{3x}+\boldsymbol v_1^TA_2\boldsymbol v_2v_{3y}+\boldsymbol v_1^TA_3\boldsymbol v_2v_{3z}[\boldsymbolv1TA1\boldsymbolv2\boldsymbolv1TA2\boldsymbolv2\boldsymbolv1TA3\boldsymbolv2]\boldsymbolv3=\boldsymbolv1TA1\boldsymbolv2v3x+\boldsymbolv1TA2\boldsymbolv2v3y+\boldsymbolv1TA3\boldsymbolv2v3z

(3)(3) According to 1.8.1(4)1.8.1(4), since \mbox{rank}(A_i)=2rank(Ai)=2, so r\geq 2r2. Then just construct a reasonable combination

I guess \alpha=\begin{pmatrix}1\\1\\-1\end{pmatrix},\beta=\begin{pmatrix}1\\-1\\1\end{pmatrix},\gamma=\begin{pmatrix}-1\\1\\1\end{pmatrix}α=(111),β=(111),γ=(111) so the rank-11 matrix like \begin{bmatrix}a&b&c\\a&b&c\\-a&-b&-c\end{bmatrix},\begin{bmatrix}d&e&f\\-d&-e&-f\\d&e&f\end{bmatrix},\begin{bmatrix}-g&-h&-i\\g&h&i\\g&h&i\end{bmatrix}, [abcabcabc],[defdefdef],[ghighighi], And the linear combinations of these three matrixes are \begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix},\begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix},\begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}[345456567],[456567678],[567678789]

which transfers to three nine - dimensional equations, out of my hand ability, the coefficient matrix is A=\left( \begin{array}{ccccccccc} 1 & 0 & 0 & 1 & 0 & 0 & -1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 0 & -1 & 0 \\ 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & -1 \\ 0 & 1 & 0 & 0 & -1 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 & -1 & 0 & 0 & 1 \\ 1 & 0 & 0 & -1 & 0 & 0 & 1 & 0 & 0 \\ -1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\ 0 & -1 & 0 & 0 & 1 & 0 & 0 & 1 & 0 \\ 0 & 0 & -1 & 0 & 0 & 1 & 0 & 0 & 1 \\ \end{array} \right),A\vec{x}=\vec{b}_{i}=\begin{pmatrix}3\\4\\5\\4\\5\\6\\5\\6\\7\end{pmatrix},\begin{pmatrix}4\\5\\6\\5\\6\\7\\6\\7\\8\end{pmatrix},\begin{pmatrix}5\\6\\7\\6\\7\\8\\7\\8\\9\end{pmatrix} A=(100100100010010010001001001010010010001001001100100100100100100010010010001001001),Ax=bi=(345456567),(456567678),(567678789) And the accurate solution is \begin{gathered} \begin{bmatrix}3 & 4 & 5 \\ 4 & 5 & 6\\5& 6&7\end{bmatrix}=\begin{bmatrix}3.5&4.5&5.5\\3.5&4.5&5.5\\-3.5&-4.5&-5.5\end{bmatrix}+ \begin{bmatrix}4&5&6\\-4&-5&-6\\4&5&6\end{bmatrix}+\begin{bmatrix}-4.5&-5.5&-6.5\\4.5&5.5&6.5\\4.5&5.5&6.5\end{bmatrix}\\ \begin{bmatrix}4 & 5 & 6 \\ 5 & 6 & 7\\6& 7&8\end{bmatrix}=\begin{bmatrix}4.5&5.5&6.5\\4.5&5.5&6.5\\-4.5&-5.5&-6.5\end{bmatrix}+ \begin{bmatrix}5&6&7\\-5&-6&-7\\5&6&7\end{bmatrix}+\begin{bmatrix}-5.5&-6.5&-7.5\\5.5&6.5&7.5\\5.5&6.5&7.5\end{bmatrix}\\ \begin{bmatrix}5 & 6 & 7 \\ 6 & 7 & 8\\7& 8&9\end{bmatrix}=\begin{bmatrix}5.5&6.5&7.5\\5.5&6.5&7.5\\-5.5&-6.5&-7.5\end{bmatrix}+ \begin{bmatrix}6&7&8\\-6&-7&-8\\6&7&8\end{bmatrix}+\begin{bmatrix}-6.5&-7.5&-8.5\\6.5&7.5&8.5\\6.5&7.5&8.5\end{bmatrix}\\ \end{gathered} [345456567]=[3.54.55.53.54.55.53.54.55.5]+[456456456]+[4.55.56.54.55.56.54.55.56.5][456567678]=[4.55.56.54.55.56.54.55.56.5]+[567567567]+[5.56.57.55.56.57.55.56.57.5][567678789]=[5.56.57.55.56.57.55.56.57.5]+[678678678]+[6.57.58.56.57.58.56.57.58.5] So it can be decomposed into three rank-11 tensors. So r\leq 3r3


Advanced Algebra part of Homework
https://lr-tsinghua11.github.io/2022/03/16/%E6%95%B0%E5%AD%A6/%E9%AB%98%E4%BB%A3%E9%80%89%E8%AE%B2%E9%83%A8%E5%88%86%E4%BD%9C%E4%B8%9A/
作者
Learning_rate
发布于
2022年3月16日
许可协议