Type of matrix in probability theory and statistics
In probability theory and statistics , a cross-covariance matrix is a matrix whose element in the i , j position is the covariance between the i -th element of a random vector and j -th element of another random vector. A random vector is a random variable with multiple dimensions. Each element of the vector is a scalar random variable. Each element has either a finite number of observed empirical values or a finite or infinite number of potential values. The potential values are specified by a theoretical joint probability distribution . Intuitively, the cross-covariance matrix generalizes the notion of covariance to multiple dimensions.
The cross-covariance matrix of two random vectors
X
{\displaystyle \mathbf {X} }
and
Y
{\displaystyle \mathbf {Y} }
is typically denoted by
K
X
Y
{\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }}
or
Σ
X
Y
{\displaystyle \Sigma _{\mathbf {X} \mathbf {Y} }}
.
For random vectors
X
{\displaystyle \mathbf {X} }
and
Y
{\displaystyle \mathbf {Y} }
, each containing random elements whose expected value and variance exist, the cross-covariance matrix of
X
{\displaystyle \mathbf {X} }
and
Y
{\displaystyle \mathbf {Y} }
is defined by[ 1] : 336
K
X
Y
=
cov
(
X
,
Y
)
=
d
e
f
E
[
(
X
−
μ
X
)
(
Y
−
μ
Y
)
T
]
{\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }=\operatorname {cov} (\mathbf {X} ,\mathbf {Y} ){\stackrel {\mathrm {def} }{=}}\ \operatorname {E} [(\mathbf {X} -\mathbf {\mu _{X}} )(\mathbf {Y} -\mathbf {\mu _{Y}} )^{\rm {T}}]}
Eq.1
where
μ
X
=
E
[
X
]
{\displaystyle \mathbf {\mu _{X}} =\operatorname {E} [\mathbf {X} ]}
and
μ
Y
=
E
[
Y
]
{\displaystyle \mathbf {\mu _{Y}} =\operatorname {E} [\mathbf {Y} ]}
are vectors containing the expected values of
X
{\displaystyle \mathbf {X} }
and
Y
{\displaystyle \mathbf {Y} }
. The vectors
X
{\displaystyle \mathbf {X} }
and
Y
{\displaystyle \mathbf {Y} }
need not have the same dimension, and either might be a scalar value.
The cross-covariance matrix is the matrix whose
(
i
,
j
)
{\displaystyle (i,j)}
entry is the covariance
K
X
i
Y
j
=
cov
[
X
i
,
Y
j
]
=
E
[
(
X
i
−
E
[
X
i
]
)
(
Y
j
−
E
[
Y
j
]
)
]
{\displaystyle \operatorname {K} _{X_{i}Y_{j}}=\operatorname {cov} [X_{i},Y_{j}]=\operatorname {E} [(X_{i}-\operatorname {E} [X_{i}])(Y_{j}-\operatorname {E} [Y_{j}])]}
between the i -th element of
X
{\displaystyle \mathbf {X} }
and the j -th element of
Y
{\displaystyle \mathbf {Y} }
. This gives the following component-wise definition of the cross-covariance matrix.
K
X
Y
=
[
E
[
(
X
1
−
E
[
X
1
]
)
(
Y
1
−
E
[
Y
1
]
)
]
E
[
(
X
1
−
E
[
X
1
]
)
(
Y
2
−
E
[
Y
2
]
)
]
⋯
E
[
(
X
1
−
E
[
X
1
]
)
(
Y
n
−
E
[
Y
n
]
)
]
E
[
(
X
2
−
E
[
X
2
]
)
(
Y
1
−
E
[
Y
1
]
)
]
E
[
(
X
2
−
E
[
X
2
]
)
(
Y
2
−
E
[
Y
2
]
)
]
⋯
E
[
(
X
2
−
E
[
X
2
]
)
(
Y
n
−
E
[
Y
n
]
)
]
⋮
⋮
⋱
⋮
E
[
(
X
m
−
E
[
X
m
]
)
(
Y
1
−
E
[
Y
1
]
)
]
E
[
(
X
m
−
E
[
X
m
]
)
(
Y
2
−
E
[
Y
2
]
)
]
⋯
E
[
(
X
m
−
E
[
X
m
]
)
(
Y
n
−
E
[
Y
n
]
)
]
]
{\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }={\begin{bmatrix}\mathrm {E} [(X_{1}-\operatorname {E} [X_{1}])(Y_{1}-\operatorname {E} [Y_{1}])]&\mathrm {E} [(X_{1}-\operatorname {E} [X_{1}])(Y_{2}-\operatorname {E} [Y_{2}])]&\cdots &\mathrm {E} [(X_{1}-\operatorname {E} [X_{1}])(Y_{n}-\operatorname {E} [Y_{n}])]\\\\\mathrm {E} [(X_{2}-\operatorname {E} [X_{2}])(Y_{1}-\operatorname {E} [Y_{1}])]&\mathrm {E} [(X_{2}-\operatorname {E} [X_{2}])(Y_{2}-\operatorname {E} [Y_{2}])]&\cdots &\mathrm {E} [(X_{2}-\operatorname {E} [X_{2}])(Y_{n}-\operatorname {E} [Y_{n}])]\\\\\vdots &\vdots &\ddots &\vdots \\\\\mathrm {E} [(X_{m}-\operatorname {E} [X_{m}])(Y_{1}-\operatorname {E} [Y_{1}])]&\mathrm {E} [(X_{m}-\operatorname {E} [X_{m}])(Y_{2}-\operatorname {E} [Y_{2}])]&\cdots &\mathrm {E} [(X_{m}-\operatorname {E} [X_{m}])(Y_{n}-\operatorname {E} [Y_{n}])]\end{bmatrix}}}
For example, if
X
=
(
X
1
,
X
2
,
X
3
)
T
{\displaystyle \mathbf {X} =\left(X_{1},X_{2},X_{3}\right)^{\rm {T}}}
and
Y
=
(
Y
1
,
Y
2
)
T
{\displaystyle \mathbf {Y} =\left(Y_{1},Y_{2}\right)^{\rm {T}}}
are random vectors, then
cov
(
X
,
Y
)
{\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )}
is a
3
×
2
{\displaystyle 3\times 2}
matrix whose
(
i
,
j
)
{\displaystyle (i,j)}
-th entry is
cov
(
X
i
,
Y
j
)
{\displaystyle \operatorname {cov} (X_{i},Y_{j})}
.
For the cross-covariance matrix, the following basic properties apply:[ 2]
cov
(
X
,
Y
)
=
E
[
X
Y
T
]
−
μ
X
μ
Y
T
{\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )=\operatorname {E} [\mathbf {X} \mathbf {Y} ^{\rm {T}}]-\mathbf {\mu _{X}} \mathbf {\mu _{Y}} ^{\rm {T}}}
cov
(
X
,
Y
)
=
cov
(
Y
,
X
)
T
{\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )=\operatorname {cov} (\mathbf {Y} ,\mathbf {X} )^{\rm {T}}}
cov
(
X
1
+
X
2
,
Y
)
=
cov
(
X
1
,
Y
)
+
cov
(
X
2
,
Y
)
{\displaystyle \operatorname {cov} (\mathbf {X_{1}} +\mathbf {X_{2}} ,\mathbf {Y} )=\operatorname {cov} (\mathbf {X_{1}} ,\mathbf {Y} )+\operatorname {cov} (\mathbf {X_{2}} ,\mathbf {Y} )}
cov
(
A
X
+
a
,
B
T
Y
+
b
)
=
A
cov
(
X
,
Y
)
B
{\displaystyle \operatorname {cov} (A\mathbf {X} +\mathbf {a} ,B^{\rm {T}}\mathbf {Y} +\mathbf {b} )=A\,\operatorname {cov} (\mathbf {X} ,\mathbf {Y} )\,B}
If
X
{\displaystyle \mathbf {X} }
and
Y
{\displaystyle \mathbf {Y} }
are independent (or somewhat less restrictedly, if every random variable in
X
{\displaystyle \mathbf {X} }
is uncorrelated with every random variable in
Y
{\displaystyle \mathbf {Y} }
), then
cov
(
X
,
Y
)
=
0
p
×
q
{\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )=0_{p\times q}}
where
X
{\displaystyle \mathbf {X} }
,
X
1
{\displaystyle \mathbf {X_{1}} }
and
X
2
{\displaystyle \mathbf {X_{2}} }
are random
p
×
1
{\displaystyle p\times 1}
vectors,
Y
{\displaystyle \mathbf {Y} }
is a random
q
×
1
{\displaystyle q\times 1}
vector,
a
{\displaystyle \mathbf {a} }
is a
q
×
1
{\displaystyle q\times 1}
vector,
b
{\displaystyle \mathbf {b} }
is a
p
×
1
{\displaystyle p\times 1}
vector,
A
{\displaystyle A}
and
B
{\displaystyle B}
are
q
×
p
{\displaystyle q\times p}
matrices of constants, and
0
p
×
q
{\displaystyle 0_{p\times q}}
is a
p
×
q
{\displaystyle p\times q}
matrix of zeroes.
Definition for complex random vectors [ edit ]
If
Z
{\displaystyle \mathbf {Z} }
and
W
{\displaystyle \mathbf {W} }
are complex random vectors, the definition of the cross-covariance matrix is slightly changed. Transposition is replaced by Hermitian transposition :
K
Z
W
=
cov
(
Z
,
W
)
=
d
e
f
E
[
(
Z
−
μ
Z
)
(
W
−
μ
W
)
H
]
{\displaystyle \operatorname {K} _{\mathbf {Z} \mathbf {W} }=\operatorname {cov} (\mathbf {Z} ,\mathbf {W} ){\stackrel {\mathrm {def} }{=}}\ \operatorname {E} [(\mathbf {Z} -\mathbf {\mu _{Z}} )(\mathbf {W} -\mathbf {\mu _{W}} )^{\rm {H}}]}
For complex random vectors, another matrix called the pseudo-cross-covariance matrix is defined as follows:
J
Z
W
=
cov
(
Z
,
W
¯
)
=
d
e
f
E
[
(
Z
−
μ
Z
)
(
W
−
μ
W
)
T
]
{\displaystyle \operatorname {J} _{\mathbf {Z} \mathbf {W} }=\operatorname {cov} (\mathbf {Z} ,{\overline {\mathbf {W} }}){\stackrel {\mathrm {def} }{=}}\ \operatorname {E} [(\mathbf {Z} -\mathbf {\mu _{Z}} )(\mathbf {W} -\mathbf {\mu _{W}} )^{\rm {T}}]}
Two random vectors
X
{\displaystyle \mathbf {X} }
and
Y
{\displaystyle \mathbf {Y} }
are called uncorrelated if their cross-covariance matrix
K
X
Y
{\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }}
matrix is a zero matrix.[ 1] : 337
Complex random vectors
Z
{\displaystyle \mathbf {Z} }
and
W
{\displaystyle \mathbf {W} }
are called uncorrelated if their covariance matrix and pseudo-covariance matrix is zero, i.e. if
K
Z
W
=
J
Z
W
=
0
{\displaystyle \operatorname {K} _{\mathbf {Z} \mathbf {W} }=\operatorname {J} _{\mathbf {Z} \mathbf {W} }=0}
.