Centre de Recherche Mathematique, Montreal
July 18-19, 2011
Ruedi Seiler
based on the joint work with
Igor Bjelakovic, TU Berlin
Jean-Dominique Deuschel, TU Berlin
Tyll Krüger, U Bielefeld
Rainer Siegmund-Schultze, integral-learning and TU Berlin
Arleta Szkola, MPI Leipzig
1 Introduction
Classical spin chain: typical subsets, Shannon McMillan
1. Question: (interesting for hypothesis testing)
Given two states and .
Is there a sequence of -typical sets
such that
? Answer yes. But the convergence to zero can not be too fast.
The optimal rate of convergence is given by an exponential law, where the exponent
is the relative entropy density of the two states:
This is the content of Stein's lemma.
The problem in hypothesis testing:
Which one of the 2 states and is the better description of the observed data?
To answer this question, the set of n spin configurations is
partitioned into the test set and its complement, such that
. The closer
to , the better the test.
2. Question: (interesting for compression of data)
Is there a universal sequence of typical small sets for any state?
Answer is yes if rephrased in a more intelligent way: Given , is there
a sequence of subsets which are typical for all states with entropy ?. Proof for iid states e.g. by Lempel-Ziv compression algorithm.
Quantum Analog: Kaltchenko-Yang (see 8).
3. Question: (hypothesis testing)
Note: This question is a combination of question 1 and 2. Given
1. | set of states and
| 2. | single state .
|
Is there a sequence of
sets such that
1. | universally typical for all states in
and
| 2. | ?
|
Answer yes:
Provided: Convergence to not too fast
Optimal convergence: exponential law
Best exponent: (rel.entropy).
Remark on Classical - Quantum
Classical algebra of phase space functions (diagonal matrices) is abelian
subalgebra of quantum algebra (full matrix algebra).
Remark on proof:
1. | One state: Basic starting point (Shannon, McMillon)
| 2. | Two states: Stein's lemma
| 3. | Many + 1 states: Sanov's theorem
|
2 to 3: Slice the set along level sets.
2 Setup and Basic Concepts
Def: Discrete Information Source
| Quantum 1-Bit Algebra
| | Classical 1-Bit Algebra
(functions on )
| | for
| |
| | |
| | local quantum spin chain algebra
| | local classical spin chain algebra
| |
| | state on :
| |
| | -automorphism on :
extends to
|
Def: Stationary, Ergodic, IID States
Def: entropy, relative entropy
| The entropy of is defined by
|
For stationary states entropy is subadditiv. Hence its rate
converges:
| | Relative entropy of is defined by
|
and its rate by
|
provided the limit exists.
|
Def: The Hiai-Petz condition | states on
| | Sequence of optimally seperating exponent:
|
| |
| | is HP:
Hiai and Petz: completely ergodic, -mixing
| |
|
sufficient condition for HP (discussed later)
ergodic and -mixing is HP.
Def: -mixing
A stationary state on
is called -mixing if for each there
exists an such that for each
|
3 Sanov's theorem, the ergodic case
Theorem (Sanov, ergodic, for classical case see (8): Let be a state on and
. Statements and are equivalent:
1. | For each the pair is HP.
| 2. |
|
Remarks:
Note about Sanov's theorem and hypothesis testing:
The test sequence of projectors defines a sequence of quantum-observables
with outcome and . It is interpreted as the quantum analog of
test sets in hypothesis testing. For that reason we use the terminology
test sequence of projectors. For a discussion of quantum hypothesis testing and in its relation to the Q-Sanov
theorem see the articles by K. M. R. Audenaert , M. Nussbaum , A. Szkoła , F. Verstraete 8.
and M.Nussbaum and A.Szkola .
4 Sanov's theorem, the stationary caseCan be reduced to the ergodic case. Recall:
1. | ( is Choquet simplex,
| 2. | is the set of extremal points.
| 3. | Any stationary
has ergodic decomposition
|
Notation
Theorem (Sanov, stationary): Let
Then
Remarks:
5 Sanov's theorem: A cradle of fundamental results in inf-theory5.1 Shannon-McMillan, the ergodic case
Theorem (Shannon-McMillan, the ergodic case)
Consider an ergodic state
. Then
1. | exists a test sequence of projectores
,
such that with
| 2. | For any sequence with the lower bound holds:
|
|
Proof of the statement:
Sanov, ergodic case Shannon McMillan, ergodic case: 1. Choice
equipartition (tracial state):
Show less
|
Show more
|
Hide all
|
Show all 2. is HP:
is -mixing by construction (product state)
is ergodic by assumtion.
Due to a result by Hiai and Petz (1991) the statement holds. For details see section
6.3
Show less
|
Show more
|
Hide all
|
Show all
3. :
|
Entropy exists and is finit (follows from subadditivity).
Show less
|
Show more
|
Hide all
|
Show all
4. for large:
Applying Sanov's theorem (3 one gets:
Show less
|
Show more
|
Hide all
|
Show all 5. for
large: Due to a Sanov's theorem (3)
|
A computation analogous to the one in the last step proofs the claim.
Show less
|
Show more
|
Hide all
|
Show all 5.2 Shannon-McMillan for stationary states
Theorem (Shannon-McMillan, for stationary states)
Consider a stationary state
Then
1. | there exists a sequence of projectores
in ,
such that
| 2. | For any sequence with we have
|
|
Remarks:
1. in the stationary case equipartition does not hold!
2. The relevant entropy in Sanov's theorem for
stationary states is not the mean entropy but the essential infimum of the ergodic
constituents of all states in .
Proof of the statemtent:
Sanov, stationary case
Shannon McMillan, the stationary case:
The proof follows very closely the one just given for the ergodic case. Again one
chooses and as the equidistribution. is replaced by
respectively .
Show less
|
Show more
|
Hide all
|
Show all
5.3 Kaltchenko-Yangs universality theorem for stationary states
Theorem (Kaltchenko-Yang, for stationary states) Let
|
Then
1. | there exists a sequence of projectors , such that
| 2. |
For each sequence of projectors such that for all
is the optimal lower bound:
|
|
Proof of the statemtent:
Sanov, stationary case Kaltchenko-Yang theorem:
Set
Now, the statement is an immediate consequence of Sanov's theorem.
Show less
|
Show more
|
Hide all
|
Show all
5.4 Stein's lemma for stationary states
Theorem (Stein's lemma for stationary null hypothesis) Let and
such that for
-almost all the pair is HP.
Then
1. | there exists a sequence of projections with
| 2. | For any sequence with we have
|
|
Proof of the statement:
Sanov, stationary case Stein's lemma, stationary case:
Choose and apply Sanov's theorem.
Show less
|
Show more
|
Hide all
|
Show all
6 Proof of Sanov's theorem, the ergodic case
6.1 The case of just 2 states and (Stein's lemma)
Proposition 1:
-typical test sequence of projectors in terms of
spectral data
(the relative equipartion theorem)
Assume ergodic and arbitrary states on ,
HP and .
Consider the family of -spectral projectors
|
denotes the eigen-projection of
for eigenvalue . Then
|
Outline of proof:
1.
is the expectation value of the random variable with probability distribution
:
|
Show less
|
Show more
|
Hide all
|
Show all 2. :
satisfy the HP condition by assumption. Hence exists.
(it might be ).
Subadditivity of entropy implies the existence of .
Show less
|
Show more
|
Hide all
|
Show all 3.
:
Assume the statement is wrong. One concludes that
is not HP. This contradicts the assumptions.
Show less
|
Show more
|
Hide all
|
Show all 4.
:
is the convex combination of
with weights
.
Yet for each the interval is not n
the support of for sufficiently large. Hence the same is true for the
interval if .
Show less
|
Show more
|
Hide all
|
Show all 5. Step 3 and 4 imply the statement of propopsition 1:
For one gets due to step 3 and 4:
|
Choose and a corresponding sequence
such that the above inquality
holds with replaced with . Hence
|
Show less
|
Show more
|
Hide all
|
Show all
Proposition 2:
maximally separating sequence of projections
Assumption as in proposition 1 (6.1). Furthermore assume the test sequence of
projectors is optimally -typical in the sense:
(29) | | (29) |
Then there is a sequence such that the spectral
projectors
satisfy
|
Hence is an optimally
separating sequence of projections.
Outline of proof:
1. denotes a state, two projectors.
Then :
Straightforward but not particularly enlightening computation.
Show less
|
Show more
|
Hide all
|
Show all 2. one gets
|
hence
Show less
|
Show more
|
Hide all
|
Show all 3.
:
This is an immediate consequence of the definition of .
Show less
|
Show more
|
Hide all
|
Show all 4.
:
Combine statement 4 and 3.
Show less
|
Show more
|
Hide all
|
Show all 5.
:
Choose an appropriate subsequence.
Show less
|
Show more
|
Hide all
|
Show all 6.
:
Preparatory note:
|
We introduce the abreviation:
|
Show less
|
Show more
|
Hide all
|
Show all
Proposition 3:
lower bound for any separating sequence of projections of large -measure
Assumption as in proposition 1 (6.1),. Let be a
test sequence of projectors such that for large. Then
|
Proposition 4:
Conclusion from (6.1 and 6.16.1), Stein's lemma
Considere ergodic and arbitrary states on and assume
is HP (2)
Then
|
Combine proposition 2 and 3.
Show less
|
Show more
|
Hide all
|
Show all 6.2 from Stein's lemma to Sanov
Construction of -universal sequencies of typical projections:
The main objective is the proof of formula (3). It asserts
the existence of -typical test sequence of projectors for any
which is maximally non typical with respect to , in other words maximally
separating. Here is a sketch of the strategy of proof:
(For details see
(BDKSSS 20078).
Show less
|
Show more
|
Hide all
|
Show all
6.3 *-mixing implies the HP-condition-mixing implies the HP-condition. Hence Sanov's theorem holds under the
condition of -mixing. This is not astonishing since -mixing states are
by definition quite close to block-iid states. The proof of the fact that -mixing implies the HP-condition
is an elaboration of some results by Hiai and Petz
(Hiai-Petz 1991, 8). For details see reference (Bjelakovic et al. 8). 7 proof of Sanov's theorem, the stationary caseThe proof of Sanov's theorem for stationary states proceeds essentially in 2
steps. First each is decomposed into ergodic components :
Second, Sanov's theorem for ergodic states is appled to each ergodic component.
Outline of proof:
1. is HP and :
Let is HP and . is weak-
-measurable since it can be represented by a countable application of unions
and intersections to local sets, defined via the measurable functions and .
Show less
|
Show more
|
Hide all
|
Show all 2. Decomposition of stationary states into ergodic states:
|
Such a decomposition exists due to a well known theorem (see e.g. Ruelle 1969 8)
Show less
|
Show more
|
Hide all
|
Show all 2. Existence of optimally seperating test sequence of projectors:
Choose as in Theorem 3, with replaced
by . Then inequality (, 3) is
satisfied by construction. For we obtain by assumption
|
Now for each the expression tends to by the choice of the projections
. Applying Lebesgue's theorem on dominated convergence we conclude
Show less
|
Show more
|
Hide all
|
Show all 3. Uniform lower bound on -typical sequences of projectors:
For each sequence of projections
fulfilling inequality (3) the ergodic components of
tends to in -probability as . By the definition of
to any we may choose in such a way that . Hence
|
which implies (3) since can be chosen arbitrarily small.
Show less
|
Show more
|
Hide all
|
Show all
8 referencesK.M.R.Audenaert, M.Nussbaum, A. Szkola, F.Verstraete: Asymptotic Error Rates in
Quantum Hypothesis Testing, Commun.Math.Phys. 279, 251-283 (2008) I. Bjelakovic, T. Krüger, Ra. Siegmund-Schultze,
A. Szkola, The Shannon-McMillan theorem for ergodic quantum lattice
systems, Invent. Math. 155 (1), 203-222 (2004) I. Bjelakovic, J.-D. Deuschel, T. Krüger, R. Seiler, Ra. Siegmund-Schultze, A.
Szkola: Typical support and Sanov large deviations fo correlated states.
Commun. Math. Phys. 279, 559-584 (2008) F. Hiai, D. Petz, The Proper Formula for Relative
Entropy and its Asymptotics in Quantum Probability,
Commun. Math. Phys. 143, 99-114 (1991) A. Kaltchenko, E. H. Yang, Universal Compression of Ergodic Quantum Sources,
Quant. Inf. and Comput. 3, 359-375 (2003) M.Nussbaum, A. Szkola; Exponential error rates in multiple state discrimination on a
quantum spin chain, J.Math.Phys. 51, 072203 (2010) D. Ruelle: Statistical Mechanics, Benjamin Inc., 1969 I.N. Sanov, On the probability of large deviations of random variables, Mat. Sbornik
42, 11-44, 1957 |