Typicality in Classical
and
Quantum Information Theory
Centre de Recherche Mathematique, Montreal
July 18-19, 2011
Ruedi Seiler
based on the joint work with
Igor Bjelakovic, TU Berlin
Jean-Dominique Deuschel, TU Berlin
Tyll Krüger, U Bielefeld
Rainer Siegmund-Schultze, integral-learning and TU Berlin
Arleta Szkola, MPI Leipzig


1 Introduction

Classical spin chain: typical subsets, Shannon McMillan

spin configuration {0,1}Z

state (measure) ψ on {0,1},ψ(0)=p

state (measure) Ψ(n) on {0,1}[1,n] product state: ψψψn times

{𝒯n{0,1}[1,n]}nNtypical sequence of subsets:Ψ(n)(𝒯n)1

Theorem by Shannon McMillan characterizes sequences of typical sets in terms of entropy s=-plogp-(1-p)log(1-p):

τ(n)𝒯nΨ(n)(τ(n))2-ns|𝒯n|2ns2n,ifp12

the size of the typical sets are exponentially small in n, but they can not be smaler than what is given by the theorem by Shannon McMillan

1. Question: (interesting for hypothesis testing)
Given two states Ψ and Φ.
Is there a sequence of Ψ-typical sets {𝒯n}nN such that Φ(n)(𝒯n)0?

Answer yes. But the convergence to zero can not be too fast. The optimal rate of convergence is given by an exponential law, where the exponent is the relative entropy density of the two states:

Φ(n)(𝒯n)2-ns(Ψ,Φ)

This is the content of Stein's lemma.

The problem in hypothesis testing:
Which one of the 2 states Ψ and Φ is the better description of the observed data?

To answer this question, the set of n spin configurations {0,1}[1,n] is partitioned into the test set {𝒯n} and its complement, such that Ψ(𝒯n)1,Φ(𝒯n)0. The closer {Ψ(𝒯n),Φ(𝒯n)} to {1,0}, the better the test.

2. Question: (interesting for compression of data)
Is there a universal sequence of typical small sets for any state?

Answer is yes if rephrased in a more intelligent way: Given h>0, is there a sequence of subsets which are typical for all states with entropy <h?.

Proof for iid states e.g. by Lempel-Ziv compression algorithm.
Quantum Analog: Kaltchenko-Yang (see 8).

3. Question: (hypothesis testing)
Note: This question is a combination of question 1 and 2.

Given

1.

set of states Θ and

2.

single state Φ.

Is there a sequence of sets {𝒯n}nN such that

1.

universally typical for all states in Θ and

2.

Φ(n)(𝒯n)0?

Answer yes:
Provided: Convergence to 0 not too fast
Optimal convergence: exponential law
Best exponent: s(Θ,Φ) (rel.entropy).

Remark on Classical - Quantum

Classical 1-bit algebra: Diagonal 2x2 Matrices, functions =C{0,1}

Classical state ψ(d)=d(0)p+d(1)(1-p)

Quantum 1-q-bit algebra: 2x2 Matrices

Quantum state ψ(A)=trace(ψA)

Classical algebra of phase space functions (diagonal matrices) is abelian subalgebra of quantum algebra (full matrix algebra).

Remark on proof:

1.

One state: Basic starting point (Shannon, McMillon)

2.

Two states: Stein's lemma

3.

Many + 1 states: Sanov's theorem

2 to 3: Slice the set Θ along level sets.















2 Setup and Basic Concepts

Def: Discrete Information Source (𝒜,Ψ,T)

𝒜=M[2,2] Quantum 1-Bit Algebra

𝒟=D[2,2]=C{0,1} Classical 1-Bit Algebra (functions on {0,1})

𝒜[m,n]:=i=mn𝒜i for m,nZ,𝒜i𝒜,iZ

𝒜(n):=𝒜[1,n]

r<sN:𝒜[-r,r]𝒜[-s,s]A1A1

𝒜:=nN𝒜[-n,n]||·|| local quantum spin chain algebra

𝒟:=nN𝒟[-n,n]||·|| local classical spin chain algebra

m<nN 𝒜[m,n]𝒜

Ψ state on 𝒜: Ψ:𝒜Clinear, positiv andΨ(1)=1

nN:Ψ(n)(A(n)):=Ψ(A(n)),A(n)𝒜(n)

Tright shift,*-automorphism on 𝒜:
T(a)=1𝒜ma𝒜[m,n+1],a𝒜[m,n], extends to 𝒜

Def: Stationary, Ergodic, IID States

Ψis stationary:Ψ(Ta)=Ψ(a),a𝒜

Ψ is ergodic:Ψextremal point in the set of stationary states(Choquet simplex)

Ψ is IID :a stateψon𝒜such thatΨis the product state ofψ

Def: entropy, relative entropy

The entropy of Ψ(n) is defined by

S(Ψ(n)):=-trace(Ψ(n)log(Ψ(n))

For stationary states entropy is subadditiv. Hence its rate converges:

s(Ψ):=limn1nS(Ψ(n))

Relative entropy of Ψ(n),Φ(n) is defined by

S(Ψ(n),Φ(n)):=trace(Ψ(n)(log(Ψ(n))-log(Φ(n))),if trace exists,otherwise

and its rate by

s(Ψ,Φ):=limn1nS(Ψ(n),Φ(n))

provided the limit exists.

Def: The Hiai-Petz condition

Ψ,Φ states on 𝒜

Sequence of optimally seperating exponent:

βε,n(Ψ,Φ):=min{logΦ(q)|q𝒜(n) proj., Ψ(q)1-ε},ε(0,1)

β¯ε(Ψ,Φ):=limsupn1nβε,n(Ψ,Φ)-[0,]

(Ψ,Φ) is HP: β¯ε(Ψ,Φ)-s(Ψ,Φ),ε(0,1)
Hiai and Petz: Ψ completely ergodic, Φ-mixing

ΦHP-state:Ψ𝒮erg(𝒜)the pair(Ψ,Φ)is HP

sufficient condition for HP (discussed later)
Ψ ergodic and Φ-mixing (Ψ,Φ) is HP.

Def: -mixing
A stationary state Φ on 𝒜) is called -mixing if for each 0<α<1 there exists an lN such that for each kN

αΦ({-k,-k+1...,0})Φ({l,l+1,...,l+k})Φ({-k,-k+1...,0}{l,l+1,...,l+k})α-1Φ({-k,-k+1...,0})Φ({l,l+1,...,l+k})


3 Sanov's theorem, the ergodic case

Theorem (Sanov, ergodic, for classical case see (8):

Let Φ be a state on 𝒜 and Θ𝒮erg(𝒜).

Statements 1 and 2 are equivalent:

1.

For each ΨΘ the pair (Ψ,Φ) is HP.

2.

ΨΘs(Ψ,Φ)+ exists.

for each ΩΘ and η>0
there exists a test sequence of projectors {pn}nN,pn𝒜(n)
such that

limnΨ(n)(pn)=1,ΨΩ()limsupn1nlogΦ(n)(pn)-s(Ω,Φ)+η()

for each sequence of projections {p˜n} with ():

liminfn1nlogΦ(n)(p˜n)-s(Ω,Φ).

Remarks:

if s(Ω,Φ)= inequality () is replaced by

limsupn1nlogΦ(n)(pn)-1η.

-s(Ω,Φ) is liminf of all sequences of separation exponents.

Note about Sanov's theorem and hypothesis testing:
The test sequence of projectors {pn}nN defines a sequence of quantum-observables with outcome 0 and 1. It is interpreted as the quantum analog of test sets 𝒯nN in hypothesis testing. For that reason we use the terminology test sequence of projectors.

For a discussion of quantum hypothesis testing and in its relation to the Q-Sanov theorem see the articles by K. M. R. Audenaert , M. Nussbaum , A. Szkoła , F. Verstraete 8. and M.Nussbaum and A.Szkola .



4 Sanov's theorem, the stationary case

Can be reduced to the ergodic case.

Recall:

1.

(𝒮stat(𝒜) is Choquet simplex,

2.

𝒮erg(𝒜) is the set of extremal points.

3.

Any stationary Ψ𝒮stat(𝒜) has ergodic decomposition

Ψ=𝒮erg(𝒜)ΞγΨ(dΞ)

γΨ: prob. meas. on[𝒮erg(𝒜),(Υ𝒜)]

Υ𝒜 weak--topology

(Υ𝒜) Borel σ-field.

Notation

Let Φ𝒮(𝒜),Θ𝒮stat(𝒜)

s¯(Ψ,Φ):=essinfγΨ(dΞ)s(Ξ,Φ),

Let ΩΘ

s¯(Ω,Φ):=infΨΩs¯(Ψ,Φ).

Theorem (Sanov, stationary):

Let

Φ state on 𝒜

Θ𝒮stat(𝒜)

ΨΘ and γΨ-almost all Ξ the pair (Ξ,Φ) is HP.

Then

s¯(Ψ,Φ)+ exists for each ΨΘ

ΩΘ,η>0 there exists a test sequence of projectors {pn}nN,pn𝒜(n) with

limnΨ(n)(pn)=1,ΨΩ()limsupn1nlogΦ(n)(pn)-s(Ω,Φ)+η()

for each sequence of projections {p˜n} with ():

liminfn1nlogΦ(n)(p˜n)-s(Ω,Φ).

Remarks:

if s(Ω,Φ)= inequality () is replaced by

limsupn1nlogΦ(n)(pn)-1η.

-s(Ω,Φ) is liminf of all sequences of separation exponents.



5 Sanov's theorem: A cradle of fundamental results in inf-theory

5.1 Shannon-McMillan, the ergodic case

Theorem (Shannon-McMillan, the ergodic case)

Consider an ergodic state Ψ.

Then

1.

exists a test sequence of projectores {pn}𝒜(n), such that ε>0n0 with

Ψ(n)(pn)1-ε, for nn0,(typicality)

trace(pn)2n(s(Ψ)+ε), (achievability)

2.

For any sequence p˜n with limΨ(n)(p˜n)=1 the lower bound holds:

tracep˜n2n(s(Ψ)-ε)nsufficiently large,(optimality)

Proof of the statement:
Sanov, ergodic case Shannon McMillan, ergodic case:

1. Choice Ω:={Ψ},Φ equipartition (tracial state):

Show less | Show more | Hide all | Show all

2. (Ψ,Φ) is HP:

Show less | Show more | Hide all | Show all

3. s(Ψ,Φ)=-s(Ψ)+1<:

Show less | Show more | Hide all | Show all

4. {pn},s.t.Ψ(pn)1-ε,tr(pn)2n(s(Ψ)+ε), for n large:

Show less | Show more | Hide all | Show all

5. limnΨ(n)(p˜n)=1tracep˜n2n(s(Ψ)-ε), for n large:

Show less | Show more | Hide all | Show all

5.2 Shannon-McMillan for stationary states

Theorem (Shannon-McMillan, for stationary states)

Consider a stationary state

Ψ=𝒮erg(𝒜)ΞγΨ(dΞ)

Then

1.

there exists a sequence of projectores {pn} in 𝒜(n), such that

limnΨ(n)(pn)=1 (typicality)

limn1nlog(tracepn)=s¯(Ψ):=ess supγΨ(dΞ)s(Ξ)

2.

For any sequence p˜n with limΨ(n)(p˜n)=1 we have

liminfn1nlog(tracep˜n)s¯(Ψ) (optimality).

Remarks:
1. in the stationary case equipartition does not hold!
2. The relevant entropy in Sanov's theorem for stationary states is not the mean entropy but the essential infimum of the ergodic constituents of all states in Θ.

Proof of the statemtent:
Sanov, stationary case Shannon McMillan, the stationary case:

Show less | Show more | Hide all | Show all

5.3 Kaltchenko-Yangs universality theorem for stationary states

Theorem (Kaltchenko-Yang, for stationary states)

Let

Ωs:={Ψ𝒮stat(𝒜):s¯(Ψ)<s}

Then

1.

there exists a sequence of projectors {pn}nN,pn𝒜(n), such that

limnΨ(n)(pn)=1,ΨΩs (typicality)

limn1nlog(tracepn)=s (max. ergodic entropy rate)

2.

For each sequence of projectors p˜n such that for all ΨΩs

limnΨ(n)(p˜n)=1,

s is the optimal lower bound:

liminfn1nlog(tracep˜n)s

Proof of the statemtent:
Sanov, stationary case Kaltchenko-Yang theorem:

Show less | Show more | Hide all | Show all

5.4 Stein's lemma for stationary states

Theorem (Stein's lemma for stationary null hypothesis)

Let Φ𝒮(𝒜) and Ψ𝒮stat(𝒜) such that for γΨ -almost all Ξ the pair (Ξ,Φ) is HP.

Then

1.

there exists a sequence {pn} of projections with

limnΨ(n)(pn)=1 (typicality)

limn1nlogΦ(n)(pn)=-s¯(Ψ,Φ) (achievability of the separation exponent -s¯(Ψ,Φ)).

2.

For any sequence p˜n with limnΨ(n)(p˜n)=1 we have

liminfn1nlogΦ(n)(p˜n)-s¯(Ψ,Φ) (optimality) .

Proof of the statement:
Sanov, stationary case Stein's lemma, stationary case:

Show less | Show more | Hide all | Show all


6 Proof of Sanov's theorem, the ergodic case

6.1 The case of just 2 states Ψ and Φ (Stein's lemma)

Proposition 1:
Ψ-typical test sequence of projectors in terms of Φ spectral data
(the relative equipartion theorem)

Assume Ψ ergodic and Φ arbitrary states on 𝒜, (Ψ,Φ) HP and ε>0.
Consider the family of Φ-spectral projectors

uΦ(n)ε(s):=s-ε<s<s+εspec2-ns(Φ(n)),sR.

specλ(·) denotes the eigen-projection of Φ(n) for eigenvalue λ.

Then ε>0

limnΨ(n)(uΦ(n)ε(s(Ψ)+s(Ψ,Φ)))=1

Outline of proof:

1. 1n(S(Ψ(n))+S(Ψ(n),Φ(n)))=-1n(Ψ(log(Φn))) is the expectation value of the random variable s with probability distribution μ(n)(s<s):=0s<sΨ(spec2-ns(Φ(n)):

Show less | Show more | Hide all | Show all

2. 1n(S(Ψ(n))+S(Ψ(n),Φ(n)))=<s>μ(n)ns(Ψ)+s(Ψ,Φ):

Show less | Show more | Hide all | Show all

3. ε>0,limnμ(n)(s<s(Ψ)+s(Ψ,Φ)-ε)=0:

Show less | Show more | Hide all | Show all

4. ε>0,limnμ(n)(s>s(Ψ)+s(Ψ,Φ)+ε)=0:

Show less | Show more | Hide all | Show all

5. Step 3 and 4 imply the statement of propopsition 1:

Show less | Show more | Hide all | Show all

Proposition 2:
maximally separating sequence of projections

Assumption as in proposition 1 (6.1). Furthermore assume the test sequence of projectors {pn} is optimally Ψ-typical in the sense:

(29)Ψ(n)(pn)n1 and 1nlogTr(pn)ns(Ψ)(29)

Then there is a sequence εn0 such that the spectral projectors un:=uΦ(n)εn(s(Ψ)+s(Ψ,Φ)) satisfy

Ψ(n)(supp(unpnun))n11nlogΦ(n)(supp(unpnun))n-s(Ψ,Φ)

Hence {supp(unpnun)} is an optimally separating sequence of projections.

Outline of proof:

1. τ denotes a state, u,p two projectors. Then τ(upu)τ(p)-(τ(1-u):

Show less | Show more | Hide all | Show all

2. ε>0,andunε:=unε(s(Ψ)+s(Ψ,Φ)) one gets Ψ(unεpnunε)n1

Show less | Show more | Hide all | Show all

3. unεpnunεsupp(unεpnunε) :

Show less | Show more | Hide all | Show all

4. Ψ(supp(unεpnunε))n1:

Show less | Show more | Hide all | Show all

5. Ψ(supp(unpnun))n1:

Show less | Show more | Hide all | Show all

6. limn1nlogΦ(supp(unpnun))-s(Ψ,Φ):

Show less | Show more | Hide all | Show all

Proposition 3:
lower bound for any separating sequence of projections of large Psi-measure

Assumption as in proposition 1 (6.1),α(0,1). Let qn be a test sequence of projectors such that Ψ(qn)1-α for n large.

Then

{limninf1nlog(Φ(qn))-s(Ψ,Φ)

Proposition 4:
Conclusion from (6.1 and 6.16.1), Stein's lemma

Considere Ψ ergodic and Φ arbitrary states on 𝒜 and assume (Ψ,Φ) is HP (2)

Then

limninf1nlog(Φ(qn))=-s(Ψ,Φ)
Show less | Show more | Hide all | Show all

6.2 from Stein's lemma to Sanov

Construction of Θ-universal sequencies of typical projections:
The main objective is the proof of formula (3). It asserts the existence of Ψ-typical test sequence of projectors for any ΨΩ which is maximally non typical with respect to Φ, in other words maximally separating.

Here is a sketch of the strategy of proof:

partition Ω by level sets of the function s(Ψ,Φ),ΨΩ (see 1).

use the Kaltchenko-Yang theorem asserting a uniform test sequence of projectores for all states with entropy below a given value s.

apply Stein's lemma to each strip. Finally one makes partioning by level sets finer and finer.

(For details see (BDKSSS 20078).

Show less | Show more | Hide all | Show all

6.3 *-mixing implies the HP-condition

-mixing implies the HP-condition. Hence Sanov's theorem holds under the condition of -mixing. This is not astonishing since -mixing states are by definition quite close to block-iid states.

The proof of the fact that -mixing implies the HP-condition is an elaboration of some results by Hiai and Petz (Hiai-Petz 1991, 8). For details see reference (Bjelakovic et al. 8).

7 proof of Sanov's theorem, the stationary case

The proof of Sanov's theorem for stationary states proceeds essentially in 2 steps. First each ΨΩ is decomposed into ergodic components Ξ:

Ψ(n)=𝒮erg(𝒜)Ξ(n)γΨ(dΞ)

Second, Sanov's theorem for ergodic states is appled to each ergodic component.

Outline of proof:
1. Ω˜:={Ξ𝒮erg(𝒜):(Ξ,Φ) is HP and s(Ξ,Φ)s¯(Ω,Φ)}:

Show less | Show more | Hide all | Show all

2. Decomposition of stationary states into ergodic states:

Ψ(n)(pn)=𝒮erg(𝒜)Ξ(n)(pn)γΨ(dΞ)
Show less | Show more | Hide all | Show all

2. Existence of optimally seperating test sequence of projectors:

Show less | Show more | Hide all | Show all

3. Uniform lower bound on Ω-typical sequences of projectors:

Show less | Show more | Hide all | Show all

8 references

K.M.R.Audenaert, M.Nussbaum, A. Szkola, F.Verstraete: Asymptotic Error Rates in Quantum Hypothesis Testing, Commun.Math.Phys. 279, 251-283 (2008)

I. Bjelakovic, T. Krüger, Ra. Siegmund-Schultze, A. Szkola, The Shannon-McMillan theorem for ergodic quantum lattice systems, Invent. Math. 155 (1), 203-222 (2004)

I. Bjelakovic, J.-D. Deuschel, T. Krüger, R. Seiler, Ra. Siegmund-Schultze, A. Szkola: Typical support and Sanov large deviations fo correlated states. Commun. Math. Phys. 279, 559-584 (2008)

F. Hiai, D. Petz, The Proper Formula for Relative Entropy and its Asymptotics in Quantum Probability, Commun. Math. Phys. 143, 99-114 (1991)

A. Kaltchenko, E. H. Yang, Universal Compression of Ergodic Quantum Sources, Quant. Inf. and Comput. 3, 359-375 (2003)

M.Nussbaum, A. Szkola; Exponential error rates in multiple state discrimination on a quantum spin chain, J.Math.Phys. 51, 072203 (2010)

D. Ruelle: Statistical Mechanics, Benjamin Inc., 1969

I.N. Sanov, On the probability of large deviations of random variables, Mat. Sbornik 42, 11-44, 1957