# Lewko's blog

## The dense model theorem

Posted in expository, math.IT, math.NT by Mark Lewko on December 14, 2009

A key component in the work of Green, Tao, and Ziegler on arithmetic and polynomial progressions in the primes is the dense model theorem. Roughly speaking this theorem allows one to model a dense subset of a sparse pseudorandom set by dense subset of the the ambient space. In the work of Green, Tao, and Zeigler this enabled them to model (essentially) the characteristic function of the set of primes with (essentially) the characteristic function of a set of integers with greater density. They then were able to obtain the existence of certain structures in the model set via Szemerédi’s theorem and its generalizations.

More recently, simplified proofs of the dense model theorem have been obtained independently by Gowers and Reingold, Trevisan, Tulsiani and Vadhan. In addition, the latter group has found applications of these ideas in theoretical computer science. In this post we give an expository proof of the dense model theorem, substantially following the paper of Reingold, Trevisan, Tulsiani and Vadhan.

With the exception of the min-max theorem from game theory (which can be replaced by (or proved by) the Hahn-Banach theorem, as in Gowers’ approach) the presentation is self-contained.

(We note that the the theorem, as presented below, isn’t explictly stated in the Green-Tao paper. Roughly speaking, these ideas can be used to simplify/replace sections 7 and 8 of that paper.)

— The dense model theorem —

Let ${[N]:=\{1,2,\ldots,N\}}$. We’ll primarily be interested in certain real-valued functions on the set ${[N]}$. We define the expectation of a function, say ${f}$, on ${S\subseteq [N]}$ to be ${\mathop{\mathbb E}_{s\in S}f = \frac{1}{|S|} \sum_{s\in [S]}f(s)}$ and the inner product of ${f,g:[N]\rightarrow {\mathbb R}}$ to be ${\left< f,g \right> = \mathop{\mathbb E}_{n \in [N]} f(n)g(n)}$. We will call a non-negative real-valued function, say ${\mu}$, on ${[N]}$ a measure (This isn’t a measure in the analytic sense however this terminology has become standard in the literature on the subject) if ${\mathop{\mathbb E}_{[N]} \mu \leq 1}$.

We will call a function on ${[N]}$, say ${f}$, bounded if ${f:[N]\rightarrow [-1,1]}$. Analogously a measure, say ${\mu}$, will be called bounded if ${\mu:[N]\rightarrow[0,1]}$. For a fixed finite collection of bounded functions, ${\mathcal{F}}$, on ${[N]}$, we say that two measure, say ${\mu}$ and ${\nu}$, are ${\epsilon}$-indistinguishable with respect to ${\mathcal{F}}$ if ${|\left< \mu-\nu,f \right>|\leq \epsilon}$ for every ${f \in \mathcal{F}}$. Furthermore, a measure ${\mu}$ on ${[N]}$ is said to be ${\epsilon}$-pseudorandom with respect to ${\mathcal{F}}$ if the measures ${\mu}$ and ${1_{[N]}}$ are ${\epsilon}$-indistinguishable. Here ${1_{[N]}}$ denotes the characteristic function of the set ${[n]}$.

In addition to the set ${\mathcal{F}}$ we will also need to consider the larger class of functions ${\mathcal{F}^{k} := \{ \prod_{i=1}^{l} f_{i} : f_{i} \in \mathcal{F}, 0 \leq l \leq k \} }$. We can now state the dense model theorem.

Theorem 1 Fix ${\epsilon>0}$ and ${\mathcal{F}}$, a finite collection of bounded functions on ${[N]}$. Furthermore, let ${\mu}$ be a ${\epsilon'}$-pseudorandom measure with respect to the set ${\mathcal{F}^{k}}$ and ${\nu}$ a measure majorized by ${\mu}$. There exists ${k(\epsilon)}$ and ${\epsilon'(\epsilon)}$ (that depend\footnote{For the sake of simplicity we will not work out the dependency of these parameters on ${\epsilon}$. We do however (very briefly) discuss the dependencies in the remark at the end of this section.} only on ${\epsilon}$) and a bounded measure ${g}$ such that ${\mathop{\mathbb E}_{n \in [N]} g(n) = \mathop{\mathbb E}_{n \in [N]} \nu(n)}$ and ${g}$ and ${\nu}$ are ${\epsilon}$-indistinguishable with respect to ${\mathcal{F}}$.

The thrust of the theorem is that ${\epsilon'}$ and ${k}$ depend only on ${\epsilon}$ and not ${N}$. At first the fact that ${\epsilon'}$ is used in the hypothesis of the theorem and ${\epsilon}$ in the conclusion may seem strange. In applications, however, one often wishes to find a dense (${\epsilon}$-indistinguishable) model for a measure ${\nu}$ for a prescribed ${\epsilon}$. One proceeds by locating ${\mu}$, a majorant of ${\nu}$, that is ${\epsilon'}$-indistinguishable from the measure ${1_{[N]}}$. With applications of this form in mind, the statement of the theorem may seem more natural.

We will split the proof of the theorem into several parts/lemmas. Throughout ${G}$ will denote the set of bounded measures of expectation ${\delta=E_{n \in [N]}\nu(n)}$. We’ll typically denote an element of ${G}$ with the symbol ${\phi}$.

Lemma 2 Let ${\mathcal{F}_{0}^{k}= -\mathcal{F}^{k} \cup \mathcal{F}^{k}}$ and ${\mathcal{F}_{*}}$ denote the convex hull of ${\mathcal{F}_{0}}$. Furthermore let ${p_{\epsilon}}$ be a real-valued polynomial (depending only on ${\epsilon}$) that maps ${[-1,1]}$ to ${[0,1]}$. If there is a function of the form ${p_{\epsilon}\circ f}$ with ${f \in \mathcal{F}_{*}}$ that ${\epsilon'}$-distinguishes ${1_{[N]}}$ from ${\mu}$ then there exists a function ${f' \in \mathcal{F}}$ that ${\epsilon}$-distinguishes ${1_{[N]}}$ from ${\mu}$.

Proof: We note that ${\mu}$ and ${1_{[N]}}$ are ${\epsilon'}$-distinguishable with respect to ${\mathcal{F}^{k}}$ if and only if they are ${\epsilon'}$-distinguishable with respect to ${\mathcal{F}_{0}^{k} = -\mathcal{F}^{k} \cup \mathcal{F}^{k}}$ (this allows us to remove the absolute value from the definition of ${\epsilon}$-distinguishability given above).

Next we note that it suffices to show that ${\mu}$ and ${1_{[N]}}$ are ${\epsilon'}$-distinguishable with respect to ${\mathcal{F}_{*}}$, the convex hull of ${\mathcal{F}_{0}}$. To see this assume that ${D=\sum w_{i}f_{i}}$ ${\epsilon'}$-distinguishes ${\mu}$ and ${1_{[N]}}$ with ${\sum w_{i}=1}$, ${w_{i}>0}$, and ${f_{i}\in \mathcal{F}_{0}}$. We then have that ${\sum w_{i} \left<\mu - 1_{[N]},f_{i} \right> > \epsilon}$, which easily implies that ${\left<\mu - 1_{[N]},f_{i} \right> > \epsilon}$ for some ${f_{i}}$.

Furthermore, let ${p_{\epsilon}}$ be a real-valued polynomial that depends only on ${\epsilon}$. We claim that it then suffices to show that there exists a function ${f \in \mathcal{F}_{*}}$ such that ${p_{\epsilon}\circ f}$ ${\epsilon'}$-distinguishes ${\mu}$ and ${1_{[N]}}$. To see this set ${c(\epsilon')}$ equal to the magnitude of the largest coefficient of ${p_{\epsilon}}$ and ${k(\epsilon')}$ the degree of ${p_{\epsilon}}$. Letting ${p_{\epsilon}\circ f= \sum c_{i}f^{i}}$ we have that ${\left<\mu - 1_{[N]}, \sum c_{i}f^{i} \right> > \epsilon' }$ hence

$\displaystyle \sum c_{i} \left< \mu - 1_{[N]}, f^{i} \right> > \epsilon'$

and thus, for some ${f \in \mathcal{F}_{*}}$,

$\displaystyle \left<\mu - 1_{[N]},f^{i} \right> > \epsilon' / c(\epsilon') k(\epsilon').$

Since the right-hand side depends only on ${\epsilon'}$ the proof is complete. $\Box$

It now suffices to assume that the conclusion of the the theorem is false and find a function of the form ${p_{\epsilon}\circ f}$ with ${f \in \mathcal{F}_{*}}$ that ${\epsilon'}$-distinguishes ${\mu}$ from ${1_{[N]}}$, which would provide a contradiction.

Lemma 3 Assume that for every ${\phi \in G}$ there exists a ${f \in \mathcal{F}_{0}}$ such that ${ \left<\nu-\phi,f\right> > \epsilon}$ (in other words, assume the theorem is false). Then there exists a function ${F \in \mathcal{F}_{*}}$ that ${\epsilon}$-distinguishes every function ${\phi \in G}$ from ${\nu}$. This is to say

$\displaystyle \left<\nu-\phi,F \right> > \epsilon$

for every ${\phi \in G}$.

Proof: Let ${G_{0}}$ denote a finite set\footnote{To see that such a set exists consider ${\{\delta 1_{\{n\}}(x) : n \in [N]\}}$.} of bounded measures on ${[N]}$ such that the convex hull of ${G_{0}}$ is ${G}$. Consider the ${|\mathcal{F}_{0}|\times N}$ matrix ${A}$ with entries ${a_{i,j}=\left<\nu-f_{i}, \phi_{j} \right>}$. By the min-max theorem there exists ${\phi_{opt} \in G}$ and ${f_{opt} \in \mathcal{F}_{*}}$ such that ${\left<\nu-\phi, f_{opt} \right> \geq \alpha}$ for all ${\phi \in G}$ and ${\left<\nu - \phi_{opt}, f \right> \leq \alpha}$ for every ${f \in \mathcal{F}_{*}}$.

By the hypothesis of the theorem we have that for every ${\phi \in G}$ there exists ${f \in \mathcal{F}_{0}}$ such that ${ \left<\nu-\phi,f\right> > \epsilon}$. Thus there exists a ${f \in \mathcal{F}_{0}}$ such that ${\left<\nu - \phi_{opt}, f \right> > \epsilon}$. Hence ${\alpha>\epsilon}$ and taking ${F=f_{opt}}$ completes the proof. $\Box$

We now let ${S}$ be a set of ${\lfloor \delta N \rfloor}$ elements of ${[N]}$ that maximizes the quantity ${\sum_{n\in S}F(n)}$. Additionally let ${a}$ denote an element of the set ${[N] \setminus S}$ that maximizes ${F(a)}$. Define ${\phi_{0}(x)= 1_{S}(x) + \{\delta N\}1_{\{a\}}(x)}$. By construction we have that ${\phi_{0} \in G}$ thus ${\left<\nu-\phi_{0},F \right> > \epsilon}$ and ${\left<\nu,F \right> > \left<\phi_{0}, F \right> + \epsilon}$. This implies that ${F}$ ${\epsilon}$-distinguishes ${\nu}$ from ${\phi_{0}}$.

Lemma 4 Let ${F_{t}=1_{U_{t}}(x)}$ where ${U_{t} = \{n \in [N] : F(n)\geq t \}}$. Then there exists a threshold ${t \in [-1+ \epsilon/3,1]}$ such that ${\left<\nu, F_{t} \right> \geq \left<\phi_{0}, F_{t-\epsilon/3} \right> + \frac{\epsilon}{3}}$.

Proof: We have previously observed that ${\left<\nu, F\right> \geq \left<\phi_{0}, F\right> + \epsilon}$. Using the fact that ${\left<\nu, 1_{[N]} \right> = \left<\phi_{0}, 1_{[N]} \right>=\delta}$ we have that ${\left<\nu, F+1_{[N]} \right> \geq \left<\phi_{0}, F+1_{[N]}\right> + \epsilon}$. Combining this with the observation that ${F(x)+1 = \int_{-1}^{1}F_{t}(x) dt}$ we have that

$\displaystyle \int_{-1}^{1}\left<\nu, F_{t}+1_{[N]} \right>dt \geq \int_{-1}^{1}\left<\phi_{0}, F_{t}+1_{[N]}\right>dt + \epsilon.$

Assuming that the conclusion of the lemma is false, that is ${\left<\nu , F_{t} \right> \leq \left<\phi_{0}, F_{t-\epsilon/3} \right> + \frac{\epsilon}{3}}$, we can conclude

$\displaystyle \int_{-1}^{1}\left<\nu, F_{t} \right>dt = \int_{-1}^{-1+\epsilon/3}\left<\nu, F_{t} \right>dt + \int_{-1+\epsilon/3}^{1}\left<\nu, F_{t}\right>dt$$\displaystyle < \int_{-1}^{-1+\epsilon/3}\left<\nu, 1_{[N]} \right>dt + \int_{-1+\epsilon/3}^{1}\left<\nu, F_{t- \epsilon/3}\right>dt + \left(2-\epsilon/3 \right)\epsilon/3$

$\displaystyle \leq \delta \epsilon/2 + \left(2-\epsilon/3 \right)\epsilon/3+ \int_{-1+\epsilon/3}^{1}\left<\nu, F_{t}\right>dt \leq \epsilon + \int_{-1}^{1}\left<\phi_{0}, F_{t} \right> dt.$

This would, however, contradict the inequality

$\displaystyle \int_{-1}^{1}\left<\nu, F_{t}+1_{[N]} \right>dt \geq \int_{-1}^{1}\left<\phi_{0}, F_{t}+1_{[N]}\right>dt + \epsilon$

which is an easy consequence of the inequality derived in the second sentence of this proof. Hence the proof is complete. $\Box$

Lemma 5 We have that ${\phi_{0}(n)=1}$ for every ${n \in \emph{supp}(F_{t-\epsilon/3})}$.

Proof: Recall that ${\phi_{0}(x)= 1_{S}(x) + \{\delta N\}1_{\{a\}}(x)}$. If ${\phi_{0}(n)<1}$ for any ${n \in \text{supp}(F_{t-\epsilon/3})}$ then ${\phi_{0}}$ must vanish identically on ${[N]\setminus \text{supp}(F_{t-\epsilon/3})}$. However this implies that

$\displaystyle \left<\phi_{0}, F_{t-\epsilon/3} \right> = \left<\phi_{0}, 1_{[N]} \right> - O(1/N) = \delta - O(1/N).$ For large ${N}$ the ${O(1/N)}$ term is negligible and thus

$\displaystyle \left<\phi_{0}, F_{t-\epsilon/3} \right> \approx \delta = \left<\nu, 1_{[N]} \right> \geq \left<\nu, F_{t} \right>.$

But this would contradict the conclusion of Lemma 4. $\Box$

Let us briefly summarize the strategy for completing the proof of the theorem. From the previous lemma we have ${\left<\phi_{0},F_{t-\epsilon/3}\right> = \left<1_{[N]}, F_{t-\epsilon/3}\right>}$. However, since ${F_{t-\epsilon/3}}$ distinguishes ${\phi_{0}}$ and ${\nu}$ it must also distinguish ${\nu}$ and ${1_{[N]}}$. This would contradict the hypothesis of the theorem if ${F_{t-\epsilon/3} \in \mathcal{F}^{k}}$. In light of Lemma 2 it then suffices to show that ${F_{t-\epsilon/3}}$ can be approximated by a function of the form ${p_{\epsilon}\circ f}$ where ${f \in \mathcal{F}_{*}}$. For this purpose, let ${p_{\epsilon}}$ be a polynomial mapping ${[-1,1]}$ into ${[0,1]}$ and satisfying  $p_{\epsilon}(x) \in [0,\epsilon/3]$ for $x\in [-1,t-\epsilon/3]$ and $p_{\epsilon}(x) \in [1-\epsilon/12,1]$ for $x\in [t,1]$.  (The existence of such a polynomial can be obtained from standard variants of Weierstrass’ approximation theorem.)

Lemma 6 Let ${p_{\epsilon}}$ be as defined above. Then ${p_{\epsilon}\circ F}$ ${\Omega(\epsilon)}$-distinguishes ${1_{[N]}}$ from ${\mu}$.

Proof: From the definition of ${p_{\epsilon}}$ we have that

$\displaystyle F_{t}(x) - \frac{\epsilon}{12}\leq p_{\epsilon}\circ F(x) \leq F_{t-\epsilon/3}(x) + \frac{\epsilon}{12}. \ \ \ \ \ (1)$

Using the lower bound in (1), the pointwise inequality ${\nu(n) \leq \mu(n)}$, Lemma 4 and Lemma 5 (in this order) we have that

$\displaystyle \left<\mu, p_{\epsilon}\circ F(x) \right> \geq \left<\mu, F_{t} \right> - \frac{\epsilon}{12} \geq \left<\nu, F_{t} \right> - \frac{\epsilon}{12} \geq \left<\phi_{0}, F_{t-\epsilon/3} \right> + \frac{\epsilon}{4} \geq \left<1_{[N]}, F_{t-\epsilon/3}\right>+\frac{\epsilon}{4}.$

Using the upper bound from (1) we further have that

$\displaystyle \left<1_{[N]},p_{\epsilon}\circ F \right> - \frac{\epsilon}{12}\leq \left<1_{[N]}, F_{t-1/3} \right> .$

Putting these two calculations together we conclude that

$\displaystyle \left<\mu - 1_{[N]}, p_{\epsilon}\circ F \right> \geq \frac{\epsilon}{6}$

which completes the proof. $\Box$

Recalling that ${F \in \mathcal{F}_{*}}$ we may apply Lemma 2 to complete the proof of the dense model theorem.

Remark 1 We note that the only step of the proof where we haven’t explicitly recorded the relationship between ${\epsilon}$, ${\epsilon'}$ and ${k}$ is in the dependency of the polynomial ${p_{\epsilon}}$ on ${\epsilon}$. Using a quantitative form of Weierstrass’ approximation theorem one can obtain ${k=O\left((1/e)^{O(1)}\right)}$ and ${\epsilon'= O(e^{-\epsilon^{-O(1)}})}$. We refer the reader here for details.

Updated 12/17/2009: Fixed formatting in proof of Lemma 6.