2020年11月9日星期一

A small theorem of Boolean approximation

I would like to record one theorem learned in the Alibaba competition this year. In fact, this problem has also appeared in the maths competition at Peking University for high school students. Let $(a_i)_{1 \leq i \leq n}$ be positive sequence such that $\sum_{i=1}^n a_i = a$ and $\sum_{i=1}^n (a_i)^2 = 1$, then we can always find a configuration $\epsilon_i \in \{\pm 1\}$ such that $$\vert \sum_{i=1}^n \epsilon_i a_i \vert  \leq \frac{1}{a}.$$

Let us see what this theorem tells us. From AG inequality, we know that if every number is equal, $a = \sqrt{n}$ and every $a_i = \frac{1}{\sqrt{n}}$. Then by a choice of the best Boolean approximation $\sum_{i=1}^n \epsilon_i a_i $, one can get a number very close to the $0$ with an error $\frac{1}{\sqrt{n}}$ --- that is the error of one term.

This makes us think of the concentration inequality in the probability --- like Markov inequality, Hoeffding inequality etc. That is the case when I did Alibaba competition, but I have to say this is a very dangerous trap in this question. In fact, even after the competition, I continue trying this idea many times, but it seems no easy way to figure it out. In fact, let us recall what concentration inequality teaches us: yes, if we put $\epsilon_i$ centered variable, the measure should be concentrated and $$\mathbb{P}[\vert \sum_{i=1}^n \epsilon_i a_i \vert  \geq \frac{1}{a} ] \leq 2\exp(-2/a^2)$$.
Good, you will see an explosion and this does not give useful information. Indeed, the concentration inequality always told us the measure should be in the $\sigma$ region, but it tells nothing how good the measure is concentrated in the $\sigma$ region. In this question, a random choice is clearly not good because the $\sigma$ for $\sum_{i=1}^n \epsilon_i a_i $ is $1$.

One has to keep in mind that the probabilistic method is good and cool, but not the only way and sometimes not the best way.

A simple solution is just by induction and one step exploration, or someone calls it the greedy method. This problem is equivalent to prove $$\left(\min_{\epsilon_i \in \{\pm\}} \vert \sum_{i=1}^n \epsilon_i a_i \vert  \right) (\sum_{i=1}^n a_i) \leq \sum_{i=1}^n (a_i)^2.$$ We do at first the optimization for $n$ variable and then choose the sign for the last one. We can also manipulate the choice of the last variable. For example, one can let $a_{n+1}$ always be the smallest one, thus its influence is always smaller than $\frac{1}{a}$. Once we get the correct direction, the theorem is not difficult.

没有评论:

发表评论