Processing math: 100%

2023年1月3日星期二

Dini's Lemma

In our last project, we use Dini's lemma in one step of proof. This result, although should be part of undergraduate analysis, is really like magic. It says "the monotone convergence of continuous function {fn}nN to a continuous limit f will give us locally uniform convergence. "

The monotone convergence is the key. For the case of decreasing sequence, we fix N, then for all n>N, then fn(y)f(y)fN(y)f(y). We then apply the classcial triangle inequality for a δ-neighbor of x for both f and fN that 
fn(y)f(y)fN(y)f(y)|fN(y)fN(x)|+|fN(x)f(x)|+|f(x)f(y)|.
Then only the information of fN and f near x can control all the convergence of sequence nN.

2021年8月22日星期日

Connection probability on oriented percolation does not depend on root

There is a long time that I have not updated my blog. This time, I would like to talk about once again a question in Alibaba competition (2021) as I did not solve it that day. 

Question: On a general connected graph G=(V,E), every bond contains two oriented edges. Then we pose an oriented percolation model on it. We choose a vertex as root, then try to prove that the probability to have oriented path from every vertex to root is independent to the choice of root.

It is natural to think of a bijection, but it is not easy. During the exam, I only treat the very simple case that G is a tree, which is very easy. However, it is still useful because we will see an argument to combine the two.

We denote by h(a)=Pp[Ga] the connection probability. It suffices to prove that h(a)=h(b) for every ab. We prove it by induction on the number of bonds |E|. Then the basic case is the tree and we have already proved it and we need to finish the induction step.

For every configuration ω, we define Pivot(ω,a,ba):={vV:va when ω(ba)=1,va otherwise}.
Then we treat two cases:

Case 1: ω{Ga},Pivot(ω,a,ba)=. For this case, ωba can be thought as a connection configuration in (V,E{a,b}). Then by induction, for this case the probability does not depend on the choice of a,b by adding an oriented edge and we can flip it.

Case 2: ω{Ga},Pivot(ω,a,ba). Then this implies that ω(ba)=1 and bPivot(ω,a,ba). We should add one more restriction that ab, otherwise it is the common part for two. Then we flip the orientation  ba, and one can check that 
Pp[ω{Ga}{ab},bPivot(ω,a,ba)]=Pp[ω{Gb}{ba},aPivot(ω,b,ab)].

With these arguments, we finish the step of induction and establish the result for the general graph G.

2020年11月9日星期一

A small theorem of Boolean approximation

I would like to record one theorem learned in the Alibaba competition this year. In fact, this problem has also appeared in the maths competition at Peking University for high school students. Let (ai)1in be positive sequence such that ni=1ai=a and ni=1(ai)2=1, then we can always find a configuration ϵi{±1} such that |ni=1ϵiai|1a.

Let us see what this theorem tells us. From AG inequality, we know that if every number is equal, a=n and every ai=1n. Then by a choice of the best Boolean approximation ni=1ϵiai, one can get a number very close to the 0 with an error 1n --- that is the error of one term.

This makes us think of the concentration inequality in the probability --- like Markov inequality, Hoeffding inequality etc. That is the case when I did Alibaba competition, but I have to say this is a very dangerous trap in this question. In fact, even after the competition, I continue trying this idea many times, but it seems no easy way to figure it out. In fact, let us recall what concentration inequality teaches us: yes, if we put ϵi centered variable, the measure should be concentrated and P[|ni=1ϵiai|1a]2exp(2/a2).
Good, you will see an explosion and this does not give useful information. Indeed, the concentration inequality always told us the measure should be in the σ region, but it tells nothing how good the measure is concentrated in the σ region. In this question, a random choice is clearly not good because the σ for ni=1ϵiai is 1.

One has to keep in mind that the probabilistic method is good and cool, but not the only way and sometimes not the best way.

A simple solution is just by induction and one step exploration, or someone calls it the greedy method. This problem is equivalent to prove (minϵi{±}|ni=1ϵiai|)(ni=1ai)ni=1(ai)2. We do at first the optimization for n variable and then choose the sign for the last one. We can also manipulate the choice of the last variable. For example, one can let an+1 always be the smallest one, thus its influence is always smaller than 1a. Once we get the correct direction, the theorem is not difficult.

2020年7月30日星期四

Variation argument is everywhere

Today I am asked a high school level question: given 1p1p2pn0 and find a subset of these number such that to maximize the quantity
F(S)=|S|i=1pαi1pαi(|S|j=1(1pαj)).
If one would like to use the search naively, the complexity will be of course large as n!. I want to say a simple variational argument is although simple but useful. But supposing adding one more element, deleting one element, or replacing one element, one can see quickly the description for this optimal subset is τ that
S:={pi}1iτ,τ:=min{nN:τi=1pi1pi1}.
Thus the complexity is reduced to O(n). So let us always think about the variational argument.

2020年7月5日星期日

Optimization from random constrains

This week I heard a very nice talk from my classmate Huajie Qian. The question can be stated as doing minxcx under the constrains h(x,ζ)0 where ζ is a random variable. A very naive method is to say: we have data {ζi}1in, and then we do the problem so that these n data is satisfied. Finally it gives the nice estimate that 
P{ζi}1in[Pζ(h(x,ζ)0)1α](nd)(1α)nd,
which is a very nice result.

The key to prove the result above depends on a very nice lemma: in fact, one can find d constrains among the n to solve the problem. That is to say: only d constrains concern. 

To prove it, we need two theorems: Radon's theorem and Helly's theorem. The first one says in the space Rd, for (d+2) points, we can always find a partition I1,I2 so that the convex hull of I1,I2 has non-empty intersection. The second one says for convex sets {Ci}1im, if the intersection of any (d+1) has non-empty intersection, then imCi is non-empty.

Using the two theorems, we suppose that any d constrains always give a strict better minimiser, then we do the projection on the hyperplane given by the direction c, and then apply the Helly's theorem on it to prove the contradiction.

2020年7月4日星期六

Maximal correlation

This is a question about the correlation. Let X,Y be two Gaussion random variable N(0,1) with correlation β, then prove that the best constant of Cauchy inequality is 
Cov(f(X),g(Y))|β|Var(f(X))Var(g(Y)).

In fact, one can define the maximal correlation of random variable by the best constant above and of course it should be bigger than β. Let us remark how to prove the inequality above quickly. We can use the expansion by Hermit polynomial that we have 
E[Hn(X)Hm(Y)]=δn,m(E1n![XY])n.
Then a centered L2 functions have projection on H0 zero. Then we have 
E[f(X)g(Y)]=n=1f,Hng,Hn1n!βn|β|n=1f,Hn21n!n=1g,Hn21n!β|Var(f(X))Var(g(Y)).
This concludes the proof.

2020年6月26日星期五

Strong Markov property for Brownian Bridge

The strong Markov property for Brownian motion is well known and it is also naturally true for the Brownian Bridge. In fact, since Brownian Bridge ended at (T,0) is thought as a Brownian Motion conditionned the end point, or thought as a perturbation for a linear interpolation, it is natural when we restart from another mid-point.

To prove it rigorously, for example, the Markov property for the Brownian Bridge, we have to do some calculus. Let (Wt)t0 be the standard Brownian motion issued from 0, and we have (Bt)stT defined as 
Bt=x+WtWstsTs(x+WTWs),
a Brownian Bridge between s,T and at s it is x and at T its value is 0. One way to see this formula is that the term x+WtWs is the Brownian Motion while we have to reduce the term at the endpoint T. Some simple calculus shows that it is equal to 
Bt=TtTs(x+WtWs)tsTs(WTWt).

A Markov property is very simple but requires calculus: we would like to show that for s<r<t<T we have
Bt=Br+WtWrtrTr(Br+WTWr).() 
Now we prove it. An intermediate step tells us 
trTr(Br+WTWr)=trTs(x+WTWs) and we put it into the formula that 
RHS()=x+WrWsrsTs(x+WTWs)+WtWrtrTs(x+WTWs)=x+WtWsrsTs(x+WTWs)trTs(x+WTWs)=x+WtWstsTs(x+WTWs). 
This proves the Markov property. Then the strong Markov property is just an approximation and the regularity of the trajectory.