xiaolongy

Attempts and failures are information

Note to self on learning and taking actions. Will rewrite it when I understand it deeper, one day.

Attempt process.

Xt{0,1},t=1,2,(Xt=1p)=p,(Xt=0p)=1p,X1,X2, i.i.d.

Aggregates after n attempts.

Sn=t=1nXt,Fn=nSn,p^n=Snn.

Probability of at least one success in n attempts.

Pn(p)=(Sn1p)=1(1p)n.

Bayesian update.

p~Beta(α0,β0),pX1:n~Beta(α0+Sn,β0+Fn),p~n:=𝔼[pX1:n]=α0+Snα0+β0+n.

Target planning.

q(0,1),n*(p,q):=min{n:1(1p)nq}=log(1q)log(1p).

Plug-in rule for next block size.

p~n{p^n,𝔼[pX1:n]},nn+1plan=n*(p~n,q).

Change-of-strategy indicator.

pmin(0,1),In:=1{p~n<pmin}.