Skip to Content
TechnicalProbability notes from set theory to more advanced

Probability notes from set theory to more advanced

Set theory

  • Universal set (sample space) Ω\Omega: all possible outcomes.
  • Event: any subset AΩA \subseteq \Omega.
  • Empty event \emptyset: never occurs.
    Certain event Ω\Omega: always occurs.
  • Set operations:
    • Union: ABA \cup B — “AA or BB happens.”
    • Intersection: ABA \cap B — “AA and BB happen.”
    • Complement: Ac=ΩAA^c = \Omega \setminus A — “AA does not happen.”
    • Difference: AB=ABcA \setminus B = A \cap B^c.
  • Disjoint (mutually exclusive): AB=A \cap B = \emptyset.

These operations satisfy Boolean algebra rules:

A(BC)=(AB)(AC),(Ac)c=A.A \cup (B \cap C) = (A \cup B) \cap (A \cup C), \quad (A^c)^c = A.

Sigma-algebra

collection of measurable events To define probabilities consistently, we use a σ-algebra F\mathcal{F} over Ω\Omega:

  1. ΩF\Omega \in \mathcal{F}.
  2. If AFA \in \mathcal{F}, then AcFA^c \in \mathcal{F}.
  3. If A1,A2,FA_1, A_2, \ldots \in \mathcal{F}, then iAiF\bigcup_i A_i \in \mathcal{F}.

This ensures we can take complements and countable unions safely.

Basic probability spaces

  • Sample space: Ω\Omega (all outcomes). An event AΩA\subseteq\Omega.
  • Probability measure Pr[]\Pr[\cdot] satisfies 0Pr[A]10\le\Pr[A]\le1, Pr[Ω]=1\Pr[\Omega]=1, countable additivity.
  • Example: fair coin Ω={H,T}\Omega=\{H,T\}, Pr[H]=Pr[T]=1/2\Pr[H]=\Pr[T]=1/2.

Conditional probability & product rule

  • Conditional probability of AA given BB (with Pr[B]>0\Pr[B]>0): Pr[AB]=Pr[AB]Pr[B].\Pr[A\mid B]=\frac{\Pr[A\cap B]}{\Pr[B]}.
  • Product rule (equivalent): Pr[AB]=Pr[AB]Pr[B]=Pr[BA]Pr[A].\Pr[A\cap B]=\Pr[A\mid B]\Pr[B]=\Pr[B\mid A]\Pr[A].

Independence

  • AA and BB are independent iff Pr[AB]=Pr[A]Pr[B],\Pr[A\cap B]=\Pr[A]\Pr[B], equivalently Pr[AB]=Pr[A]\Pr[A\mid B]=\Pr[A] (when Pr[B]>0\Pr[B]>0).
  • For random variables X,YX,Y: independence means events {XI}\{X\in I\} and {YJ}\{Y\in J\} independent for all measurable I,JI,J.

Law of total probability

  • If B1,,BnB_1,\dots,B_n partition Ω\Omega (mutually disjoint, iBi=Ω\bigcup_i B_i=\Omega) then for any event AA: Pr[A]=i=1nPr[ABi]Pr[Bi].\Pr[A]=\sum_{i=1}^n \Pr[A\mid B_i]\Pr[B_i].
  • Useful to expand Pr[A]\Pr[A] by conditioning on cases.

Expectation and indicator variables

  • Indicator of event AA: 1A(ω)=11_A(\omega)=1 if ωA\omega\in A, else 00.
  • Expectation of indicator: E[1A]=Pr[A]\mathbb{E}[1_A]=\Pr[A].
  • Linearity: E[iXi]=iE[Xi]\mathbb{E}[\sum_i X_i]=\sum_i\mathbb{E}[X_i] (no independence required).
  • For a discrete r.v. XX: E[X]=xxPr[X=x]\mathbb{E}[X]=\sum_x x\Pr[X=x].

Useful identities

  • Pr[A]=E[1A]\Pr[A]=\mathbb{E}[1_A].
  • Pr[AB]=Pr[A]+Pr[B]Pr[AB]\Pr[A\cup B]=\Pr[A]+\Pr[B]-\Pr[A\cap B].
  • If BB is a Bernoulli(pp) bit independent of other randomness: Pr[A]=Pr[AB=0]Pr[B=0]+Pr[AB=1]Pr[B=1]\Pr[A]=\Pr[A\mid B=0]\Pr[B=0]+\Pr[A\mid B=1]\Pr[B=1] (special case of total law).

Short worked example from our paper (the paper’s setup)

Arthur chooses b{0,1}b\in\{0,1\} uniformly. If b=0b=0 he samples x0ukx_0\sim u_k; if b=1b=1 he samples x1px_1\sim p. He sends xbx_b to Merlin, who outputs b^=1Sp(xb)\hat b=1_{S_p}(x_b) where Sp={x:p(x)1/k}S_p=\{x:p(x)\ge 1/k\}. Arthur accepts iff b^=b\hat b=b.

Compute Pr[b^=b]\Pr[\hat b=b]:

  1. Condition on bb (law of total probability): Pr[b^=b]=Pr[b^=bb=0]Pr[b=0]+Pr[b^=bb=1]Pr[b=1].\Pr[\hat b=b]=\Pr[\hat b=b\mid b=0]\Pr[b=0]+\Pr[\hat b=b\mid b=1]\Pr[b=1].
  2. Evaluate each term:
    • If b=0b=0: Merlin is correct iff x0Spx_0\notin S_p, so Pr[b^=bb=0]=uk(Spc)=1uk(Sp)\Pr[\hat b=b\mid b=0]=u_k(S_p^c)=1-u_k(S_p).
    • If b=1b=1: Merlin is correct iff x1Spx_1\in S_p, so Pr[b^=bb=1]=p(Sp)\Pr[\hat b=b\mid b=1]=p(S_p).
  3. Since Pr[b=0]=Pr[b=1]=1/2\Pr[b=0]=\Pr[b=1]=1/2, Pr[b^=b]=12(1uk(Sp)+p(Sp))=12(1+(p(Sp)uk(Sp))).\Pr[\hat b=b]=\tfrac12\big(1-u_k(S_p)+p(S_p)\big)=\tfrac12\big(1 + (p(S_p)-u_k(S_p))\big).
  4. Using p(Sp)uk(Sp)=dTV(p,uk)p(S_p)-u_k(S_p)=d_{TV}(p,u_k) (Scheffé set), if dTV(p,uk)ε2d_{TV}(p,u_k)\ge\varepsilon_2 then Pr[b^=b]12(1+ε2).\Pr[\hat b=b]\ge \tfrac12(1+\varepsilon_2).
Last updated on