Entropy: H(X) = \mathbb{E}[-\log p(X)]
uniform distribution (highest entropy) has largest bit size to encode entry in distribution: H(X)=\log_2 N (N equally likely outcomes)
Low entropy: p(0)=0.9,\quad p(1)=0.1 \implies H(X)= -0.9\log_2(0.9) - 0.1\log_2(0.1) \approx 0.469 \text{ bits}
Objective: people can figure out stuff even if you anonymize dataset, so we need a proof of security safety.
Safety: e.g. memory leak
Isolation: e.g. sandbox
Information Flow: e.g. prove no info leak
Privacy: e.g. release a portion of data but don't reveal too much
Authorization / Trust: e.g. cookie, password, ssh
Static Analysis: written down proof
Dynamic Analysis: runtime guarantee
Theorem Proof: z3
Tradeoff: more secure sometimes means less useful (e.g. netflix dataset)
Table of Content