Bite-Sized Mathematics for Computer Science

Notes on MIT's 6.042J

Eda Eren
Table of Contents

Introduction

Prerequisites & Resources

Unit 1: Proofs

00. Introduction to Proofs

01. (Two) Proof Methods

02. The Well Ordering Principle

03. Logic & Propositions

04. Quantifiers and Predicate Logic

05. Sets

06. Binary Relations

07. Induction

08. State Machines — Invariants

09. Recursive Definitions

10. Infinite Sets

Unit 2: Structures

11. GCDs

12. Congruences

13. Euler's Theorem

14. RSA Encryption

15. Digraphs: Walks and Paths

16. Directed Acyclic Graphs

17. Partial Orders and Equivalence

18. Degrees & Isomorphism

19. Coloring & Connectivity

20. Trees

21. Stable Matching

Unit 3: Counting

22. Sums & Products

23. Asymptotics

24. Counting with Bijections

25. Repetitions & Binomial Theorem

26. Pigeonhole Principle, Inclusion-Exclusion

Unit 4: Probability

27. Intro to Discrete Probability

28. Conditional Probability

30. Random Variables, Density Functions

31. Expectation

32. Deviation: Markov & Chebyshev Bounds

33. Sampling & Confidence

34. Random Walks & PageRank

Independence & Causality

Independent events are independent; they have nothing to do with each other.

One common misconception is when flipping a coin: if you got tails the first five times, are you more likely to get tails the sixth time?
Well, the answer is, of course not. Each flip is an independent event.

If it's hard to see why, imagine that after you flipped the coin the fifth time, you put it in a box, and don't open that box for years. After a long time, you finally open the box and flip the coin. Now, does it make sense that it's more likely to get tails? Hopefully not. The coin doesn't remember anything, you can try asking it. Although, that would be weird.

Let's define it better:

Pr[A]=Pr[A  B]\text{Pr}[A] = \text{Pr}[A \ | \ B]

So, the fact that BB happened doesn't affect the probability of AA.

Another definition is:

Pr[A]Pr[B]=Pr[AB]\text{Pr}[A] \cdot \text{Pr}[B] = \text{Pr}[A \cap B]

That means AA and BB are independent events. This is called the product rule. So, the two events are equivalent if the product of their probabilities is equal to the probability that they both happen (which is their intersection).

We stated that Pr[A]=Pr[A  B]\text{Pr}[A] = \text{Pr}[A \ | \ B], but let's prove that.

By the product rule, we can write Pr[A]\text{Pr}[A] as Pr[AB]Pr[B]\frac{\text{Pr}[A \cap B]}{\text{Pr}[B]}.

Well, we know that is just conditional probability: Pr[A  B]=Pr[AB]Pr[B]\text{Pr}[A \ | \ B] = \frac{\text{Pr}[A \cap B]}{\text{Pr}[B]}.

Therefore, Pr[A]=Pr[A  B].  \text{Pr}[A] = \text{Pr}[A \ | \ B]. \ \ \blacksquare

Another way to see that is with just plugging in the definitions.
For example, remember that the difference rule says that Pr[AB]=Pr[A]Pr[AB]\text{Pr}[A - B] = \text{Pr}[A] - \text{Pr}[A \cap B].
So, we can write Pr[A]\text{Pr}[A] as Pr[AB]+Pr[AB]\text{Pr}[A \cap B] + \text{Pr}[A - B]

Also, by the difference rule, Pr[AB]\text{Pr}[A \cap B] can be written as Pr[A]Pr[AB]\text{Pr}[A] - \text{Pr}[A - B].

We want to prove that Pr[A]=Pr[A  B]\text{Pr}[A] = \text{Pr}[A \ | \ B].

Substituting Pr[A]\text{Pr}[A] with Pr[AB]+Pr[AB]\text{Pr}[A \cap B] + \text{Pr}[A - B], we have:

Pr[AB]+Pr[AB]=Pr[A  B]\text{Pr}[A \cap B] + \text{Pr}[A - B] = \text{Pr}[A \ | \ B]

Substituting Pr[AB]\text{Pr}[A \cap B] with Pr[A]Pr[AB]\text{Pr}[A] - \text{Pr}[A - B], we have:

Pr[A]Pr[AB]+Pr[AB]=Pr[A  B]\text{Pr}[A] - \text{Pr}[A - B] + \text{Pr}[A - B] = \text{Pr}[A \ | \ B]
=Pr[A]=Pr[A  B]  = \text{Pr}[A] = \text{Pr}[A \ | \ B] \ \ \blacksquare

It's just a long-winded of saying the same thing, the point is that the math, of course, works.


Independent events are not just for two events. When there are more independent events, it is called mutual independence.

So, if we have events A1, A2, ..., AnA_1, \ A_2, \ ..., \ A_n, they are mutually independent when the probability of one of them (AiA_i) occurs is unchanged by the probability of other ones occurring.

Formally:

Pr[Ai]=Pr[Ai  AjAk ... Am]  (ij, k, ..., m)\text{Pr}[A_i] = \text{Pr}[A_i \ | \ A_j \cap A_k \cap \ ... \ \cap A_m] \ \ (i \neq j, \ k, \ ..., \ m)

The second definition we used previously for two independent events looks like this for mutual independence:

Pr[AiAj ... Am=Pr[Ai]Pr[Aj] ... Pr[Am]\text{Pr}[A_i \cap A_j \cap \ ... \ \cap A_m = \text{Pr}[A_i] \cdot \text{Pr}[A_j] \cdot \ ... \ \cdot \text{Pr}[A_m]

The simplest example for mutual independence is the flipping a coin example we mentioned. You can flip a million times, and none of them depends on the others.


A set A1, A2, ...A_1, \ A_2, \ ..., of events is k-way independent iff every set of kk of those events is mutually independent. The set is pairwise independent iff it is 2-way independent.

So, when we have a set of random variables in which any two of them are independent, it is called pairwise independent.

Let's look at an example.
Imagine we have a box that has four tickets with labels: 112, 121, 211, and 222.

Let A1A_1 be the event that 1 occurs at the first place.
Let A2A_2 be the event that 1 occurs at the second place.
Let A3A_3 be the event that 1 occurs at the third place.

The probability of each of these events is 12\frac{1}{2}:
Pr[A1]=12\text{Pr}[A_1] = \frac{1}{2}, Pr[A2]=12\text{Pr}[A_2] = \frac{1}{2}, Pr[A3]=12\text{Pr}[A_3] = \frac{1}{2}

We choose one ticket at random.

Remember that we defined independence as Pr[A]Pr[B]=Pr[AB]\text{Pr}[A] \cdot \text{Pr}[B] = \text{Pr}[A \cap B].

The intersection of A1A_1 and A2A_2 (A1A2A_1 \cap A_2) is 112.
The probability of choosing 112 (Pr[A1A2]\text{Pr}[A_1 \cap A_2]) is 14\frac{1}{4}, which is 1212\frac{1}{2} \cdot \frac{1}{2}, which is A1A2A_1 \cdot A_2.

It is the same for other pairings as well:
A1A3A_1 \cap A_3 = 121. Pr[A1A3]=14\text{Pr}[A_1 \cap A_3] = \frac{1}{4}.
A2A3A_2 \cap A_3 = 211. Pr[A2A3]=14\text{Pr}[A_2 \cap A_3] = \frac{1}{4}.

So, A1A_1, A2A_2, and A3A_3 are pairwise independent.