Bite-Sized Mathematics for Computer Science

Notes on MIT's 6.042J

Eda Eren
Table of Contents

Introduction

Prerequisites & Resources

Unit 1: Proofs

00. Introduction to Proofs

01. (Two) Proof Methods

02. The Well Ordering Principle

03. Logic & Propositions

04. Quantifiers and Predicate Logic

05. Sets

06. Binary Relations

07. Induction

08. State Machines — Invariants

09. Recursive Definitions

10. Infinite Sets

Unit 2: Structures

11. GCDs

12. Congruences

13. Euler's Theorem

14. RSA Encryption

15. Digraphs: Walks and Paths

16. Directed Acyclic Graphs

17. Partial Orders and Equivalence

18. Degrees & Isomorphism

19. Coloring & Connectivity

20. Trees

21. Stable Matching

Unit 3: Counting

22. Sums & Products

23. Asymptotics

24. Counting with Bijections

25. Repetitions & Binomial Theorem

26. Pigeonhole Principle, Inclusion-Exclusion

Unit 4: Probability

27. Intro to Discrete Probability

28. Conditional Probability

29. Independence & Causality

30. Random Variables, Density Functions

31. Expectation

32. Deviation: Markov & Chebyshev Bounds

33. Sampling & Confidence

Random Walks & PageRank

A random walk describes a situation in which an object moves in a sequence of steps in randomly chosen directions.

This page has a short, delightful explanation of an introduction to random walks.


Think of the World Wide Web as a digraph; the pages are the nodes, and the edges are the hyperlinks that are pointing to them.

If we were to rank the nodes, which one should be ranked first?
It makes sense that the best-ranked one has to be the one with the maximum number of indegree (the number of edges coming in to that node), so its page rank should be its indegree—the more links to that page, the better ranked it is. But this is a naive idea.

Instead, the probability of going to each page will be considered after taking a long random walk in the graph.

All the edges that go out from a vertex will have the same probability of being chosen.
So, if we are at page xx, and there are 3 links out of it, each one has a probability of 13\frac{1}{3} of being clicked.

So, if outdeg(V)\text{outdeg}(V) denotes the number of outgoing edges from VV, the probability of each edge to WW that goes out from VV will be:

Pr[(V, W)]=1outdeg(V)\text{Pr}[(V, \ W)] = \frac{1}{\text{outdeg}(V)}

What about the pages that have no links going out? Then comes the supernode.

There will be an edge from the supernode to all the other nodes in the graph with equal probability. Also, every page will have an edge to the supernode as well.
In the case that a page has no outgoing links, its edge to the supernode will have a probability of 11.


Stationary distribution is an assignment of probabilities to vertices in a digraph, if for all vertices xx:

Pr[at x]=Pr[go to x at next step]\text{Pr}[\text{at }x] = \text{Pr}[\text{go to }x \text{ at next step}]

So, it is when the next-step distribution is the same as the current distribution.
That is, the updated probabilities will be the same as the ones we were starting with.