Random Walks & PageRank
A random walk describes a situation in which an object moves in a sequence of steps in randomly chosen directions.
This page has a short, delightful explanation of an introduction to random walks.
Think of the World Wide Web as a digraph; the pages are the nodes, and the edges are the hyperlinks that are pointing to them.
If we were to rank the nodes, which one should be ranked first?
It makes sense that the best-ranked one has to be the one with the maximum number of indegree (the number of edges coming in to that node), so its page rank should be its indegree—the more links to that page, the better ranked it is. But this is a naive idea.
Instead, the probability of going to each page will be considered after taking a long random walk in the graph.
All the edges that go out from a vertex will have the same probability of being chosen.
So, if we are at page
So, if
What about the pages that have no links going out? Then comes the supernode.
There will be an edge from the supernode to all the other nodes in the graph with equal probability. Also, every page will have an edge to the supernode as well.
In the case that a page has no outgoing links, its edge to the supernode will have a probability of
Stationary distribution is an assignment of probabilities to vertices in a digraph, if for all vertices
So, it is when the next-step distribution is the same as the current distribution.
That is, the updated probabilities will be the same as the ones we were starting with.