Google PageRank Explained

Because of the dominant role Google plays in today’s search
engine landscape, and because
of the incredible amount of insight it allows into its inner workings,
the PageRank algorithm that this search engine uses has taken on an almost mythical
status among SEO practitioners.
Many very detailed explanations of PageRank are available on the Web.
The original paper written by Google founders Larry Page (the
“Page” in PageRank) and Sergey Brin is called “The PageRank Citation: Bringing Order to the
Web[3],” and is available in
multiple formats online.
In addition to the many papers published by Stanford and Google
researchers, numerous
competing (and occasionally conflicting) accounts have been prepared by
SEO consultants.
Let’s briefly discuss how PageRank works without
getting bogged in mathematical detail.
The concept of PageRank is very similar to the “wandering
drunk” algorithm employed
in many areas of computer science, and to the proverbial thousands of
monkeys that
eventually type long enough to reproduce Shakespeare’s Hamlet.
To understand this concept, let’s consider a random Web
surfer. We’ll make him male,
and call him Bob. To get Bob started, we’ll sit him down with
the browser open at a
Web page that’s selected at random from our index. If there
are a million pages in our
index, there’s a one-in-a-million chance that Bob starts at
any particular one of them.
Bob’s job is to pick a random link on every page he visits,
and continue on to wherever
that link sends him. On each page Bob visits, including the first,
there’s a chance that
he’ll get bored and ask for a different random Web page. So,
when he gets bored, we
select another page completely at random, and the process starts again.
In Page and
Brin’s paper, there is a 15% chance that Bob will get bored
on any page.
If we let Bob surf the pages in our index in this way for a decade or
so, he will eventually
visit every page. Once Bob has viewed every page at least once, we can
count the number
of times he’s visited each page in the index. The pages that
Bob has visited the most
times will be allotted the highest PageRank. To put this another way,
the PageRank
score of a given Web page is an estimate of the probability that a
random surfer would
find that page if they followed the process that Bob followed.
Now, if Bob starts surfing from a page with a very high PageRank, we
can assume that
any page it links to will have a high probability of being found by
Bob. As such, you
might think that links from a page with a high PageRank would be the
most valuable.
However, this is not the case. The more links there are on a page, the
less likely it is
that any particular link will be chosen by a random surfer. This leads
us to PageRank
Truth #1:
The value of any link from a Web page is decreased proportionally forevery additional link on that page.
This is why links from a pure directory like the Open Directory may actually be more
valuable than links from a Web portal that happens to include a directory. In a pure
directory, nearly all the PageRank attributed to the homepage flows through to the
category listings.
By comparison, of the vast number of links that appear on each page of the Yahoo! site,
only a certain percentage link to directory pages. Over 200 links appear on the Yahoo!
homepage, most of which lead away from the directory. Even the directory pages
themselves display many listings and other links.
Likewise, links from a highly selective directory are likely to be worth more than a less
selective directory of equal size, because there will be fewer links (or listings) on each
of the category pages.
The same logic applies to all links. If you are interested in maximizing the PageRank of
the pages on your site, simply looking for high PageRank pages may not be the best
approach. Link placement (i.e. which page carries the link) matters much more than
the average site owner realizes.
Important, high-quality sites receive a higher PageRank, which Google
remembers each time it conducts a search. Of course, important pages
mean nothing to you if they are not matching your query. So, Google combines
PageRank with sophisticated text-matching techniques to find pages that are both important
and relevant to your search. Google goes far beyond the number of times a term appears
on a page and examines all aspects of the page content (and the content of the pages linking to it)
to determine if it is a good match for your query.