The $25,000,000,000 Eigenvector: The Linear Algebra Behind Google

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

SIAMREVIEW ?

2006 SocietyforIndustrial
andAppliedMathematics
Vol.48,No. 3, pp.569-581

The $25,000,000,000 Eigenvector:


The Linear Algebra behind Google*
KurtBryant
TanyaLeiset

Abstract. Google's success derives in large part from its PageRank algorithm, which ranks the im
portance of web pages according to an eigenvector of a weighted link matrix. Analysis
of the PageRank formula provides a wonderful applied topic for a linear algebra course.
Instructors may assign this article as a project to more advanced students or spend one
or two lectures presenting the material with assigned homework from the exercises. This
material also complements the discussion of Markov chains in matrix algebra. Maple and
Mathematica files supporting this material can be found at www.rose-hulman.edu/-bryan.

Keywords. linearalgebra, PageRank, eigenvector,stochasticmatrix

AMS subject classifications.15-01, 15A18, 15A51

DOI. 10.1137/050623280

1. Introduction. When Google went online in the late 1990s, one thing that set
it apart from other search engines was that its search result listings always seemed to
deliver the "good stuff" up front.With other search engines you often had to wade
through screen after screen of links to irrelevant web pages that just happened to
match the search text. Part of the magic behind Google is its PageRank algorithm,
which quantitatively rates the importance of each page on the web, allowing Google
to rank the pages and thereby present to the user the more important (and typically
most relevantand helpful)pages first.
Understanding how to calculate PageRank is essential for anyone designing a web
page that they want people to access frequently, since getting listed first in a Google
search leads tomany people looking at your page. Indeed, due to Google's prominence
as a search engine, its ranking system has had a deep influence on the development and
structure of the Internet and on what kinds of information and services get accessed
most frequently. Our goal in this paper is to explain one of the core ideas behind how
Google calculates web page rankings. This turns out to be a delightful application of
standard linear algebra.
Search engines such as Google have to do three basic things:
1. Crawl the web and locate all web pages with public access.
2. Index the data from step 1, so that they can be searched efficiently for relevant
key words or phrases.

*
Received by the editors January 25, 2005; accepted for publication (in revised form) January 5,
2006;published electronically August 1, 2006. $25,000,000,000 was the approximate market value of
Google when the company went public in 2004.

http://www.siam.org/journals/sirev/48-3/62328.html
"("Department ofMathematics, Rose-Hulman Institute of Technology, Terre Haute, IN 47803 (kurt.
bryan@rose-hulman.edu).
*Mathematics and Computer Science Amherst MA 01002
Department, College, Amherst, (tleise?
amherst.edu).
569

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
570 KURT BRYANAND TANYALEISE

1 ~3

Fig. I An example of a web with only four pages. An arrow from page A to page B indicates a link
from page A to page B.

3. Rate the importance of each page in the database, so that when a user does
a search and the subset of pages in the database with the desired information
has been found, the more important pages can be presented first.
This paper will focus on step 3. In an interconnected web of pages, how can one
meaningfully define and quantify the "importance" of any given page?
The rated importance of web pages is not the only factor in how links are pre
sented, but it is a significant one. There are successful ranking algorithms other than
PageRank. The interested reader will find a wealth of information about ranking
algorithms and search engines, and we list just a few references for getting started
(see the extensive bibliography in [9], for example, for a more complete list). For a
brief overview of how Google handles the entire process, see [6], and for an in-depth
treatment of PageRank, see [3] and a companion article [9]. Another article with
good concrete examples is [5]. For more background on PageRank and explanations
of essential principles of web design tomaximize a website's PageRank, go to the web
sites [4, 11, 14]. To find out more about search engine principles in general and other
ranking algorithms, see [2] and [8]. Finally, for an account of some newer approaches
to searching the web, see [12] and [13].

2. Developing a Formula to Rank Pages.

2.1. The Basic Idea. Inwhat follows we will use the phrase "importance score" or
just "score" for any quantitative rating of a web page's importance. The importance
score for any web page will always be a nonnegative real number. A core idea in
assigning a score to any given web page is that the page's score is derived from the
links made to that page from other web pages. The links to a given page are called
the backlinks for that page. The web thus becomes a democracy where pages vote for
the importance of other pages by linking to them.
Suppose the web of interest contains n pages, each page indexed by an integer k,
1 < k < n. A typical example is illustrated in Figure 1, in which an arrow from page
A to page B indicates a link from page A to page B. Such a web is an example of a
directed graph.1 We'll use Xk to denote the importance score of page k in the web. Xk
is nonnegative and xj > Xk indicates that page j ismore important than page k (so
xj = 0 indicates that page j has the least possible importance score).

1A graph consists of a set of vertices (in this context, the web pages) and a set of edges. Each
edge joins a pair of vertices. The graph is undirected if the edges have no direction. The graph is
directed if each edge (in the web context, the links) has a direction, that is, a starting and an ending
vertex.

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
EIGENVECTOR
THE $25,000,000,000 571

A very simple approach is to take Xk as the number of backlinks for page k. In


the example in Figure 1, we have x1 = 2, X2 = 1, X3 3, and X4 = 2, so that page 3
is the most important, pages 1 and 4 tie for second, and page 2 is least important. A
link to page k becomes a vote for page k's importance.
This approach ignores an important feature one would expect a ranking algorithm
to have, namely, that a link to page k from an important page should boost page k's
importance score more than a link from an unimportant page. For example, a link
to your homepage directly fromYahoo! ought to boost your page's score much more
than a link from, say, www.kurtbryan.com (no relation to the author). In the web of
Figure 1, pages 1 and 4 both have two backlinks: each links to the other, but page
l's second backlink is from the seemingly important page 3, while page 4's second
backlink is from the relatively unimportant page 1. As such, perhaps we should rate
page l's importance higher than that of page 4.
As a first attempt at incorporating this idea, let's compute the score of page j as
the sum of the scores of all pages linking to page j. For example, consider the web
of Figure 1. The score of page 1would be determined by the relation x1 = X3 + X4.
Since X3 and X4 will depend on x1, this scheme seems strangely self-referential, but
it is the approach we will use, with one more modification. Just as in elections, we
don't want a single individual to gain influence merely by casting multiple votes. In
the same vein, we seek a scheme in which a web page doesn't gain extra influence
simply by linking to lots of other pages. If page j contains nj links, one of which
links to page k, then we will boost page k's score by xj/nj, rather than by xj. In
this scheme, each web page gets a total of one vote, weighted by that web page 's score,
that is evenly divided up among all of its outgoing links. To quantify this for a web
of n pages, let Lk C {1, 2, ... ,n} denote the set of pages with a link to page k, that
is, Lk is the set of page k's backlinks. For each k we require

Xk = E Z'
n
(1ELk

where nj is the number of outgoing links from page j (which must be positive since if
j E Lk, then page j links to at least page k). We will assume that a link from a page
to itselfwill not be counted. In this "democracy of the web," you don't get to vote
foryourself!
Let's apply this approach to the four-page web of Figure 1. For page 1, we have
Xl = X3/1 + X4/2, since pages 3 and 4 are backlinks for page 1 and page 3 contains
only one link, while page 4 contains two links (splitting its vote in half). Similarly,
X2 = X1/3, X3 = xi/3 X2/2 + X4/2, and X4 x1/3 + x2/2. These linear equations
can be written Ax = x, where x [x1 X2 X3 X4]T and

oO I 1 -
100 0
(2) A= 3[ 2

L 3 2 J

This transforms the web ranking problem into the "standard" problem of finding an
eigenvector for a square matrix! (Recall that the eigenvalues A and eigenvectors x of
a matrix A satisfy the equation Ax = Ax, x 74 0, by definition.) We thus seek an
eigenvectorx with eigenvalue 1 forthematrix A. We will refertoA as the "link
matrix" forthegivenweb.

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
572 KURT BRYANAND TANYALEISE

It turns out that the link matrix A in (2) does indeed have eigenvectors with
eigenvalue 1, namely, all multiples of the vector [12 4 9 6]T (recall that any nonzero
multiple of an eigenvector is again an eigenvector). Let's agree to scale these "impor
tance score eigenvectors" so that the components sum to 1. In this case we obtain
1 ; 0.129 X3 9 = 6
X =32
31 0.387, X2 =4? 31 29 3 ,= 0.290, andX4
0.29, x 0.194. Note that
this ranking differs from that generated by simply counting backlinks. It might seem
surprising that page 3, linked to by all other pages, is not the most important. To
understand this, note that page 3 links only to page 1 and so casts its entire vote for
page 1. This, with the vote of page 2, results in page 1 getting the highest importance
score.
More generally, the matrix A for any web must have 1 as an eigenvalue if the
web has no dangling nodes (pages with no outgoing links). To see this,
in question
firstnote that for a general web of n pages, formula (1) gives rise to a matrix A with
= 1/nj if page j links to page i, and Aij = 0 otherwise. The jth column of A
Aid
then contains nj nonzero entries, each equal to 1/nj, and the column thus sums to 1.
This motivates the following definition, used in the study ofMarkov chains.
DEFINITION 1. A square matrix is called a column-stochastic matrix if all of its
entries are nonnegative and the entries in each column sum to 1.
The matrix A for a web with no dangling nodes is column-stochastic. We now
prove the following.
PROPOSITION 1. Every column-stochastic matrix has 1 as an eigenvalue.
Proof. Let A be an n x n column-stochastic matrix and let e denote an n
dimensional column vector with all entries equal to 1. Recall that A and its transpose
AT have the same eigenvalues. Since A is column-stochastic, it is easy to see that
ATe = e, so that 1 is an eigenvalue forAT and hence forA. I
In what follows, we use Vi (A) to denote the eigenspace for the eigenvalue 1 of a
matrixA.
column-stochastic

2.2. Shortcomings. Several difficulties arise with using formula (1) to rank web
sites. In this section we discuss two issues: webs with nonunique rankings and webs
with danglingnodes.

2.2.1. Nonunique Rankings. For our rankings, it is desirable that the dimension
of VI (A) equal 1, so that there is a unique eigenvector x with Ei xi = 1 that we can
use for importance scores. This is true in the web of Figure 1 and more generally is
always true for the special case of a strongly connected web (that is, you can get from
any page to any other page in a finite number of steps); see Exercise 10 below.
Unfortunately, it is not always true that the link matrix A will yield a unique
ranking for all webs. Consider the web in Figure 2, forwhich the linkmatrix is

0 1 0 0 0
I 0 0 0 0
2
2
L0 0 0 0 0 J

We find here that V1(A) is two-dimensional; one possible pair of basis vectors is x =
[1/2, 1/2, 0, 0,o]T and y = [0,0, 1/2,1/2, O]T. But note that any linear combination of
In 3
these two vectors yields another vector inV1(A), e.g., ,-[3/8, 3/8, 1/8, 1/8, n]T.
we shoulduse forthe rankings!
It is not clearwhich, ifany,of theseeigenvectors

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
THE $25,000,000,000
EIGENVECTOR 573

1 3

2 4

Fig. 2 A web of five pages, consisting of two disconnected "subwebs" Wi (pages 1 and 2) and W2
(pages 3, 4, 5).

It is no coincidence that for the web of Figure 2 we find that dim(Vi (A)) > 1. It is
a consequence of the fact that ifa web W, considered as an undirected graph (ignoring
which direction each arrow points), consists of r disconnected subwebs WI,. .Wr,
,
then dim(Vi (A)) > r, and hence there is no unique importance score vector x E V1 (A)
with Ei xi = 1. This makes intuitive sense: if a web W consists of r disconnected
subwebs WI, . .Wr,, then one would expect difficulty in finding a common reference
frame for comparing the scores of pages in one subweb with those in another subweb.
Indeed, it is not hard to see why a web W consisting of r disconnected subwebs
forces dim(V1 (A)) > r. Suppose a web W has n pages and r component subwebs
W.,. . .,Wr. Let ni denote the number of pages inWi. Index the pages inW1 with
indices 1 through nl, the pages inW2 with indices n1 + 1 through n1 + n2, the pages
inW3 with n1 + n2 + 1 through ni + n2 + n3, etc. In general, let Ni = _ nj for
i> 1,with No = 0, so Wi contains pages Ni-, + 1 through Ni. For example, in the
web of Figure 2, we can take N1 = 2 and N2 = 5, so Wi contains pages 1 and 2 and
W2 contains pages 3,4, and 5. The web in Figure 2 is a particular example of the
general case, inwhich the matrix A assumes a block diagonal structure

[A1 0 ... 0
0 A2 0 0
o :
o o 0 Alr

where Ai denotes the link matrix forWi. In fact, Wi can be considered as a web
in its own right. Each ni x ni matrix Ai is column-stochastic and hence possesses
some eigenvector v2 E Rni with eigenvector 1. For each i between 1 and r construct
a vector wi E in which has 0 components for all elements corresponding to blocks
other than block i. For example,

w4 I w2= [0J

Then it is easy to see that the vectorswi 1 K i K r are linearlyindependent

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
574 KURT BRYANAND TANYALEISE

eigenvectorsforA with eigenvalue1 because


0

0
Awi = A vi wi.
0

Thus V1(A) has dimension at least r.

2.2.2. Dangling Nodes. Another difficultymay arise when using thematrix A to


generate rankings. A web with dangling nodes produces a matrix A which contains
one or more columns of all zeros. In this case A is column-substochastic, that is,
the column sums of A are all less than or equal to 1. Such a matrix must have
all eigenvalues less than or equal to 1 in magnitude, but 1 need not actually be an
eigenvalue for A. Nevertheless, the pages in a web with dangling nodes can still
be ranked using a similar technique. The corresponding substochastic matrix must
have a positive eigenvalue A < 1 and a corresponding eigenvector x with nonnegative
entries (called the Perron eigenvector) that can be used to rank the web pages. See
Exercise 4 below.We will not further
considertheproblemof danglingnodes here,
however.
EXERCISE 1. Suppose the people who own page 3 in the web of Figure 1 are
infuriated by the fact that its importance score, computed using formula (1), is lower
than the score of page 1. In an attempt to boost page 3 's score, they create a page 5
that links to page 3; page 3 also links to page 5. Does this boost page 3 's score above
that of page 1 ?
EXERCISE 2. Construct a web consisting of three or more subwebs and verify that
dim(Vi (A)) equals (or exceeds) the number of components in the web.
EXERCISE 3. Add a link from page 5 to page 1 in the web of Figure 2. The
resulting web, considered as an undirected graph, is connected. What is the dimension
of VI (A) ?
EXERCISE 4. In the web of Figure 1, remove the link from page 3 to page 1. In the
resulting web, page 3 is now a dangling node. Set up the corresponding substochastic
matrix and find its largest positive (Perron) eigenvalue. Find a nonnegative Perron
eigenvector for this eigenvalue, and scale the vector so that its components sum to 1.
Does the resulting ranking seem reasonable?
EXERCISE 5. Prove that in any web the importance score of a page with no
backlinks is 0.
EXERCISE 6. Implicit in our analysis up to this point is the assertion that the
manner in which the pages of a web W are indexed has no effect on the importance
score assigned to any given page. Prove this as follows: Let W contain n pages, each
page assigned an index 1 through n, and letA be the resulting linkmatrix. Suppose
we then transpose the indices of pages i and j (so page i is now page j and vice versa).
Let A be the linkmatrix for the relabeled web.
* Argue thatA = PAP, where P is the elementary matrix obtained by trans
posing rows i and j of the n x n identity matrix. Note that the operation
A -PA has the effect of swapping rows i and j of A, while A X AP swaps
columnsi and j. Also, p2 = I, the identity
matrix.

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
THE $25,000,000,000
EIGENVECTOR 575

* Suppose that x is an eigenvector for A, so Ax = Ax for some A. Show that


y = Px is an ezgenvector for A with eigenvalue A.
* Explain why this shows that transposing the indices of any two pages leaves
the importance scores unchanged, and use this result to argue that any per
mutation of the page indices leaves the importance scores unchanged.

3. A Remedy for dim(Vi (A)) > I. An enormous amount of computing resources


are needed to determine an eigenvector for the linkmatrix corresponding to a web
containing billions of pages. It is thus important to know that our algorithm will yield
a unique set of sensible web rankings. The analysis above shows that our firstattempt
to rank web pages leads to difficulties if the web isn't connected. And theWorld Wide
Web, treated as an undirected graph, contains many disjoint components; see [9] for
some interesting statistics concerning the structure of the web.
Below we present and analyze a modification of the above method that is guar
anteed to overcome this shortcoming. The analysis that follows is basically a special
case of the Perron-Frobenius theorem, and we only prove what we need for this appli
cation. For a full statement and proof of the Perron-Frobenius theorem, see chapter
8 in [10].

3.1. A Modification to the Link Matrix A. For an n-page web with no dangling
nodes, we can generate unambiguous importance scores as follows, including the case
of a web with multiple subwebs.
Let S denote an n x n matrix with all entries 1/n. The matrix S is column
stochastic, and it is easy to check that VI(S) is one-dimensional. We will replace the
matrix A with the matrix

(3) M = (1-m)A + mS,

where 0< m < 1. M is a weighted average of A and S. The value of m originally


used by Google is reportedly 0.15 [9, 11]. For any m E [0, 1], thematrix M is column
stochastic and we show below that V1 (M) is always one-dimensional ifm E (, 1].
Thus M can be used to compute unambiguous importance scores. In the case when
m 0 we have the original problem, for then M = A. At the other extreme is
m 1, yielding M = S. This is the ultimately egalitarian case: the only normalized
eigenvector x with eigenvalue 1 has xi = 1/n for all i and all web pages are rated
equally important.
Using M in place of A gives a web page with no backlinks (a dangling node)
the importance score ofm/n (Exercise 9), and thematrix M is substochastic for any
m < 1 since the matrix A is substochastic. Therefore the modified formula yields
nonzero importance scores for dangling links (ifm > 0) but does not resolve the issue
of dangling nodes. In the remainder of this article, we only consider webs with no
danglingnodes.
The equation x = Mx can also be cast as

(4) x= (1-m)Ax + ms,

where s is a column vector with all entries 1/n. Note that Sx = s if>i xi = 1.
We will provebelow thatV (M) is alwaysone-dimensional,
but firstlet's lookat
a couple of examples.

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
576 KURTBRYANAND TANYALEISE

Example 1. For the web of four pages in Figure 1 with matrix A given by (2),
the new formula gives (with m - 0.15)

[ 0.0375 0.0375 0.8875 0.46251


M- 0.32083 0.0375 0.0375 0.0375
- 0.32083 0.4625 0.0375 0.4625
0.32083 0.4625 0.0375 0.0375 j
and yields importance scores xl 0.368, X2 0.142, X3 0.288, and X4 00.202.
This yields the same ranking of pages as the earlier computation, but the scores are
slightlydifferent.
Example 2 shows more explicitly the advantages of using M in place of A.
Example 2. As a second example, for the web of Figure 2 with m = 0.15, we
obtain the matrix
0.03 0.88 0.03 0.03 0.03
0.88 0.03 0.03 0.03 0.03
(5) M = 0.03 0.03 0.03 0.88 0.455
0.03 0.03 0.88 0.03 0.455
0.03 0.03 0.03 0.03 0.03
with normalizedeigenvectorcomponents
The space V1(M) is indeedone-dimensional,
of x1 = 0.2, x2 = 0.2, x3 = 0.285, X4 = 0.285, and X5 = 0.03. The modification, using
M instead of A, allows us to compare pages in different subwebs.
Each entry Mij of M defined by (3) is strictly positive, which motivates the
definition.
following
DEFINITION 2. A matrix M is positive ifMij > 0 for all i and j.
This is the key property that guarantees dim(Vi (M)) = 1, which we prove in the
next section.
3.2. Analysis of the Matrix M. Note that Proposition 1 shows that V1 (M) is
nonempty since M is stochastic. The goal of this section is to show that V1 (M) is in
fact one-dimensional. This is a consequence of the following two propositions.
PROPOSITION 2. IfM is positive and column-stochastic, then any eigenvector in
V1 (M) has all positive or all negative components.
Proof. We use proof by contradiction. First note that in the standard triangle
inequality >Ei yiI < Ei IyiI (with all yi real), the inequality is strict when the yi are
ofmixed sign. Suppose x C Vi (M) contains elements ofmixed sign. From x = Mx we
have xi = E1>1 Mijxj and the summands Mijxj are of mixed sign (since Mij > 0).
As a result we have a strict inequality

n n

(6) ixil= ZMijxj <EMij xj1.


j=l j=l

Sum both sides of inequality (6) from i = 1 to i = n, and swap the i and j summations.
Then use the fact thatM is column-stochastic (i Mij = 1 for all j) to find
n n n n n( \ n

S1xi I<EEMij xj= (E EMij) lxjI=ElxjL


i= 1 i=l j=l j=1 i=1 j=1

a contradiction. Hence x cannot contain both positive and negative elements. If xi > 0
forall i (and not all xi are 0), thenxi > 0 followsimmediatelyfromx, = ,n=1M
andMi > 0. Similarlyxi,< 0 forall i impliesthateach xi < 0. 5

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
EIGENVECTOR
THE $25,000,000,000 577

propositionwill also be useful foranalyzingdim(Vi(M)).


The following
PROPOSITION 3. Let v and w be linearly independent vectors in Rm, m > 2.
Then for some values of s and t, the vector x - sv + tw has both positive and
negativecomponents.
Proof. Linear independence implies that neither v nor w is 0. Let d >
Ei vi. If
d= 0, then v must contain components of mixed sign, and taking s = 1 and t = 0
yields the conclusion. If d 54 0, set s =- t= 1 and x = sv + tw. Since v
and w are independent, x :A 0. However, Ei xi = 0. We conclude that x has both
positiveand negativecomponents. O
We can now prove that using M in place ofA yields an unambiguous ranking for
anyweb with no danglingnodes.
LEMMA3. IfM ispositiveand column-stochastic, thenV1(M) has dimension1.
Proof. We again use proofby contradiction. Suppose thereare two linearly
independent v andw in thesubspaceV1(M). For any realnumbers, the
eigenvectors
vector x = v + sw must be in VI (M) and so have components that are all negative
or all positive. But by Proposition 3, for some choice of s and t the vector x must
containcomponentsofmixed sign,a contradiction. We concludethatV1(M) cannot
contain two linearlyindependentvectors,and so ithas dimension1. U
Lemma 3 providesthe "punchline"forour analysisof therankingalgorithmusing
the matrix M (for 0 < m < 1). The space Vi (M) is one-dimensional, and, moreover,
have entirelypositiveor negativecomponents.
the relevanteigenvectors We are thus
guaranteed the existence of a unique eigenvector x e Vi (M) with positive components
such that Ei xi = 1.
EXERCISE 7. Prove that ifA is an n x n column-stochastic matrix and 0 < m < 1
thenM = (1 - m)A + mS is also a column-stochastic matrix.
EXERCISE 8. Show that theproductof twocolumn-stochastic matrices is also
column-stochastic.
EXERCISE 9. Show thata page withno backlinksis given importancescore m by
formula(4).
EXERCISE 10. Suppose thatA is the linkmatrix for a strongly
connectedweb of
n pages (any page can be reached from any other page by following a finite number
of links).Show thatdim(Vi (A)) = 1 as follows.Let (Ak)ij denote the(i,j)-entry of
Ak.
* Note that page i can be reached from page j in one step if and only ifAij > 0
(since Ai3 > 0 means there's a link from j to i). Show that (A2)ij > 0 if
and only if page i can be reached from page j in exactly two steps. Hint:
(A2)ij = Sk AikAkj; all Aij are nonnegative, so (A2)ij > 0 implies that for
some k both Aik and Akj are positive.
* Show more generally that (AP)ij > 0 if and only ifpage i can be reached from
page j in exactly p steps.
* Argue that (I + A + A2 + ... + 0 if and only ifpage i can be reached
AP)jj >
from page j in p or fewer steps (note that p = 0 is a legitimate choice-any
page can be reachedfromitselfin zero steps!).
* Explain why I+ A + A2 +.. .+ An-l is a positive matrix if the web is strongly
connected.
* Use the last part (and Exercise 8) to show thatB = (I+A+A2 +. .+An
(and hence byLemma 3, dim(Vi(B)) = 1).
is positiveand column-stochastic
* Show thatifx E VX7(A),thenxE V1(B). Why does thisimplythatdim(V17(A))
- 1?

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
578 KURT BRYANAND TANYALEISE

EXERCISE 11. Consider again the web in Figure 1, with the addition of a page 5
that links to page 3, where page 3 also links to page 5. Calculate the new ranking by
findingtheeigenvectorofM (correspondingto A = 1) thathas positivecomponents
summing to 1. Use m = 0.15.
EXERCISE 12. Add a sixth page that links to every page of theweb in the previous
exercise, but to which no other page links. Rank the pages using A, then ussingM
with m = 0.15, and compare the results.
EXERCISE 13. Constructa web consistingof twoormore subwebsand determine
therankinggiven byformula(3).
As of January 2005, the web contains at least eight billion pages how does one
computean eigenvectorforan eightbillionby eightbillionmatrix? One reasonable
approach is an iterative procedure called the power method (along with modifications),
which we will now examine for the special case at hand. It isworth noting that there
ismuch additional analysis one can do, and there are many improved methods for the
computation of PageRank. The reference [7]provides a typical example and additional
references.
4. Computing the Importance Score Eigenvector. The roughidea behind the
power method2 for computing an eigenvector of a matrix M is this: One starts with
a "typical" vector x0 then generates the sequence Xk = Mxk-l (so Xk = Mkxo) and
lets k approaches infinity. The vector xk is, to good approximation, an eigenvector
for the dominant (largest-magnitude) eigenvalue ofM. However, depending on the
magnitude of this eigenvalue, the vector xk may also grow without bound or decay
to the zero vector. One thus typically rescales at each iteration, say, by computing
can be vector norm.
xk = MXk-1,
where any

However, to guarantee that the power method converges at a reasonable rate, one
typically requires that the dominant eigenvalue A =1 forM be simple. That is, the
characteristic polynomial forM should be of the form p(A) =(A - 1)q(A) for some
polynomial q(A) of degree n - 1, where (A - 1) does not divide q(A). Alternatively,
M should possess no "generalized eigenvectors" for the eigenvalue A = 1 other than
the eigenvectors in V1(M) (see [1] formore on generalized eigenvectors).
Actually, the following proposition provides what we need to prove that the power
method converges in this case, but with no explicit reference to the algebraic nature
of the dominant A =1 eigenvalue. It can also be used to show that this eigenvalue is
simple.
DEFINITION v =
4. The 1-norm of a vector v is lvlll j Ivil.
PROPOSITION 4. Let M be a positive column-stochastic n x n matrix and letV
denote the subspace of RWnconsisting of vectors v such that ,j v. = 0. Then Mv EV
for any v c V, and

< cllvlli
11MvII1
for any v e V, where c = max,<j<n 11-2 minl<i<n Mii I< 1.
Proof. To see that Mv E V is straightforward: Let w = Mv, so that wi =
n1 Mijvj and

n n n n n \ n

ZWi = ZLM ijvj = vj EMij= Lvj = 0.


i=l i=lj j=l j1 i=v j=1
2 the use of spectral
See [15] for a general introduction to the power method and decomposition
to find the rate of convergence of the vectors x^ = Mfcxo

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
THE $25,000,000,000
EIGENVECTOR 579

Hence w = Mv C V. To prove the bound in the proposition, note that

n n n

wll= S eiwi = Zei (Mijvj

where ei = sgn(wi). Note that the ei are not all of one sign, since Ei wi = 0 (unless
w _ 0, inwhich case the bound clearly holds). Reverse the double sum to obtain

n n n
(7) Iw1il= Evj E( eiMij) = ajvj
j=l i=l j=l_

where aj = E>=1 eiMij. Since the ei are of mixed sign and >j Mij = 1 with 0 <
Mij < 1, it is easy to see that

-1<-1+2 min Mij<aj<1-2 min Mij<1.

We can thus bound

aj < 11-2 2 min Mij| < 1.


1<i<n

Let c = max1 <j<n 11- 2min,<i<nMi3j. Observe that c < 1 and 1aj1 < c for all j.
From (7) we have

n n n n
=
Iwli =5 ajvj Eaj vj < E = cfVII,
j=1 j=1 j=1 j=1

which proves theproposition. a


Proposition4 sets the stage forthe following
proposition.
PROPOSITION5. Everypositivecolumn-stochastic matrixM has a unique vector
q with positive components such thatMq = q with Ilqlil = 1. The vector q can be
computed as q = limk,oo Mkxo for any initial guess xo with positive components such
= 1.
that lixolli
Proof. From Proposition 1 the matrix M has 1 as an eigenvalue and by Lemma
3 the subspaceV1(M) is one-dimensional.Also, all nonzerovectors inV1(M) have
entirely positive or negative components. It is clear that there is a unique vector
q E V1 (M) with positive components such that Ei qi = 1.
Let x0 be any vector in Rn with positive components such that llxolli = 1.
We can write x0 = q + v, where v E V (V as in Proposition 4). We find that
Mkxo = Mkq + Mkv = q + Mkv. As a result

(8) Mkx0 - q = Mkv.

A straightforward < ck lvIJJ


inductionand Proposition4 show that flMkvJIi for0 <
c< 1 (c as in Proposition 4) and so limkj,, IlMkvIlI = 0. From (8) we conclude that
Mkxo = q.
limk<,0 U
Example 3. Let M be the matrix defined by (5) for the web of Figure 2. We
take x0 = [0.24,0.31,0.08,0.18,0.19]T as an initial guess; recall that we had q =
[0.2,0.2,0.285,0.285,0.03]T. The table below shows the value of IIMkxo- qlll for
several values of k, as well as the ratio IIMkxo -iqi -/jMk-lx - qi. Compare this
ratio to c fromProposition4,which in thiscase is 0.94.

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
580 KURT BRYANAND TANYALEISE

k
____ I[Mkxo-q|ll IMkxo-qlll
f~~~~JMklx0-q~J1
0 0.62
1 0.255 0.411
5 0.133 0.85
10 0.0591 0.85
50 8.87 x 10-5 0.85

It is clear that the bound IlMkxo - qlll < ckIIxo - qlll is rather pessimistic (note that
0.85 is the value 1 - m, and 0.85 turns out to be the second-largest eigenvalue for
M). One can show that in general the power method will converge asymptotically
according to MxkII - q IIl A2
I Ix-q
I IIl, where A2 is the second-largest eigenvalue
ofM. Moreover, forM of the formM = (1 - m)A + mS with A column-stochastic
and all Sij = 1/n, it can be shown that IA21< 1 - m (see, e.g., [1, Theorem 5.10]).
As a result, the power method will converge much more rapidly than indicated by
ck IIXo-qI1-. Nonetheless, the value of c in Proposition 4 provides a very simple bound
on the convergence of the power method here. It is easy to see that since all entries
ofM are at least m/n, we will always have c < 1 - 2m/n in Proposition 4.
As a practical matter, note that the n x n positive matrix M has no nonzero ele
ments, so the multiplication Mv for v E lRnwill typically take Q(nr2) multiplications
and additions, a formidable computation ifn = 8,000,000,000. But (4) shows that ifx
ispositivewith Ixli Mx
1, thenthemultiplication isequivalentto (1- m)Ax+ms.
This efficient computation, since A can be expected to contain mostly
is a farmore
zeros (most web pages link to only a few other pages). We've now proved our main
theorem.
THEOREM 5. The matrix M defined by (3) for a web with no dangling nodes will
always be a positive column-stochastic matrix and so have a unique q with positive
components such thatMq q and Ei qi = 1. The vector q may be computed as
the limit of iterations xk (1 - m)Axk-1 + ins, where xo is any initial vector with
positivecomponentsand lIxoll 1.
The eigenvector x defined by (4) also has a probabilistic interpretation. Consider
a web-surfer on a web of n pages with no dangling nodes. The surfer begins at
some web page (it doesn't matter where) and randomly moves fromweb page to web
page according to the following procedure: If the surfer is currently at a page with
r outgoing links, he either randomly chooses any one of these links with uniform
probability 1rm or he jumps to any randomly selected page on the web, each with
probability n (note that r 1 -m+ nm= 1, so this accounts for everything he can do).
The surfer repeats this page-hopping procedure ad infinitum. The component xj of
the normalized vector x in (4) is the fraction of time that the surfer spends, in the
long run, on page j of the web. More important pages tend to be linked to by many
other pages and so the surfer hits those most often.
EXERCISE 14. For the web inExercise 11, compute the values of flMkxo-q|ll and
IkMx1oqlllI for k = 1,5, 10,50 using an initial guess xo not too close to the actual
eigenvector q (so that you can watch the convergence). Determine c = max,<?<? 1 -
2min1i<<n MijI and the absolute value of the second-largest eigenvalue ofM.
EXERCISE 15. To see why the second-largest eigenvalue plays a role in bounding
x
lMk-1?Xqqlll nconsider an n n positive column-stochastic matrix M that is diagonal
izable. Let xo be any vector with nonnegative components that sum to 1. Since M
I}
is diagonalizable, we can create a basis of eigenvectors {q,vi 4... . vf"r where q is

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions
EIGENVECTOR
THE $25,000,000,000 581

thesteadystatevector,and thenwritexo= aq -1--jZ1 bjvj. DetermineMkx0, and


then show that a = 1 and the sum of the components of each vj must equal 0. iVext
apply Proposition 4 to prove that, except for the nonrepeated eigenvalue A = 1, the
other eigenvalues are all strictly less than 1 in absolute value. Use this to evaluate
llMkxo-qll 1
lMk+ IlMk-XO-qlll
EXERCISE 16. Consider the linkmatrix

?2 2
A=[O O

Show thatM = (1 - m)A + mS (all Sij = 1/3) is not diagonalizable for 0 < m < 1.
EXERCISE 17. How shouldthevalue ofm be chosen?How does thischoiceaffect
therankingsand thecomputationtime?

REFERENCES

A. Bermanand R. Plemmons, Nonnegative Matrices in theMathematical Sciences, Academic


Press, New York, 1979.
M. W. Berry and M. Browne, Understanding Search Engines: Mathematical Modeling and
Text Retrieval, 2nd ed., SIAM, Philadelphia, 2005.
M. BlANCHlNl, M. GORI, AND F. SCARSELLl, Inside PageRank, ACM Trans. Internet Tech., 5

(2005), pp. 92-128.


S. Brin and L. Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine,
http://www-db.stanford.edu/~backrub/google.html (August 1, 2005).
A. Farahat, T. Lofaro, J. C. Miller, G. Rae, and L. A. Ward, Authority rankings from
HITS, PageRank, and SALSA: Existence, uniqueness, and effect of initialization, SIAM
J. Sei. Comput., 27 (2006), pp 1181-1201.
A. Hill, Google Inside Out, Maximum 9 (4) (2004), pp. 44-48.
PC,
S. Kamvar, T. Haveliwala, and G. Adaptive methods
Golub, for the computation of Page
Rank, Linear Algebra Appl., 386 (2004), pp. 51-65.
A. N. Langville and C. D. Meyer, A survey of eigenvector methods of Web information
retrieval, SIAM Rev., 47 (2005), pp. 135-161.
A. N. Langville and CD. Meyer, Deeper inside PageRank, Internet Math., 1 (2005), pp.
335-380.
CD. Meyer, Matrix and Applied
Analysis Linear Algebra, SIAM, Philadelphia, 2000.
C Moler, The World's
Largest Matrix Computation, http://www.mathworks.com/company/
newsletters/news_notes/clevescorner/oct02_cleve.html (August 1, 2005).
J. Mostafa, Seeking better web searches, Sei. Amer., 292 (2005), pp. 66-73.
S. Robinson, The ongoing search for efficient web search algorithms, SIAM News, 37 (9)
(November 2004), p. 4.
I. Rogers, The Google Pagerank Algorithm and How It Works, http://www.iprcom.com/
papers/pagerank/(August 1, 2005).
W. J. Stewart, An Introduction to the Numerical Solution of Markov Chains, Princeton
University Press, Princeton, NJ, 1994.

This content downloaded by the authorized user from 192.168.52.61 on Wed, 28 Nov 2012 15:52:29 PM
All use subject to JSTOR Terms and Conditions

You might also like