Page Rank With 13 Cases
Page Rank With 13 Cases
Page Rank With 13 Cases
GROUP 3
ANSHURAJ KAWLECHA
205123021
ARADHYA GUPTA
205123023
ASHISH
205123025
BHARAT BISHNOI
205123027
TOPICS
• INTRODUCTION OF PAGERANK
• HISTORY
• MATHEMATICAL REPRESENTATION
• STRATEGIES FOR PAGERANK
• TOOLS FOR PAGERANK
• ADVANTAGES AND DISADVANTAGES
• DIFFERENT CASE STUDIES
Introduction
PageRank was developed in the late 1990s during the rapid rise of
the internet ,as the number of web pages increased significantly.
Iterative Algorithm:
for iteration in range(max_iterations):
new_rank = {}
for node in G.nodes:
new_rank[node] = (1 - d) / n # Initialize with (1 - d)/n, where n is the total number of nodes
for node in G.nodes:
for neighbor in G.neighbors(node):
num_neighbors = len(G.neighbors(neighbor))
new_rank[neighbor] += d * (current_rank[node] / num_neighbors) # Update the rank
of the neighbor
# Check for convergence
max_diff = max(abs(new_rank[node] - current_rank[node])
if max_diff < tol:
break
current_rank = new_rank # Update the current rank with the newly calculated ranks
Some Example Based on
damping factor
CONTINUE…
CONTINUE…
STEPS FOR CALCULATION
1.Rank Distribution:
As the algorithm progresses, the PageRank of the page is
distributed among the outgoing links. For example, if a page has
a high PageRank and many outbound links, each link will
contribute to the overall impact of the page. This division ensures
that the importance of linked pages is shared.
CONTINUE…
2.Convergence:
The iterative process continues until the PageRank score
stabilizes or converges. Convergence occurs when the difference
in PageRank scores between successive iterations falls below a
certain threshold. At this point, the algorithm has reached a
stable ranking, and the PageRank scores indicate the relative
importance of each web page.
CONTINUE…
•Example:
TrustRank Reducing spam, trust- Boosts trusted sources, Relies on trusted seed
based reduces spam page selection
NetworkX (Python) Small to medium Easy to use, good for Slow for large datasets
datasets quick prototyping
Computational Complexity
Calculating PageRank for large-scale social networks (e.g.,
millions or billions of users) is computationally expensive
and time-consuming, requiring iterative processing for
convergence.
Choosing the Right Damping Factor
The choice of the damping factor (usually set to 0.85) can
greatly impact the results, but determining the optimal
value for different networks is not straightforward. It can
require experimentation and tuning.
Handling Weighted and Multilayer Networks
Social networks often involve different types of relationships
(e.g., professional, personal, casual), and PageRank does not
natively support multilayer networks. Adapting the algorithm
to weight connections based on their strength or importance
can be complex.
CASE 1:
• Page D is not having any incoming link , But still it is having
some PR this is because of Concept of Random Surfer.
• We are having Damping factor = 0.85. Which means a surfer
can visit a page via a link with probability of 0.85 and can
teleport to a random page with a probability of 1 - 0.85 =
0.15
• Formula used for page rank = (1-d) + d*(pr(t1)/c(t1) +
pr(t2)/c(t2) + . . . . .)
• Pr(D) = (1-0.85) + 0.85 * 0 = 0.15
CASE 2: A simple hierarchy with some outgoing links
• Since each page in the loop links to the same number of other
pages, the PageRank is distributed equally among all the
pages.
• No page receives more or less PageRank than the others, resulting
in an equilibrium where every page has the same rank (assuming
there are no external backlinks from other sites or pages).
CASE 7: Extensive linking or Fully Meshed linking
• Since every page links to every other page, the PageRank gets evenly
distributed among all the pages in the network.
• Each page effectively passes a portion of its PageRank to all other
pages, resulting in a situation where, after multiple iterations, each
page has nearly the same amount of PageRank (assuming there
are no external factors or links from outside this group).
CASE 8: Hierarchical - but with a link in and
one out.
The external site (A) with a high PR (1.0) linking directly to the
home page has a strong influence on the overall site. This single
link increases the PR of not just the home page but also indirectly
boosts the PR of linked internal pages ("About," "Product," and
"More") compared to previous (Case - 05) through a feedback
effect.
The way the pages within the site are linked together helps boost
the PR from the external link. This increase in PR gets shared
among other pages on the site, raising their PR as well.
Fully meshed, but it has one Vote in and one Vote out.
(here Vote is Name given to the Outgoing edge from a Node).
It is similar to the Case 7 but the only difference is these extra two votes. In a
fully meshed network with one vote in and one vote out, every page ends up
with an equal PageRank due to the circular nature of the links. This illustrates
how the PageRank algorithm evaluates link structure and the influence of
pages, leading to equal ranking in this specific setup
Observation:
2. When all the node are having the same PageRank then this is
used by the Firms to make sure that the ranking of the page
is maintained.
1. In this Case we can see that there’s a possibility that there can exit a
Page which is having the PageRank greater than the Home Page. The
reason here is that the Page B is getting all the Vote from Page A but,
Page A is getting fraction of Pages from B,C and D.
1. It does not matter how many pages do we use to make our site but the
average of the PageRank of all the combine will always be 1.000
• Wikipedia
• GeeksForGeeks
Thank
You !