Lecture 20

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Data Science

CSE-4075
(Social Network Analysis)
Social network analysis
• Social network is the study of social entities (people in an
organization, called actors), and their interactions and
relationships.
• The interactions and relationships can be represented
with a network or graph,
– each vertex (or node) represents an actor and
– each link represents a relationship.
• From the network, we can study the properties of its
structure, and the role, position and prestige of each
social actor.
• We can also find various kinds of sub-graphs, e.g.,
communities formed by groups of actors.
Social network and the Web
• Social network analysis is useful for the Web because the
Web is essentially a virtual society, and thus a virtual
social network,
– Each page: a social actor and
– each hyperlink: a relationship.
• Many results from social network can be adapted and
extended for use in the Web context.
• We study two types of social network analysis, centrality
and prestige, which are closely related to hyperlink
analysis and search on the Web.
Centrality
• Important or prominent actors are those that
are linked or involved with other actors
extensively.
• A person with extensive contacts (links) or
communications with many other people in
the organization is considered more important
than a person with relatively fewer contacts.
• The links can also be called ties. A central
actor is one involved in many ties.
Centrality: who’s important based on their
network position
In each of the following networks, X has higher centrality than Y according to
a particular measure

indegree outdegree betweenness closeness


Degree Centrality
Closeness Centrality
Closeness Centrality

A B C D E

1
 N 
 d(A, j)  1 1
  
1 2  3  4  
10 
Cc (A) 
' j1
     0.4
 N 1    4 
 4 
 
 
Betweenness Centrality
• If two non-adjacent actors j and k want to
interact and actor i is on the path between j
and k, then i may have some control over the
interactions between j and k.
• Betweenness measures this control of i over
other pairs of actors. Thus,
– if i is on the paths of many such interactions, then
i is an important actor.
Betweenness Centrality pjk pjk(i) pjk(i)/ pjk
(a,b) 1 0 0
p jk (i )

j k p jk
(a,d) 1 1 1
(a,e) 1 1 1
a (a,f) 1 1 1
c
(b,d) 1 1 1
d
b f (b,e) 1 1 1
(b,f) 1 1 1
e
(d,e) 1 0 0
Betweenness centrality of (d,f) 1 0 0
node c=6
(e,f) 1 0 0
Betweenness centrality of
node a=0
Calculation for node c
Prestige
• Prestige is a more refined measure of prominence of an
actor than centrality.
– Distinguish: ties sent (out-links) and ties received (in-links).
• A prestigious actor is one who is object of extensive ties as
a recipient.
– To compute the prestige: we use only in-links.
• Difference between centrality and prestige:
– centrality focuses on out-links
– prestige focuses on in-links.
• We study three prestige measures.
Degree prestige
Proximity prestige
• The degree index of prestige of an actor i only considers
the actors that are adjacent to i.
• The proximity prestige generalizes it by considering both
the actors directly and indirectly linked to actor i.
– We consider every actor j that can reach i.
• Let Ii be the set of actors that can reach actor i.
• The proximity is defined as closeness or distance of
other actors to i.
• Let d(j, i) denote the distance from actor j to actor i.
Proximity prestige
Rank prestige
• In the previous two prestige measures, an important
factor is considered,
– the prominence of individual actors who do the “voting”
• In the real world, a person i chosen by an important
person is more prestigious than chosen by a less
important person.
– For example, if a company CEO votes for a person is much more
important than a worker votes for the person.
• If one’s circle of influence is full of prestigious actors,
then one’s own prestige is also high.
– Thus one’s prestige is affected by the ranks or statuses of the
involved actors.
Rank prestige (cont …)
• Based on this intuition, the rank prestige PR(i) is define as
a linear combination of links that point to i:
Co-citation and Bibliographic Coupling
• Another area of research concerned with links is citation
analysis of scholarly publications.
– A scholarly publication cites related prior work to acknowledge
the origins of some ideas and to compare the new proposal with
existing work.
• When a paper cites another paper, a relationship is
established between the publications.
– Citation analysis uses these relationships (links) to perform
various types of analysis.
• We discuss two types of citation analysis, co-citation and
bibliographic coupling.
Co-citation
• If papers i and j are both cited by paper k, then they may
be related in some sense to one another.
• The more papers they are cited by, the stronger their
relationship is.
Bibliographic coupling
• Bibliographic coupling operates on a similar principle.
• Bibliographic coupling links papers that cite the same
articles
– if papers i and j both cite paper k, they may be related.
• The more papers they both cite, the stronger their
similarity is.

You might also like