10 - Social Network Analysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

SOCIAL NETWORK

ANALYSIS
WEEK 10 BIG DATA AND DATA ANALYTICS
A N D R Y A L A M S YA H
@ANDRYBREW
OUTLINE
o Social Network Introduction & Background
o Background Network Analysis, Social Science and Other Domains
o SNA Practical Applications
o SNA Basic Concept (Network and Tie Strength)
BACKGROUND
• Vast majority (internet + social media) data is in unstructured forms
• Big Data challenge and opportunity related to unstructured data
• Two approach for solving unstructured data :
• Content aspect (text mining: sentiment analysis, text summary, etc)
• Structure aspect (network data: SNA)
• Content and Structure aspect answer different questions, thus they
form different model and approach
• SNA is the fastest way to process Big Data
• SNA can handle large-scale data in real time (Stream data)
SOCIAL NETWORK INTRODUCTION
Definition :
a social network is a social structure, community or society-made of nodes which
generally represent actors / individuals or organizations. It indicates the way in which they
are connected via edges which represents various social familiarities, affiliations, and/or
relationship ranging from casual acquintance to close familial bounds (wikipedia)

Social Network Methodology is one approach to solve unstructured data problem


Network Analysis
SNA origins come from social science and network analysis
(graph theory)

Network analysis concerns with the formulation and


solution of problems that have a network structure; such
structure is usually captured in a graph
Newman et al, 2006
Graph theory provides a set of abstract concepts and
methods for the analysis of graphs. These, in combination
with other analytical tools and with methods for the
visualization and analysis of social networks, form the basis
of what we call SNA methods. Newman et al, 2006

A very early example of network analysis comes


SNA is not just a methodology; it is a unique perspective on from the city of Königsberg (now Kaliningrad).
Famous mathematician Leonard Euler used a graph
how society functions. Instead of focusing on individuals to prove that there is no path that crosses each of
the city’s bridges only once (Newman et al, 2006).
and their attributes, it centers on relations between
individuals, groups, or social institutions
Social Science
Studying society from a network perspective is to study
individuals as embedded in a network of relations and seek
explanations for social behavior in the structure of these
networks rather than in the individuals alone. This ‘network
perspective’ becomes increasingly relevant in a society that
Manuel Castells has dubbed the network society.

SNA has a long history in social science, although much of the


work in advancing its methods has also come from
mathematicians, physicists, biologists and computer scientists
(because they too study networks of different types)

The idea that networks of relations are important in social


Wellman, 1998
science is not new, but widespread availability of data and
This is an early depiction of what we call an
‘ego’ network, i.e. a personal network. The advances in computing and methodology have made it much
graphic depicts varying tie strengths via easier now to apply SNA to a range of problems
concentric circles (Wellman, 1998)
Other Domains
(Social) Network Analysis has found applications in many
domains beyond social science, although the greatest
advances have generally been in relation to the study of
structures generated by humans

Computer scientists for example have used (and even


developed new) network analysis methods to study
webpages, Internet traffic, information dissemination, etc.

One example in life sciences is the use of network analysis


to study food chains in different ecosystems Broder et al, 2000

In this example researchers collected a very large


Mathematicians and (theoretical) physicists usually focus on amount of data on the links between web pages and
found out that the Web consists of a core of densely
producing new and complex methods for the analysis of inter-linked pages, while most other web pages either
link to or are linked to from that core. It was one of the
networks, that can be used by anyone, in any domain where first such insights into very large scale human-generated
networks are relevant structures (Broder et al, 2000).
Practical applications
Businesses use SNA to analyze and improve communication flow
in their organization, or with their networks of partners and
customers

Law enforcement agencies (and the army) use SNA to identify


criminal and terrorist networks from traces of communication
that they collect; and then identify key players in these networks

Social Network Sites like Facebook use basic elements of SNA to


identify and recommend potential friends based on friends-of-
friends

Civil society organizations use SNA to uncover conflicts of


interest in hidden connections between government bodies,
lobbies and businesses

Network operators (telephony, cable, mobile) use SNA-like


methods to optimize the structure and capacity of their networks
Why and when to use SNA
• Whenever you are studying a social network, either offline or online, or when you wish to understand
how to improve the effectiveness of the network
• When you want to visualize your data so as to uncover patterns in relationships or interactions
• When you want to follow the paths that information (or basically anything) follows in social networks
• When you do quantitative research, although for qualitative research a network perspective is also
valuable
(a) The range of actions and opportunities afforded to individuals are often a function of their positions
in social networks; uncovering these positions (instead of relying on common assumptions based on
their roles and functions, say as fathers, mothers, teachers, workers) can yield more interesting and
sometimes surprising results
(b) A quantitative analysis of a social network can help you identify different types of actors in the
network or key players, whom you can focus on for your qualitative research
• SNA is clearly also useful in analyzing SNS’s, OC’s and social media in general, to test hypotheses on
online behavior and CMC, to identify the causes for dysfunctional communities or networks, and to
promote social cohesion and growth in an online community
Basic Concepts

• Networks How to represent various social networks


• Tie Strength How to identify strong/weak ties in the network
• Key Players How to identify key/central nodes in network

• Cohesion Measures of overall network structure


Representing relations as networks

Anne Jim John


Mary
Can we study their
interactions as a
network?

1 2 3 4

Graph
Communication 1
Anne: Jim, tell the Murrays they’re invited
2
Jim: Mary, you and your dad should come for dinner!
Jim: Mr. Murray, you should both come for dinner
Anne: Mary, did Jim tell you about the dinner? You must come. 3
4
John: Mary, are you hungry?
Vertex

(node) Edge (link)
Directed Graph Different Format
Edge list
Vertex Vertex
1 2
1 3
Graph (directed)
2 3
1 2 4
2
3 4

Adjacency matrix
3
4 Vertex 1 2 3 4
1 - 1 1 0
2 0 - 1 1
3 0 0 - 0
4 0 0 1 -
Representing an Undirected Graph
Directed Edge list remains the same
(who contacts whom) Vertex Vertex
But interpretation
1 1 2 is different now
2
1 3
2 3
3 2 4
4
3 4

1 Adjacency matrix becomes symmetric


2
Vertex 1 2 3 4
1 - 1 1 0
3 2 1 - 1 1
4
3 1 1 - 1
Undirected
(who knows whom) 4 0 1 1 -
Ego Networks and ‘Whole’ Networks
‘whole’ network* 1
2
alter
ego
1
2 3

5
3 4

5 1
4 2

7 isolate
6

4 5
* no studied network is ‘whole’ in practice; it’s usually a partial picture of one’s real life networks (boundary specification problem)
** ego not needed for analysis as all alters are by definition connected to ego
Basic Concepts

Networks How to represent various social networks


 Tie Strength How to identify strong/weak ties in the network
Key Players How to identify key/central nodes in network

Cohesion Measures of overall network structure


Adding Weights to Edges (directed or undirected)

30 Edge list: add column of weights


1
2 Vertex Vertex Weight
22 1 2 30
5 2
1 3 5
3
4 2 3 22
37
2 4 2
Weights could be:
• Frequency of interaction 3 4 37
in period of observation
• Number of items
Adjacency matrix: add weights instead of 1
exchanged in period Vertex 1 2 3 4
• Individual perceptions of
strength of relationship 1 - 30 5 0
• Costs in communication 2 30 - 22 2
or exchange, e.g.
distance 3 5 22 - 37
• Combinations of these 4 0 2 37 -
Edge Weights as Relationship Strength
• Edges can represent interactions, flows of
information or goods, similarities/affiliations, or
social relations
• Specifically for social relations, a ‘proxy’ for the
strength of a tie can be:
(a) the frequency of interaction (communication) or
the amount of flow (exchange)
(b) reciprocity in interaction or flow
(c) the type of interaction or flow between the two
parties (e.g., intimate or not)
(d) other attributes of the nodes or ties (e.g., kin
relationships)
(e) The structure of the nodes’ neighborhood (e.g.
many mutual ‘friends’)
• Surveys and interviews allows us to establish the
existence of mutual or one-sided strength/affection
with greater certainty, but proxies above are also
useful
Homophily, Transitivity, and Bridging
 Homophily is the tendency to relate to people with similar characteristics
(status, beliefs, etc.) Homophily Heterophily
 It leads to the formation of homogeneous groups (clusters) where
forming relations is easier
 Extreme homogenization can act counter to innovation and idea
generation (heterophily is thus desirable in some contexts) Strong TIES Weak
 Homophilous ties can be strong or weak
 Transitivity in SNA is a property of ties: if there is a tie between A and B and
one between B and C, then in a transitive network A and C will also be
connected Transitivity Bridging
 Strong ties are more often transitive than weak ties; transitivity is
therefore evidence for the existence of strong ties (but not a necessary CLUSTERING
or sufficient condition)
Interlinked
 Transitivity and homophily together lead to the formation of cliques Cliques
groups
(fully connected clusters)
 Bridges are nodes and edges that connect across groups
 Facilitate inter-group communication, increase social cohesion, and help Social
spur innovation network
 They are usually weak ties, but not every weak tie is a bridge
ASSIGNMENT
• Use SNA software tools to manage network data (edge list) and
visualize it.
• Use several tools : R, Gephi, and others
• Consult to lab. asistant

You might also like