Case Study
Case Study
Case Study
Introduction
The paper discusses the rapid rise of Instagram as a prominent
platform for sharing photos and videos, highlighting its
significant user base and daily photo upload statistics. Despite
its popularity, there has been a lack of comprehensive research
into the platform. The paper argues that Instagram warrants
attention similar to other social media platforms like Twitter. To
address this gap, the study aims to understand the types of
content shared on Instagram and the differences between
users in terms of their posted content. Using computer vision
techniques and human coding, the researchers categorize
Instagram photos into eight types and identify five distinct user
profiles based on their posted content. Surprisingly, the study
finds no strong correlations between user characteristics, such
as follower count, and the types of content they share,
suggesting that audience size is independent of the content
posted on Instagram.
Utilizing Instagram's API, the researchers collect a substantial
dataset comprising user profiles and photos. Through a
combination of computer vision techniques and manual
analysis, they categorize Instagram posts into eight distinct
types based on their content, ranging from self-portraits to pet
photos. Additionally, the study identifies five distinct user
archetypes based on their posting habits.Of particular interest
is the finding that user characteristics, such as follower count,
do not significantly correlate with the types of content shared.
This implies that the size of a user's audience is not necessarily
influenced by the content they post, challenging conventional
assumptions about engagement on social media platforms.
Overall, the research provides valuable insights into the
dynamics of Instagram as a medium for visual communication
and highlights the need for further exploration in understanding
its societal and cultural implications.To the best of our
knowledge, we believe this is the first paper to conduct a deep
analysis of photo content and user activities and types on
Instagram. In summary, the main contributions of this paper
are:
• A characterization of the content of photos shared on
Instagram.
• An examination of how the content of photos is related to user
types and characteristics.
Background
In essence, Instagram is a widely-used mobile app for
capturing and sharing photos and videos, boasting over 150
million registered users since its launch in October 2010. Users
leverage its intuitive interface to apply various filters and editing
tools to enhance their images before sharing them instantly on
multiple platforms, including Twitter. The app also enables
users to add captions, hashtags, and tag other users, fostering
social interaction and connectivity. Similar to Twitter, users can
follow others and gain followers, creating an asymmetric social
network where mutual following is not required. Privacy settings
allow users to control who can view their posts, with options for
public or private sharing. Users engage with content by
scrolling through a stream of posts from friends, where they can
like, comment, or favorite content, with interactions displayed in
a dedicated updates page for each user. Given these functions,
we regard Instagram as a kind of social awareness stream
(Naaman, Boase, and Lai 2010) like other social media
platforms such as Facebook and Twitter.
Approach
The analysis of Instagram is based on data collected using the
Instagram API, is a qualitative categorization of Instagram
photos; and a quantitative examination of users’ characteristics
with respect to their photos. The data includes profile
information, photos, captions and tags associated with photos,
and users’ social network that includes friends and followers.
Below, we first provide details about the dataset we used, and
later discuss how we develop a coding scheme for categorizing
the photos and the coding process.
DATA COLLECTION
To obtain a random sample of Instagram users and access their
public photos, the researchers initially identified users whose
media appeared on Instagram's public timeline, which
showcases popular content. This process yielded 37 unique
users, predominantly celebrities due to their high popularity.
Subsequently, the researchers crawled the IDs of these users'
followers and friends, amalgamating them into a unified list
comprising 95,343 unique seed users.Next, the researchers
defined regular active users as those not affiliated with
organizations, brands, or spammers, who had at least 30
friends, 30 followers, and had posted a minimum of 60 photos.
From the seed user list, 13,951 individuals (14.6% of the total)
met these criteria. Out of this subset, the researchers randomly
selected 50 users and obtained their profiles, 20 recent photos
(as random photo downloads were constrained by Instagram's
API limitations), and details of their social network
connections.Limiting the sample to 50 users was necessary
due to the manual coding required for photo analysis, which is
impractical for a larger dataset. Despite the smaller sample
size, the researchers assert that the dataset maintains
representativeness, enabling predictions with a 95% confidence
level and a 13% confidence interval for typical users, suitable
for the analysis conducted in the study.
CLASSIFICATION OF CONTENT AND CODING
METHODOLOGY
In order to characterize the types of photos shared on
Instagram, a systematic approach was employed, involving
both computer vision techniques and human coders. Initially, a
sample of 200 photos was extracted from a larger dataset of
1,000 photos obtained from 50 users (20 photos per user).
Computer vision techniques, specifically the Scale Invariant
Feature Transform (SIFT) algorithm, were utilized to identify
and extract local features from the photos, generating
codebook vectors for each image.
These vectors were then subjected to k-means clustering to
create 15 clusters representing initial coding categories.To
refine this automated categorization, two human coders, both
regular Instagram users, independently reviewed the photos
within each category. They assessed thematic affinities within
and across categories, making manual adjustments as needed
to ensure accuracy. Through collaborative discussion and
conflict resolution, the coders finalized an 8-category coding
scheme with unanimous agreement (Fleiss’ kappa κ = 1).Using
this scheme, the remaining 800 photos were categorized by the
two coders based on their main themes and accompanying
descriptions or hashtags. Each photo was assigned a single
category to avoid ambiguity. Initial agreement between coders
was substantial (Fleiss’ kappa κ = 0.75), with discrepancies
resolved by a third-party judge tasked with assigning
unresolved photos to appropriate categories.
A photo I of a dog can have 125 SIFT features corresponding to the dog’s eyes, legs,
ears and so on, which are expressed in terms of the codebook vector (of size n) as I
=< C1 : f1, C2 : f2, C3 : f3, ..., Cn : fn >, where P 0≤i≤n fi = 125 and Ci is the cluster
of all the features about specific characteristic of an object in the image.
Analysis
This section presents analysis of photo content and user types
on Instagram. Our main objective here is to develop a deeper
understanding on the types of photos and active users on
Instagram. Specifically, we aim to address the following
research questions:
• RQ1: What kind of photos do people usually post on
Instagram?
• RQ2: How do the users differ based on the type of images
they post?
• RQ3: How are these differences between users’ photo content
related to user’s number of followers ?
Conclusion
In this paper, the researchers conducted a comprehensive
analysis of photos and users on Instagram, a rapidly growing
social media platform. This study represents the first attempt to
systematically analyze Instagram data to address fundamental
research questions. The analysis revealed eight distinct
categories of photo content on Instagram, and from this, five
different types of users or user clusters were identified based
on their posting behavior. Additionally, the study found no direct
relationship between the number of followers a user has and
the type of user, as determined by their shared photos, as
evidenced by statistical significance tests.
Future Scope
Looking ahead, the researchers plan to expand their work by
incorporating additional features from Instagram, such as user
bios, hashtags, comments, and social networks. They also
intend to delve into sentiment analysis and explore events
associated with photos and their accompanying text. This future
research aims to provide deeper insights into user behavior and
the dynamics of content sharing on Instagram.
Source: Yuheng Hu Lydia Manikonda Subbarao Kambhampati Department of Computer
Science, Arizona State University, Tempe AZ 85281 {yuhenghu, lmanikon, rao}@asu.edu