Case Study

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

CASE STUDY: What We Instagram:

A First Analysis of Instagram


Photo Content and User Types

Introduction
The paper discusses the rapid rise of Instagram as a prominent
platform for sharing photos and videos, highlighting its
significant user base and daily photo upload statistics. Despite
its popularity, there has been a lack of comprehensive research
into the platform. The paper argues that Instagram warrants
attention similar to other social media platforms like Twitter. To
address this gap, the study aims to understand the types of
content shared on Instagram and the differences between
users in terms of their posted content. Using computer vision
techniques and human coding, the researchers categorize
Instagram photos into eight types and identify five distinct user
profiles based on their posted content. Surprisingly, the study
finds no strong correlations between user characteristics, such
as follower count, and the types of content they share,
suggesting that audience size is independent of the content
posted on Instagram.
Utilizing Instagram's API, the researchers collect a substantial
dataset comprising user profiles and photos. Through a
combination of computer vision techniques and manual
analysis, they categorize Instagram posts into eight distinct
types based on their content, ranging from self-portraits to pet
photos. Additionally, the study identifies five distinct user
archetypes based on their posting habits.Of particular interest
is the finding that user characteristics, such as follower count,
do not significantly correlate with the types of content shared.
This implies that the size of a user's audience is not necessarily
influenced by the content they post, challenging conventional
assumptions about engagement on social media platforms.
Overall, the research provides valuable insights into the
dynamics of Instagram as a medium for visual communication
and highlights the need for further exploration in understanding
its societal and cultural implications.To the best of our
knowledge, we believe this is the first paper to conduct a deep
analysis of photo content and user activities and types on
Instagram. In summary, the main contributions of this paper
are:
• A characterization of the content of photos shared on
Instagram.
• An examination of how the content of photos is related to user
types and characteristics.

Background
In essence, Instagram is a widely-used mobile app for
capturing and sharing photos and videos, boasting over 150
million registered users since its launch in October 2010. Users
leverage its intuitive interface to apply various filters and editing
tools to enhance their images before sharing them instantly on
multiple platforms, including Twitter. The app also enables
users to add captions, hashtags, and tag other users, fostering
social interaction and connectivity. Similar to Twitter, users can
follow others and gain followers, creating an asymmetric social
network where mutual following is not required. Privacy settings
allow users to control who can view their posts, with options for
public or private sharing. Users engage with content by
scrolling through a stream of posts from friends, where they can
like, comment, or favorite content, with interactions displayed in
a dedicated updates page for each user. Given these functions,
we regard Instagram as a kind of social awareness stream
(Naaman, Boase, and Lai 2010) like other social media
platforms such as Facebook and Twitter.

Approach
The analysis of Instagram is based on data collected using the
Instagram API, is a qualitative categorization of Instagram
photos; and a quantitative examination of users’ characteristics
with respect to their photos. The data includes profile
information, photos, captions and tags associated with photos,
and users’ social network that includes friends and followers.
Below, we first provide details about the dataset we used, and
later discuss how we develop a coding scheme for categorizing
the photos and the coding process.
DATA COLLECTION
To obtain a random sample of Instagram users and access their
public photos, the researchers initially identified users whose
media appeared on Instagram's public timeline, which
showcases popular content. This process yielded 37 unique
users, predominantly celebrities due to their high popularity.
Subsequently, the researchers crawled the IDs of these users'
followers and friends, amalgamating them into a unified list
comprising 95,343 unique seed users.Next, the researchers
defined regular active users as those not affiliated with
organizations, brands, or spammers, who had at least 30
friends, 30 followers, and had posted a minimum of 60 photos.
From the seed user list, 13,951 individuals (14.6% of the total)
met these criteria. Out of this subset, the researchers randomly
selected 50 users and obtained their profiles, 20 recent photos
(as random photo downloads were constrained by Instagram's
API limitations), and details of their social network
connections.Limiting the sample to 50 users was necessary
due to the manual coding required for photo analysis, which is
impractical for a larger dataset. Despite the smaller sample
size, the researchers assert that the dataset maintains
representativeness, enabling predictions with a 95% confidence
level and a 13% confidence interval for typical users, suitable
for the analysis conducted in the study.
CLASSIFICATION OF CONTENT AND CODING
METHODOLOGY
In order to characterize the types of photos shared on
Instagram, a systematic approach was employed, involving
both computer vision techniques and human coders. Initially, a
sample of 200 photos was extracted from a larger dataset of
1,000 photos obtained from 50 users (20 photos per user).
Computer vision techniques, specifically the Scale Invariant
Feature Transform (SIFT) algorithm, were utilized to identify
and extract local features from the photos, generating
codebook vectors for each image.
These vectors were then subjected to k-means clustering to
create 15 clusters representing initial coding categories.To
refine this automated categorization, two human coders, both
regular Instagram users, independently reviewed the photos
within each category. They assessed thematic affinities within
and across categories, making manual adjustments as needed
to ensure accuracy. Through collaborative discussion and
conflict resolution, the coders finalized an 8-category coding
scheme with unanimous agreement (Fleiss’ kappa κ = 1).Using
this scheme, the remaining 800 photos were categorized by the
two coders based on their main themes and accompanying
descriptions or hashtags. Each photo was assigned a single
category to avoid ambiguity. Initial agreement between coders
was substantial (Fleiss’ kappa κ = 0.75), with discrepancies
resolved by a third-party judge tasked with assigning
unresolved photos to appropriate categories.
A photo I of a dog can have 125 SIFT features corresponding to the dog’s eyes, legs,
ears and so on, which are expressed in terms of the codebook vector (of size n) as I
=< C1 : f1, C2 : f2, C3 : f3, ..., Cn : fn >, where P 0≤i≤n fi = 125 and Ci is the cluster
of all the features about specific characteristic of an object in the image.
Analysis
This section presents analysis of photo content and user types
on Instagram. Our main objective here is to develop a deeper
understanding on the types of photos and active users on
Instagram. Specifically, we aim to address the following
research questions:
• RQ1: What kind of photos do people usually post on
Instagram?
• RQ2: How do the users differ based on the type of images
they post?
• RQ3: How are these differences between users’ photo content
related to user’s number of followers ?

In addressing Research Question 1 (RQ1), the study examined


the distribution of different photo categories on Instagram.
Figure 2 illustrates the proportions of each category within the
dataset. Notably, self-portraits and photos featuring friends
collectively comprise nearly half (46.6%) of the dataset, with
self-portraits slightly outnumbering friend-related images
(24.2% vs. 22.4%). Conversely, categories such as pets and
fashion represent the least popular content, each accounting
for less than 5% of the total images.
These findings align with recent observations reported in
mainstream media.To further elucidate these trends, the
analysis was refined to explore user engagement within specific
categories. Figure 3 depicts the distribution of users across
individual categories relative to their level of engagement,
measured by the number of photos posted. For instance,
approximately 22% of users contributed 6-8 photos categorized
as friend-related, while 26% of users shared 3-5 photos related
to food. Notably, categories such as pets and fashion exhibit
high standard deviations (SD = 0.5), indicating greater
variability in user engagement. In contrast, categories like
selfies and friends demonstrate lower standard deviations (SD
= 0.11 and SD = 0.124, respectively), suggesting more
consistent user proportions across different levels of
engagement.
In RQ2, the study aimed to identify distinct types of users on
Instagram based on the content they post. Initially, an 8-
dimensional vector was created for each user, representing the
proportion of their photos in each of the 8 content categories.
Subsequently, k-means clustering was employed to group
users into clusters based on their posting behavior, with the
optimal number of clusters determined by minimizing the root
mean square error.Figure 4 illustrates the clustering results,
revealing 5 distinct types of users on Instagram. Each cluster is
characterized by histograms indicating the proportion of photos
in each content category. The analysis highlights the diverse
posting habits of Instagram users, with distinct user profiles
emerging. For instance, one cluster comprises "selfies-lovers"
who predominantly share self-portraits, while another group
focuses on captioned photos featuring quotes or popular
hashtags. Additionally, there are common users who exhibit a
more varied posting behavior across categories.
Notably, one cluster consists of users who prioritize sharing
photos of both themselves and their friends equally, indicating a
strong emphasis on social connections. These findings
underscore the diversity of user behavior on Instagram and the
varying priorities individuals place on different types of content.
In RQ3, the study aimed to determine whether the type of users
on Instagram correlates with the number of followers they
attract. Specifically, the researchers sought to ascertain if users
categorized as "selfies-lovers" (C4) tend to garner significantly
more followers compared to common users in cluster C1.To
investigate this relationship, a two-tailed t-test was performed
on the follower distributions from different user clusters. The
analysis revealed that all types of users, except for "selfies-
lovers" (C4), exhibited follower distributions that did not
significantly differ (two-tailed t-test; p-value = 0.171). This
indicates that followers are independent of the user clusters
defined by their posting behavior.
Consequently, the study concluded that the size of a user's
audience (followers) is not significantly influenced by the type of
user, as characterized by their shared photos on Instagram.
This finding suggests that factors other than posting habits,
such as networking, engagement strategies, or external factors,
may play a more significant role in determining user follower
counts.

Conclusion
In this paper, the researchers conducted a comprehensive
analysis of photos and users on Instagram, a rapidly growing
social media platform. This study represents the first attempt to
systematically analyze Instagram data to address fundamental
research questions. The analysis revealed eight distinct
categories of photo content on Instagram, and from this, five
different types of users or user clusters were identified based
on their posting behavior. Additionally, the study found no direct
relationship between the number of followers a user has and
the type of user, as determined by their shared photos, as
evidenced by statistical significance tests.

Future Scope
Looking ahead, the researchers plan to expand their work by
incorporating additional features from Instagram, such as user
bios, hashtags, comments, and social networks. They also
intend to delve into sentiment analysis and explore events
associated with photos and their accompanying text. This future
research aims to provide deeper insights into user behavior and
the dynamics of content sharing on Instagram.
Source: Yuheng Hu Lydia Manikonda Subbarao Kambhampati Department of Computer
Science, Arizona State University, Tempe AZ 85281 {yuhenghu, lmanikon, rao}@asu.edu

You might also like