Fan 2015
Fan 2015
Fan 2015
PII: S2214-5796(15)00015-5
DOI: http://dx.doi.org/10.1016/j.bdr.2015.02.006
Reference: BDR 21
Please cite this article in press as: S. Fan et al., Demystifying big data analytics for business intelligence
through the lens of marketing mix, Big Data Research (2015), http://dx.doi.org/10.1016/j.bdr.2015.02.006
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our
customers we are providing this early version of the manuscript. The manuscript will undergo copyediting,
typesetting, and review of the resulting proof before it is published in its final form. Please note that during
the production process errors may be discovered which could affect the content, and all legal disclaimers
that apply to the journal pertain.
Demystifying Big Data Analytics for Business
Intelligence through the Lens of Marketing Mix
Big data analytics have been embraced as a disruptive technology that will
literature, we investigate the landscape of big data analytics through the lens
namely people, product, place, price, and promotion, that lay the foundation
issues and future directions of research in big data analytics and marketing
1. Introduction
much faster than ever before (McAfee et al. 2012). The notion of big data and its
because of its great potential in generating business impacts (Chen et al. 2012). “Big
1
Data” is defined as “the amount of data just beyond technology’s capability to store,
manage and process efficiently” (Kaisler et al. 2013). Big data can be characterized
along three important dimensions, namely volume, velocity, and variety (Zikopoulos
et al. 2011).
insights that support decision-making (Hedin et al. 2014). Marketing intelligence has
product design. For example, companies use consumer satisfaction surveys to study
customer attitudes. With big data analytic technologies, key factors for strategic
company, can be automatically monitored by mining social media data (Tan et al.
2013).
management, and processing (Kaisler et al. 2013). For typical marketing intelligence
tasks such as customer opinion mining, companies nowadays have many different
ways (social media data, transactional data, survey data, sensor network data, etc.) to
2
Analysis models developed based on a single data source may only provide limited
integrating big data from multiple sources to generate marketing intelligence is not a
trivial task. This prompts exploration of new methods, applications, and frameworks
framework to manage big data in this context. We first identify popular data sources
for marketing intelligence perspectives. Then, we summarize the methods that are
suitable for different data sources and marketing perspectives. Finally, we give
guidelines for companies to select appropriate data sources and methods for managing
The marketing mix framework is a well-known framework that identifies the principal
research, and practice (Brassington et al. 2005). Borden (1964) has been recognized
as the first to use the term “marketing mix” and he proposed a set of 12 elements.
product, price, promotion, and place. The 4P model has been considered to be most
3
relevant for consumer marketing. However, it has been criticized as being a
(people)(Goi 2009). We adopt the 5P model of the marketing mix framework in this
4
People Product Promotion Price Place
• Demographics • Product • Promotional • Transactional • Location-based
• Social Networks Characteristics Data Data social networks
• Customer Review • Product Category • Survey Data • Survey Data • Survey Data
Data • Click Stream • Customer Review
• Survey Data • Survey Data
5
In this paper, we propose a marketing mix framework to manage big data for marketing
intelligence. This model classifies the research in marketing intelligence into five perspectives
according to the marketing mix framework. Further, we identify common data, methods, and
applications in each perspective and highlight the dominating big data characteristic with respect
to each perspective. This framework provides guidelines for marketing decision-making based
on big data analytics. Figure 1 is an overview of the proposed big data management framework
for marketing intelligence. First, data from various sources are retrieved and utilized to generate
vital marketing intelligence. Second, a variety of analytics methods are applied to convert raw
big data to actionable marketing knowledge (intelligence). Finally, both data and methods are
combined to support marketing applications with respect to each perspective of the marketing
mix model.
Data
Researchers use various methods to collect data, such as surveys, interviews, focus groups,
observations, and archives (Axinn et al. 2006). Note that data collection methods are different
from research methods. For example, experiments are a widely used research method in
data (Luo et al. 2013). Surveys and logs are the two most common methods to acquire data for
organized and methodical manner about characteristics of interest from some or all units of a
population using well-defined concepts, methods and procedures, and compiles such information
into a useful summary form”(Canada 2010). Firms use surveys to collect data for various
purposes, such as understanding customers’ preferences and behaviors. For example, Apple has
sent surveys tocustomers who recently purchased an iPhone to gain feedback about their
6
purchase and their experience with the product (Etherington 2014). Log data is generated by
information systems that capture transactional records and user behavior (Jacobs 2009). For
example, Walmart has started to explore analyzing social media data to gain customer opinions
about the company or a particular product(Brown 2012). Log data and survey data can be
different in terms of size, quality, frequency, objectives, contents, and processing techniques
(Zhao et al. 2014). The two data collection methods complement each other in various business
contexts. Surveys can be useful when we want to collect data on phenomena that cannot be
directly observed. Log data are preferred when real-time conclusions about users’ actual
behavior are required. The two methods can be combined when we want to study the relationship
between user intention and user behavior. There are advantages and disadvantages to both
methods, and we believe big data management should take both methods into consideration.
Methods
Marketing intelligence refers to developing insights from data for marketing decision-making.
Data mining techniques can help to accomplish such a goal by extracting ordetecting patterns or
forecasting customer behavior fromlarge databases. According to the data mining literature,
common data mining methods include association mining, classification, clustering, and
regression (Ngai et al. 2009). We need to select appropriate data mining methods based on the
Applications
identify a specific group of customers who share similar preferences and respond to a specific
marketing signal. Customer segmentation applications can help identify different communities
7
(segments) of customers who may share similar interests. Kim et al. (Kim et al. 2006) proposed
clustering customer groups with respect to lifecycle characteristics. Usually, various clustering
and classification techniques are applied to customer segmentation and user profiling. However,
customer segmentation is becoming increasingly challenging under a big data environment. For
necessary to analyze their call data apart from their demographics (Abbaso÷lu et al. 2013). The
volume of call data is huge (e.g., the communication time between each pair of customers on
each day), and a variety of data should be taken into account (e.g., both qualitative demographic
data and quantitative call records). In fact, for the most fine-grained targeted marketing (e.g.,
one-to-one marketing), we are not talking about identifying groups of similar customers, but the
“profiling” of each individual customer such that the most suitable products/services are
marketed to the most appropriate individual given a steam of customer service consumption data
retrieving limited product reputation via survey data, Morinaga et al. (2002) developed an
automatic framework to monitor the reputation of a variety of products by mining Web contents.
Clustering and association mining techniques are among the most common methods employed to
reputation management method which not only mines text-based reputation data from the Web
but also considers the graphical images of products posted to the Web. Nevertheless, by the time
of this writing, twenty billion images have been uploaded to Instagram.1 Given such an
extraordinary size of images archived online, it is extremely challenging to analyze the sheer
1http://instagram.com/press/#
8
volume of images for product reputation management, not to mention the variety of formats of
source data (e.g., text versus images). To carry out an automatic analysis of the textual comments
posted to the Web for product reputation management, it is essential to develop a rich
Recently, an automated product ontology mining method that is underpinned by latent topic
modeling has been explored to build product ontologies based on textual descriptions of products
extracted from online social media (Lau et al. 2014). The automatically constructed product
ontologies can be used as the basis to support product reputation management applications and
involved in automated product ontology extraction from online social media, new computational
methods must be developed to cope with the volume, velocity, and variety issues of big social
media data.
business environment, billions of dollars are spent on promotions each year (Srinivasan et al.
2004). Thus, promotional marketing analysis has attracted a lot of attention from practitioners
and researchers. Effective promotional strategies are one of the key success factors for
companies to increase their sales and revenue (Bell et al. 1999). Promotional data usually
includes information about promotion types (price cut or coupons), promotion time, and
purchase records during the promotional period. Early work related to promotional marketing
analysis mostly focused on analyzing how different types of customers respond to different
promotional strategies(Pauwels et al. 2002). Most existing work uses regression methods to
9
In the big data environment, more log data becomes/is available for promotion analysis. A recent
work studied WOM derived from both customer reviews and promotions (Lu et al. 2013). The
authors found a substitute relationship between the WOM volume and coupon offerings, but a
marketing analysis can also include factors from other perspectives, such as price and place. For
example, enabled by mobile technologies and location-based services, companies can use
customers’ location information to improve their promotion strategy and select targeted
customers.
systems have been widely used in the e-commerce context (Dias et al. 2008). User rating-based
applied to develop recommender systems. However, existing methods may not scale up to big
data. For instance, given N user ratings, the general computational complexity of a collaborative
filtering method is N2(Cai et al. 2014). Therefore, it is quite challenging to scale up existing
recommender systems to cope with big data (e.g., N = tens of millions) and generate appropriate
the reason why “velocity” is one of the most challenging issues for the “promotion” perspective
Pricing strategy and competitor analysis:There has been much research on what pricing
strategies managers should follow under various situations. Traditionally, empirical research on
pricing strategies uses survey data and regression methods. For example, researchers used a
national mail survey to study the determinants of pricing strategies (Noble et al. 1999). They
found different pricing strategies are preferred under different marketing situations. The growth
10
of e-commerce has made price information available on websites and researchers started using
log data to study pricing strategy in e-commerce websites. For example, a recent study uses a
method to estimate demand levels from sales rank and derive demand elasticity, variable costs,
and the optimality of pricing choicesdirectly from publicly available e-commerce data (Ghose et
al. 2006). Based on the data derived from various log data sources, they can study the
optimality of price discrimination. While regression methods are widely used for price prediction
automated competitor analysis application does not simply identify the potential competitors of a
company; it also effectively discovers the potentially competitive products and the product
contexts (Bao et al. 2008). This type of application has proven useful to facilitate the “price”
aspect of the marketing mix model. However, the sheer volume of product pricing information
on the Web has also posed new challenges to scale up existing applications with big data.
places on marketing strategies. For example, researchers used a survey to collect customer data
and study different levels of place-based marketing in the form of region of origin strategies used
With the widespread use of mobile technology, location-based services (LBS) can provide users
been proposed as an efficient marketing strategy (Dhar et al. 2011; Luo et al. 2013). Location is
one of the most important solutions to meet consumers’ need and it is a valuable source for
customers and enhance brand value. One challenge for location-based advertisingis how to
accurately predict customers’ locations. Both spatial and temporal data should be taken into
consideration (temporal moving pattern mining for location-based service). We need to process a
large volume of spatial and temporal data within a short time period before customers move to
new locations. Thus, the “velocity” issue of big data is also one of the most challenging aspects
Researchers explored the log data in location-based social networks to uncover user profiles;
these automatically discovered user profiles have the potential to be subsequently applied to
methods are often utilized for location-based marketing applications. In another study, Castro et
al. (2013) leverage the GPS traces of individuals to uncover the location-based dynamics of
predict their changing product/service preferences. As a result, effective marketing strategies can
be developed with respect to both the place and time dynamics of a group of customers.
Nevertheless, this type of application needs to deal with both the “variety” and “velocity” issues
of big data. For instance, both the relational data among users in location-based social networks
and GPS signals need to be analyzed to uncover the location-based dynamics of a local
community. In addition, since individuals may constantly move around different places,
location-based marketing applications must be able to respond quickly in order to maintain the
12
We propose to use a marketing mix framework for guiding research in big data management for
marketing intelligence. We identify the data sources, methods, and applications in different
marketing perspectives. We further discuss the challenging issues related to big data
1. How to select appropriate data sources for particular goals. The amount of available data
is increasing. Current techniques do not allow us to process all data available in a timely
manner. Thus, data selection is a critical decision for managing marketing intelligence.
How to select data that can provide the most value to business decision-making requires
future research on the alignment between data and marketing intelligence goals.
2. How to select appropriate data analysis methods. There are many types of methods that
can be used to process data. Given a particular data set, many methods may be applicable.
Regression and classification are usually used for prediction, while clustering and
association rule mining are used for description. Further, big data brings issues such as
imbalanced data distribution and large number of variables, which cannot be efficiently
3. How to integrate different data sources to study complicated marketing problems. Most
existing studies use data from one single data source. However, some complicated
business problems require combining data from different sources. For example, in order
to study the impact of social media behavior on purchase behavior, we may need to
13
4. When using different data sources to study the same marketing problem, how to deal with
the heterogeneity among data sources. For example, both customer reviews and social
media data can be used to study customer opinions toward a company or a product.
However, data collection and analysis methods may be different due to different
structure, quality, granularity, and objectivity. Further, survey data and log data can also
be used to study the same marketing problem. How to conduct surveys in social media
and confirm the survey result with log data in social media will become an important
marketing intelligence will become a competitive source for consumer behavior and
product planning; therefore all companies must invest in big data infrastructure including
6. Data of a variety of formats and qualities will continue to grow and be digitized. Even
though peta-scale data (e.g., petabytes of customer records) may be considered big data
now, the same volume of data may not be considered big in a few years. It is important to
continuously refine the framework, methods, and techniques that we discuss in this paper
in order to meet the challenges for more advanced business intelligence in the next
References
Abbaso÷lu, M. A., Gedik, B., and Ferhatosmano÷lu, H. "Aggregate profile clustering for telco analytics,"
Axinn, W. G., and Pearce, L. D. Mixed method data collection strategies Cambridge University Press,
2006.
14
Bao, S., Li, R., Yu, Y., and Cao, Y. "Competitor mining with the web," Knowledge and Data Engineering,
Bell, D. R., Chiang, J., and Padmanabhan, V. "The decomposition of promotional response: An empirical
Borden, N. H. "The concept of the marketing mix," Journal of advertising research (4:2) 1964, pp 2-7.
Brown, E. "Mining the social data stream for deeper customer insight," http://www.zdnet.com/, 2012.
Bruwer, J., and Johnson, R. "Placeϋbased marketing and regional branding strategy perspectives in the
Cai, Y., Lau, R. Y., Liao, S. S., Li, C., Leung, H.-F., and Ma, L. C. "Object typicality for effective Web of
Castro, P. S., Zhang, D., Chen, C., Li, S., and Pan, G. "From taxi GPS traces to social and community
Chen, H., Chiang, R. H., and Storey, V. C. "Business Intelligence and Analytics: From Big Data to Big
Dhar, S., and Varshney, U. "Challenges and business models for mobile location-based services and
Di, W., Sundaresan, N., Piramuthu, R., and Bhardwaj, A. "Is a picture really worth a thousand words?:-on
the role of images in e-commerce," Proceedings of the 7th ACM international conference on Web
Dias, M. B., Locher, D., Li, M., El-Deredy, W., and Lisboa, P. J. G. "The value of personalised
recommender systems to e-business: a case study," in: Proceedings of the 2008 ACM conference on
15
Etherington, D. "Apple Sends Out iPhone Survey, Seeks Feedback On Android, Touch ID And More,"
http://techcrunch.com/, 2014.
Ghose, A., and Sundararajan, A. "Evaluating pricing strategy using e-commerce data: Evidence and
Goi, C. L. "A review of marketing mix: 4Ps or more?," International Journal of Marketing Studies (1:1)
2009, p P2.
Hedin, H., Hirvensalo, I., and Vaarnas, M. The Handbook of Market Intelligence: Understand, Compete
Jacobs, A. "The pathologies of big data," Communications of the ACM (52:8) 2009, pp 36-44.
Kaisler, S., Armour, F., Espinosa, J. A., and Money, W. "Big data: Issues and challenges moving forward,"
System Sciences (HICSS), 2013 46th Hawaii International Conference on, IEEE, 2013, pp.
995-1004.
Kaptein, M., Parvinen, P., and Poyry, E. "Theory vs. Data-Driven Learning in Future E-Commerce,"
System Sciences (HICSS), 2013 46th Hawaii International Conference on, IEEE, 2013, pp.
2763-2772.
Kim, S.-Y., Jung, T.-S., Suh, E.-H., and Hwang, H.-S. "Customer segmentation and strategy development
based on customer lifetime value: A case study," Expert systems with applications (31:1) 2006, pp
101-107.
Lau, R. Y., Li, C., and Liao, S. S. "Social analytics: Learning fuzzy product ontologies for aspect-oriented
Liao, S.-H., Chu, P.-H., and Hsiao, P.-Y. "Data mining techniques and applications–A decade review from
Lu, X., Ba, S., Huang, L., and Feng, Y. "Promotional Marketing or Word-of-Mouth? Evidence from Online
Luo, X., Andrews, M., Fang, Z., and Phang, C. W. "Mobile targeting," Management Science) 2013.
16
McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D., and Barton, D. "Big Data," The management
Morinaga, S., Yamanishi, K., Tateishi, K., and Fukushima, T. "Mining product reputations on the web,"
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and
Ngai, E. W., Xiu, L., and Chau, D. C. "Application of data mining techniques in customer relationship
management: A literature review and classification," Expert systems with applications (36:2) 2009,
pp 2592-2602.
Noble, P. M., and Gruca, T. S. "Industrial pricing: Theory and managerial practice," Marketing Science
Pauwels, K., Hanssens, D. M., and Siddarth, S. "The long-term effects of price promotions on category
incidence, brand choice, and purchase quantity," Journal of marketing research (39:4) 2002, pp
421-439.
Srinivasan, S., Pauwels, K., Hanssens, D. M., and Dekimpe, M. G. "Do Promotions Benefit Manufacturers,
Tan, W., Blake, M. B., Saleh, I., and Dustdar, S. "Social-network-sourced big data analytics," Internet
Vasconcelos, M. A., Ricci, S., Almeida, J., Benevenuto, F., and Almeida, V. "Tips, dones and todos:
uncovering user profiles in foursquare," Proceedings of the fifth ACM international conference on
Zhao, J. L., Fan, S., and Hu, D. "Business Challenges and Research Directions of Management Analytics in
the Big Data Era," Journal of Management Analytics (1:3), 2014, pp. 169-174.
Zikopoulos, P., and Eaton, C. Understanding big data: Analytics for enterprise class hadoop and streaming
17