Big Basket - Solution PDF
Big Basket - Solution PDF
Big Basket - Solution PDF
By
Akanksha Utreja(11910056)
Swapnil Vermani(11910052)
2018-2019
Customer Analytics at Bigbasket - Product Recommendations
Question 1. What are the problems Bigbasket is trying to address? Please be as specific as possible.
Big Basket customers generally order a large number of items in one transaction (sometimes recorded
up to 80 items per transaction). So there are always high chances that the user forgets one or more
items.
Generally user orders the same items every week, but due to large variety in the number of products on
the basket under a category it is difficult to scroll down so many products even for the usual items. They
seek an automated cart based out of their preferences.
Question 2. What is the difference in the recommender systems requirements between Bigbasket and
other ecommerce companies such as Amazon or Flipkart?
Big basket and Amazon/ Flipkart have different requirement since there objectives are different.
Big basket is an online grocery store while Amazon is an online electronic, goods etc. store.
Unlike the ecommerce companies, Bigbasket customers order in bulk and tend to forget some
items due to large order size.
Other difference is that the Bigbasket customers tend to order the same groceries on a
daily/weekly/monthly basis. Whereas, in Amazon/Flipkart, customers order different products
though they may have similar product brand, similar price, quality, features etc.
So, keeping the above in mind the recommender systems need to be designed differently for
both.
Bigbasket needs a recommender system that uses the purchase behavior to predict what a
customer is likely to buy in future. This feature is the “Smart Basket” feature.
The other recommender system that they need is the “Did you forget” feature. This is to provide
a solution if the user forgets to buy what his/her usual pattern has been.
In the “Did you forget” feature, the input to the recommendation system is the items at present
in the users bucket and the past purchase items. It does not looks for patterns but just the
products as they are.
Big basket has the user information, their past purchase history which can serve as the input to
“Smart Basket” and “Did you forget” recommender systems. The ecommerce companies like
Amazon or Flipkart on the other hand are using a variety of recommender algorithms like
collaborative filtering, Content based filtering etc. to provide recommendations to the users.
However, from the above it is clear that problem faced by Bigbasket is significantly different
from other ecommerce companies.
Question 3. What sort of recommendation technique(s) is/are more appropriate for Big basket
and why?
For this problem we can create associative rules based on the transactional data of the online store. We
can do generic data rules based on all transaction data for a new customer or a guest user. But to
particularly address the problem of clients like Sangeetha, a personalized recommendation based on the
past transaction of that particular user dynamically on run time would be a much better approach. This
will capture the pattern how Sangeetha has bought together items in the past.
For smart basket user based Collaborative filtering would be the best approach. As we can provide users
with the purchase as similar of theirs by considering what our users have bought in the past and what
users similar to them have bought giving us the answer of the question what our user is likely to buy. A
smart basket should capture the personalized pattern the likelihood a person will purchase a given
product. We could have taken associative rules on a personal level too but that might have given us
products that generally have been bought together in the past by the user but it would not have
provided us the level of personalization as provided in collaborative filtering.
Question 4. Bigbasket is interested in introducing a “Did you forget?” feature that will identify items a
customer may have forgotten. Discuss how this feature can be created?
“Did you forget” feature will help in providing solution to the problem that users face by
forgetting to add certain items to their cart due to the large basket size.
This feature can be created in two ways:
1. Association Rules based on all transactions
2. Association Rules based on the transactions made by the particular user
(Personalization)
We can build a recommendation system on the basis of all the transactions that took place by using
Association rules and recommend the user which products they might have missed, and show them to
the user at the time of checkout, under the Did you forget tab.
Association Rules based on the transactions made by the particular user (Personalization)
For the other kind of recommendation engine, a more personalized version of this could be to use the
transaction data of a particular user. Using this data we can build the recommender system by using
Association rules and tell them which products they might have missed. This will increase the user
convenience and also help Bigbasket save on double travelling cost to the user.
Question 5. Using the Association Rules technique, do the following: Pick a customer of your choice and
suggest one “Did you forget?” item for that customer. Present and explain the output generated from
Association Rules techniques in support of your answer. Please explain the steps before presenting the
answer.
To create a Didn’t you forget system for a particular user we can study the transaction of a particular
user and create association rules on the basis for it. It will recommend products based on the
transaction of a particular user.
Steps:
Extract data for a particular user by a member id. We have taken Member id M09736 which has
around 632 records in the data about each item which are grouped against a transaction id.
Then we can load this data as transaction in the single format.
Then we can apply the apriori algorithm to these transactions where transaction id and the item
are the columns to be fed alongside threshold level of support and confidence factor
The output of the algorithm gives us some personalized association rules based on the data for
member M09736 and are shown as follows:
Here we can see that Staples are grouped together generally and thus this clearly depicts the common
scenario that when we order grocery we tend to order all the staples at once. Like if you have Other Dals
and Urad Dal in the basket, sugar is the recommendation with a lift ratio of around 1.98.Also it captures
the fact that this particular user generally buys staples from big basket than any other grocery.
Question 6. Now, unlike in the question above, generate two consumer-agnostic association rules? How
they are different from the rules obtained in Question 5 above? Present and explain the output
generated from Association Rules techniques in support of your answer. Offer some actionable
recommendations for these rules. Please explain the steps before presenting the answer
For the customer-agnostic recommendation system, we have taken all the transactional data for every
customer that was present and created some general rules on top of it.
Steps:
This technique provides general recommendations for a shopper. These could be used for the users
with limited or no transactional data.
It differs from the above technique as the above one is personalized system and this is more generic.
lhs rhs support confidence lift
[1] {Gourd & Cucumber,Other Vegetables} => {Beans} 0.1316166 0.6744044
1.690142
[2] {Gourd & Cucumber,Root Vegetables} => {Beans} 0.1130186 0.6713881
1.682582
[3] {Beans,Gourd & Cucumber} => {Other Vegetables} 0.1316166 0.7154893
1.672666
[4] {Gourd & Cucumber,Root Vegetables} => {Other Vegetables} 0.1201717 0.7138810
1.668906
[5] {Brinjals,Gourd & Cucumber} => {Other Vegetables} 0.1002623 0.7091062
1.657743
[6] {Brinjals,Root Vegetables} => {Beans} 0.1032427 0.6516178
1.633035
The output here gives us some generic rules like green vegetables are bought together with lift ratio as
high as 1.67. These rules are commercially viable when there is no transactional data for personalised
recommendation systems.
It also suggests that vegetable section is the most common section from where the big basket customers
order.
Question 7. Bigbasket is interested in introducing a “Smart Basket” feature that will identify a list of
items a customer is more likely to buy. Discuss how this feature can be created?
The “Smart Basket” feature can help provide suggestions or recommendations to users to buy products
based on their liking.
For this we can use collaborative filtering, in which we can design a recommendation system by
observing the pattern of the purchase history.
The recommendations will be given to the users based on what users similar to them have
bought giving us the answer of the question what our user is likely to buy.
We can use Association rules for personalization as well, but using collaborative filtering will
give us much better results.
So, our input could be what user has bought in the past then compare it with what similar user
has bought, and then generate recommendations.
Thus, the above gives the approach how Smart Basket feature can be created.
Question 8. Pick a customer of your choice and create a “smart basket” for that customer. Please explain
the steps before presenting the answer.
1. Load the data in the data frame. Convert the data types to appropriate types.
2. After that, create a column ratings which tells how frequently the user buys the items.
3. Next, create a matrix and use that to build the recommendation systems using collaborative
filtering.
4. Using different types of collaborative filtering, like user based , item based, create models.
5. Specify a particular member and see the recommendations for that user.
6. Using the above steps, we get the following results. The R script for the same is attached.
Question 9. What sort of data challenges do you anticipate while building a recommendation
engine for Bigbasket?
Challenges Faced:
Question 10. What are the implementation and deployment challenges of a recommendation engine for
Bigbasket?
Implementation issues:-
1. One of the implementation issues could be that we have designed a personalized system for a
user based on what he orders but tomorrow if he wants to order something entirely different or
the user preference changes, then it could be a problem for us.
2. Loading huge amounts of data can take a lot of time.
Deployment issues:-
3. The computation memory required for the problem is large since we have huge datasets, so R
can slowdown.
4. When we are trying to create dynamic recommendation engines, it could be a problem due to
slow response.
5. The recommendations have to be linked with UI which will be displayed to the user.
REFRENCES:
1. https://coggle.it/diagram/WCuvzgnl8LgCVgfW/t/customer-analytics-bigbasket
2. https://stackoverflow.com/questions/35485536/r-argument-is-not-numeric-or-logical-returning-
na/35487405
3. https://stackoverflow.com/questions/16819956/warning-message-in-invalid-factor-level-na-
generated