Razor Group_Python Task
Razor Group_Python Task
2021
1
• THE
THE TASK
Analyze the dataset provided on the next slide using Python and present your insights. Please
APPROACH do not do any manual data cleaning (only formula-based sanitization) , and do not use excel for
this exercise.
Please present your insights in the form of a Jupyter Notebook. We will assess you on your data
RESULTS parsing process, code quality as well as the insights and selection criteria you develop to
analyze the dataset and present the best sellers. We will ask you to present your results in a
joint call via screenshare.
2
• WHO
THE WE ARESET
DATA
3
• WHO
DATAWE ARE
SANITY
• The column sellerproductcount gives you the count of products in the form '1-16 of over 100,000 results' , and
you can parse out the product count 100,000.
• sellerratings - This columns gives you the % and count of positive ratings (e.g. 88% positive in the last 12 months
(118 ratings) ) if parsed correctly.
• sellerdetails - You can use this text to parse out phone numbers, and email IDs of merchants, where available, so
our team can reach out to them.
• businessaddress - This will give you the business locations of the sellers. You can parse them to identify if a seller
is registered in the US , Germany (DE), or China (CN). Note that Razor does not acquire Chinese sellers at this
point, so you can use this data to exclude sellers in China from your analysis.
• Hero Product 1 #ratings and Hero Product 2 #ratings - these 2 columns give you the number of ratings of the 2
'hero products' or bestselling products of this seller.