Micron Interview Questions Summary # Question 1 Parsing The HTML Webpages
Micron Interview Questions Summary # Question 1 Parsing The HTML Webpages
Micron Interview Questions Summary # Question 1 Parsing The HTML Webpages
# Question 2
a Check if there are new lines in the ‘NewData.csv’, and append them to the
existing ‘MasterDB.csv’, as long as the ‘Status’ in the row is ‘Available’, and
the ‘Price’ and ‘COE’ columns are not ‘N.A’ (has value in ).
Initially read the data and check the condition given for appending the new data
rows_to_be_updated=new_data[(new_data['COE']!="N.A.") &
(new_data['Price']!="N.A") &(new_data['Status']=='Avaialble')]
And after fetching the above records, there is missing value which has to be
treated. Then for comparing the rows from master and fetched rows , there is
‘compare’ method in pandas which I have avoided as it’s resource intensive and
not supported with some pandas versions which could be bottleneck. Inorder to
compare I have used the last index of master data and then appended the
fetched rows according to it.
b. For the existing lines, see if the NewData.csv, contains any changes. If
yes, update the changes in the ‘MasterDB.csv’.
Used left outer join for comparing the further rows and removed unwanted rows.
We could have done with several methods alternatively.
c. If the column ‘Status’ in the NewData.csv is ‘Sold’, then remove those
lines from the ‘MasterDB.csv’
Just checked the condition for not equals sold and then filtered the remaining
rows in master.
# Question 3
a. Develop a script that can split Column ‘Car Name’ to get the following
attributes
i. Car Make
Used the lambda functions for splitting the column according to space. And for
the end date I have fetched last elements and extracted the date from it. Lambda
function can be used in python as well as in spark which provides better
performance.
In order to extract all the above statistics I have formatted the data in specific
dtypes. And the nI have used groupby pandas as well as aggregation for all the
values.