80 Câu Hỏi Phỏng Vấn Về Python

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Python — 34 questions

1. How do we create numerical variables in python?


pi = 3.14159
diameter = 3

2. How do we perform calculations in python?


radius = diameter / 2
area = pi * radius * radius

3. Give an example of BODMAS in python?


(8–3) * (2 — (1 + 1))
The output is 0

4. Give examples of list?


a = [1, 2, 3] → length of a : 3
b = [1, [2, 3]] → length of b : 2
c = [] → length of c : 0
d = [1, 2, 3][1:] → length of d : 2

5. How do we interchange the values of two lists?


a = [1, 2, 3]
b = [3, 2, 1]
b,a = a,b

6. How do we extract values from list?


r = [“Mario”, “Bowser”, “Luigi”]
r[0] → Mario
r[-1] → Luigi

7. How do we create loops in python using list?


The following code returns the numbers from a list that are more than the threshold
def elementwise_greater_than(L, thresh):
res = []
for ele in L:
if ele > thresh: res.append()
return res
elementwise_greater_than([1, 2, 3, 4], 2)
The output is [3, 4]

8. Give examples of String?


a = “” → length of a : 0
b = “it’s ok” → length of b : 7
c = ‘it\’s ok’ → length of c : 7
d = “””hey””” → length of d : 3
e = ‘\n’ → length of e : 1
9. Give an example of Boolean?
A Boolean takes only 2 values: True and False
0 < 1: True
0 > 1: False

10. How do we perform operations on Boolean?

11. What are functions in python?


A function is a block of organized, reusable code that is used to perform a single, related
action.
def round_to_two_places(num):
return round(num, 2)
pi = round_to_two_places(3.14159)
The output is 3.14

12. Calculating remainder in python?


91 % 3
The output is 1

13. Who created python?


Python is an interpreted, high-level, general-purpose programming language.
Python was created by Guido van Rossum

14. When was python created?


Python was conceived in the late 1980s as a successor to the ABC language
The first version was released in 1991
Python 2.0 was released in 2000
Python 3.0 was released in 2008

15. What are the built-in types does python provide?


16. What is lambda in Python?
It is a single expression anonymous function used as an inline function.
x = lambda a : a + 10
x(5)
The output is 15

17. What is pass in Python?


Pass means, no-operation Python statement.
It is a placeholder in compound statements, where nothing has to be written.

18. What is slicing?


A mechanism to select a range of items from sequence types like list, tuple, strings etc. is
known as slicing.
x[1, 2, 3, 4, 5]
x[0:2] → [1,2]
x[2:] → [3,4,5]

19. What is a negative index in Python?


Python sequences can be indexed in positive and negative numbers.
For positive index, 0 is the first index, 1 is the second index and so forth.
For negative index, (-1) is the last index and (-2) is the second last index and so forth.

20. How can you convert a number to a string?


In order to convert a number into a string, use the inbuilt function str().
If you want an octal or hexadecimal representation, use the inbuilt function oct() or hex().

21. What is range function?


The range() function returns a sequence of numbers, starting from 0 by default, and
increments by 1 (by default), and stops before a specified number.
x = range(6)
for n in x:
print(n)
The output is 0, 1, 2, 3, 4, 5

22. How do you generate random numbers in Python?


Library: import random
Syntax: random.random()
Output: Returns a random floating point number in the range [0,1)

23. What is the difference between / and // operator in Python?


// is a Floor Division operator
It is used for dividing two operands with the result as quotient showing only digits before the
decimal point.
10 / 3 = 3.33333
10 // 3 = 3
24. What is the use of the split function in Python?
The use of the split function in Python is that it breaks a string into shorter strings using the
defined separator.
It gives a list of all words present in the string.

25. What is the difference between a list and a tuple?

26. What is the difference between an array and a list?

27. How would you convert a list to an array?


This is done using numpy.array().
This function of the numpy library takes a list as an argument and returns an array that
contains all the elements of the list.

28. What are the advantages of NumPy arrays over Python lists?
NumPy is more convenient.
You get a lot of vectors and matrix operations, which sometimes allow one to avoid
unnecessary work.
You get a lot of built-in functions with NumPy for fast searching, basic statistics, linear
algebra, histograms, etc.

29. What are global and local variables in Python?


30. Explain the differences between Python 2 and Python 3?

31. What is dictionary comprehension in Python?


Dictionary comprehension is one way to create a dictionary in Python.
It creates a dictionary by merging two sets of data which are in the form of either lists or
arrays.
rollNumbers =[122, 233, 353, 456]
names = [‘alex’, ‘bob’, ‘can’, ‘don’]
NewDictionary={ i:j for (i,j) in zip (rollNumbers,names)}
The output is {(122, ‘alex’), (233, ‘bob’), (353, ‘can’), (456, ‘don’)

32. How would you sort a dictionary in Python?


Dictionary.keys() : Returns only the keys in an arbitrary order.
Dictionary.values() : Returns a list of values.
Dictionary.items() : Returns all of the data as a list of key-value pairs.
Sorted(): This method takes one mandatory and two optional arguments

33. How do you reverse a string in Python?


Stringname = ‘python’
Stringname[::-1]
The output is ‘nohtyp’

34. How do you check if a Python string contains another string?


“Python Programming” contains “Programming”
The output is True
“Python Programming” contains “Language”
The output is False
Pandas — 18 questions
35. How to create dataframe from list?
fruit_sales = pd.DataFrame([[35, 21], [41, 34]], columns=[‘Apples’, ‘Bananas’],index=[‘2017
Sales’, ‘2018 Sales’])

36. How to create a dataframe from a dictionary?


animals = pd.DataFrame({‘Cows’: [12, 20], ‘Goats’: [22, 19]}, index=[‘Year 1’, ‘Year 2’])

37. How to import csv?


import pandas as pd
cr_data = pd.read_csv(“credit_risk_dataset.csv”)

38. How to export csv?


import pandas as pd
animals.to_csv(“cows_and_goats.csv”)

39. How do you select columns from dataframe?


Selecting the ‘description’ column from ‘reviews’ dataframe
reviews[‘description’]

40. How do you select rows from dataframe?


Selecting the first row from ‘reviews’ dataframe
reviews.iloc[0]

41. How do you select both rows and columns from dataframe?
Selecting the first row of ‘description’ column from ‘reviews’ dataframe
reviews[‘description’].iloc[0]
42. How do you select rows based on indices?
Selecting rows 1, 2, 3, 5 and 8 from ‘reviews’ dataframe
indices = [1, 2, 3, 5, 8]
sample_reviews = reviews.loc[indices]

43. How do you find the median value?


Finding the median of ‘points’ column from ‘reviews’ dataframe
reviews[‘points’].median()

44. How do you find the unique values?


Finding all the unique countries in ‘country’ column from ‘reviews’ dataframe
reviews[‘country’].unique()

45. How do you find the count of unique values?


Finding the count of unique countries in ‘country’ column from ‘reviews’ dataframe
reviews[‘country’].value_counts()

46. How do you group on a particular variable?


Find the count of ‘taster_twitter_handle’ column from ‘reviews’ dataframe
reviews.groupby(‘taster_twitter_handle’).size()

47. How do you apply functions after grouping on a particular variable?


Find the min and max of ‘price’ for different ‘variety’ column from ‘reviews’ dataframe
reviews.groupby(‘variety’).[‘price’].agg([min, max])
48. How to get the data type of a particular variable?
Get the data type of ‘points’ column from ‘reviews’ dataframe
reviews[‘points’].dtype

49. How do you drop columns?


Dropping columns ‘points’ and ‘country’ from ‘reviews’ dataframe
reviews.drop([‘points’, ‘country’], axis=1, inplace=True)

50. How do you keep columns?


Keeping columns ‘points’ and ‘country’ from ‘reviews’ dataframe
reviews = reviews[[‘points’, ‘country’]]

51. How do you rename a column?


Rename ‘region_1’ as ‘region’ and ‘region_2’ as ‘locale’
reviews.rename(columns=dict(region_1=’region’, region_2=’locale’))

52. How do you sort a dataframe based on a variable?


Sorting ‘region_1’ in descending order
reviews[‘region_1’].sort_values(ascending=False)
Visualization — 8 questions
53. How do you plot a line chart?
import seaborn as sns
sns.lineplot(data=loan_amnt)

54. How do you plot a bar chart?


import seaborn as sns
sns.barplot(x=cr_data[‘cb_person_default_on_file’], y=cr_data[‘loan_int_rate’])
55. How do you plot a heat map?
import seaborn as sns
sns.heatmap(num_data.corr(), annot=True)

56. How do you plot scatter plots?


import seaborn as sns
sns.scatterplot(x=cr_data[‘loan_amnt’], y=cr_data[‘person_income’])
57. How do you plot a distribution chart?
import seaborn as sns
sns.distplot(a=cr_data[‘person_income’], label=”person_income”, kde=False)

58. How do you add x-label and y-label to the chart?


import matplotlib.pyplot as plt
plt.xlabel(“cred_hist_length”)
plt.ylabel(“loan_amnt”)

59. How do you add a title to the chart?


import matplotlib.pyplot as plt
plt.title(“Average int_rate”)

60. How do you add legend to the chart?


import matplotlib.pyplot as plt
plt.legend()
Data Cleaning — 5 questions
61. How do you identify missing values?
The function used to identify the missing value is through .isnull()
The code below gives the total number of missing data points in the data frame
missing_values_count = sf_permits.isnull().sum()

62. How do you impute missing values?


Replace missing values with zero / mean
df[‘income’].fillna(0)
df[‘income’] = df[‘income’].fillna((df[‘income’].mean()))

63. What is scaling of data?


Scaling convert the data using the formula = (value — min value) / (max value — min value)
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
original_data = pd.DataFrame(kickstarters_2017[‘usd_goal_real’])
scaled_data = pd.DataFrame(scaler.fit_transform(original_data))
Original data
Minimum value: 0.01
Maximum value: 166361390.71
Scaled data
Minimum value: 0.0
Maximum value: 1.0

64. What is normalizing data?


Scaling convert the data using the formula = (value — mean) / standard deviation
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
original_data = pd.DataFrame(kickstarters_2017[‘usd_goal_real’])
scaled_data = pd.DataFrame(scaler.fit_transform(original_data))
Original data
Minimum value: 0.01
Maximum value: 166361390.71
Scaled data
Minimum value: -0.10
Maximum value: 212.57

65. How do you treat dates in python?


To convert dates from String to Date
import datetime
import pandas as pd
df[‘Date_parsed’] = pd.to_datetime(df[‘Date’], format=”%m/%d/%Y”)
Machine Learning — 15 questions
66. What is logistic regression?
Logistic regression is a machine learning algorithm for classification. In this algorithm, the
probabilities describing the possible outcomes of a single trial are modelled using a logistic
function.

67. What is the syntax for logistic regression?


Library: sklearn.linear_model.LogisticRegression
Define model: lr = LogisticRegression()
Fit model: model = lr.fit(x, y)
Predictions: pred = model.predict_proba(test)

68. How do you split the data in train / test?


Library: sklearn.model_selection.train_test_split
Syntax: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33,
random_state=42)

69. What is decision tree?


Given a data of attributes together with its classes, a decision tree produces a sequence of
rules that can be used to classify the data.

70. What is the syntax for decision tree classifier?


Library: sklearn.tree.DecisionTreeClassifier
Define model: dtc = DecisionTreeClassifier()
Fit model: model = dtc.fit(x, y)
Predictions: pred = model.predict_proba(test)

71. What is random forest?


Random forest classifier is a meta-estimator that fits a number of decision trees on various
sub-samples of datasets and uses average to improve the predictive accuracy of the model
and controls over-fitting. The sub-sample size is always the same as the original input
sample size but the samples are drawn with replacement.

72. What is the syntax for random forest classifier?


Library: sklearn.ensemble.RandomForestClassifier
Define model: rfc = RandomForestClassifier()
Fit model: model = rfc.fit(x, y)
Predictions: pred = model.predict_proba(test)

73. What is gradient boosting?


Gradient boosting is a machine learning technique for regression and classification
problems, which produces a prediction model in the form of an ensemble of weak prediction
models, typically decision trees. It builds the model in a stage-wise fashion like other
boosting methods do, and it generalizes them by allowing optimization of an arbitrary
differentiable loss function.
74. What is the syntax for gradient boosting classifier?
Library: sklearn.ensemble.GradientBoostingClassifier
Define model: gbc = GradientBoostingClassifier()
Fit model: model = gbc.fit(x, y)
Predictions: pred = model.predict_proba(test)

75. What is SVM?


Support vector machine is a representation of the training data as points in space separated
into categories by a clear gap that is as wide as possible. New examples are then mapped
into that same space and predicted to belong to a category based on which side of the gap
they fall.

76. What is the difference between KNN and KMeans?


KNN:
Supervised classification algorithm
Classifies new data points accordingly to the k number or the closest data points
KMeans:
Unsupervised clustering algorithm
Groups data into k number of clusters.

77. How do you treat categorical variables?


Replace categorical variables with the average of target for each category

One hot encoding


78. How do you treat missing values?
Drop rows having missing values
DataFrame.dropna(axis=0, how=’any’, inplace=True)
Drop columns
DataFrame.dropna(axis=1, how=’any’, inplace=True)
Replace missing values with zero / mean
df[‘income’].fillna(0)
df[‘income’] = df[‘income’].fillna((df[‘income’].mean()))

79. How do you treat outliers?


Inter quartile range is used to identify the outliers.
Q1 = df[‘income’].quantile(0.25)
Q3 = df[‘income’].quantile(0.75)
IQR = Q3 — Q1
df = df[(df[‘income’] >= (Q1–1.5 * IQR)) & (df[‘income’] <= (Q3 + 1.5 * IQR))]

80. What is bias / variance trade off?


Definition
The Bias-Variance Trade off is relevant for supervised machine learning, specifically for
predictive modelling. It’s a way to diagnose the performance of an algorithm by breaking
down its prediction error.
Error from Bias
Bias is the difference between your model’s expected predictions and the true values.
This is known as under-fitting.
Does not improve with collecting more data points.
Error from Variance
Variance refers to your algorithm’s sensitivity to specific sets of training data.
This is known as over-fitting.
Improves with collecting more data points.

You might also like