Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
ycchen00
GitHub Repository: ycchen00/Introduction-to-Data-Science-in-Python
Path: blob/main/quiz/FinalQuiz.ipynb
2246 views
Kernel: Python 3

Final Quiz

Q1

Consider the given NumPy arrays a and b. What will be the value of c after the following code is executed?

import numpy as np a = np.arange(8) b = a[4:6] b[:] = 40 c = a[4] + a[6] c
46

Q2

Given the string s as shown below, which of the following expressions will be True?

import re s = 'ABCAC'
print(re.split('A', s)) len(re.split('A', s)) == 2
['', 'BC', 'C']
False
bool(re.match('A', s)) == True
True
print(re.match('A', s)) re.match('A', s) == True
<_sre.SRE_Match object; span=(0, 1), match='A'>
False
# len(re.search('A', s)) == 2 print(re.search('A', s)) print("TypeError: object of type '_sre.SRE_Match' has no len()")
<_sre.SRE_Match object; span=(0, 1), match='A'> TypeError: object of type '_sre.SRE_Match' has no len()

Q3

Consider a string s. We want to find all characters (other than A) which are followed by triple A, i.e., have AAA to the right. We don’t want to include the triple A in the output and just want the character immediately preceding AAA . Complete the code given below that would output the required result.

def result(): s = 'ACAABAACAAABACDBADDDFSDDDFFSSSASDAFAAACBAAAFASD' result = [] # compete the pattern below pattern = "(\w)(?=[A]{3})" for item in re.finditer(pattern, s): # identify the group number below. result.append(item.group()) return result
result()
['C', 'F', 'B']

Q4

Consider the following 4 expressions regarding the above pandas Series df. All of them have the same value except one expression. Can you identify which one it is? %E5%9B%BE%E7%89%87.png

import pandas as pd df=pd.Series({'d':4,'b':7,'a':-5,'c':3}) df
d 4 b 7 a -5 c 3 dtype: int64
df.iloc[0]
4
df['d']
4
df.index[0]
'd'
df[0]
4

Q5

Consider the two pandas Series objects shown belwo, representing the no. of items of different yogurt flavors that were sold in a day from two different stores, s1 and s2. Which of the following statements is True regarding the Series s3 defined below? %E5%9B%BE%E7%89%87.png

s1=pd.Series({ 'Mango':20, 'Strawberry':15, 'Blueberry':18, 'Vanilla':31 }) s2=pd.Series({ 'Mango':20, 'Strawberry':20, 'Vanilla':30, 'Banana':15, 'Plain':20 })
s1
Mango 20 Strawberry 15 Blueberry 18 Vanilla 31 dtype: int64
s2
Mango 20 Strawberry 20 Vanilla 30 Banana 15 Plain 20 dtype: int64
s3=s1.add(s2)
s3
Banana NaN Blueberry NaN Mango 40.0 Plain NaN Strawberry 35.0 Vanilla 61.0 dtype: float64
s3['Blueberry']==s1['Blueberry']
False
s3['Mango'] >= s1.add(s2,fill_value=0)['Mango']
True
s3['Blueberry'] >= s1.add(s2,fill_value=0)['Blueberry']
False
s3['Plain']>=s3['Mango']
False

Q6

In the following list of statements regarding a DataFrame df, one or more statements are correct. Can you identify all the correct statements? %E5%9B%BE%E7%89%87.png

data = pd.DataFrame(data=[['bar','one','z','1'], ['bar','two','v','2'], ['foo','one','x','3'], ['foo','two','w','4']], columns=['a','b','c','d']) data
indexed1 = data.set_index('c') indexed1
indexed2 = indexed1.set_index('a') indexed2
reindexed1 = data.set_index('c') reindexed1
reindexed2 = reindexed1.reset_index() reindexed2

Q7

Consider the Series object S defined below. Which of the following is an incorrect way to slice S such that we obtain all data points corresponding to the indices 'b', 'c', and 'd'?

S = pd.Series(np.arange(5), index=['a', 'b', 'c', 'd', 'e']) S
a 0 b 1 c 2 d 3 e 4 dtype: int32
S['b':'e']
b 1 c 2 d 3 e 4 dtype: int32
S[['b','c','d']]
b 1 c 2 d 3 dtype: int32
S[S<=3][S>0]
b 1 c 2 d 3 dtype: int32
S[1:4]
b 1 c 2 d 3 dtype: int32

Q8

%E5%9B%BE%E7%89%87.png Consider the DataFrame df shown above with indexes 'R1', 'R2', 'R3', and 'R4'. In the following code, a new DataFrame df_new is created using df. What will be the value of df_new[1] after the below code is executed?

df = pd.DataFrame([ {'a':5,'b':6,'c':20}, {'a':5,'b':82,'c':28}, {'a':71,'b':31,'c':92}, {'a':67,'b':37,'c':49}], index=['R1', 'R2', 'R3','R4']) df
f = lambda x: x.max() + x.min() df_new = df.apply(f) df_new[1]
88

Q9

%E5%9B%BE%E7%89%87.png Consider the DataFrame named new_df shown above. Which of the following expressions will output the result (showing the head of a DataFrame) below?

%E5%9B%BE%E7%89%87.png

import pandas as pd import numpy as np df = pd.read_csv('../resources/week-3/datasets/cwurData.csv') df.head()
def create_category(ranking): if (ranking >= 1) & (ranking <= 100): return "First Tier Top Unversity" elif (ranking >= 101) & (ranking <= 200): return "Second Tier Top Unversity" elif (ranking >= 201) & (ranking <= 300): return "Third Tier Top Unversity" return "Other Top Unversity" df['Rank_Level'] = df['world_rank'].apply(lambda x: create_category(x)) new_df=df.pivot_table(values='score', index='country', columns='Rank_Level', aggfunc=[np.mean, np.max], margins=True) new_df.head()
new_df.unstack()
Rank_Level country mean First Tier Top Unversity Argentina NaN Australia 47.9425 Austria NaN Belgium 51.8750 Brazil NaN ... amax All Uganda 44.4000 United Arab Emirates 44.3600 United Kingdom 97.6400 Uruguay 44.3500 All 100.0000 Length: 600, dtype: float64
new_df.stack()
new_df.stack().stack()
country Rank_Level Argentina Other Top Unversity mean 44.672857 amax 45.660000 All mean 44.672857 amax 45.660000 Australia First Tier Top Unversity mean 47.942500 ... All Second Tier Top Unversity amax 51.290000 Third Tier Top Unversity mean 46.843450 amax 47.930000 All mean 47.798395 amax 100.000000 Length: 386, dtype: float64
new_df.unstack().unstack()

Q10

%E5%9B%BE%E7%89%87.png Consider the DataFrame df shown above. What will be the output (rounded to the nearest integer) when the following code related to df is executed:

df = pd.DataFrame([ {'Item':'item_1','Store':'A','Quantity sold':10}, {'Item':'item_1','Store':'B','Quantity sold':20}, {'Item':'item_1','Store':'C','Quantity sold':None}, {'Item':'item_2','Store':'A','Quantity sold':5}, {'Item':'item_2','Store':'B','Quantity sold':10}, {'Item':'item_2','Store':'C','Quantity sold':15}]) df
df.groupby('Item').sum().iloc[0]['Quantity sold']
30.0