AWS Machine Learning Certification - Quiz #2

AWS Certification Cloud AI

admin -

May 25, 2020

Know your built-in algorithms from your AI services? Sure about your hyper-parameters & ETL?

Give this a go to check your knowledge ahead of taking the AWS Machine Learning-Specialty exam.

193

Created on May 25, 2020 By

admin

AWS Machine Learning – Quiz 2

1 / 10

QUESTION

An office supplies company uses Amazon EMR with Apache Spark for its data transformation workloads. Due to supplier systems issues currently outside their control, duplicates are being seen in the data feeds. What would be the most efficient and simplest method to remove those duplicates?

(Select One)

Implement a Lambda function to pre-process problem data sources

Use Glue and write a custom script to pre-process before delivery to EMR

Use Glue’s built-in machine learning capabilities to implement a transformation for deduplication

Load the data into Amazon Redshift to de-duplicate, then load into EMR data frames

2 / 10

QUESTION

A data scientist is examining a subset of data on an AWS notebook instance. The data has been loaded into a Pandas DataFrame and a correlation command df.corr() has been run.

Which of the following is can be determined from resulting correlation table, below?

	target	f1	f2	f3	f4
target	1.000	0.212	-0.001	-0.009	0.003
f1	0.212	1.000	-0.007	0.024	-0.016
f2	-0.001	-0.007	1.000	0.007	0.000
f3	-0.009	0.024	0.007	1.000	-0.005
f4	0.003	-0.016	0.000	-0.005	1.000

(Choose One)

There is no correlation between features f1-f4, between features and the target

There is weak correlation between feature f3 and feature f2

There is no correlation between feature f2 and the target

There is weak correlation between f1 and the target. There may be non linear correlations between features but this cannot be determined here

You’ll likely need to be sure of precision, accuracy, recall and perhaps f1 scores for the exam. Confusion matrix and related calculations for both binary classification and multi-class classification should be understood.

Watch out for matrix being drawn with either prediction or actuals on the left/top, as this can be confusing if not spotted.

https://towardsdatascience.com/reading-a-confusion-matrix-60c4dd232dd4

3 / 10

QUESTION

A data scientist is creating a virus detection model utilising global pandemic data. She is evaluating the latest binary classification results.Given the following product requirements, which of these models would fulfil the criteria at lowest cost based on the confusion matrices given?

a) The test must support claims of “at least 90% accuracy”
b) At least 90% of virus positives must being found.
c) The cost of a false negative is to be considered 4 times more than a false positive.

(Choose One)

TP=18 FN=2, FP=4, TN=76

TP=19, FN=1, FP=10, TN=70

TP=17 FN=3, FP=1, TN=79

TP=17, FN=3, FP=0, TN=80

4 / 10

QUESTION

Which of the following must be set by the user (required hyper parameters), for SageMaker’s built-in algorithm, XGBoost?(assume classification)

(Select Two)

n_estimators

max_depth

num_round

num_class

5 / 10

QUESTION

Which of the following built-in SageMaker Algorithms can be used for dimensionality reduction?

(Select Two)

Random Cut Forest (FCA)

Principal Component Analysis (PCA)

t-Distributed Stochastic Neighbor Embedding (t-SNE)

Object2Vec

6 / 10

QUESTION

The Product Marketing team at a high street footwear brand want to add a new feature to their app that allows users to upload an image and have their shoes replaced in that image with the latest offering. Based on user permissions, the uploaded photo may later be shared on social media #upgrademytrainers

Which of the following services would help build this use case?

(Choose One)

SageMaker Image Classification

SageMaker Object Detection & Amazon Rekognition

SageMaker Semantic Segmentation & Amazon Rekognition

SageMaker Instance Segmentation with Pyramid Scene Parsing

7 / 10

QUESTION

A product owner is working with the development team to rapidly prototype a new image application. They’ll use AWS Built in algorithms to test feasibility. Which of the following would be valid options for image applications?

(Choose One)

Image Classification using Parquet file type on a GPU instance class

Object2Vec utilising .tiff file format on a GPU instance class

Semantic Segmentation utilising image files, Pipe input mode and GPU instance class

Neural Topic Model utilising recordIO file type on a GPU instance class

8 / 10

QUESTION

Which of these SageMaker built-in algorithms support SGD, Adam, rmsprop optimisers?

(Select Three)

Image Classifier & Object Detection

Seq2Seq

DeepAR

Object2Vec

9 / 10

QUESTION

A data scientist wishes to use SageMaker notebook instances to orchestrate AWS services whilst developing and deploying new models. In particular, she wishes to control an Amazon EMR spark instance.

What actions are needed?

(Select Two)

A notebook lifecycle configuration should be used to set the EMR master IP

The notebook kernel should be set to Sparkmagic (PySpark)

You can’t control a Spark EMR from SageMaker. Use EMR Notebooks or Zeppelin on the EMR directly, instead

An AWS data Pipeline must be used to sync instance meta data between EMR and SageMaker

10 / 10

QUESTION

Match the real world distribution to its corresponding statistical distribution

A) The probability of exactly x pet owners selected at random being men.
B) The probability of catching x fishes in h hours given average hourly catch of y
C) The distribution of male adult heights in Germany

A) Poisson B) Normal C) Binomial

A) Binomial B) Normal C) Poisson

A) Normal B) Binomial C) Poisson

A) Binomial B) Poisson C) Normal

Your score is

The average score is 48%

#AWS certification AWS Machine Learning – Speciality

Metric	Calculation	Model 1	Model 2	Model 3	Model 4
Accuracy	(TP+TN)/(TP+FN+FP+TN)	0.94	0.89	0.96	0.97
Recall	TP/(TP+FN)	0.9	0.95	0.85	0.85
Precision	TP/(TP+FP)	0.82	0.66	0.94	1
Cost	FP+(4+K*FN)	12	14	13	12

AWS Machine Learning Certification – Quiz #2

No responses yet

Leave a Reply Cancel reply