Know your Zeppelin from your Jupyter? How about Bring Your Own Algorithm vs Built-In? Give these questions a go and see whether you’re ready for certification.
AWS Machine Learning – Quiz 4
1 / 8
The development team want to make use of SageMaker auto-scaling. Where should this be configured?
2 / 8
A data & development team have built a deep learning model on a SageMaker TensorFlow Framework container. However, the model is performing poorly and training is frequently failing.
Which of the following approaches would help resolve the issue?
Although ‘Relu’ may help if there’s a vanishing gradient problem, this question is about obtaining the visibility needed to definitively isolate and resolve the issue – “Amazon SageMaker Debugger provides full visibility of model training by monitoring, recording, analyzing, and visualizing tensors of the training process. A tensor is defined as a high dimensional array of the machine learning and deep learning metrics such as weights, gradients, and losses; in other words, it is a collection of metrics continuously updated during the backpropagation and optimization process of training deep learning models.”
3 / 8
A developer is choosing between two different approaches for working with SageMaker – which of the following are true?
The AWS SDK for Python is Boto3 – it does not support other languages. Boto3 provides finer control at the cost of more verbosity, and access to wider AWS features and services vs SageMaker Python SDK.
Sagemaker Python SDK https://sagemaker.readthedocs.io/en/stable/
AWS SDK for Python Boto 3 https://docs.aws.amazon.com/sagemaker/latest/APIReference/
4 / 8
A product owner has made the development team aware of a number of built-in machine learning capabilities associated with various AWS Services. Match the machine learning capability with the AWS service.
a) Anomaly detection (RCF) + Hotspot detectionb) Duplicate detection (FindMatches ML)c) Anomaly detection (RCF), Forecasting, Auto-Narrativesd) Active label learning
a) Anomaly detection (RCF) + Hotspot detection. These capabilities are built in to Kinesis Data Analytics so presumably works in real-time. https://docs.aws.amazon.com/kinesisanalytics/latest/dev/examples-machine.htmlb) Duplicate detection (FindMatches ML). https://aws.amazon.com/about-aws/whats-new/2019/08/aws-glue-provides-findmatches-ml-transform-to-deduplicate/ “You can teach the FindMatches ML Transform your definition of a “duplicate” through examples, and it will use machine learning to identify other potential duplicates in your dataset”c) Anomaly detection (RCF), Forecasting, Auto-Narratives – https://docs.aws.amazon.com/quicksight/latest/user/making-data-driven-decisions-with-ml-in-quicksight.htmld) Active label learning https://aws.amazon.com/sagemaker/groundtruth/
“The model is able to get progressively better over time by continuously learning from labels created by human labellers. “
5 / 8
A data scientist is utilising SageMaker notebooks to explore various machine learning models with a subset of the data. Which of the following are true?
Pipe Mode is not relevant to training on local notebook instances.
6 / 8
A data scientist has been experimenting with a deep neural network, utilising Kaggle’s free Jupyter notebook environment, Keras and Tensorflow.
Which of the options corresponds to the minimal actions needed to get this running on SageMaker?
a. Load train & test data into S3b. Modify your Kaggle Script to accept model directory, train and test, host arguments and to save the modelc. Add your Kaggle script (myscript.py) into your Notebook instanced. Register Your Container on ECMe. Specify the Container location in the SageMaker estimator.f. Create a SageMaker notebook and use myscript.py as the entry point in the TensorFlow estimator.g. Use SageMaker Kaggle Sync to transfer the script file and model artifacts to S3.h. Create a serve script
When you use SageMaker built-in algorithms, you’re using a fully managed container and code, and you reference it in the estimator by setting ‘container’. SageMaker also supports a range of frameworks, including tensor flow, and ‘Script Mode’ provides a really straight-forward way of taking a Tensorflow/Keras script you’ve written somewhere else and benefitting from SageMaker’s managed service e.g. you can quickly deploy behind a managed end point. In short, with a few script modifications, just adding a .py script and a .pynb notebook is sufficient to transfer your model to SageMaker. You can go further (for example if you wanted to use R which isn’t supported directly) and ‘bring your own container’ in which case you would need to manage in ECR. Note that Sagemaker.tensorflow takes care of locating the script mode container, uploading your script to a S3 location and creating a SageMaker training job so you don’t need to specify a container in the estimator (just your script as entry point.) Finally, because you’re using an AWS managed container and framework, you don’t need to specify a serve script (as you would for bring your own container.)
7 / 8
Which of the following must be set by the user (required hyperparameters), for SageMaker’s built-in algorithm, Linear Learner? (assume classification)
The user must set num_classes (to the number of classes minus one) and predictor_type to binary_classifier, multicalss_classifier or regressor.
8 / 8
A product owner is working with a data science team who use Spark on AWS EMR for big data projects and have significant expertise with Spark. They have indicated a preference to work with Spark + MLLib to address the real-time predictions use case you’ve proposed.
Which of the following would you recommend as the most scalable and efficient way to proceed?
The team have significant expertise with Spark, and this together with the depth of Spark’s support for feature transformations, high performance via in memory caching, would likely indicate that Spark is a better candidate for ETL (than Glue)
However, there are a number of reasons for not using Spark for Machine Learning, although MLLib provides some useful algorithms vi MLLib.
Hence, “Use Spark for ETL, SageMaker to train & deploy models” .
This AWS Online Tech Talk really helps clarify the relationship between EMR Spark usage and SageMaker.
Your score is
The average score is 41%
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.