An overview of the new AWS machine learning certification
Previously, we looked at three specialty exams/certifications available for Amazon Web Services (AWS). A new specialty credential, AWS Certified Machine Learning – Specialty, has become available in addition to those existing three (Advanced Networking, Big Data, and Security), and it is worth taking a deeper look at.
Intended for those who want to validate their skills with machine learning models using the AWS cloud, exam MLS-C01 is aimed at candidates with one to two years of hands-on experience with ML in the AWS cloud. The certification exam cost is currently $300 and good for two years (recertification exams are priced at $75).
Exam candidates have 170 minutes given to answer the questions, and they come in two formats: multiple choice (choose one of four given options) and multiple response (choose two or more of five given options). The range of possible scores is from 100 to 1000, with 750 being the minimum passing score.
While there is no limit on the number of times you can take the exam until you pass, there is a required waiting period of 14 days between each attempt (you don’t have the option, in other words, to retake the test while everything you missed is fresh in your mind). More information on exam policies, or the exams themselves, is available online.
A practice exam is also available from Amazon for $40. Taking such an exam provides a highly recommended gauge as to whether or not you are ready to sit for the live exam.
Bear in mind that it is not enough to just know machine language: You also need a deep understanding of the AWS cloud. For that reason, Amazon recommends successful candidates have:
● The ability to express the intuition behind basic ML algorithms
● Experience performing basic hyperparameter optimization
● Experience with ML and deep learning frameworks
● The ability to follow model-training best practices
● The ability to follow deployment and operational best practices
There exam has four domains, and the following table shows those categories as well as the weighting and the topic areas beneath each:
Selected Amazon Machine Learning services
While a successful candidate will need to have a good knowledge of every topic listed in the table above, the following lists, in alphabetic order, some of the ML services offered with AWS that may not be readily apparent from looking at the domains:
Amazon Comprehend is an NLP (natural language processing) service that uses machine learning to find insights and relationships in text. It identifies what language the text is in, and then extracts/parses key phrases, as well as places, people, brands, or events. It looks at how positive or negative the text is and then can automatically organize a set of text files by topic. Complementing this is Amazon Comprehend Medical which “makes it easy to use machine learning to extract relevant medical information from unstructured text”.
Amazon EMR offers a managed Hadoop framework that “makes it easy, fast, and cost-effective to process vast amounts of data across dynamically scalable Amazon EC2 instances.” With Amazon EMR, you can also work with other popular frameworks including Apache Spark, HBase, Presto, and Flink. It makes it simple to interact with data in AWS data stores including Amazon S3 and Amazon DynamoDB.
Amazon Forecast is a “fully managed service that uses machine learning to deliver highly accurate forecasts.” Being fully managed, there are no servers that you need to provision, and no machine learning models that you need to work with. It uses historical data that you provide and it then examines the data, identifying what is deems meaningful and produces a forecasting model “capable of making predictions that are up to 50 percent more accurate than looking at time series data alone.”
Amazon Lex is a service for building conversational interfaces into any application using voice and text (think Alexa). It provides the advanced deep learning functionalities of ASR (automatic speech recognition) needed for converting speech to text, and NLU (natural language understanding).
Amazon Personalize is intended to simplify the process allowing “developers to create individualized product and content recommendations for customers using their applications.” You feed it an activity stream (such things as page views, signups, and purchases) along with items you want to recommend (articles, products, videos, etc.).
Optionally, you can arm Amazon Personalize with more demographic data on users (age, location, and so forth) and Amazon Personalize will identify what is meaningful, “select the right algorithms”, and personalize a custom model for your data.
Amazon Polly (as in, “Wanna cracker”; get it?) translates “text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products”. The text-to-speech service uses deep learning technologies to synthesize speech into what sounds like a human voice.
Amazon Rekognition is an API that can “identify the objects, people, text, scenes, and activities, as well as detect any inappropriate content”. Its goal is to make it easy to add image and video analysis to applications. It can also be used for facial analysis and facial recognition as well as to detect, analyze, and compare faces for user verification, cataloging, people counting, and public safety usages.
Amazon SageMaker is a “fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the algorithm, tune and optimize it for deployment, make predictions, and take action. Your models get to production faster with much less effort and lower cost”. In other words, of all the services listed here, this is the one you absolutely must master. It includes a number of components, a few of which are:
● Amazon SageMaker Ground Truth, which is used for building and managing training datasets quickly.
● Amazon SageMaker Neo will automatically optimize any trained model built with a popular framework for the hardware platform you specify.
Amazon Textract is a service “that automatically extracts text and data from scanned documents. Amazon Textract goes beyond simple optical character recognition (OCR) to also identify the contents of fields in forms and information stored in tables.” It uses machine learning to “read” documents and extract the text and data without the need for manual interaction or custom coding.
Amazon Transcribe is an ASR (automatic speech recognition) service used to add speech-to-text capability to applications. With Transcribe, a user can “analyze audio files stored in Amazon S3 and have the service return a text file of the transcribed speech. You can also send a live audio stream to Amazon Transcribe and receive a stream of transcripts in real time.”
Amazon Translate is a neural machine translation service that “delivers fast, high-quality, and affordable language translation.” It uses deep learning models to produce more natural sounding translation than the typical statistical translation algorithms. With it, you can localize content for international users, and efficiently translate large volumes of text.
Also worth knowing
In addition to the core services/APIs listed above, a few others AWS-related products to be aware of are:
Amazon DeepLens — “(A) fully programmable video camera, tutorials, code, and pre-trained models designed to expand deep learning skills.”
Amazon DeepRacer — “(A) fully autonomous 1/18th scale race car driven by reinforcement learning, 3D racing simulator, and global racing league.”
Amazon Elastic Inference — “(A)llows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances to reduce the cost of running deep learning inference by up to 75 percent. Amazon Elastic Inference supports TensorFlow, Apache MXNet, and ONNX models.”
Amazon Inferentia — “(A) machine learning inference chip designed to deliver high performance at low cost. AWS Inferentia will support the TensorFlow, Apache MXNet, and PyTorch deep learning frameworks, as well as models that use the ONNX format.”
Apache MXnet on AWS — “(A) fast and scalable training and inference framework with an easy-to-use, concise API for machine learning.” MXnet includes the Gluon interface to help developers get started with deep learning on the cloud, on edge devices, and on mobile apps.
This month, we looked at the newest specialty exam in the Amazon Web Services (AWS) family. In addition to the overview of the machine language exam, we also examined some of the services offered with AWS that a candidate needs to know to prep for the exam.