Run an EMR Cluster on Spot Instances in 5 Steps

Introduction

In this tutorial, you will learn how to clone your Elastic MapReduce (EMR) clusters into an Elastigroup. AWS EMR provides a managed Big Data framework that enables you to easily add/remove cluster capacity to meet the necessary workloads for your application. EMR supports Hadoop, Apache Spark, and other popular distributed frameworks. Running your EMR clusters on Elastigroup provides you with the significant discounts that Spot instances offer while maintaining 100% availability.

This tutorial focuses on cloning an existing EMR into Elastigroup. Elastigroup also enables you towrap your existing cluster with Spot instances Task nodes. Head to our tutorial onWrapping EMR Clustersto learn more.

Prerequisites:

  1. A verified Spot by NetApp Account.
  2. A running EMR Cluster

Step 1: Open The EMR Creation Wizard

Login to theElastigroup Console(console.spotinst.com) and navigate to the Creation Wizard by clicking the Create button in the Elastigroups tab.

In the Creation Wizard select EMR:

Step 2: Add Elastigroup Description

Set thenameandregionof the Elastigroup. Click下一个。

Step 3: Configure Strategy & Compute

  1. Under Strategy selectCloneand provide an “Origin Cluster” for Elastigroup to Clone.
  2. For the Master, Core and Task nodes select theInstance Types,Lifecycle(Spot/On-Demand),TargetandMinimum/Maximumnumber of instances. To ensure Spot availability select multiple Instance Types.
  3. To ensure widespread deployment select as manyAvailability Zones (AZ)as possible and selectSubnetswithin each AZ.
  4. (Optional) Assign tags to the Elastigroup.
  5. (Optional) Advanced Settings:
    • Set aRoot Volume Size(GB)

      {Warning:decreasing root volume size is not recommended and might affect the proper launch of the instance group or the cluster}

    • IncludeEMR Steps

      {Caution: This adds any steps configured in the original cluster to the clone}

Step 4: Scaling Policies(optional)

  1. Elastigroup offers a wide variety of scaling options for EMR, both for Core and Task nodes. Assign the ones relevant to your environment.
  2. Click Next.

Step 5: Review and Create

The Creation Wizard prepares a JSON template to launch an Elastigroup with the EMR configuration. All that’s left to do is clickCreate!

You’ve now created an EMR on Elastigroup and are in the Elastigroup Manager view, where you can review, manage and monitor your running Elastigroup.

Congratulations!

You have now learned how to create an EMR cluster on Spot instances with Spot by NetApp, letting you:

  • Cut your costs by up to 80%, while maintaining high availability.
  • Run on spot instances with zero overhead, and no servers to manage - The Spot Elastigroup platform manages your infrastructure for you.

Next Steps

  • Create aWrapped EMR Clusteron Elastigroup to run tasks nodes for your existing EMR cluster on Spot Instances.
  • ConfigureElastigroup’s Scaling Policiesfor EMR Core and Task nodes.
  • Check out ourAPI Docshere to learn how to clone your EMR into an Elastigroup using RESTful APIs.
Baidu
map