Skip to content
This repository is currently being migrated. It's locked while the migration is in progress.

Latest commit

 

History

History
135 lines (93 loc) · 5.96 KB

File metadata and controls

135 lines (93 loc) · 5.96 KB

Sync Computing Client Tools

This repo contains Sync Computing tools used to feed the required information into the Sync Autotuner for Apache Spark.

EMR Instruction

Step 1: Retrieve and paste your cluster info (get_cluster_config.sh)

get_cluster_config.sh parses results of AWS CLI 'describe-cluster' and 'list-instances' command into a format accepted by our Prediction UI.

Pre-requisites

The script relies on AWS CLI to retreive the data.

AWS CLI installation instructions

Usage

./get_cluster_config.sh -i <cluster-id> -r <region>

Example Command

./get_cluster_config.sh -i j-3WBO3J7R6PVS -r us-east-1

Example Output (copy and paste into Autotuner step #1)

{"cluster": { "ID": "j-3M0WNYKET5TIV", "Timeline": { "CreationDateTime": 1639990568.943, "ReadyDateTime": 1639991128.293, "EndDateTime": 1640004773.038 }, "AvailabilityZone": "us-east-1d", "InstanceCollectionType": "INSTANCE_GROUP", "InstanceGroups": [ { "InstanceGroupType": "MASTER", "Market": "ON_DEMAND", "InstanceType": "m5.xlarge", "RequestedInstanceCount": 1, "VolumeSpecification": [ { "SizeInGB": 32 }, { "SizeInGB": 32 } ] }, { "InstanceGroupType": "CORE", "Market": "SPOT", "InstanceType": "m5.4xlarge", "RequestedInstanceCount": 4, "VolumeSpecification": [ { "SizeInGB": 20 } ] } ] }, "region": "us-east-1"}

<cluster id> is the cluster id that you are interested in parsing. The cluster id is prefixed with 'j-'. <region> represents the region the cluster ran in.

New EMR console Old EMR Console

Step 2: Retrieve EMR Spark logs and upload into Autotuner

  1. Go to the EMR console in AWS, and find the cluster that ran the job you are interested in optimizing. Click on the cluster name to view details of the cluster.

  2. Verify that you have spark.eventLog.enabled set to true for any jobs you are interested in optimizing. The Sync Autotuner needs a Spark event log from a job run in order to provide optimized cluster configurations for the job.

New EMR console Old EMR Console
  1. If spark.eventLog.dir is set and specifies an S3 location then download the Spark event log from the specified S3 location. Skip to Step 7.

  2. If spark.eventLog.dir is not set, follow the steps below to download the Spark event log from the Spark history server.

  3. Once you are in the cluster information page, click on the “Application user interfaces” tab, and click on “Spark history server” (in red below) under “Persistent application user interfaces.”

New EMR console Old EMR Console
  1. A new tab should open up with the Spark history server. It may take a minute to load. Click the download button under the event log column to download the Spark event log.

  1. Upload the Spark event log into the Autotuner.

Databricks Tools

These tools are not currently necessary to run the Autotuner for Databricks eventlogs.

Retrieving your Databricks cluster eventlog

get_databrick_cluster_eventlog.sh makes the appropriate Databricks CLI calls and combines the data into a single output file.

Pre-requisites

The script relies on Databricks CLI to retrieve the cluster events

Databricks CLI Installation instructions

Usage

./get_databricks_cluster_eventlog.sh -i <cluster-id> [-r <results directory>]

Example

./get_databricks_cluster_eventlog.sh -i 2631-121255-j612dkia -r /path/to/results

If the -r flag is excluded, then a new directory databricks_cluster_eventlogs will be created in the same directory as this script and results will be saved there.

Instructions for finding a cluster-id through the Databricks console can be found here. Alternatively, the cluster-id associated with a given Databricks Spark eventlog can be found opening the eventlog in a text editor and searching for the string spark.databricks.clusterUsageTags.clusterId.

Example Output

{
  "events": [
    {
      "cluster_id": "2631-121255-j612dkia",
      "timestamp": 1659615195995,
      "type": "CREATING",
      "details": {
        "cluster_size": {
          "autoscale": {
            "min_workers": 2,
            "max_workers": 8
          }
        },
        "user": "user@company.com",
        "job_run_name": "job-1234567-run-1000"
      }
    },
    ...,
    {
      "cluster_id": "2631-121255-j612dkia",
      "timestamp": 1659616529106,
      "type": "TERMINATING",
      "details": {
        "reason": {
          "code": "JOB_FINISHED",
          "type": "SUCCESS"
        }
      }
    }
  ],
  "total_count": 22
}