Skip to content

Latest commit

 

History

History
107 lines (85 loc) · 4.14 KB

File metadata and controls

107 lines (85 loc) · 4.14 KB

BERT Large inference

This document has instructions for running BERT Large inference using Intel-optimized TensorFlow with Intel® Data Center GPU Max Series.

Software Requirements:

  • Intel® Data Center GPU Max Series

  • Follow instructions to install the latest ITEX version and other prerequisites.

  • Intel® oneAPI Base Toolkit: Need to install components of Intel® oneAPI Base Toolkit

    • Intel® oneAPI DPC++ Compiler

    • Intel® oneAPI Threading Building Blocks (oneTBB)

    • Intel® oneAPI Math Kernel Library (oneMKL)

    • Follow instructions to download and install the latest oneAPI Base Toolkit.

    • Set environment variables for Intel® oneAPI Base Toolkit: Default installation location {ONEAPI_ROOT} is /opt/intel/oneapi for root account, ${HOME}/intel/oneapi for other accounts

      source {ONEAPI_ROOT}/compiler/latest/env/vars.sh
      source {ONEAPI_ROOT}/mkl/latest/env/vars.sh
      source {ONEAPI_ROOT}/tbb/latest/env/vars.sh
      source {ONEAPI_ROOT}/mpi/latest/env/vars.sh
      source {ONEAPI_ROOT}/ccl/latest/env/vars.sh

Datasets

BERT Large Data

Download and unzip the BERT Large uncased (whole word masking) model from the google bert repo. Then, download the Stanford Question Answering Dataset (SQuAD) dataset file dev-v1.1.json into the wwm_uncased_L-24_H-1024_A-16 directory that was just unzipped.

wget https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip
unzip wwm_uncased_L-24_H-1024_A-16.zip

wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -P wwm_uncased_L-24_H-1024_A-16

Set the DATASET_DIR to point to that directory when running BERT Large inference using the SQuAD data.

Quick Start Scripts

Script name Description
benchmark.sh This script runs bert large fp16 and fp32 inference.
accuracy.sh This script runs bert large fp16 and fp32 inference in accuracy mode.

Run the model

Install the following pre-requisites:

  • Create and activate virtual environment.

    virtualenv -p python <virtualenv_name>
    source <virtualenv_name>/bin/activate
  • Download the frozen graph model file, and set the FROZEN_GRAPH environment variable to point to where it was saved:

    wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_7_0/fp32_bert_squad.pb
  • Download the pretrained model directory and set the PRETRAINED_DIR environment variable to point where it was saved:

    wget  https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip
    unzip wwm_uncased_L-24_H-1024_A-16.zip
  • Download the SQUAD directory and set the SQUAD_DIR environment variable to point where it was saved:

    wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json
    wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json
    wget https://raw.githubusercontent.com/allenai/bi-att-flow/master/squad/evaluate-v1.1.py
  • Clone the Model Zoo repository:

    git clone https://github.com/IntelAI/models.git

Run the model on Baremetal

Navigate to the BERT Large inference directory, and set environment variables:

cd models
export OUTPUT_DIR=<path where output log files will be written>
export PRECISION=<Set precision: fp16 or fp32>
export FROZEN_GRAPH=<path to pretrained model file (*.pb)>
export PRETRAINED_DIR=<path to pretrained directory>
export SQUAD_DIR=<path to squad directory>

# Set `Tile` env variable only for running `benchmark.sh` script:
export Tile=2

# Run quickstart script:
./quickstart/language_modeling/tensorflow/bert_large/inference/gpu/<script name>.sh

License

LICENSE