3 Getting Started With BigDL
3 Getting Started With BigDL
3 Getting Started With BigDL
Learning Objectives
You will be able to:
2
BigDL on Apache Spark*
4
BigDL on Apache Spark*
BigDL runs as
standard Apache
Spark* jobs
- No changes to
Apache Spark*
required
Each iteration of
training runs as an
Apache Spark* job
*Other names and brands may be claimed as the property of others.
5
Running BigDL
6
Running BigDL: Natively
Requirements
- JDK 8,
- Apache Spark* v1.6 and v2.x (2.2 or 2.3 and above recommended)
- BigDLa
7
Apache Spark* Run Modes
8
Apache Spark* Run Mode: Local
9
Apache Spark* Run Mode: Distributed
10
Running BigDL Natively on Apache Spark*
# install bigDL
export BIGDL_HOME="/path/to/bigDL"
# install Spark
export SPARK_HOME="/path/to/spark"
11
BigDL and Jupyter*
Apache Spark*
Jupyter* Kernel + BigDL
Libraries
12
Running Jupyter* With BigDL
# install bigDL
export BIGDL_HOME="/path/to/bigDL"
# install Spark
export SPARK_HOME="/path/to/spark"
13
Running BigDL on Docker*
14
Running BigDL on Docker*
# 3.3 - running BigDL on Docker
## TODO : replace pointers from ElephantScale --> Intel
15
Running BigDL in the Cloud
16
BigDL: 'Hello World' – Python*
model = Sequential()
# Hidden layer with ReLu
model.add(Linear(4,4)))
model.add(ReLU())
# Output layer
model.add(Linear(4,3))
model.add(LogSoftMax())
Layers: Linear
Hidden Layers
creating: createInput
module = Reverse(dimension=1)
input = np.random.rand(3,2) input:
print("input:\n", input) [[0.34014864 0.77003297]
[0.4424559 0.63356693]
[0.90753477 0.969264 ]]
output =
module.element().forward(input) output:
print ("output:\n", output) [ [0.9075348 0.96926403]
[0.4424559 0.6335669
[0.34014863 0.77003294]
]
Reverse Layer
Reshape Layer
Reshapes according to new dimensions
Often used for 2-D to 1-D transformation in image recognition
Between convolutional layers and fully connected layers
digit 0 1 2 3 4 5 6 7 8 9
(outcome)
Probability 0.8 0 0 0 0 0 0 0 0.1 0.1
Activation: SoftMax*
# pretty print
import pandas as pd
print(pd.DataFrame({'input' : input, 'softmax'
:
output}).to_string(index=False))
IRIS* Dataset
linear = Linear(10,15)()
sigmoid = Sigmoid()(linear)
softmax = Softmax()(sigmoid)
model = Model([linear], [softmax])
Lab 3.1: Getting Started With BigDL
Overview:
- Getting started with BigDL environment
Run time:
- 15 mins
Instructions
- Follow lab instructions
Lab 3.2: Testing BigDL Environment
Overview:
- Testing BigDL environment
Run time:
- 10 mins
Instructions
- Follow lab instructions
Lab 3.3: Data Loading and Exploration using
Apache Spark*
Overview:
- Use Apache Spark to load data, clean up and preform preliminary analysys
Run time:
- 30 mins
Instructions
- Follow lab instructions
We learned about:
- How BigDL and Spark* work together
- How to run BigDL in Docker*
- Layers and Containers in BigDL
- How to get started with BigDL
- https://bigdl-project.github.io/0.7.0/
- https://github.com/intel-analytics/BigDL