Skip to content

Commit bbdf343

Browse files
author
Marton Sereg
committed
Merge pull request sequenceiq#51 from wjur/submit_from_outside
Describe submitting from the outside of the container
2 parents 59a94c1 + e01199e commit bbdf343

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ docker run -d -h sandbox sequenceiq/spark:1.6.0 -d
3333

3434
## Versions
3535
```
36-
Hadoop 2.6.0 and Apache Spark v1.6.0 on Centos
36+
Hadoop 2.6.0 and Apache Spark v1.6.0 on Centos
3737
```
3838

3939
## Testing
@@ -86,3 +86,16 @@ spark-submit \
8686
--executor-cores 1 \
8787
$SPARK_HOME/lib/spark-examples-1.6.0-hadoop2.6.0.jar
8888
```
89+
90+
### Submitting from the outside of the container
91+
To use Spark from outside of the container it is necessary to set the YARN_CONF_DIR environment variable to directory with a configuration appropriate for the docker. The repository contains such configuration in the yarn-remote-client directory.
92+
93+
```
94+
export YARN_CONF_DIR="`pwd`/yarn-remote-client"
95+
```
96+
97+
Docker's HDFS can be accessed only by root. When submitting Spark applications from outside of the cluster, and from a user different than root, it is necessary to configure the HADOOP_USER_NAME variable so that root user is used.
98+
99+
```
100+
export HADOOP_USER_NAME=root
101+
```

0 commit comments

Comments
 (0)