Skip to content

Commit 52ca412

Browse files
author
Jacky Li
committed
add demo for sparksql on hbase
2 parents c23d347 + fe7f849 commit 52ca412

File tree

1 file changed

+26
-24
lines changed

1 file changed

+26
-24
lines changed

README.md

Lines changed: 26 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -20,50 +20,52 @@ This version of 1.0.0 requires Spark 1.4.0.
2020

2121
Spark HBase is built using [Apache Maven](http://maven.apache.org/).
2222

23-
2423
I. Clone and build Huawei-Spark/Spark-SQL-on-HBase
25-
26-
$ git clone https://github.com/Huawei-Spark/Spark-SQL-on-HBase spark-hbase
2724
```
25+
$ git clone https://github.com/Huawei-Spark/Spark-SQL-on-HBase spark-hbase
26+
```
27+
2828
II. Go to the root of the source tree
2929
```
30-
$ cd spark-hbase
30+
$ cd spark-hbase
3131
```
32+
3233
III. Build without testing
3334
```
34-
$ mvn -Phbase,hadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package install
35+
$ mvn -Phbase,hadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package install
3536
```
37+
3638
IV. Build and run test suites against a HBase minicluster, from Maven.
3739
```
38-
$ mvn clean install
40+
$ mvn clean install
3941
```
4042

4143
## Interactive Scala Shell
4244

4345
The easiest way to start using Spark HBase is through the Scala shell:
44-
45-
./bin/hbase-sql
46-
46+
```
47+
./bin/hbase-sql
48+
```
4749

4850
## Python Shell
4951

5052
First, add the spark-hbase jar to the SPARK_CLASSPATH in the $SPARK_HOME/conf directory, as follows:
51-
53+
```
5254
SPARK_CLASSPATH=$SPARK_CLASSPATH:/spark-hbase-root-dir/target/spark-sql-on-hbase-1.0.0.jar
53-
54-
55+
```
5556
Then go to the spark-hbase installation directory and issue
56-
57-
./bin/pyspark-hbase
58-
57+
```
58+
./bin/pyspark-hbase
59+
```
5960
A successfull message is as follows:
6061

6162
You are using Spark SQL on HBase!!!
6263
HBaseSQLContext available as hsqlContext.
6364

6465
To run a python script, the PYTHONPATH environment should be set to the "python" directory of the Spark-HBase installation. For example,
65-
66-
export PYTHONPATH=/root-of-Spark-HBase/python
66+
```
67+
export PYTHONPATH=/root-of-Spark-HBase/python
68+
```
6769

6870
Note that the shell commands are not included in the Zip file of the Spark release. They are for developers' use only for this version of 1.0.0. Instead, users can use "$SPARK_HOME/bin/spark-shell --packages Huawei-Spark/Spark-SQL-on-HBase:1.0.0" for SQL shell or "$SPARK_HOME/bin/pyspark --packages Huawei-Spark/Spark-SQL-on-HBase:1.0.0" for Pythin shell.
6971

@@ -72,13 +74,13 @@ Note that the shell commands are not included in the Zip file of the Spark relea
7274
Testing first requires [building Spark HBase](#building-spark). Once Spark HBase is built ...
7375

7476
Run all test suites from Maven:
75-
76-
mvn -Phbase,hadoop-2.4 test
77-
77+
```
78+
mvn -Phbase,hadoop-2.4 test
79+
```
7880
Run a single test suite from Maven, for example:
79-
80-
mvn -Phbase,hadoop-2.4 test -DwildcardSuites=org.apache.spark.sql.hbase.BasicQueriesSuite
81-
81+
```
82+
mvn -Phbase,hadoop-2.4 test -DwildcardSuites=org.apache.spark.sql.hbase.BasicQueriesSuite
83+
```
8284
## IDE Setup
8385

8486
We use IntelliJ IDEA for Spark HBase development. You can get the community edition for free and install the JetBrains Scala plugin from Preferences > Plugins.
@@ -93,7 +95,7 @@ To import the current Spark HBase project for IntelliJ:
9395
6. When you run the scala test, sometimes you will get out of memory exception. You can increase your VM memory usage by the following setting, for example:
9496

9597
```
96-
-XX:MaxPermSize=512m -Xmx3072m
98+
-XX:MaxPermSize=512m -Xmx3072m
9799
```
98100

99101
You can also make those setting to be the default by setting to the "Defaults -> ScalaTest".

0 commit comments

Comments
 (0)