Skip to content

Commit f0123f9

Browse files
committed
Initial commit
0 parents  commit f0123f9

26 files changed

+1981
-0
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
**/target
2+
**/dependency-reduced-pom.xml
3+
**/.idea

README.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Apache Flink® SQL Demo
2+
3+
**This repository provides a demo for Flink SQL.**
4+
5+
The demo shows how to:
6+
7+
* Setup Flink SQL with a Hive catalog.
8+
* Use Flink SQL to prototype a query on a small CSV sample data set.
9+
* Run the same query on a larger ORC data set.
10+
* Run the same query as a continuous query on a Kafka topic.
11+
* Run differnet streaming SQL queries including pattern matching with `MATCH_RECOGNIZE`
12+
* Maintain a materialized view in MySQL
13+
14+
### Requirements
15+
16+
The demo is based on Flink's SQL CLI client and uses Docker Compose to setup the training environment.
17+
18+
You **only need [Docker](https://www.docker.com/)** to run this training. </br>
19+
20+
## What is Apache Flink?
21+
22+
[Apache Flink](https://flink.apache.org) is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.
23+
24+
## What is SQL on Apache Flink?
25+
26+
Flink features multiple APIs with different levels of abstraction. SQL is supported by Flink as a unified API for batch and stream processing, i.e., queries are executed with the same semantics on unbounded, real-time streams or bounded, recorded streams and produce the same results. SQL on Flink is commonly used to ease the definition of data analytics, data pipelining, and ETL applications.
27+
28+
The following example shows a SQL query that computes the number of departing taxi rides per hour.
29+
30+
```sql
31+
SELECT
32+
TUMBLE_START(rowTime, INTERVAL '1' HOUR) AS t,
33+
COUNT(*) AS cnt
34+
FROM Rides
35+
WHERE
36+
isStart
37+
GROUP BY
38+
TUMBLE(rowTime, INTERVAL '1' HOUR)
39+
```
40+
41+
----
42+
43+
*Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.*

build-image/Dockerfile

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
###############################################################################
2+
# Licensed to the Apache Software Foundation (ASF) under one
3+
# or more contributor license agreements. See the NOTICE file
4+
# distributed with this work for additional information
5+
# regarding copyright ownership. The ASF licenses this file
6+
# to you under the Apache License, Version 2.0 (the
7+
# "License"); you may not use this file except in compliance
8+
# with the License. You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
###############################################################################
18+
19+
###############################################################################
20+
# Build Click Count Job
21+
###############################################################################
22+
23+
FROM maven:3.6-jdk-8-slim AS builder
24+
25+
# Get UDF code and compile it
26+
COPY ./java/sql-training-udfs /opt/sql-udfs
27+
RUN cd /opt/sql-udfs; \
28+
mvn clean install
29+
30+
# Get data producer code and compile it
31+
COPY ./java/sql-training-data-producer /opt/data-producer
32+
RUN cd /opt/data-producer; \
33+
mvn clean install
34+
35+
###############################################################################
36+
# Build SQL Playground Image
37+
###############################################################################
38+
39+
FROM flink:1.10.0-scala_2.11
40+
41+
ADD VERSION .
42+
43+
# Copy sql-client configuration
44+
COPY sql-client/ /opt/sql-client
45+
46+
# Copy playground UDFs
47+
COPY --from=builder /opt/sql-udfs/target/sql-training-udfs-*.jar /opt/sql-client/lib/
48+
49+
# Copy data producer
50+
COPY --from=builder /opt/data-producer/target/sql-training-data-producer-*.jar /opt/data/data-producer.jar
51+
52+
# Download connector libraries
53+
RUN wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/org/apache/flink/flink-json/${FLINK_VERSION}/flink-json-${FLINK_VERSION}.jar; \
54+
wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/${FLINK_VERSION}/flink-sql-connector-kafka_2.11-${FLINK_VERSION}.jar; \
55+
wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/org/apache/flink/flink-jdbc_2.11/1.10.0/flink-jdbc_2.11-1.10.0.jar; \
56+
wget -P /opt/sql-client/lib/ https://repo.maven.apache.org/maven2/mysql/mysql-connector-java/8.0.19/mysql-connector-java-8.0.19.jar; \
57+
# Create data folders
58+
mkdir -p /opt/data; \
59+
mkdir -p /opt/data/stream; \
60+
# Download data files
61+
wget -O /opt/data/driverChanges.txt.gz 'https://drive.google.com/uc?export=download&id=1pf4tfv-YpoVQ9_O0948M8oXeCfVH-0MH'; \
62+
wget -O /opt/data/fares.txt.gz 'https://drive.google.com/uc?export=download&id=1SriiwcIdMvY7uJsWSY4Hhh32iO3F4ND2'; \
63+
wget -O /opt/data/rides.txt.gz 'https://drive.google.com/uc?export=download&id=1gY8W07OFvB7_4lHlAyingM4WQzs0_8lT';
64+
65+
# Copy configuration
66+
COPY conf/* /opt/flink/conf/
67+
68+
WORKDIR /opt/sql-client
69+
ENV SQL_CLIENT_HOME /opt/sql-client

0 commit comments

Comments
 (0)