0% found this document useful (0 votes)

41 views15 pages

247 Sqlnet Generating Structured Q

This paper proposes a new approach called SQLNet for generating SQL queries from natural language. SQLNet avoids using a sequence-to-sequence model, which can be sensitive to the ordering of elements in the SQL query. Instead, SQLNet uses a sketch-based approach with a dependency graph to determine dependencies between predictions. It also introduces a sequence-to-set model and column attention mechanism. When evaluated on the WikiSQL dataset, SQLNet achieves state-of-the-art results, outperforming the previous best approach by 9-13 percentage points in exact query matching and query result matching.

Uploaded by

vijaysagarb17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views15 pages

247 Sqlnet Generating Structured Q

Uploaded by

vijaysagarb17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

SQLNet: G ENERATING S TRUCTURED Q UERIES F ROM

NATURAL L ANGUAGE W ITHOUT R EINFORCEMENT

L EARNING
Anonymous authors
Paper under double-blind review

A BSTRACT

Synthesizing SQL queries from natural language is a long-standing open problem

and has been attracting considerable interest recently. Toward solving the prob-
lem, the de facto approach is to employ a sequence-to-sequence-style model. Such
an approach will necessarily require the SQL queries to be serialized. Since the
same SQL query may have multiple equivalent serializations, training a sequence-
to-sequence-style model is sensitive to the choice from one of them. This phe-
nomenon is documented as the “order-matters” problem. Existing state-of-the-art
approaches rely on reinforcement learning to reward the decoder when it generates
any of the equivalent serializations. However, we observe that the improvement
from reinforcement learning is limited.
In this paper, we propose a novel approach, i.e., SQLNet, to fundamentally solve
this problem by avoiding the sequence-to-sequence structure when the order does
not matter. In particular, we employ a sketch-based approach where the sketch
contains a dependency graph so that one prediction can be done by taking into
consideration only the previous predictions that it depends on. In addition, we
propose a sequence-to-set model as well as the column attention mechanism to
synthesize the query based on the sketch. By combining all these novel tech-
niques, we show that SQLNet can outperform the prior art by 9% to 13% on the
WikiSQL task.

1 I NTRODUCTION
Semantic parsing is a long-standing open question and has many applications. In particular,
parsing natural language descriptions into SQL queries recently attracts much interest from both
academia (Yaghmazadeh et al., 2017) and industry (Zhong et al., 2017). We refer to this problem
as the natural-language-to-SQL problem (NL2SQL). The de facto standard approach to solve this
problem is to treat both the natural language description and SQL query as sequences and train a
sequence-to-sequence model (Vinyals et al., 2015b) or its variants (Dong & Lapata, 2016) which
can be used as the parser. One issue of such an approach is that different SQL queries may be equiv-
alent to each other due to commutativity and associativity. For example, consider the following two
queries:
SELECT result SELECT result
WHERE score=‘1-0’ AND goal=16 WHERE goal=16 AND score=‘1-0’
The order of the two constraints in the WHERE clause does not affect the execution results of the
query, but syntactically, these two are considered as different queries. It is well-known that the order
of these constraints affects the performance of a sequence-to-sequence-style model (Vinyals et al.,
2016), and it is typically hard to find the best ordering. To mitigate this ordering issue, a typical
approach that has been applied in many scenarios is to employ reinforcement learning (Zhong et al.,
2017; Hu et al., 2017). The basic idea is that, after a standard supervised training procedure, the
model is further trained using a policy gradient algorithm. In particular, given an input sequence,
the decoder of a sequence-to-sequence model samples an output sequence following the output dis-
tribution and computes the reward based on whether the output is a well-formed query and whether
the query will compute the correct results. This reward can be used by the policy gradient algorithm

1
to fine-tune the model. However, the improvement that can be achieved through reinforcement
learning is often limited. For example, on a NL2SQL task called WikiSQL (Zhong et al., 2017),
the state-of-the-art work (Zhong et al., 2017) reports an improvement of only 2% by employing
reinforcement learning.
In this work, we propose SQLNet to fundamentally solve this issue by avoiding the sequence-to-
sequence structure when the order does not matter. In particular, we employ a sketch-based approach
to generate a SQL query from a sketch. The sketch aligns naturally to the syntactical structure of
a SQL query. A neural network, called SQLNet, is then used to predict the content for each slot
in the sketch. Our approach can be viewed as a neural network alternative to the traditional sketch-
based program synthesis approaches (Alur et al., 2013; Solar-Lezama et al., 2006; Bornholt et al.,
2016; Rabinovich et al., 2017; Parisotto et al., 2017). Note that the-state-of-the-art neural network
SQL synthesis approach (Zhong et al., 2017) also employs a sketch-based approach, although their
sketch is more coarse-grained and they employ a sequence-to-sequence structure to fill in the most
challenging slot in the sketch.
As discussed above, the most challenging part is to generate the WHERE clause. Essentially, the
issue with a sequence-to-sequence decoder is that the prediction of the next token depends on all
previously generated tokens. However, different constraints may not have a dependency on each
other. In our approach, SQLNet employs the sketch to provide the dependency relationship of
different slots so that the prediction for each slot is only based on the predictions of other slots that
it depends on. To implement this idea, the design of SQLNet introduces two novel constructions:
sequence-to-set and column attention. The first is designed to predict an unordered set of constraints
instead of an ordered sequence, and the second is designed to capture the dependency relationship
defined in the sketch when predicting.
We evaluate our approach on the WikiSQL dataset (Zhong et al., 2017), which is, to the best of our
knowledge, the only large scale NL2SQL dataset, and compare with the state-of-the-art approach,
Seq2SQL (Zhong et al., 2017). Our approach results in the exact query-match accuracy of 61.5% and
the result-match accuracy of 68.3% on the WikiSQL testset. In other words, SQLNet can achieve
exact query-match and query-result-match accuracy of 7.5 points and 8.9 points higher than the
corresponding metrics of Seq2SQL respectively, yielding the new state-of-the-art on the WikiSQL
dataset.
The WikiSQL dataset was originally proposed to ensure that the training set and test set have a
disjoint set of tables. In the practical setting, it is more likely that such an NL2SQL solution is
deployed where there exists at least one query observed in the training set for the majority of tables.
We re-organize the WikiSQL dataset to simulate this case and evaluate our approach and the base-
line approach, Seq2SQL. We observe that in such a case the advantage of SQLNet over Seq2SQL
enlarges by 2 points, and the SQLNet model can achieve an execution accuracy of 70.1%.
To summarize, our main contributions in this work are three-fold. First, we propose a novel
principled approach to handle the sequence-to-set generation problem. Our approach avoids the
“order-matters” problems in a sequence-to-sequence model and thus avoids the necessity to employ
a reinforcement learning algorithm, and achieves a better performance than existing sequence-to-
sequence-based approach. Second, we propose a novel attention structure called column attention,
and show that this helps to further boost the performance over a raw sequence-to-set model. Last,
we design SQLNet which bypasses the previous state-of-the-art approach by 9 to 13 points on the
WikiSQL dataset, and yield the new state-of-the-art on an NL2SQL task.

2 SQL Q UERY S YNTHESIS FROM NATURAL L ANGUAGE Q UESTIONS AND

TABLE S CHEMA

In this work, we consider the WikiSQL task proposed in (Zhong et al., 2017). In particular, the input
contains two parts: a natural language question stating the query for a table, and the schema of the
table being queried. The schema of a table contains both the name and the type (i.e., real numbers
or strings) of each column. The output is a SQL query which reflects the natural language question
with respect to the queried table.

2
Table Question:
Years for School/Club
Player No. Nationality Position Which country is Jim Les from?
Jazz Team
Stu Lantz 22 United States Guard 1974-75 Nebraska
Rusty LaRue 5 United States Guard 2001-02 Wake Forest
SQL: Result:
Eric Leckner 45 United States Forward-Center 1988-90 Wyoming
Ron Lee 18 United States Guard 1979-80 Oregon
SELECT Nationality United States
Jim Les 25 United States Guard 1988-89 Bradley
WHERE Player = Jim Les

Figure 1: An example of the WikiSQL task.

Note that the WikiSQL task considers synthesizing a SQL query with respect to only one table.
Thus, in an output SQL query, only the SELECT clause and the WHERE clause need to be predicted,
and the FROM clause can be omitted. We present an example in Figure 1.
The WikiSQL task makes further assumptions to make it tractable. First, it assumes that each col-
umn name is a meaningful natural language description so that the synthesis task is tractable from
only the natural language question and column names. Second, any token in the output SQL query
is either a SQL keyword or a sub-string of the natural language question. For example, when gen-
erating a constraint in the WHERE clause, e.g., name=‘Bob’, the token ‘Bob’ must appear in
the natural language question as a sub-string. This assumption is necessary when the content in a
database table is not given as an input. Third, each constraint in the WHERE clause has the form
of COLUMN OP VALUE, where COLUMN is a column name, OP is one of “<, =, >, ≥, ≤”, and
VALUE is a substring of the natural language question as explained above.
Although these constraints seem to overly simplify the problem, we argue that such a subset of SQL
queries still has a significant impact in practice. In fact, Johnson et al. (2017) studied 8.1 million
real-world SQL queries written by Uber’s data analysts, and found that at least 37% of them involve
only one table and all constraints in the WHERE clause involve only one column. Thus, solving
this problem can significantly reduce the burden of data analysts by reducing more than 1/3 of the
workload in practice.
More importantly, different from most previous NL2SQL datasets (Tang & Mooney, 2001; Price,
1990; Dahl et al., 1994; Li & Jagadish, 2014; Pasupat & Liang, 2015; Yin et al., 2015), the WikiSQL
task has several properties that we would like. First, it provides a large-scale dataset so that a
neural network can be effectively trained. Note, previously famous data sets such as (Dahl et al.,
1994) contain less than 10,000 examples; but modern deep learning models, e.g., a sequence-to-
sequence model, typically require much more data. WikiSQL mitigates this issue by providing a
larger dataset. Second, it employs crowd-sourcing to collect the natural language questions created
by human beings, so that it can help to overcome the issue that a well-trained model may overfit to
template-synthesized descriptions.
Third, we are interested in the SQL synthesis problem in an enterprise setting. In such a setting, the
database may contain billions of users’ sensitive information, and thus how to handle the scalability
of the data and how to ensure privacy are important problems. Therefore, we prefer a solution that
synthesizes SQL queries from only the natural language description and the table schema. Such
requirements make many existing studies on question-answering (e.g., Sun et al. (2016)) and ta-
ble content-based SQL synthesis approaches (e.g., Yin et al. (2015)) unsuitable. In this sense, the
WikiSQL task perfectly fits our requirement to mitigate the scalability and privacy issues.
Fourth, the data is split so that the training, dev, and test set do not share tables. This helps to evaluate
an approach’s capability to generalize to an unseen schema. Previous datasets may have one or more
of such four properties, but to the best of our knowledge, we are not aware of any NL2SQL dataset
having all these properties except WikiSQL.
Besides these benefits, WikiSQL is still a challenging task. Zhong et al. (2017) report that the state-
of-the-art task-agnostic semantic parsing model (Dong & Lapata, 2016) can achieve an execution
accuracy of merely 37%, while the prevous state-of-the-art model for this task can achieve an exe-
cution accuracy of around 60%. Therefore, we believe tackling the WikiSQL task is a meaningful

3
Natural Language Question

SELECT WHERE
SELECT Column1 Columnn
Column
OP1 … OPn
SELECT $AGG $COLUMN SELECT
WHERE $COLUMN $OP $VALUE Aggregator VALUE1 VALUEn
(AND $COLUMN $OP $VALUE)*
(a) SQL Sketch (b) Graphical illustration of the dependency in a sketch

Figure 2: Sketch syntax and the dependency in a sketch

and challenging first step toward eventually solving the NL2SQL problem. We consider building
and tackling the SQL synthesis task of more complex queries as important future work.

3 SQLNet

In this section, we present our SQLNet solution to tackle the WikiSQL task. Different from ex-
isting semantic parsing models (Dong & Lapata, 2016) which are designed to be agnostic to the
output grammar, our basic idea is to employ a sketch, which highly aligns with the SQL grammar.
Therefore, SQLNet only needs to fill in the slots in the sketch rather than to predict both the output
grammar and the content.
The sketch is designed to be generic enough so that all SQL queries of interest can be expressed
by the sketch. Therefore, using the sketch does not hinder our approach’s generalizability. We will
explain the details of a sketch in Section 3.1.
The sketch captures the dependency of the predictions to make. By doing so, the prediction of the
value of one slot is only conditioned on the values of those slots that it depends on. This avoids the
“order matters” problem in a sequence-to-sequence model, in which one prediction is conditioned
on all previous predictions (Vinyals et al., 2016). To make predictions based on a sketch, we de-
velop two techniques, sequence-to-set and column attention. We will explain the details of these
techniques in Section 3.2.
We combine all techniques to design a SQLNet neural network to synthesize a SQL query from a
natural language question and a table schema. In Section 3.3, we present the details of SQLNet and
training details to surpass previous state-of-the-art approach without using reinforcement learning.

3.1 S KETCH - BASED QUERY SYNTHESIS

The SQL sketch that we employ is formally stated in Figure 2a. The tokens in bold (i.e., SELECT,
WHERE, and AND) indicate the SQL keywords. The tokens starting with “$” indicate the slot to
be filled. The name following the “$” indicates the type of the prediction. For example, the $AGG
slot can be filled with either an empty token or one of the aggregation operators, such as SUM and
MAX. The $COLUMN and the $VALUE slots need be filled with a column name and a sub-string of
the question respectively. The $OP slot can take a value from {=, <, >}. The notion (...)∗ employ
a regular expression to indicate zero or more AND clauses.
The dependency graph of the sketch is illustrated in Figure 2b. All slots whose values are to be
predicted are illustrated as boxes, and each dependency is depicted as a directed edge. For example,
the box of OP1 has two incoming edges from Column1 and the natural language question respec-
tively. These edges indicate that the prediction of the value for OP1 depends on both the values of
Column1 and the natural language question. We can view our model as a graphical model based on
this dependency graph, and the query synthesis problem as an inference problem on the graph. From
this perspective, we can see that the prediction of one constraint is independent with another, and

4
thus our approach can fundamentally avoid the “order-matters” problem in a sequence-to-sequence
model.
Note that although it is simple, this sketch is expressive enough to represent all queries in the Wik-
iSQL task. Our SQLNet approach is not limited to this sketch only. To synthesize more complex
SQL queries, we can simply employ a sketch that supports a richer syntax. In fact, the state-of-
the-art approach on the WikiSQL task, i.e., Seq2SQL (Zhong et al., 2017), can also be viewed as a
sketch-based approach. In particular, Seq2SQL predicts for $AGG and $COLUMN separately from
the WHERE clause. However, Seq2SQL generates the WHERE clause using a sequence-to-sequence
model. Thus it still suffers the “order-matters” problem.

3.2 S EQUENCE - TO - SET PREDICTION USING COLUMN ATTENTION

In this section, we use the prediction of a column name in the WHERE clause as an example to explain
the ideas of a sequence-to-set model and column attention. We will explain the full SQLNet model
in Section 3.3.

Sequence-to-set. Intuitively, the column names appearing in the WHERE clause constitute a subset
of the full set of all column names. Therefore, instead of generating a sequence of column names,
we can simply predict which column names appear in this subset of interest. We refer to this idea as
sequence-to-set prediction.
In particular, we compute the probability Pwherecol (col|Q), where col is a column name and Q is
the natural language question. To this aim, one idea is to compute Pwherecol (col|Q) as

Pwherecol (col|Q) = σ(uTc Ecol + uTq EQ ) (1)

where σ is the sigmoid function, Ecol and EQ are the embeddings of the column name and the
natural language question respectively, and uc and uq are two column vectors of trainable variables.
Here, the embeddings Ecol and EQ can be computed as the hidden states of a bi-directional LSTM
running on top of the sequences of col and Q respectively. Note the two LSTMs to encode the
column names and the question do not share their weights. The dimensions of uc , uq , Ecol , EQ are
all d, which is the dimension of the hidden states of the LSTM.
In doing so, the decision of whether or not to include a particular column in the WHERE clause can
be made independently to other columns by examining Pwherecol (col|Q).

Column attention. Equation (1) has a problem of using EQ . Since it is computed as the hidden
states of the natural language question only, it may not be able to remember the particular informa-
tion useful in predicting a particular column name. For example, in the question in Figure 1, the
token “number” is more relevant to predicting the column “No.” in the WHERE clause. However,
the token “player” is more relevant to predicting the “player” column in the SELECT clause.
The embedding should reflect the most relevant information in the natural language question when
predicting on a particular column.
To incorporate this intuition, we design the column attention mechanism to compute EQ|col instead
of EQ . In particular, we assume HQ is a matrix of d×L, where L is the length of the natural language
question. The i-th column of HQ represents the hidden states output of the LSTM corresponding to
the i-th token of the question.
We compute the attention weights w for each token in the question. In particular, w is a L-dimension
column vector, which is computed as

w = softmax(v) vi = (Ecol )T W HQ
i
∀i ∈ {1, ..., L}
i
where vi indicates the i-th dimension of v, HQ indicates the i-th column of HQ , and W is a trainable
matrix of size d × d.
After the attention weights w are computed, we can compute EQ|col as the weighted sum of each
token’s LSTM hidden output based on w:
EQ|col = HQ w

5
We can replace EQ with EQ|col in Equation (1) to get the column attention model:
Pwherecol (col|Q) = σ(uTc Ecol + uTq EQ|col ) (2)
In fact, we find that adding one more layer of affine transformation before the σ operator can improve
the prediction performance by around 1.5%. Thus, we get the final model for predicting column
names in the WHERE clause:
Pwherecol (col|Q) = σ((ucol T col col
a ) tanh(Uc Ecol + Uq EQ|col )) (3)
where Uccol and Uqcol are trainable matrices of size d × d, and ucol
a is a d-dimensional trainable
vector.
We want to highlight that column attention is a special instance of the generic attention mechanism
to compute the attention map on a question conditioned on the column names. We will show in our
evaluation that this mechanism can improve upon a sequence-to-set model by around 3 points.

3.3 SQLNet MODEL AND TRAINING DETAILS

In this section, we present the full SQLNet model and training details. As illustrated in Figure 2b,
the predictions of the SELECT clause and WHERE clause are separated. In the following, we first
present the model for generating the WHERE clause and then the SELECT clause. In the end, we
describe more training details which significantly help to improve the prediction accuracy.

3.3.1 P REDICTING THE WHERE CLAUSE

The WHERE clause is the most complex structure to predict in the WikiSQL task. Our SQLNet
model first predicts the set of columns that appear in the WHERE clause based on Section 3.2, and
then for each column it generates the constraint by predicting the OP and VALUE slots. We describe
them below.

Column slots. After Pwherecol (col|Q) is computed based on Equation (3), SQLNet needs to
decide which columns to include in the WHERE. One approach is to set a threshold τ ∈ (0, 1), so
that all columns with Pwherecol (col|Q) ≥ τ are chosen.
However, we find that an alternative approach can typically give a better performance. We now
explain this approach. In particular, we use a network to predict the total number K of columns to
be included in the subset, and choose the top-K columns with the highest Pwherecol (col|Q) to form
the column names in the WHERE clause.
We observe that most queries have a limited number of columns in their WHERE clauses. Therefore,
we set an upper-bound N on the number of columns to choose, and thus we cast the problem to
predict the number of columns as a (N + 1)-way classification problem (from 0 to N ). In particular,
we have
P#col (K|Q) = softmax(U1#col tanh(U2#col EQ|Q ))i
where U1#col and U2#col are trainable matrices of size (N + 1) × d and d × d respectively. The
notion softmax(...)i indicates the i-th dimension of the softmax output, and we will use this notion
throughout the rest of the description. SQLNet chooses the number of columns K that maximizes
P#col (K|Q).
In our evaluation, we simply choose N = 4 to simplify our evaluation setup. But note that we can
get rid of the hyper-parameter N by employing a variant-length prediction model, such as the one
for the SELECT column prediction model that will be discussed in Section 3.3.2.

OP slot. For each column in the WHERE clause, predicting the value of its OP slot is a 3-way
classifications: the model needs to choose from {=, >, <}. Therefore, we compute
Pop (i|Q, col) = softmax(U1op tanh(Ucop Ecol + Uqop EQ|col ))i
where col is the column under consideration, U1op , Ucop , Uqop are trainable matrices of size 3 × d,
d × d, and d × d respectively. Note that EQ|col is used in the right-hand side. This means that
SQLNet uses column attention for OP prediction to capture the dependency in Figure 2b.

6
VALUE slot. For the VALUE slot, we need to predict a substring from the natural language ques-
tion. To this end, SQLNet employs a sequence-to-sequence structure to generate the sub-string.
Note that, here the order of the tokens in the VALUE slot indeed matters. Therefore, using a
sequence-to-sequence structure is reasonable.
The encoder phase still employs a bi-directional LSTM. The decoder phase computes the distribution
of the next token using a pointer network (Vinyals et al., 2015a; Yang et al., 2016) with the column
attention mechanism. In particular, consider the hidden state of the previously generated sequence is
i
h, and the LSTM output for each token in the natural language question is HQ . Then the probability
of the next token in VALUE can be computed as
Pval (i|Q, col, h) = softmax(a(h))
a(h)i = (uval )T tanh(U1val HQ
i
+ U2val Ecol + U3val h) ∀i ∈ {1, ..., L}
where uval val val
a is a d-dimensional trainable vector, Uh , Uc , Uq
val
are three trainable matrices of size
d × d, and L is the length of the natural language question. Note that the computation of the a(h)i
is using the column attention mechanism, which is similar in the computation of EQ|col .
Note that Pval (i|Q, col, h) represents the probability that the next token to generate is the i-th token
in the natural language question.SQLNet simply chooses the most probable one for each step to
generate the sequence. Note that the hENDi token also appears in the question. The SQLNet model
stops generating for VALUE when the hENDi token is predicted.

3.3.2 P REDICTING THE SELECT CLAUSE

The SELECT clause has an aggregator and a column name. The prediction of the column name
in the SELECT clause is quite similar to the WHERE clause. The main difference is that for the
SELECT clause, we only need to select one column among all. Therefore, we compute
Pselcol (i|Q) = softmax(sel)i
seli = (usel T sel sel
a ) tanh(Uc Ecoli + Uq EQ|coli ) ∀i ∈ {1, ..., C}
Here, usel sel sel col col
a , Uc , Uq are similar to ua , Uc , Uq
col
in (3), and C is the total number of columns.
Notice that each different dimension of the vector sel is computed based on a corresponding column
col i . The model will predict the column coli that maximizes Pselcol (i|Q).
For the aggregator, assuming the predicted column name for the SELECT clause is col , we can
simply compute
Pagg (i|Q, col) = softmax(U agg tanh(Ua EQ|col ))i
where Ua is a trainable matrix of size 6 × d. Notice that the prediction of the aggregator shares a
similar structure as OP.

3.3.3 T RAINING DETAILS

In this section, we present more details to make our experiments reproducible. We also emphasize
on the details that can improve our model’s performance.

Input encoding model details. Both natural language descriptions and column names are treated
as a sequence of tokens. We use the Stanford CoreNLP tokenizer (Manning et al., 2014) to parse
the sentence. Each token is represented as a one-hot vector and fed into a word embedding vector
before feeding them into the bi-directional LSTM. To this end, we use the GloVe word embedding
(Pennington et al., 2014).

Training details. We need a special loss to train the sequence-to-set model. Intuitively, we design
the loss to reward the correct prediction while penalizing the wrong prediction. In particular, given a
question Q and a set of C columns col, assume y is a C-dimensional vector where yj = 1 indicates
that the j-th column appears in the ground truth of the WHERE clause; and yj = 0 otherwise.
Then we minimize the following weighted negative log-likelihood loss to train the sub-model for
Pwherecol :
X C
loss(col, Q, y) = − (αyj log Pwherecol (colj |Q) + (1 − yj ) log(1 − Pwherecol (colj |Q))
j=1

7
In this function, the weight α is hyper-parameter to balance the positive data versus negative data.
In our evaluation, we choose α = 3. For all other sub-module besides Pwherecol , we minimize the
standard cross-entropy loss.
We choose the size of the hidden states to be 100. We use the Adam optimizer (Kingma & Ba, 2014)
with a learning rate 0.001. We train the model for 200 epochs and the batch size is 64. We randomly
re-shuffle the training data in each epoch.

Weight sharing details. The model contains multiple LSTMs for predicting different slots in the
sketch. In our evaluation, we find that using different LSTM weights for predicting different slots
yield better performance than making them share the weights. However, we find that sharing the
same word embedding vector helps to improve the performance. Therefore, different components
in SQLNet only share the word embedding.

Training the word embedding. In Seq2SQL, Zhong et al. (2017) suggest that the word embed-
ding for tokens appearing in GloVe should be fixed during training. However, we observe that the
performance can be boosted by 2 points when we allow the word embedding to be updated during
training. Therefore, we initialize the word embedding with GloVe as discussed above, and allow
them to be trained during the Adam updates after 100 epochs.

4 E VALUATION

In this section, we evaluate SQLNet versus the state-of-the-art approach, i.e., Seq2SQL (Zhong
et al., 2017), on the WikiSQL dataset.
In the following, we first present the evaluation setup. Then we present the comparison between
our approach and Seq2SQL on the query synthesis accuracy, as well as a break-down comparison
on different sub-tasks. In the end, we propose another variant of the WikiSQL dataset to reflect
another application scenario of the SQL query synthesis task and present our evaluation results of
our approach versus Seq2SQL.

4.1 E VALUATION SETUP

In this work, we focus on the WikiSQL dataset (Zhong et al., 2017). The dataset was updated on
October 16, 2017. In our evaluation, we use the updated version.
We compare our work with Seq2SQL, the state-of-the-art approach on the WikiSQL task. We com-
pare SQLNet with Seq2SQL using three metrics to evaluate the query synthesis accuracy:

1. Logical-form accuracy. We directly compare the synthesized SQL query with the ground
truth to check whether they match each other. This metric is used in (Zhong et al., 2017).
2. Query-match accuracy. We convert the synthesized SQL query and the ground truth into a
canonical representation and compare whether two SQL queries match exactly. This metric
can eliminate the false negatives due to only the ordering issue.
3. Execution accuracy. We execute both the synthesized query and the ground truth query
and compare whether the results match to each other. This metric is used in (Zhong et al.,
2017).

We are also interested in the break-down results on different sub-tasks: (1) the aggregator in the
SELECT clause; (2) the column in the SELECT clause; and (3) the WHERE clause. Due to the
different structure, it is hard to make a further fine-grained comparison.
We implement SQLNet using PyTorch (Facebook, 2017). For the baseline approach in our compar-
ison, i.e., Seq2SQL, we compare our results with the numbers reported by Zhong et al. (2017).
However, Zhong et al. (2017) do not include the break-down results for different sub-tasks, and
the source code is not available. To solve this issue, we re-implement Seq2SQL by ourselves. For
evaluations whose results are not reported in (Zhong et al., 2017), we report the results from our
re-implementation and compare SQLNet against those as the baseline.

8
dev test
Acclf Accqm Accex Acclf Accqm Accex
Seq2SQL (Zhong et al. (2017)) 49.5% - 60.8% 48.3% - 59.4%
Seq2SQL (ours) 52.5% 53.5% 62.1% 50.8% 51.6% 60.4%
SQLNet - 63.2% 69.8% - 61.3% 68.0%

Table 1: Overall result on the WikiSQL task. Acclf , Accqm , and Accex indicate the logical form,
query-match and the execution accuracy respectively.

dev test
Accagg Accsel Accwhere Accagg Accsel Accwhere
Seq2SQL (ours) 90.0% 89.6% 62.1% 90.1% 88.9% 60.2%
Seq2SQL (ours, C-order) - - 63.3% - - 61.2%
SQLNet (Seq2set) - - 69.1% - - 67.1%
SQLNet (Seq2set+CA) 90.1% 91.1% 72.1% 90.3% 90.4% 70.0%
SQLNet (Seq2set+CA+WE) 90.1% 91.5% 74.1% 90.3% 90.9% 71.9%

Table 2: Break down result on the WikiSQL dataset. Seq2SQL (C-order) indicates that after
Seq2SQL generates the WHERE clause, we convert both the prediction and the ground truth into
a canonical order when being compared. Seq2set indicates that the sequence-to-set technique is
employed. +CA indicates that column attention is used. +WE indicates that the word embedding is
allowed to be trained. Accagg and Accsel indicate the accuracy on the aggregator and column pre-
diction accuracy on the SELECT clause, and Accwhere indicates the accuracy to generate the WHERE
clause.

4.2 E VALUATION ON THE W IKI SQL TASK

Table 1 presents the results for query synthesis accuracy of our approach and Seq2SQL. We first
observe that our re-implementation of Seq2SQL yields better result than that reported in (Zhong
et al., 2017). Since we do not have access to the source code of the original implementation, we
cannot analyze the reason.
We observe that SQLNet outperforms Seq2SQL (even our version) by a large margin. On the
logical-form metric, SQLNet outperforms our re-implementation of Seq2SQL by 10.7 points on
the dev set and by 10.5 points on the test set. These advancements are even larger to reach 13.7
points and 13.0 points respectively if we compare with the original results reported in (Zhong et al.,
2017). Note that even if we eliminate the false negatives of Seq2SQL by considering the query-
match accuracy, the gap is only closed by 1 point, and still remains as large as 9.7 points. We
attribute the reason to that Seq2SQL employs a sequence-to-sequence model and thus suffers the
“order-matters” problem, while our sequence-to-set-based approach can entirely solve this issue.
On the execution accuracy metric, SQLNet is better than Seq2SQL (reported in Zhong et al. (2017))
by 9.0 points and 8.6 points respectively on the dev and test sets. Although they are still large, the
advancements are not as large as those on the query-match metric. This phenomenon shows that, for
some of the queries that Seq2SQL cannot predict exactly correct, (e.g., maybe due to the lack of one
constraint in the WHERE clause), the execution results are still correct. We want to highlight that the
execution accuracy is sensitive to the data in the table, which contributes to the difference between
query-match accuracy and execution accuracy.

4.3 A BREAK - DOWN ANALYSIS ON THE W IKI SQL TASK

We would like to further analyze SQLNet’s and Seq2SQL’s performance on different sub-tasks as
well as the improvement provided by different techniques in SQLNet. The results are presented in
Table 2.
We observe that on the SELECT clause prediction, the accuracy is around 90%. This shows that the
SELECT clause is less challenging to predict than the WHERE clause. SQLNet’s accuracy on the

9
Question: SQL:
When twente came in third place and SELECT season
ajax was the winner what are the seasons? WHERE winner = ajax AND third place = twente

Figure 3: An example of disordered pairs.

SELECT column prediction better than Seq2SQL. We attribute this improvement to the reason that
SQLNet employs column attention.
We observe that the biggest advantage of SQLNet over Seq2SQL is on the WHERE clause’s predic-
tion accuracy. The improvement on the WHERE clause prediction is around 11 points to 12 points.
Notice that the order of the constraints generated by Seq2SQL matters. To eliminate this effect,
we evaluate the accuracy based on a canonical order, i.e., Seq2SQL (ours, C-order), in a similar
way as the query-match accuracy. This metric will improve Seq2SQL’s accuracy by 1 point, which
obeys our observation on the overall query-match accuracy of Seq2SQL. However, we still observe
that the SQLNet can outperform Seq2SQL by a large margin. From the break-down analysis, we
can observe that the improvement from the usage of a sequence-to-set architecture is the largest to
achieve around 6 points. The column attention further improves a sequence-to-set only model by 3
points, while allowing training word embedding gives another 2 points’ improvement.
Note that the improvement on the SELECT prediction is around 2 points. The improvements from
two clauses add up to the 13 points to 14 points improvements in total.

4.4 T HE “ ORDER - MATTERS ” EFFECT

In this section, we document the “order-matters” phenomenon in the WikiSQL dataset, and study its
impact on the Seq2SQL model. In WikiSQL, the constraints in the WHERE clause are listed in the
ascending order of the column ID. We say a description-query pair is disordered if a clause in the
description corresponding to a larger column ID appears earlier than another one corresponding to
a smaller column ID. One example is shown in Figure 3. Note, in this example, a user is also likely
to ask “When ajax was the winner and twente came in third place what are the seasons?”. Thus, no
matter how we assign column IDs, one of the two descriptions is disordered with respect to the SQL
query.
We calculate the number of disordered pairs as follows. In fact, for each clause col op val, since val
must appear in the description as a substring, we can locate its first appearance in the description.
By examining these locations for different constraints matches the order of the column IDs, we can
examine whether a the description-query pair is disordered or not. From the WikiSQL dataset, we
can calculate the percentage of disordered pairs as follows:

training dev test

% of disordered samples 6.2% 6.5% 6.2%
% of non-disordered samples (≥ 2 columns) 23.7% 24.4% 24.7%
% of samples with zero or one column 70.1% 69.1% 69.1%

In this table, we split all samples into three parts: (1) disordered samples; (2) non-disordered samples
involving at least 2 columns in the WHERE clause; and (3) the rest containing at most one column
in the WHERE clause. In fact, the second third does not cause the “order-matters” problem at all.
Although we observe non-disordered samples are majority, the disordered ones constitute a non-
negligible portion, i.e., > 6%. Therefore, this portion may likely cause the “order-matters” problem.
We then examine whether the disordered samples indeed contributes to the “order-matters” phe-
nomenon, and whether our proposed seq2set technique helps to mitigate the issue. To do this, we
evaluate the effectiveness of Seq2SQL (our implementation) and the SQLNet (using sequence-to-
set only) on the three parts. We train both models on the entire dataset. Then we split the dev set
and the test set into three parts in the same way as above.
We report the results in Table 3. We observe that our sequence-to-set technique improves the per-
formance on the disordered subset by a larger margin (i.e., 9.5% to 16.5%) on both the dev set and

10
disordered non-disordered (≥ 2 col) rest
Accqm Accex Accqm Accex Accqm Accex
Seq2SQL (ours) 19.5% 33.6% 35.9% 53.9% 62.8% 67.5%
dev SQLNet (Seq2set) 33.1% 43.1% 44.0% 55.4% 65.3% 69.7%
Improvement 13.6% 9.5% 8.1% 1.5% 2.5% 2.2%
Seq2SQL (ours) 18.8% 34.5% 33.4% 50.1% 61.1% 66.5%
test SQLNet (Seq2set) 35.3% 46.2% 43.1% 54.0% 63.7% 68.9%
Improvement 16.5% 11.7% 9.7% 3.9% 2.6% 2.4%

Table 3: Evaluation results on disordered set versus non-disordered set. Accqm , and Accex indicate
the query-match and the execution accuracy respectively.

dev test
Acclf Accqm Accex Acclf Accqm Accex
Seq2SQL (ours) 54.5% 55.6% 63.8% 54.8% 55.6% 63.9%
SQLNet - 65.5% 71.5% - 64.4% 70.3%

Table 4: Overall result on the WikiSQL variant dataset.

the test set than on the other two parts. This observation clearly shows that our sequence-to-set
technique can mitigate the “order-matters” effect.
On the other hand, we observe that on the non-disordered set with two or more columns in the
WHERE constraints, the Accqm scores are also boosted by more than 8 percentage points. We also
attribute this to the “order-matters” effect: the disordered samples in the training set actually hurt
the performance of a Seq2SQL model on the non-disordered samples; and since the sequence-to-
set technique does not suffer from the “order-matters” issue at all, its advantage over Seq2SQL is
large. The Accex is relatively small, i.e., 1.5% to 3.9%. We observe that for some descriptions the
Seq2SQL model actually generates some queries with fewer constraints in the WHERE clause than
the ground truth; but since both queries produce the same results, they are counted as correct when
computing Accex .
Finally, on all samples with one or less constraints in the WHERE clause, the improvements of SQL-
Net over Seq2SQL is not significant. Such samples do not have the “order-matters” issue at all.
We attribute the improvement to the fact that SQLNet generates the constraint following the sketch,
which leverages more grammar information than a sequence-to-sequence model used in Seq2SQL.
This result provides further evidence to show that the improvement on the other two parts are mainly
related to the “order-matters” issue.

4.5 E VALUATION ON A VARIANT OF THE W IKI SQL TASK

In practice, a machine learning model is frequently retrained periodically to reflect the latest dataset.
Therefore, it is more often that when a model is trained, the table in the test set is already seen in the
training set. The original WikiSQL dataset is split so that the training, dev, and test sets are disjoint
in their sets of tables, and thus it does not approximate this application scenario very well.
To better understand different model’s performance in this alternative application scenario, we re-
shuffle the data, so that all the tables appear at least once in the training set.
On this new dataset, we evaluate both SQLNet and Seq2SQL, and the results are presented in
Table 4. We observe that all metrics of both approaches are improved. We attribute this to that
all tables in the test set are observed by the models in the training set. This observation meets our
expectation. The improvement of SQLNet over Seq2SQL (our implementation) remains the same
across different metrics.

11
5 R ELATED W ORK
The study of translating natural language into SQL queries has a long history (Warren & Pereira,
1982; Androutsopoulos et al., 1993; 1995; Popescu et al., 2003; 2004; Li et al., 2006; Giordani &
Moschitti, 2012; Zhang & Sun, 2013; Li & Jagadish, 2014; Wang et al., 2017). Earlier work focuses
on specific databases and requires additional customization to generalize to each new database.
Recent work considers mitigating this issue by incorporating users’ guidance (Li & Jagadish, 2014;
Iyer et al., 2017). In contrast, SQLNet does not rely on human in the loop. Another direction
incorporates the data in the table as an additional input (Pasupat & Liang, 2015; Mou et al., 2016).
We argue that such an approach may suffer scalability and privacy issues when handling large scale
user databases.
SQLizer (Yaghmazadeh et al., 2017) is a related work handling the same application scenario. SQL-
izer is also a sketch-based approach so that it is not restricted to any specific database. Different
from our work, SQLizer (Yaghmazadeh et al., 2017) relies on an off-the-shelf semantic parser (Be-
rant et al., 2013; Manning et al., 2014) to translate a natural language question into a sketch, and
then employs programming language techniques such as type-directed sketch completion and auto-
matic repairing to iteratively refine the sketch into the final query. Since SQLizer does not require
database-specific training and its code is not available, it is unclear how SQLizer will perform on the
WikiSQL task. In this work, we focus on neural network approaches to handle the NL2SQL tasks.
Seq2SQL (Zhong et al., 2017) is the most relevant work and achieves the state-of-the-art on the
WikiSQL task. We use Seq2SQL as the baseline in our work. Our SQLNet approach enjoys all the
benefits of Seq2SQL, such as generalizability to an unseen schema and overcoming the inefficiency
of a sequence-to-sequence model. Our approach improves over Seq2SQL in that by proposing a
sequence-to-set-based approach, we eliminate the sequence-to-sequence structure when the order
does not matter, so that we do not require reinforcement learning at all. These techniques enable
SQLNet to outperform Seq2SQL by 9 points to 13 points.
The problem to parse a natural language to SQL queries can be considered as a special instance to
the more generic semantic parsing problem. There have been many works considering parsing a
natural language description into a logical form (Zelle & Mooney, 1996; Wong & Mooney, 2007;
Zettlemoyer & Collins, 2007; 2012; Artzi & Zettlemoyer, 2011; 2013; Cai & Yates, 2013; Reddy
et al., 2014; Liang et al., 2011; Quirk et al., 2015; Chen et al., 2016). Although they are not handling
the SQL generation problem, we observe that most of them need to be fine-tuned to the specific
domain of interest, and may not generalize.
Dong & Lapata (2016) provide a generic approach, i.e., a sequence-to-tree model, to handle the
semantic parsing problem, which yields the state-of-the-art results on many tasks. This approach
is evaluated in (Zhong et al., 2017), and it has been demonstrated less effective than the Seq2SQL
approach. Thus, we do not include it in our comparison.
Some earlier approaches Wong & Mooney (2007) propose using Synchronous Context-Free Gram-
mars (SCFG) to construct a semantic parser. However, a production rule in a SCFG is essentially a
translation rule from the source language (e.g., English) to the target (e.g., SQL). Therefore, such
an approach requires much more efforts to construct a SCFG and is less robust. On the other hand,
our sketch is only a subset of SQL, which is simple to derive from the full SQL specification. Thus,
our approach is more practical than previous SCFG-based approaches. Some recent works Rabi-
novich et al. (2017) propose incorporation grammar information during program generation. These
approaches are similar to the Seq2SQL baseline that we have compared in our paper. In fact, we con-
sider Seq2SQL as a special version of Rabinovich et al. (2017) that incorporates pointer networks
for WHERE clause generation and reinforcement learning to boost the performance.

6 C ONCLUSION
In this paper, we propose an approach, SQLNet, to handle an NL2SQL task. We observe that
all existing approaches employing a sequence-to-sequence model suffer from the “order-matters”
problem when the order does not matter. Previous attempts using reinforcement learning to solve this
issue bring only a small improvement, e.g., by around 2 points. In our work, SQLNet fundamentally
solves the “order-matters” problem by employing a sequence-to-set model to generate SQL queries

12
when order does not matter. We further introduce the column attention mechanism, which can
further boost a sequence-to-set model’s performance. In total, we observe that our SQLNet system
can improve over the prior art, i.e., Seq2SQL, by a large margin ranging from 9 points to 13 points
on various metrics. This demonstrates that our approach can effectively solve the “order-matters”
problem, and shed new light on novel solutions to structural generation problems when order does
not matter.

R EFERENCES
Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo MK Martin, Mukund Raghothaman, Sanjit A
Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-
guided synthesis. In Formal Methods in Computer-Aided Design (FMCAD), 2013, pp. 1–8. IEEE,
2013.
I Androutsopoulos, G Ritchie, and P Thanisch. Masque/sql an efficient and portable natural language
query interface for relational databases. Database technical paper, Department of AI, University
of Edinburgh, 1993.
Ion Androutsopoulos, Graeme D Ritchie, and Peter Thanisch. Natural language interfaces to
databases–an introduction. Natural language engineering, 1(1):29–81, 1995.
Yoav Artzi and Luke Zettlemoyer. Bootstrapping semantic parsers from conversations. In Pro-
ceedings of the conference on empirical methods in natural language processing, pp. 421–432.
Association for Computational Linguistics, 2011.
Yoav Artzi and Luke Zettlemoyer. Weakly supervised learning of semantic parsers for mapping
instructions to actions. Transactions of the Association for Computational Linguistics, 1:49–62,
2013. ISSN 2307-387X.
Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. Semantic parsing on freebase from
question-answer pairs. In EMNLP, 2013.
James Bornholt, Emina Torlak, Dan Grossman, and Luis Ceze. Optimizing synthesis with metas-
ketches. 51(1):775–788, 2016.
Qingqing Cai and Alexander Yates. Large-scale semantic parsing via schema matching and lexicon
extension. In ACL, 2013.
Xinyun Chen, Chang Liu, Richard Shin, Dawn Song, and Mingcheng Chen. Latent attention for
if-then program synthesis. arXiv preprint arXiv:1611.01867, 2016.
Deborah A Dahl, Madeleine Bates, Michael Brown, William Fisher, Kate Hunicke-Smith, David
Pallett, Christine Pao, Alexander Rudnicky, and Elizabeth Shriberg. Expanding the scope of the
atis task: The atis-3 corpus. In Proceedings of the workshop on Human Language Technology,
pp. 43–48. Association for Computational Linguistics, 1994.
Li Dong and Mirella Lapata. Language to logical form with neural attention. In ACL, 2016.
Facebook. Pytorch, 2017. URL http://pytorch.org/.
Alessandra Giordani and Alessandro Moschitti. Translating questions to SQL queries with genera-
tive parsers discriminatively reranked. In COLING, 2012.
Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Kate Saenko. Learn-
ing to reason: End-to-end module networks for visual question answering. arXiv preprint
arXiv:1704.05526, 2017.
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, and Luke Zettlemoyer.
Learning a neural semantic parser from user feedback. CoRR, abs/1704.08760, 2017.
Noah Johnson, Joseph P Near, and Dawn Song. Practical differential privacy for sql queries using
elastic sensitivity. arXiv preprint arXiv:1706.09479, 2017.

13
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. CoRR,
abs/1412.6980, 2014.
Fei Li and HV Jagadish. Constructing an interactive natural language interface for relational
databases. Proceedings of the VLDB Endowment, 8(1):73–84, 2014.
Yunyao Li, Huahai Yang, and HV Jagadish. Constructing a generic natural language interface for
an xml database. In EDBT, volume 3896, pp. 737–754. Springer, 2006.
Percy Liang, Michael I Jordan, and Dan Klein. Learning dependency-based compositional seman-
tics. pp. 590–599, 2011.
Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David
McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Com-
putational Linguistics (ACL) System Demonstrations, pp. 55–60, 2014.
Lili Mou, Zhengdong Lu, Hang Li, and Zhi Jin. Coupling distributed and symbolic execution for
natural language queries. arXiv preprint arXiv:1612.02741, 2016.
Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, and Push-
meet Kohli. Neuro-symbolic program synthesis. In ICLR, 2017.
Panupong Pasupat and Percy Liang. Compositional semantic parsing on semi-structured tables.
CoRR, abs/1508.00305, 2015.
Jeffrey Pennington, Richard Socher, and Christopher Manning. Glove: Global vectors for word
representation. In Proceedings of the 2014 conference on empirical methods in natural language
processing (EMNLP), pp. 1532–1543, 2014.
Ana-Maria Popescu, Oren Etzioni, and Henry Kautz. Towards a theory of natural language interfaces
to databases. In Proceedings of the 8th international conference on Intelligent user interfaces, pp.
149–157. ACM, 2003.
Ana-Maria Popescu, Alex Armanasu, Oren Etzioni, David Ko, and Alexander Yates. Modern natural
language interfaces to databases: Composing statistical parsing with semantic tractability. In Pro-
ceedings of the 20th international conference on Computational Linguistics, pp. 141. Association
for Computational Linguistics, 2004.
P. J. Price. Evaluation of spoken language systems: the atis domain. In Speech and Natural Lan-
guage: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990, pp.
91–95, 1990.
Chris Quirk, Raymond J Mooney, and Michel Galley. Language to code: Learning semantic parsers
for if-this-then-that recipes. In ACL (1), pp. 878–888, 2015.
Maxim Rabinovich, Mitchell Stern, and Dan Klein. Abstract syntax networks for code generation
and semantic parsing. arXiv preprint arXiv:1704.07535, 2017.
Siva Reddy, Mirella Lapata, and Mark Steedman. Large-scale semantic parsing without question-
answer pairs. Transactions of the Association for Computational Linguistics, 2:377–392, 2014.
Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. Combi-
natorial sketching for finite programs. ACM SIGOPS Operating Systems Review, 40(5):404–415,
2006.
Huan Sun, Hao Ma, Xiaodong He, Wen-tau Yih, Yu Su, and Xifeng Yan. Table cell search for
question answering. In Proceedings of the 25th International Conference on World Wide Web, pp.
771–782. International World Wide Web Conferences Steering Committee, 2016.
Lappoon R. Tang and Raymond J. Mooney. Using multiple clause constructors in inductive logic
programming for semantic parsing. In Proceedings of the 12th European Conference on Machine
Learning, pp. 466–477, Freiburg, Germany, 2001.
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. Pointer networks. In Advances in Neural
Information Processing Systems, pp. 2692–2700, 2015a.

14
Oriol Vinyals, Łukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey Hinton. Gram-
mar as a foreign language. In NIPS, 2015b.
Oriol Vinyals, Samy Bengio, and Manjunath Kudlur. Order matters: Sequence to sequence for sets.
In ICLR, 2016.
Chenglong Wang, Alvin Cheung, and Rastislav Bodik. Synthesizing highly expressive sql queries
from input-output examples. In Proceedings of the 38th ACM SIGPLAN Conference on Program-
ming Language Design and Implementation, pp. 452–466. ACM, 2017.
David HD Warren and Fernando CN Pereira. An efficient easily adaptable system for interpreting
natural language queries. Computational Linguistics, 8(3-4):110–122, 1982.
Yuk Wah Wong and Raymond Mooney. Learning synchronous grammars for semantic parsing with
lambda calculus. In Proceedings of the 45th Annual Meeting of the Association of Computational
Linguistics, pp. 960–967, Prague, Czech Republic, June 2007. Association for Computational
Linguistics.
Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, and Thomas Dillig. Sqlizer: Query synthesis from
natural language. In OOPSLA, 2017.
Zichao Yang, Phil Blunsom, Chris Dyer, and Wang Ling. Reference-aware language models. arXiv
preprint arXiv:1611.01628, 2016.
Pengcheng Yin, Zhengdong Lu, Hang Li, and Ben Kao. Neural enquirer: Learning to query tables
with natural language. arXiv preprint arXiv:1512.00965, 2015.
John M Zelle and Raymond J Mooney. Learning to parse database queries using inductive logic
programming. In Proceedings of the national conference on artificial intelligence, pp. 1050–
1055, 1996.
Luke S. Zettlemoyer and Michael Collins. Online learning of relaxed ccg grammars for parsing to
logical form. In In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural
Language Processing and Computational Natural Language Learning (EMNLP-CoNLL-2007,
pp. 678–687, 2007.
Luke S Zettlemoyer and Michael Collins. Learning to map sentences to logical form: Structured
classification with probabilistic categorial grammars. arXiv preprint arXiv:1207.1420, 2012.
Sai Zhang and Yuyin Sun. Automatically synthesizing sql queries from input-output examples. In
Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on, pp.
224–234. IEEE, 2013.
Victor Zhong, Caiming Xiong, and Richard Socher. Seq2sql: Generating structured queries from
natural language using reinforcement learning. arXiv preprint arXiv:1709.00103, 2017.

P66 B1 M5 Lufthansa PDF
100% (1)
P66 B1 M5 Lufthansa PDF
196 pages
Main Idea and Theme
100% (1)
Main Idea and Theme
44 pages
QPDEC2010 Sem 1
No ratings yet
QPDEC2010 Sem 1
16 pages
1711 04436v1
No ratings yet
1711 04436v1
13 pages
NLQ 262290 5914375 NLQ
No ratings yet
NLQ 262290 5914375 NLQ
8 pages
Nl2sql
No ratings yet
Nl2sql
12 pages
Seq 2 SQL
No ratings yet
Seq 2 SQL
13 pages
A Survey On Text-to-SQL Parsing: Concepts, Methods, and Future Directions
No ratings yet
A Survey On Text-to-SQL Parsing: Concepts, Methods, and Future Directions
19 pages
2504.08600v1
No ratings yet
2504.08600v1
18 pages
Enhancing Text-To-SQL Capabilities of Large Language Models
No ratings yet
Enhancing Text-To-SQL Capabilities of Large Language Models
22 pages
Semantic Parsing For Complex Data Retrieval: Targeting Query Plans vs. SQL For No-Code Access To Relational Databases
No ratings yet
Semantic Parsing For Complex Data Retrieval: Targeting Query Plans vs. SQL For No-Code Access To Relational Databases
17 pages
In Context Reinforcement Learning Based Retrieval Augmented Generation for Text to SQL
No ratings yet
In Context Reinforcement Learning Based Retrieval Augmented Generation for Text to SQL
8 pages
Structure-guided Large Language Models For
No ratings yet
Structure-guided Large Language Models For
24 pages
STaR SQL Self Taught Reasoner for Text to SQL
No ratings yet
STaR SQL Self Taught Reasoner for Text to SQL
11 pages
new
No ratings yet
new
29 pages
A Natural Language Interface To Relational Databases Using An Online Analytic Processing Hypercube
No ratings yet
A Natural Language Interface To Relational Databases Using An Online Analytic Processing Hypercube
18 pages
Syntax_and_Relation_Enhanced_Query_Generation_for_
No ratings yet
Syntax_and_Relation_Enhanced_Query_Generation_for_
12 pages
SQLPaLM
No ratings yet
SQLPaLM
61 pages
Project Report -4
No ratings yet
Project Report -4
5 pages
3583140.3583165
No ratings yet
3583140.3583165
14 pages
Large Language Model Enhanced Text-to-SQL Generation- A Survey
No ratings yet
Large Language Model Enhanced Text-to-SQL Generation- A Survey
18 pages
Few-Shot Text-to-SQL Translation Using Structure
No ratings yet
Few-Shot Text-to-SQL Translation Using Structure
28 pages
research paper
No ratings yet
research paper
32 pages
2409.16751v1
No ratings yet
2409.16751v1
18 pages
RESDSQL
No ratings yet
RESDSQL
9 pages
Plan Then Generate Controlled Data to-Text Generation via Planning
No ratings yet
Plan Then Generate Controlled Data to-Text Generation via Planning
15 pages
NLIDB_PPT(33247)
No ratings yet
NLIDB_PPT(33247)
18 pages
670e4e23bdd7d170839060aa2023.findings-emnlp.227
No ratings yet
670e4e23bdd7d170839060aa2023.findings-emnlp.227
32 pages
2403.09732v4
No ratings yet
2403.09732v4
15 pages
db gpt hub 2024
No ratings yet
db gpt hub 2024
17 pages
10 LLM Based SQL Generation Wi
No ratings yet
10 LLM Based SQL Generation Wi
6 pages
Formation of SQL From Natural Language Query Using NLP: Uma M Sneha V Sneha G
No ratings yet
Formation of SQL From Natural Language Query Using NLP: Uma M Sneha V Sneha G
5 pages
Advanced SAS Interview Questions You'll Most Likely Be Asked
From Everand
Advanced SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Base paper
No ratings yet
Base paper
10 pages
Sequential_Feature_Augmentation_for_Robust_Text-to-SQL
No ratings yet
Sequential_Feature_Augmentation_for_Robust_Text-to-SQL
7 pages
Base Paper
No ratings yet
Base Paper
6 pages
2024 Lrec-Main 539
No ratings yet
2024 Lrec-Main 539
19 pages
Mark
No ratings yet
Mark
3 pages
2.1 Review of Literature: "SQL Generation and PL/SQL Execution From Natural Language Processing"
No ratings yet
2.1 Review of Literature: "SQL Generation and PL/SQL Execution From Natural Language Processing"
11 pages
E-SQL Direct Schema Linking via Question Enrichment In
No ratings yet
E-SQL Direct Schema Linking via Question Enrichment In
20 pages
RATSQL
No ratings yet
RATSQL
12 pages
SCHEMA-R1: A REASONING TRAINING APPROACH FOR SCHEMA LINKING IN TEXT-TO-SQL TASK
No ratings yet
SCHEMA-R1: A REASONING TRAINING APPROACH FOR SCHEMA LINKING IN TEXT-TO-SQL TASK
11 pages
T2S Retrieval
No ratings yet
T2S Retrieval
16 pages
tr#252
No ratings yet
tr#252
98 pages
MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation
No ratings yet
MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation
17 pages
PS-SQL_Phrase-based_Schema-Linking_with_Pre-trained_Language_Models_for_Text-to-SQL_Parsing
No ratings yet
PS-SQL_Phrase-based_Schema-Linking_with_Pre-trained_Language_Models_for_Text-to-SQL_Parsing
5 pages
Project Report -3
No ratings yet
Project Report -3
3 pages
Chess Contextual Harnessing for Efficient SQL Synthesis
No ratings yet
Chess Contextual Harnessing for Efficient SQL Synthesis
41 pages
Chase SQL
No ratings yet
Chase SQL
30 pages
Translation and Optimization of Logic Queries, The Algebraic Approach
No ratings yet
Translation and Optimization of Logic Queries, The Algebraic Approach
8 pages
din sql research paper
No ratings yet
din sql research paper
34 pages
Natural Language To SQL in Low-Code Platforms
No ratings yet
Natural Language To SQL in Low-Code Platforms
11 pages
HCteam_IT_Proposal
No ratings yet
HCteam_IT_Proposal
15 pages
JavaScript Data Structures Explained: A Practical Guide with Examples
From Everand
JavaScript Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
LLM Based Survey Text 1741015993
No ratings yet
LLM Based Survey Text 1741015993
20 pages
preprints202402.0693.v1
No ratings yet
preprints202402.0693.v1
9 pages
Robustfill: Neural Program Learning Under Noisy I/O: Gettoken (Alpha, - 1) - ,' - ' - Tocase (Proper, Gettoken (Alpha, 1) )
No ratings yet
Robustfill: Neural Program Learning Under Noisy I/O: Gettoken (Alpha, - 1) - ,' - ' - Tocase (Proper, Gettoken (Alpha, 1) )
18 pages
CCECE.2019.8861892
No ratings yet
CCECE.2019.8861892
4 pages
2302.13007v3
No ratings yet
2302.13007v3
12 pages
NL2SQL_handbook
No ratings yet
NL2SQL_handbook
181 pages
Solid-SQL Enhanced Schema-linking Based in-context Learning For
No ratings yet
Solid-SQL Enhanced Schema-linking Based in-context Learning For
11 pages
Logical Reasoning
No ratings yet
Logical Reasoning
12 pages
Two End-To-End Quantum-Inspired Deep Neural Networks for Text Classification
No ratings yet
Two End-To-End Quantum-Inspired Deep Neural Networks for Text Classification
11 pages
Neelima-01.01.25 to 17.07.25
No ratings yet
Neelima-01.01.25 to 17.07.25
1 page
On_Duty_report_01.07.2025-15.07.2025
No ratings yet
On_Duty_report_01.07.2025-15.07.2025
188 pages
apply
No ratings yet
apply
1 page
cnu poly
No ratings yet
cnu poly
3 pages
require doc
No ratings yet
require doc
52 pages
25 Jp Plag Report
No ratings yet
25 Jp Plag Report
2 pages
After Five Yr
No ratings yet
After Five Yr
1 page
Solar Scopus
No ratings yet
Solar Scopus
7 pages
achievements
No ratings yet
achievements
1 page
SFPE As Per
No ratings yet
SFPE As Per
8 pages
Ijcna 2022 o 04
No ratings yet
Ijcna 2022 o 04
14 pages
18IF805
No ratings yet
18IF805
99 pages
CN_LabMCQ
No ratings yet
CN_LabMCQ
6 pages
PTS Session Plan Moni TM1
100% (1)
PTS Session Plan Moni TM1
10 pages
Computer MCQ 1000
No ratings yet
Computer MCQ 1000
125 pages
Spyware: Computer Hacking
No ratings yet
Spyware: Computer Hacking
14 pages
Log
No ratings yet
Log
20 pages
OEM Folder For KSS Readme
No ratings yet
OEM Folder For KSS Readme
5 pages
Electro Techniques
No ratings yet
Electro Techniques
2 pages
Module 2 PPP
No ratings yet
Module 2 PPP
37 pages
The Official PRINCE2 Agile Accreditor Sample Examination Papers Terms of Use
No ratings yet
The Official PRINCE2 Agile Accreditor Sample Examination Papers Terms of Use
14 pages
Mesan All Catalogue
No ratings yet
Mesan All Catalogue
450 pages
Hima OPC Server Manual
100% (3)
Hima OPC Server Manual
36 pages
Lec 2
No ratings yet
Lec 2
28 pages
Git HG Rosetta Stone
No ratings yet
Git HG Rosetta Stone
1 page
How To Choose Right CATV Modulator
No ratings yet
How To Choose Right CATV Modulator
7 pages
Docslide - Us - Stages of Management Consulting Engagement Part II
No ratings yet
Docslide - Us - Stages of Management Consulting Engagement Part II
17 pages
Pre-Assessment Test for the Deso Tss Training 2025 (1)
No ratings yet
Pre-Assessment Test for the Deso Tss Training 2025 (1)
20 pages
Techo-Commercial Queries
No ratings yet
Techo-Commercial Queries
3 pages
GDP Write Up PDF
100% (2)
GDP Write Up PDF
23 pages
Nikon Metrology API v3.7 - Reference Manual
No ratings yet
Nikon Metrology API v3.7 - Reference Manual
30 pages
OS5000 Compass Manual
No ratings yet
OS5000 Compass Manual
38 pages
Brother ADS 1000W-1500W User Manual PDF
No ratings yet
Brother ADS 1000W-1500W User Manual PDF
246 pages
Practical-starting-pages
No ratings yet
Practical-starting-pages
3 pages
Software Trimble RealWorks Vs TBC Comparativa 0520
No ratings yet
Software Trimble RealWorks Vs TBC Comparativa 0520
21 pages
70-247Configuring and Deploying a Private Cloud With System Center 2012
No ratings yet
70-247Configuring and Deploying a Private Cloud With System Center 2012
728 pages
Infoistic Company Profile PDF
No ratings yet
Infoistic Company Profile PDF
11 pages
An Introduction To MATLAB: 10.3.1 Creating Matrices
No ratings yet
An Introduction To MATLAB: 10.3.1 Creating Matrices
28 pages
MPS110-120 Users Manual-2019
No ratings yet
MPS110-120 Users Manual-2019
22 pages