JsonServer User Manual
JsonServer User Manual
JsonServer User Manual
Design philosophy
JSON data is popular among web applications. While today’s solutions for
managing JSON have been mostly centered on the concept of NoSQL, we
adopt a different approach, advocating that old SQL technologies have great
values in managing JSON data and can deliver abundant functionality and
exceptional performance. The goal of JSON Server is to augment SQL
databases with efficient JSON processing, so that SQL users enjoy superior
performance and functionality, compared with other NoSQL alternatives.
The second observation that drives our design is that SQL databases are
products of years of research and development. When we examined today’s
JSON offerings, we found that most of their functions have counterparts in
SQL databases. Missing functions can too be found in SQL. We believe that
it is a waste to abandon mature SQL technologies and re-invent them in the
context of JSON. Hence, JSON Server is built on top of SQL databases,
aiming to re-use SQL DB’s functions whenever possible. By using SQL
functions appropriately, the physical data representation and runtime
behavior of JSON Server closely resemble those of native JSON databases.
In fact, SQL databases provide many more sophisticated optimizations that
have been neglected by native JSON databases, giving JSON Server
unparallel performance advantages.
Features
JSON Server is a DLL library through which you manage JSON documents
in SQL Server (version 2012 and onward) or MariaDB (10.0 and onward).
It provides features a standard JSON database is expected to have. In
addition, since JSON Server relies on SQL DBs, it inherits many features
in the relational world that are rarely supported in native JSON databases.
2
JSON Server offers the following major features:
SQL-related features. JSON Server inherits many features from the SQL
database it connects to. Below is a short list of features that are crucial to
administration tasks:
4
JSON Databases and Collections
JSON Server uses SQL databases to host JSON data. JSON Server connects
to an existing SQL database using a connection string. It does not provide
any API’s to manage the database. Any database-level management, such
as creating the database, setting the database properties and adding user
accounts, should be done by executing SQL statements through the
corresponding console of SQL Server and MariaDB.
JSON Server maintains all data and meta-data using SQL objects such as
tables, indexes and stored procedures. These objects are visible to SQL
users with appropriate permissions. As a middleware, JSON Server does
not prevent you from modifying these objects. It is advised that you apply
database operations, such as replication and backup, to the entire database
and do not change these SQL objects. Doing so will result in an
inconsistent/corrupted state of data, from which JSON Server is not able to
recover.
JsonServerConnection jdb =
new JsonServerConnection(connectionString);
try {
// Connects to a database. Creates objects needed by
// JSON Server if they do not exist.
jdb.Open(true);
}
catch(DatabaseException e) {
// Exception handling goes here
}
JsonServerConnection jdb =
new JsonServerConnection(connectionString,DatabaseType.Mariadb);
try {
// Connects to a database. Creates objects needed by
// JSON Server if they do not exist.
jdb.Open(true);
}
catch(DatabaseException e) {
// Exception handling goes here
}
Note that when connecting to MariaDB, you must have the .NET driver for
MySQL installed.
6
}
catch(DatabaseException e) {
// Exception handling goes here
}
When a JSON document is inserted, JSON Server first parses the string
using a JSON parser (i.e., Json.NET). An exception will be thrown by the
JSON parser if the input string is not a valid JSON object. And the whole
transaction will be aborted.
8
catch(JsonDmlException e) {
// Exception handling goes here
}
leaves fewer footprints in the transaction log and therefore achieves better
performance than loading them separately. Further, inserting multiple
documents at a time incurs less overhead on accessing collection-level
meta-data. While the optimal parameter may be data-wise, a few thousand
documents in one invocation is a reasonable number.
10
Querying JSON
Once JSON documents are loaded to the database, you use a query language
to retrieve them. JSON Server supports 3 query languages: MongoDB query
language, Azure DocumentDB query language and JSON Server query
language. The former two are languages of today’s two pouplar JSON
stores. If you are familiar with them, you may directly go to Chapter “Query
Execution” to learn the API’s to execute queries and retrieve results.
JSON Server also provides its own query language, a language that is akin
to XQuery (and LINQ) with slight adjustments to fit the JSON semantics
and application needs. The goal is to provide high expressive power to allow
applications to express complex business logic. This chapter gives an
introduction of the language with numerous examples. It is still helpful to
review XQuery materials, as there are many excellent resources online.
To give an overview of the query syntax, the following query retrieves first
name(s) of the director(s) of “Kung Fu Panda” from movie documents:
FOR md IN (“MovieCollection”)
FOR fn IN md.director.*.firstName
WHERE md.title = “Kung Fu Panda”
SELECT fn AS DirectorFirstName
In the following, we walk through each query construct and provide detailed
examples, showing how to use the language to query JSON data to fit
various application needs. We use the following document as a document
example.
{
“movieID”:101,
“title”:“Kung Fu Panda”,
“releaseDate”:6/06/2008,
“directorList”:[
{
“directorID”:9999,
“firstName”:“John”,
“lastName”:“Stevenson”
},
{
“directorID”:7777,
“firstName”:“Mark”,
“lastName”:“Osborne”,
“DOB”:9/17/1970
}
]
}
Collection variables
12
A collection variable is a variable defined in a FOR clause and bound to
the JSON documents in the specified collection. The collection name is
given in the quotes following the IN keyword. A
JsonQueryCompileException exception will be thrown if the
specified collection name does not exist in the database.
Path expressions
Path expressions are a central construct to locate fields/values inside JSON
documents. A path expression starts with a head variable, followed by a
sequence of field names separated by dots. The field-name sequence is
evaluated w.r.t. a collection of context objects, objects that are binding
values of the head variable of the path. Starting from a context object, the
segment [.nodeName] navigates to all the fields of the object and selects
those whose names match nodeName. The nodeName can be wildcard *,
meaning any field is a match.
FOR m1 IN (“MovieCollection”)
FOR fn IN md.directorList.*.firstName
WHERE md.title = “Kung Fu Panda”
SELECT fn AS DirectorFirstName
This query returns movies directed by someone whose first name is “John”.
The LET clause is a clause defining temporary variables. In this example,
variable dir is bound to the elements of the array directorList. This
temporary variable establishes the context objects of the path expression in
the WHERE clause. Semantically, this path expression is equivalent to the
one that replaces the head variable dir with its full name:
FOR md IN (“MovieCollection”)
WHERE md.directorList.*.firstName = “John”
SELECT Doc(md)
Types
The JSON data model loosely defines a few data types, such as string,
boolean and number. Due to the flexibility of JSON, a path expression may
be bound to values of different types. An example is that two documents
have the same field, but the field values are of different data types, with one
being a string and the other being a number. Such type ambiguity may lead
to unexpected results and/or query semantics. For instance, inequality
comparison has quite different semantics when applying to strings and
numbers.
In this query language, you can include type information in path expressions
to disambiguate data types. The conventional [.nodeName] navigation
in a path expression can be coupled with the type specification, in the form
of [.nodeName:$typeName]. From a context object, only the child
field whose name and type match nodeName and typeName will be
selected.
FOR md IN (“MovieCollection”)
LET dir := md.directorList.[*:$Object]
WHERE dir.firstName = “John”
14
SELECT Doc(md)
Number
Int
Long
Float
Double
Boolean
Bytes
String
Date
Array
Object
Some of the above types are from JSON, while other are defined in the
query language. The type Number is a superset of Int, Long, Float and
Double. And the type String is a superset of ShortString and
LongString. You can use whichever type that fits the need.
While the language provides type specifications, you can opt not to use
them. When type specifications are missing and there is ambiguity, JSON
Server will locate all types of data matching path expressions. The runtime
behavior is then determined by the underlying SQL database: at runtime
SQL DB will try to perform type casting when needed. We refer you to
other online resources for details on rules of type casting in SQL Server and
MariaDB. If type casting causes no issues, the query will be executed
successfully. If type casting fails, a runtime exception will be thrown by
SQL Server or MariaDB.
Joins
Joins are a cornerstone for expressing relationships. They are a product of
expressiveness and are unavoidable in expressive query languages. In JSON
Server, you can express two types of joins: structural joins and value joins.
A structural join is a join that specifies the structural relationship of two
variables in a nested document. It is implicitly described by the intersections
of the variables' path expressions. A value join specifies a relationship
between two variables based on a comparison of their values. This type of
joins is similar to relational joins in SQL and are expressed in the WHERE
clause using boolean expressions.
FOR md IN (“MovieCollection”)
LET fn := md.directorList.*.firstName
LET ln := md.directorList.*.lastName
WHERE fn = “John” AND ln = “Stevenson”
SELECT md.movieID AS MovieID
The above query matches movie documents against two predicates on the
first and last names of directors. In each iteration (or for each movie), the
two variables fn and ln are bound to first and last names of the movie’s
director(s). The subtlety is that the first and last names are not necessarily
from the same director, as directorList is an array field and fn and
ln may be bound to different array elements. A movie directed by two
directors with one named “John Doe” and the other named “Alan
Stevenson”' is also a match to the query.
FOR md IN (“MovieCollection”)
FOR dir IN md.directorList.*
LET fn := dir.firstName
LET ln := dir.lastName
WHERE fn = “John” AND ln = “Stevenson”
16
SELECT md.movieID AS MID, dir.directorID AS DID
This query finds movies directed by “John Stevenson”'. Now fn and ln are
defined inside every iteration of directors, so their binding values refer to
the same person.
In this query, the search criterion is that at least one director is more than 40
years old when the movie is released.
Nested queries
JSON Server’s query language supports nesting in two ways: EXISTS and
IN clauses, and path expressions. EXISTS and IN clauses are no strangers
to people who are familiar with SQL. The usage of path expressions in the
WHERE clause also achieve similar semantics. Specifically, when a path
expression is used as a scalar expression in the WHERE clause, it is bound
to one or more fields/values in a specific iteration. As long as one such value
satisfies the predicates, values of iterative variables in the current iteration
survive.
FOR md IN (“MovieCollection”)
WHERE md.directorList.*.firstName = “John”
SELECT Doc(md)
This query returns movies directed by someone whose first name is “John”.
While there may be more than one director of a movie, as long as one of
them is named “John”, the movie will be returned. The same query can be
expressed using a sub-query as follows:
FOR md IN (“MovieCollection”)
WHERE EXISTS (
FOR dir IN md.directorList.*
WHERE dir.firstName = “John”
SELECT "director ID":dir.directorID
)
SELECT Doc(md)
Array functions
Two fuctions associated with JSON arrays are Position() and
Array(). Position() takes input as a variable or a path expression and
outputs the position of an array element. For either input format, only if the
binding values are array members does the function have semantics. When
a binding value of the input variable/expression is not an array member, the
evaluation of the function is null.
FOR md IN (“MovieCollection”)
LET dir := md.directorList.*
WHERE EXISTS (
FOR dir in md.directorList.*
Where Position(dir)>1 AND Position(dir)<=3 AND
dir.firstName = “John”
SELECT dir
)
SELECT Doc(md)
In this example, the variable dir is bound to directors in the array field
directorList in a subquery. By using the function Pos(), the
subquery targets directors whose array positions are within a specific range.
18
i.e., Doc(dir). They are arranged by the order of Position(n)–
directors’ positions in the orginal array.
Deleting documents
JSON Server’s API’s to insert documents are simple; the inputs of the
insertion method are document string(s) and the collection name. But to
delete documents, you need a full query language to specify the matching
criteria. The DELETE statement is similar to the SELECT statement, except
that there is only one FOR clause defining a collection variable and that the
SELECT clause is replaced with the DELETE clause. By semantics,
documents bound to the collection variable and satisfying the WHERE clause
will be deleted from the database.
FOR md IN (““MovieCollection””)
LET dir := md.directorList.*
WHERE Position(dir) > 3
DELETE md
This statement deletes all movie documents that have more than 3 directors.
Query Execution
20
Executing MongoDB and Azure Document query languages
In addition to the default query language described in last chapter, JSON
Server also supports two other languages from MongoDB and Azure
DocumentDB, two popular JSON stores. To execute queries other than the
default the query language, use QueryType to indicate the type of the
query to be executed.
var q = @"db.MovieCollection.find( { title: 'Kung Fu Panda' } )";
IDataReader reader = jdb.ExecuteReader(q, QueryType.MongoDB);
while (reader.Read())
{
Console.WriteLine(reader[0]);
}
reader.Close();
Transactions
Transactions are an integral part of JSON Server. Every operation is
transactional-safe. In prior discussions, transactions are implicit; they are
committed when InsertJson(), ExecuteReader() or
ExecuteNonReader() returns. But you can control transactions
explicitly. In JSON Server, you start a transaction by instantiating a
JsonTransaction object. The object has similar methods as
IDbTransaction, such as Commit(), Rollback() and
Dispose(). This object is passed into InsertJson(),
ExecuteReader() or ExecuteNonReader() to execute the
statement within the specified transaction.
using JsonServer;
……
try
{
jdb.Open();
JSONTransaction jtx = jdb.BeginTransaction();
jdb.InsertJSON(doc1, “MovieRepo”, jtx);
jdb.InsertJSON(doc2, “MovieRepo”, jtx);
jtx.Commit();
}
catch(Exception e1) {
try {
jtx.Rollback();
}
catch(Exception e2) {
Among all operations, document insertion may change the schema of the
physical SQL objects that hold the data and the meta-data of JSON
documents. When the schema of a SQL object has been changed, the
transaction cannot proceed and JSON Server will abort this transaction
automatically. In such cases, you need to re-try the transaction. Since all
needed physical changes have been persisted, new attempts will succeed
eventually.
22
Indexes
One-dimensional indexes
A one-dimensional index is defined for binding values of a FOR variable in
a JSON query block. Since JSON variables are specified by path
expressions, this JSON variable is bound to fields or values of either
homogeneous or heterougenous types. Depending on the binding values,
one or more SQL indexes will be created physically.
FOR md IN (“MovieCollection”)
FOR fn IN md.directorList.*.firstName
CREATE INDEX fname_index ON (fn)
The statement creates an index on first names of directors of movie
documents. Similar to the DELETE statement, you execute the index
statement through ExecuteNonReader().
using JsonServer;
try {
……
string indexStatement = @“
FOR md IN (“MovieCollection”)
FOR fn IN md.directorList.*.firstName
CREATE INDEX fname_index ON fn”;
jdb.ExecuteNonQuery(indexStatement);
jdb.Close();
……
}
catch (Exception e) {
// Exception handling
}
Multi-dimensional indexes
A multi-dimensional index is an index on more than one JSON variable.
Though similar to multi-column indexes in SQL, an important limitation for
JSON is that for each combination of indexed values, they must NOT satisfy
one of the following conditions:
When one of the above conditions holds, the indexed value combinations
will be Cartesian product of binding values of individual variables,
producing many value combinations that do not physically appear in the
original JSON documents, thereby exploding index sizes and complicating
index maintenance. This is analogous to SQL indexes in that you can create
a multi-dimensional index on columns from the same table, but not on
columns from more than one table.
FOR md IN (“MovieCollection”)
FOR dir IN md.directorList.*
LET fn := dir.firstName
LET ln := dir.lastName
CREATE INDEX name_index ON (fn, ln)
This statement creates an index on first and last names of directors. Since
both fn and ln variables are bound to the same director element in the
directorList array, this two-dimensional index is legitimate.
FOR md IN (“MovieCollection”)
FOR fn := md.directorList.*.firstName
FOR ln := md.directorList.*.lastName
CREATE INDEX name_index ON (fn, ln)
Note that in this statement there is no intermediate variable dir, and fn
and ln may be bound to different directors. As a result, the (fn, ln)
pair will create first-last name combinations that do not physically appear
in JSON documents. When you try to create such an index through
ExecuteNonQuery(), a JSONQueryCompileException
exception will be thrown.
24