Skip to content

Commit de36004

Browse files
committed
Add basic JSON_TABLE() functionality
JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c97 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
1 parent a9d6c38 commit de36004

35 files changed

+2943
-52
lines changed

doc/src/sgml/func.sgml

+334
Original file line numberDiff line numberDiff line change
@@ -18859,6 +18859,340 @@ DETAIL: Missing "]" after array dimensions.
1885918859
</tgroup>
1886018860
</table>
1886118861
</sect2>
18862+
18863+
<sect2 id="functions-sqljson-table">
18864+
<title>JSON_TABLE</title>
18865+
<indexterm>
18866+
<primary>json_table</primary>
18867+
</indexterm>
18868+
18869+
<para>
18870+
<function>JSON_TABLE</function> is an SQL/JSON function which
18871+
queries <acronym>JSON</acronym> data
18872+
and presents the results as a relational view, which can be accessed as a
18873+
regular SQL table. You can use <function>JSON_TABLE</function> inside
18874+
the <literal>FROM</literal> clause of a <literal>SELECT</literal>,
18875+
<literal>UPDATE</literal>, or <literal>DELETE</literal> and as data source
18876+
in a <literal>MERGE</literal> statement.
18877+
</para>
18878+
18879+
<para>
18880+
Taking JSON data as input, <function>JSON_TABLE</function> uses a JSON path
18881+
expression to extract a part of the provided data to use as a
18882+
<firstterm>row pattern</firstterm> for the constructed view. Each SQL/JSON
18883+
value given by the row pattern serves as source for a separate row in the
18884+
constructed view.
18885+
</para>
18886+
18887+
<para>
18888+
To split the row pattern into columns, <function>JSON_TABLE</function>
18889+
provides the <literal>COLUMNS</literal> clause that defines the
18890+
schema of the created view. For each column, a separate JSON path expression
18891+
can be specified to be evaluated against the row pattern to get an SQL/JSON
18892+
value that will become the value for the specified column in a given output
18893+
row.
18894+
</para>
18895+
18896+
<para>
18897+
The rows produced by <function>JSON_TABLE</function> are laterally
18898+
joined to the row that generated them, so you do not have to explicitly join
18899+
the constructed view with the original table holding <acronym>JSON</acronym>
18900+
data.
18901+
</para>
18902+
18903+
<para>
18904+
The syntax is:
18905+
</para>
18906+
18907+
<synopsis>
18908+
JSON_TABLE (
18909+
<replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> AS <replaceable>json_path_name</replaceable> </optional> <optional> PASSING { <replaceable>value</replaceable> AS <replaceable>varname</replaceable> } <optional>, ...</optional> </optional>
18910+
COLUMNS ( <replaceable class="parameter">json_table_column</replaceable> <optional>, ...</optional> )
18911+
<optional> { <literal>ERROR</literal> | <literal>EMPTY</literal> } <literal>ON ERROR</literal> </optional>
18912+
)
18913+
18914+
<phrase>
18915+
where <replaceable class="parameter">json_table_column</replaceable> is:
18916+
</phrase>
18917+
<replaceable>name</replaceable> FOR ORDINALITY
18918+
| <replaceable>name</replaceable> <replaceable>type</replaceable>
18919+
<optional> FORMAT JSON <optional>ENCODING <literal>UTF8</literal></optional></optional>
18920+
<optional> PATH <replaceable>path_expression</replaceable> </optional>
18921+
<optional> { WITHOUT | WITH { CONDITIONAL | <optional>UNCONDITIONAL</optional> } } <optional> ARRAY </optional> WRAPPER </optional>
18922+
<optional> { KEEP | OMIT } QUOTES <optional> ON SCALAR STRING </optional> </optional>
18923+
<optional> { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT <replaceable>expression</replaceable> } ON EMPTY </optional>
18924+
<optional> { ERROR | NULL | EMPTY { ARRAY | OBJECT } | DEFAULT <replaceable>expression</replaceable> } ON ERROR </optional>
18925+
| <replaceable>name</replaceable> <replaceable>type</replaceable> EXISTS <optional> PATH <replaceable>path_expression</replaceable> </optional>
18926+
<optional> { ERROR | TRUE | FALSE | UNKNOWN } ON ERROR </optional>
18927+
</synopsis>
18928+
18929+
<para>
18930+
Each syntax element is described below in more detail.
18931+
</para>
18932+
18933+
<variablelist>
18934+
<varlistentry>
18935+
<term>
18936+
<literal><replaceable>context_item</replaceable>, <replaceable>path_expression</replaceable> <optional> <literal>AS</literal> <replaceable>json_path_name</replaceable> </optional> <optional> <literal>PASSING</literal> { <replaceable>value</replaceable> <literal>AS</literal> <replaceable>varname</replaceable> } <optional>, ...</optional></optional></literal>
18937+
</term>
18938+
<listitem>
18939+
<para>
18940+
The input data to query (<replaceable>context_item</replaceable>),
18941+
the JSON path expression defining the query (<replaceable>path_expression</replaceable>)
18942+
with an optional name (<replaceable>json_path_name</replaceable>), and an
18943+
optional <literal>PASSING</literal> clause, which can provide data values
18944+
to the <replaceable>path_expression</replaceable>. The result of the input
18945+
data evaluation using the aforementioned elements is called the
18946+
<firstterm>row pattern</firstterm>, which is used as the source for row
18947+
values in the constructed view.
18948+
</para>
18949+
</listitem>
18950+
</varlistentry>
18951+
18952+
<varlistentry>
18953+
<term>
18954+
<literal>COLUMNS</literal> ( <replaceable>json_table_column</replaceable> <optional>, ...</optional> )
18955+
</term>
18956+
<listitem>
18957+
18958+
<para>
18959+
The <literal>COLUMNS</literal> clause defining the schema of the
18960+
constructed view. In this clause, you can specify each column to be
18961+
filled with an SQL/JSON value obtained by applying a JSON path expression
18962+
against the row pattern. <replaceable>json_table_column</replaceable> has
18963+
the following variants:
18964+
</para>
18965+
18966+
<variablelist>
18967+
<varlistentry>
18968+
<term>
18969+
<replaceable>name</replaceable> <literal>FOR ORDINALITY</literal>
18970+
</term>
18971+
<listitem>
18972+
<para>
18973+
Adds an ordinality column that provides sequential row numbering starting
18974+
from 1.
18975+
</para>
18976+
</listitem>
18977+
</varlistentry>
18978+
18979+
<varlistentry>
18980+
<term>
18981+
<literal><replaceable>name</replaceable> <replaceable>type</replaceable>
18982+
<optional><literal>FORMAT JSON</literal> <optional>ENCODING <literal>UTF8</literal></optional></optional>
18983+
<optional> <literal>PATH</literal> <replaceable>path_expression</replaceable> </optional></literal>
18984+
</term>
18985+
<listitem>
18986+
<para>
18987+
Inserts an SQL/JSON value obtained by applying
18988+
<replaceable>path_expression</replaceable> against the row pattern into
18989+
the view's output row after coercing it to specified
18990+
<replaceable>type</replaceable>.
18991+
</para>
18992+
<para>
18993+
Specifying <literal>FORMAT JSON</literal> makes it explicit that you
18994+
expect the value to be a valid <type>json</type> object. It only
18995+
makes sense to specify <literal>FORMAT JSON</literal> if
18996+
<replaceable>type</replaceable> is one of <type>bpchar</type>,
18997+
<type>bytea</type>, <type>character varying</type>, <type>name</type>,
18998+
<type>json</type>, <type>jsonb</type>, <type>text</type>, or a domain over
18999+
these types.
19000+
</para>
19001+
<para>
19002+
Optionally, you can specify <literal>WRAPPER</literal> and
19003+
<literal>QUOTES</literal> clauses to format the output. Note that
19004+
specifying <literal>OMIT QUOTES</literal> overrides
19005+
<literal>FORMAT JSON</literal> if also specified, because unquoted
19006+
literals do not constitute valid <type>json</type> values.
19007+
</para>
19008+
<para>
19009+
Optionally, you can use <literal>ON EMPTY</literal> and
19010+
<literal>ON ERROR</literal> clauses to specify whether to throw the error
19011+
or return the specified value when the result of JSON path evaluation is
19012+
empty and when an error occurs during JSON path evaluation or when
19013+
coercing the SQL/JSON value to the specified type, respectively. The
19014+
default for both is to return a <literal>NULL</literal> value.
19015+
</para>
19016+
<note>
19017+
<para>
19018+
This clause is internally turned into and has the same semantics as
19019+
<function>JSON_VALUE</function> or <function>JSON_QUERY</function>.
19020+
The latter if the specified type is not a scalar type or if either of
19021+
<literal>FORMAT JSON</literal>, <literal>WRAPPER</literal>, or
19022+
<literal>QUOTES</literal> clause is present.
19023+
</para>
19024+
</note>
19025+
</listitem>
19026+
</varlistentry>
19027+
19028+
<varlistentry>
19029+
<term>
19030+
<replaceable>name</replaceable> <replaceable>type</replaceable>
19031+
<literal>EXISTS</literal> <optional> <literal>PATH</literal> <replaceable>path_expression</replaceable> </optional>
19032+
</term>
19033+
<listitem>
19034+
<para>
19035+
Inserts a boolean value obtained by applying
19036+
<replaceable>path_expression</replaceable> against the row pattern
19037+
into the view's output row after coercing it to specified
19038+
<replaceable>type</replaceable>.
19039+
</para>
19040+
<para>
19041+
The value corresponds to whether applying the <literal>PATH</literal>
19042+
expression to the row pattern yields any values.
19043+
</para>
19044+
<para>
19045+
The specified <replaceable>type</replaceable> should have a cast from the
19046+
<type>boolean</type> type.
19047+
</para>
19048+
<para>
19049+
Optionally, you can use <literal>ON ERROR</literal> to specify whether to
19050+
throw the error or return the specified value when an error occurs during
19051+
JSON path evaluation or when coercing SQL/JSON value to the specified
19052+
type. The default is to return a boolean value
19053+
<literal>FALSE</literal>.
19054+
</para>
19055+
<note>
19056+
<para>
19057+
This clause is internally turned into and has the same semantics as
19058+
<function>JSON_EXISTS</function>.
19059+
</para>
19060+
</note>
19061+
</listitem>
19062+
</varlistentry>
19063+
</variablelist>
19064+
19065+
<note>
19066+
<para>
19067+
In each variant of <replaceable>json_table_column</replaceable> described
19068+
above, if the <literal>PATH</literal> clause is omitted, path expression
19069+
<literal>$.<replaceable>name</replaceable></literal> is used, where
19070+
<replaceable>name</replaceable> is the provided column name.
19071+
</para>
19072+
</note>
19073+
19074+
</listitem>
19075+
</varlistentry>
19076+
19077+
<varlistentry>
19078+
<term>
19079+
<literal>AS</literal> <replaceable>json_path_name</replaceable>
19080+
</term>
19081+
<listitem>
19082+
19083+
<para>
19084+
The optional <replaceable>json_path_name</replaceable> serves as an
19085+
identifier of the provided <replaceable>path_expression</replaceable>.
19086+
The name must be unique and distinct from the column names.
19087+
</para>
19088+
</listitem>
19089+
</varlistentry>
19090+
19091+
<varlistentry>
19092+
<term>
19093+
{ <literal>ERROR</literal> | <literal>EMPTY</literal> } <literal>ON ERROR</literal>
19094+
</term>
19095+
<listitem>
19096+
19097+
<para>
19098+
The optional <literal>ON ERROR</literal> can be used to specify how to
19099+
handle errors when evaluating the top-level
19100+
<replaceable>path_expression</replaceable>. Use <literal>ERROR</literal>
19101+
if you want the errors to be thrown and <literal>EMPTY</literal> to
19102+
return an empty table, that is, a table containing 0 rows. Note that
19103+
this clause does not affect the errors that occur when evaluating
19104+
columns, for which the behavior depends on whether the
19105+
<literal>ON ERROR</literal> clause is specified against a given column.
19106+
</para>
19107+
</listitem>
19108+
</varlistentry>
19109+
</variablelist>
19110+
19111+
<para>Examples</para>
19112+
19113+
<para>
19114+
In the examples that follow, the following table containing JSON data
19115+
will be used:
19116+
19117+
<programlisting>
19118+
CREATE TABLE my_films ( js jsonb );
19119+
19120+
INSERT INTO my_films VALUES (
19121+
'{ "favorites" : [
19122+
{ "kind" : "comedy", "films" : [
19123+
{ "title" : "Bananas",
19124+
"director" : "Woody Allen"},
19125+
{ "title" : "The Dinner Game",
19126+
"director" : "Francis Veber" } ] },
19127+
{ "kind" : "horror", "films" : [
19128+
{ "title" : "Psycho",
19129+
"director" : "Alfred Hitchcock" } ] },
19130+
{ "kind" : "thriller", "films" : [
19131+
{ "title" : "Vertigo",
19132+
"director" : "Alfred Hitchcock" } ] },
19133+
{ "kind" : "drama", "films" : [
19134+
{ "title" : "Yojimbo",
19135+
"director" : "Akira Kurosawa" } ] }
19136+
] }');
19137+
</programlisting>
19138+
19139+
</para>
19140+
<para>
19141+
The following query shows how to use <function>JSON_TABLE</function> to
19142+
turn the JSON objects in the <structname>my_films</structname> table
19143+
to a view containing columns for the keys <literal>kind</literal>,
19144+
<literal>title</literal>, and <literal>director</literal> contained in
19145+
the original JSON along with an ordinality column:
19146+
19147+
<programlisting>
19148+
SELECT jt.* FROM
19149+
my_films,
19150+
JSON_TABLE (js, '$.favorites[*]' COLUMNS (
19151+
id FOR ORDINALITY,
19152+
kind text PATH '$.kind',
19153+
title text PATH '$.films[*].title' WITH WRAPPER,
19154+
director text PATH '$.films[*].director' WITH WRAPPER)) AS jt;
19155+
</programlisting>
19156+
19157+
<screen>
19158+
id | kind | title | director
19159+
----+----------+--------------------------------+----------------------------------
19160+
1 | comedy | ["Bananas", "The Dinner Game"] | ["Woody Allen", "Francis Veber"]
19161+
2 | horror | ["Psycho"] | ["Alfred Hitchcock"]
19162+
3 | thriller | ["Vertigo"] | ["Alfred Hitchcock"]
19163+
4 | drama | ["Yojimbo"] | ["Akira Kurosawa"]
19164+
(4 rows)
19165+
</screen>
19166+
19167+
</para>
19168+
<para>
19169+
The following is a modified version of the above query to show the
19170+
usage of <literal>PASSING</literal> arguments in the filter specified in
19171+
the top-level JSON path expression and the various options for the
19172+
individual columns:
19173+
19174+
<programlisting>
19175+
SELECT jt.* FROM
19176+
my_films,
19177+
JSON_TABLE (js, '$.favorites[*] ? (@.films[*].director == $filter)'
19178+
PASSING 'Alfred Hitchcock' AS filter, 'Vertigo' AS filter2
19179+
COLUMNS (
19180+
id FOR ORDINALITY,
19181+
kind text PATH '$.kind',
19182+
title text FORMAT JSON PATH '$.films[*].title' OMIT QUOTES,
19183+
director text PATH '$.films[*].director' KEEP QUOTES)) AS jt;
19184+
</programlisting>
19185+
19186+
<screen>
19187+
id | kind | title | director
19188+
----+----------+---------+--------------------
19189+
1 | horror | Psycho | "Alfred Hitchcock"
19190+
2 | thriller | Vertigo | "Alfred Hitchcock"
19191+
(2 rows)
19192+
</screen>
19193+
19194+
</para>
19195+
</sect2>
1886219196
</sect1>
1886319197

1886419198
<sect1 id="functions-sequence">

src/backend/commands/explain.c

+18-3
Original file line numberDiff line numberDiff line change
@@ -4087,9 +4087,24 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
40874087
}
40884088
break;
40894089
case T_TableFuncScan:
4090-
Assert(rte->rtekind == RTE_TABLEFUNC);
4091-
objectname = "xmltable";
4092-
objecttag = "Table Function Name";
4090+
{
4091+
TableFunc *tablefunc = ((TableFuncScan *) plan)->tablefunc;
4092+
4093+
Assert(rte->rtekind == RTE_TABLEFUNC);
4094+
switch (tablefunc->functype)
4095+
{
4096+
case TFT_XMLTABLE:
4097+
objectname = "xmltable";
4098+
break;
4099+
case TFT_JSON_TABLE:
4100+
objectname = "json_table";
4101+
break;
4102+
default:
4103+
elog(ERROR, "invalid TableFunc type %d",
4104+
(int) tablefunc->functype);
4105+
}
4106+
objecttag = "Table Function Name";
4107+
}
40934108
break;
40944109
case T_ValuesScan:
40954110
Assert(rte->rtekind == RTE_VALUES);

src/backend/executor/execExpr.c

+10-1
Original file line numberDiff line numberDiff line change
@@ -2436,7 +2436,16 @@ ExecInitExprRec(Expr *node, ExprState *state,
24362436
{
24372437
JsonExpr *jsexpr = castNode(JsonExpr, node);
24382438

2439-
ExecInitJsonExpr(jsexpr, state, resv, resnull, &scratch);
2439+
/*
2440+
* No need to initialize a full JsonExprState For
2441+
* JSON_TABLE(), because the upstream caller tfuncFetchRows()
2442+
* is only interested in the value of formatted_expr.
2443+
*/
2444+
if (jsexpr->op == JSON_TABLE_OP)
2445+
ExecInitExprRec((Expr *) jsexpr->formatted_expr, state,
2446+
resv, resnull);
2447+
else
2448+
ExecInitJsonExpr(jsexpr, state, resv, resnull, &scratch);
24402449
break;
24412450
}
24422451

src/backend/executor/execExprInterp.c

+2
Original file line numberDiff line numberDiff line change
@@ -4370,6 +4370,8 @@ ExecEvalJsonExprPath(ExprState *state, ExprEvalStep *op,
43704370
break;
43714371
}
43724372

4373+
/* JSON_TABLE_OP can't happen here */
4374+
43734375
default:
43744376
elog(ERROR, "unrecognized SQL/JSON expression op %d",
43754377
(int) jsexpr->op);

0 commit comments

Comments
 (0)