Skip to content

Commit 5ca68d6

Browse files
author
Artur Zakirov
committed
substring_similarity documentation fixes
1 parent 99e3c7b commit 5ca68d6

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

doc/src/sgml/pgtrgm.sgml

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,17 @@
8484
identical).
8585
</entry>
8686
</row>
87+
<row>
88+
<entry><function>substring_similarity(text, text)</function><indexterm><primary>substring_similarity</primary></indexterm></entry>
89+
<entry><type>real</type></entry>
90+
<entry>
91+
Returns a number that indicates how similar the first string
92+
to the most similar substring of the second string. The range of
93+
the result is zero (indicating that the two strings are completely
94+
dissimilar) to one (indicating that the first string is identical
95+
to substring of the second substring).
96+
</entry>
97+
</row>
8798
<row>
8899
<entry><function>show_trgm(text)</function><indexterm><primary>show_trgm</primary></indexterm></entry>
89100
<entry><type>text[]</type></entry>
@@ -111,6 +122,24 @@
111122
Returns the same value passed in.
112123
</entry>
113124
</row>
125+
<row>
126+
<entry><function>show_substring_limit()</function><indexterm><primary>show_substring_limit</primary></indexterm></entry>
127+
<entry><type>real</type></entry>
128+
<entry>
129+
Returns the current similarity threshold used by the <literal>&lt;%</>
130+
operator. This sets the minimum substring similarity between
131+
two phrases.
132+
</entry>
133+
</row>
134+
<row>
135+
<entry><function>set_substring_limit(real)</function><indexterm><primary>set_substring_limit</primary></indexterm></entry>
136+
<entry><type>real</type></entry>
137+
<entry>
138+
Sets the current substring similarity threshold that is used by
139+
the <literal>&lt;%</> operator. The threshold must be between
140+
0 and 1 (default is 0.6). Returns the same value passed in.
141+
</entry>
142+
</row>
114143
</tbody>
115144
</tgroup>
116145
</table>
@@ -136,6 +165,15 @@
136165
<function>set_limit</>.
137166
</entry>
138167
</row>
168+
<row>
169+
<entry><type>text</> <literal>&lt;%</literal> <type>text</></entry>
170+
<entry><type>boolean</type></entry>
171+
<entry>
172+
Returns <literal>true</> if its arguments have a substring similarity
173+
that is greater than the current substring similarity threshold set by
174+
<function>set_substring_limit</>.
175+
</entry>
176+
</row>
139177
<row>
140178
<entry><type>text</> <literal>&lt;-&gt;</literal> <type>text</></entry>
141179
<entry><type>real</type></entry>
@@ -203,6 +241,21 @@ SELECT t, t &lt;-&gt; '<replaceable>word</>' AS dist
203241
a small number of the closest matches is wanted.
204242
</para>
205243

244+
<para>
245+
Also you can use an index on the <structfield>t</> column for substring
246+
similarity. For example:
247+
<programlisting>
248+
SELECT t, substring_similarity('<replaceable>word</>', t) AS sml
249+
FROM test_trgm
250+
WHERE '<replaceable>word</>' &lt;% t
251+
ORDER BY sml DESC, t;
252+
</programlisting>
253+
This will return all values in the text column that have a substring
254+
which sufficiently similar to <replaceable>word</>, sorted from best
255+
match to worst. The index will be used to make this a fast operation
256+
even over very large data sets.
257+
</para>
258+
206259
<para>
207260
Beginning in <productname>PostgreSQL</> 9.1, these index types also support
208261
index searches for <literal>LIKE</> and <literal>ILIKE</>, for example

0 commit comments

Comments
 (0)