Skip to content

Commit 7fc614c

Browse files
committed
Docs review for unaccent: fix grammar, markup, etc.
1 parent 1dab218 commit 7fc614c

File tree

1 file changed

+51
-45
lines changed

1 file changed

+51
-45
lines changed

doc/src/sgml/unaccent.sgml

Lines changed: 51 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
<!-- $PostgreSQL: pgsql/doc/src/sgml/unaccent.sgml,v 1.6 2010/08/25 02:12:00 tgl Exp $ -->
2+
13
<sect1 id="unaccent">
24
<title>unaccent</title>
35

@@ -6,24 +8,24 @@
68
</indexterm>
79

810
<para>
9-
<filename>unaccent</> removes accents (diacritic signs) from a lexeme.
10-
It's a filtering dictionary, that means its output is
11-
always passed to the next dictionary (if any), contrary to the standard
12-
behavior. Currently, it supports most important accents from European
13-
languages.
11+
<filename>unaccent</> is a text search dictionary that removes accents
12+
(diacritic signs) from lexemes.
13+
It's a filtering dictionary, which means its output is
14+
always passed to the next dictionary (if any), unlike the normal
15+
behavior of dictionaries. This allows accent-insensitive processing
16+
for full text search.
1417
</para>
1518

1619
<para>
17-
Limitation: Current implementation of <filename>unaccent</>
18-
dictionary cannot be used as a normalizing dictionary for
19-
<filename>thesaurus</filename> dictionary.
20+
The current implementation of <filename>unaccent</> cannot be used as a
21+
normalizing dictionary for the <filename>thesaurus</filename> dictionary.
2022
</para>
21-
23+
2224
<sect2>
2325
<title>Configuration</title>
2426

2527
<para>
26-
A <literal>unaccent</> dictionary accepts the following options:
28+
An <literal>unaccent</> dictionary accepts the following options:
2729
</para>
2830
<itemizedlist>
2931
<listitem>
@@ -43,105 +45,109 @@
4345
<itemizedlist>
4446
<listitem>
4547
<para>
46-
Each line represents pair: character_with_accent character_without_accent
48+
Each line represents a pair, consisting of a character with accent
49+
followed by a character without accent. The first is translated into
50+
the second. For example,
4751
<programlisting>
4852
&Agrave; A
4953
&Aacute; A
50-
&Acirc; A
54+
&Acirc; A
5155
&Atilde; A
52-
&Auml; A
53-
&Aring; A
54-
&AElig; A
56+
&Auml; A
57+
&Aring; A
58+
&AElig; A
5559
</programlisting>
5660
</para>
5761
</listitem>
5862
</itemizedlist>
5963

6064
<para>
61-
Look at <filename>unaccent.rules</>, which is installed in
62-
<filename>$SHAREDIR/tsearch_data/</>, for an example.
65+
A more complete example, which is directly useful for most European
66+
languages, can be found in <filename>unaccent.rules</>, which is installed
67+
in <filename>$SHAREDIR/tsearch_data/</> when the <filename>unaccent</>
68+
module is installed.
6369
</para>
6470
</sect2>
6571

6672
<sect2>
6773
<title>Usage</title>
6874

6975
<para>
70-
Running the installation script creates a text search template
71-
<literal>unaccent</> and a dictionary <literal>unaccent</>
76+
Running the installation script <filename>unaccent.sql</> creates a text
77+
search template <literal>unaccent</> and a dictionary <literal>unaccent</>
7278
based on it, with default parameters. You can alter the
7379
parameters, for example
7480

7581
<programlisting>
76-
=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
82+
mydb=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
7783
</programlisting>
7884

7985
or create new dictionaries based on the template.
8086
</para>
8187

8288
<para>
83-
To test the dictionary, you can try
84-
89+
To test the dictionary, you can try:
8590
<programlisting>
86-
=# select ts_lexize('unaccent','Hôtel');
87-
ts_lexize
91+
mydb=# select ts_lexize('unaccent','H&ocirc;tel');
92+
ts_lexize
8893
-----------
8994
{Hotel}
9095
(1 row)
9196
</programlisting>
9297
</para>
93-
98+
9499
<para>
95-
Filtering dictionary are useful for correct work of
96-
<function>ts_headline</function> function.
100+
Here is an example showing how to insert the
101+
<filename>unaccent</> dictionary into a text search configuration:
97102
<programlisting>
98-
=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
99-
=# ALTER TEXT SEARCH CONFIGURATION fr
103+
mydb=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
104+
mydb=# ALTER TEXT SEARCH CONFIGURATION fr
100105
ALTER MAPPING FOR hword, hword_part, word
101106
WITH unaccent, french_stem;
102-
=# select to_tsvector('fr','Hôtels de la Mer');
103-
to_tsvector
107+
mydb=# select to_tsvector('fr','H&ocirc;tels de la Mer');
108+
to_tsvector
104109
-------------------
105110
'hotel':1 'mer':4
106111
(1 row)
107112

108-
=# select to_tsvector('fr','Hôtel de la Mer') @@ to_tsquery('fr','Hotels');
109-
?column?
113+
mydb=# select to_tsvector('fr','H&ocirc;tel de la Mer') @@ to_tsquery('fr','Hotels');
114+
?column?
110115
----------
111116
t
112117
(1 row)
113-
=# select ts_headline('fr','Hôtel de la Mer',to_tsquery('fr','Hotels'));
114-
ts_headline
118+
119+
mydb=# select ts_headline('fr','H&ocirc;tel de la Mer',to_tsquery('fr','Hotels'));
120+
ts_headline
115121
------------------------
116-
&lt;b&gt;Hôtel&lt;/b&gt;de la Mer
122+
&lt;b&gt;H&ocirc;tel&lt;/b&gt; de la Mer
117123
(1 row)
118-
119124
</programlisting>
120125
</para>
121126
</sect2>
122127

123128
<sect2>
124-
<title>Function</title>
129+
<title>Functions</title>
125130

126131
<para>
127-
<function>unaccent</> function removes accents (diacritic signs) from
128-
argument string. Basically, it's a wrapper around
129-
<filename>unaccent</> dictionary.
132+
The <function>unaccent()</> function removes accents (diacritic signs) from
133+
a given string. Basically, it's a wrapper around the
134+
<filename>unaccent</> dictionary, but it can be used outside normal
135+
text search contexts.
130136
</para>
131137

132138
<indexterm>
133139
<primary>unaccent</primary>
134140
</indexterm>
135141

136142
<synopsis>
137-
unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>)
138-
returns <type>text</type>
143+
unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>) returns <type>text</type>
139144
</synopsis>
140145

141146
<para>
147+
For example:
142148
<programlisting>
143-
SELECT unaccent('unaccent', 'Hôtel');
144-
SELECT unaccent('Hôtel');
149+
SELECT unaccent('unaccent', 'H&ocirc;tel');
150+
SELECT unaccent('H&ocirc;tel');
145151
</programlisting>
146152
</para>
147153
</sect2>

0 commit comments

Comments
 (0)