Skip to content

Commit 39658fb

Browse files
Moloejoegitbook-bot
authored andcommitted
GITBOOK-101: No subject
1 parent afa35aa commit 39658fb

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

pgml-cms/docs/product/vector-database.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -33,14 +33,14 @@ UPDATE
3333
SET embedding = pgml.embed('intfloat/e5-small', "Address");
3434
```
3535

36-
```
36+
```sql
3737
UPDATE 5000
3838
```
3939

4040
That's it. We just embedding 5,000 "Address" values with a single SQL query. Let's take a look at what we got:
4141

42-
```
43-
postgresml=# SELECT
42+
```sql
43+
SELECT
4444
"Address",
4545
(embedding::real[])[1:5]
4646
FROM usa_house_prices
@@ -79,7 +79,7 @@ ORDER BY
7979
LIMIT 3;
8080
```
8181

82-
```
82+
```sql
8383
Address
8484
----------------------------------------
8585
1 Infinite Loop, Cupertino, California
@@ -104,8 +104,8 @@ When searching for a nearest neighbor match, `pgvector` picks the closest centro
104104

105105
The number of lists in an IVFFlat index is configurable when creating the index. The more lists are created, the faster you can search it, but the nearest neighbor approximation becomes less precise. The best number of lists for a dataset is typically its square root, e.g. if a dataset has 5,000,000 vectors, the number of lists should be:
106106

107-
```
108-
postgresml=# SELECT round(sqrt(5000000)) AS lists;
107+
```sql
108+
SELECT round(sqrt(5000000)) AS lists;
109109
lists
110110
-------
111111
2236
@@ -124,8 +124,8 @@ WITH (lists = 71);
124124

125125
71 is the approximate square root of 5,000 rows we have in that table. With the index created, if we `EXPLAIN` the query we just ran, we'll get an "Index Scan" on the cosine distance index:
126126

127-
```
128-
postgresml=# EXPLAIN SELECT
127+
```sql
128+
EXPLAIN SELECT
129129
"Address"
130130
FROM usa_house_prices
131131
ORDER BY

0 commit comments

Comments
 (0)