Skip to content

Commit afa35aa

Browse files
Moloejoegitbook-bot
authored andcommitted
GITBOOK-100: Move APIs into Introduction
1 parent 07470ed commit afa35aa

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+2162
-922
lines changed

pgml-cms/docs/SUMMARY.md

Lines changed: 40 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -9,37 +9,42 @@
99
* [Import your data](introduction/getting-started/import-your-data/README.md)
1010
* [CSV](introduction/getting-started/import-your-data/csv.md)
1111
* [Foreign Data Wrapper](introduction/getting-started/import-your-data/foreign-data-wrapper.md)
12-
* [Machine Learning](introduction/machine-learning/README.md)
13-
* [Natural Language Processing](introduction/machine-learning/natural-language-processing/README.md)
14-
* [Embeddings](introduction/machine-learning/natural-language-processing/embeddings.md)
15-
* [Fill Mask](introduction/machine-learning/natural-language-processing/fill-mask.md)
16-
* [Question Answering](introduction/machine-learning/natural-language-processing/question-answering.md)
17-
* [Summarization](introduction/machine-learning/natural-language-processing/summarization.md)
18-
* [Text Classification](introduction/machine-learning/natural-language-processing/text-classification.md)
19-
* [Text Generation](introduction/machine-learning/natural-language-processing/text-generation.md)
20-
* [Text-to-Text Generation](introduction/machine-learning/natural-language-processing/text-to-text-generation.md)
21-
* [Token Classification](introduction/machine-learning/natural-language-processing/token-classification.md)
22-
* [Translation](introduction/machine-learning/natural-language-processing/translation.md)
23-
* [Zero-shot Classification](introduction/machine-learning/natural-language-processing/zero-shot-classification.md)
24-
* [Supervised Learning](introduction/machine-learning/supervised-learning/README.md)
25-
* [Data Pre-processing](introduction/machine-learning/supervised-learning/data-pre-processing.md)
26-
* [Regression](introduction/machine-learning/supervised-learning/regression.md)
27-
* [Classification](introduction/machine-learning/supervised-learning/classification.md)
28-
* [Hyperparameter Search](introduction/machine-learning/supervised-learning/hyperparameter-search.md)
29-
* [Joint Optimization](introduction/machine-learning/supervised-learning/joint-optimization.md)
30-
* [Unsupervised Learning](introduction/machine-learning/unsupervised-learning.md)
31-
* [SDKs](introduction/machine-learning/sdks/README.md)
32-
* [Overview](introduction/machine-learning/sdks/overview.md)
33-
* [Getting Started](introduction/machine-learning/sdks/getting-started.md)
34-
* [OpenSourceAI](introduction/machine-learning/sdks/opensourceai.md)
35-
* [Collections](introduction/machine-learning/sdks/collections.md)
36-
* [Pipelines](introduction/machine-learning/sdks/pipelines.md)
37-
* [Search](introduction/machine-learning/sdks/search.md)
38-
* [Tutorials](introduction/machine-learning/sdks/tutorials/README.md)
39-
* [Semantic Search](introduction/machine-learning/sdks/tutorials/semantic-search.md)
40-
* [Semantic Search using Instructor model](introduction/machine-learning/sdks/tutorials/semantic-search-using-instructor-model.md)
41-
* [Extractive Question Answering](introduction/machine-learning/sdks/tutorials/extractive-question-answering.md)
42-
* [Summarizing Question Answering](introduction/machine-learning/sdks/tutorials/summarizing-question-answering.md)
12+
* [APIs](introduction/apis/README.md)
13+
* [SQL Extensions](introduction/apis/sql-extensions/README.md)
14+
* [pgml.deploy()](introduction/apis/sql-extensions/pgml.deploy.md)
15+
* [pgml.embed()](introduction/apis/sql-extensions/pgml.embed.md)
16+
* [pgml.generate()](introduction/apis/sql-extensions/pgml.generate.md)
17+
* [pgml.predict()](introduction/apis/sql-extensions/pgml.predict/README.md)
18+
* [Batch Predictions](introduction/apis/sql-extensions/pgml.predict/batch-predictions.md)
19+
* [pgml.train()](introduction/apis/sql-extensions/pgml.train/README.md)
20+
* [Regression](introduction/apis/sql-extensions/pgml.train/regression.md)
21+
* [Classification](introduction/apis/sql-extensions/pgml.train/classification.md)
22+
* [Clustering](introduction/apis/sql-extensions/pgml.train/clustering.md)
23+
* [Data Pre-processing](introduction/apis/sql-extensions/pgml.train/data-pre-processing.md)
24+
* [Hyperparameter Search](introduction/apis/sql-extensions/pgml.train/hyperparameter-search.md)
25+
* [Joint Optimization](introduction/apis/sql-extensions/pgml.train/joint-optimization.md)
26+
* [pgml.transform()](introduction/apis/sql-extensions/pgml.transform/README.md)
27+
* [Fill Mask](introduction/apis/sql-extensions/pgml.transform/fill-mask.md)
28+
* [Question Answering](introduction/apis/sql-extensions/pgml.transform/question-answering.md)
29+
* [Summarization](introduction/apis/sql-extensions/pgml.transform/summarization.md)
30+
* [Text Classification](introduction/apis/sql-extensions/pgml.transform/text-classification.md)
31+
* [Text Generation](introduction/apis/sql-extensions/pgml.transform/text-generation.md)
32+
* [Text-to-Text Generation](introduction/apis/sql-extensions/pgml.transform/text-to-text-generation.md)
33+
* [Token Classification](introduction/apis/sql-extensions/pgml.transform/token-classification.md)
34+
* [Translation](introduction/apis/sql-extensions/pgml.transform/translation.md)
35+
* [Zero-shot Classification](introduction/apis/sql-extensions/pgml.transform/zero-shot-classification.md)
36+
* [pgml.tune()](introduction/apis/sql-extensions/pgml.tune.md)
37+
* [Client SDKs](introduction/apis/client-sdks/README.md)
38+
* [Overview](introduction/apis/client-sdks/getting-started.md)
39+
* [OpenSourceAI](introduction/apis/client-sdks/opensourceai.md)
40+
* [Collections](introduction/apis/client-sdks/collections.md)
41+
* [Pipelines](introduction/apis/client-sdks/pipelines.md)
42+
* [Search](introduction/apis/client-sdks/search.md)
43+
* [Tutorials](introduction/apis/client-sdks/tutorials/README.md)
44+
* [Semantic Search](introduction/apis/client-sdks/tutorials/semantic-search.md)
45+
* [Semantic Search using Instructor model](introduction/apis/client-sdks/tutorials/semantic-search-using-instructor-model.md)
46+
* [Extractive Question Answering](introduction/apis/client-sdks/tutorials/extractive-question-answering.md)
47+
* [Summarizing Question Answering](introduction/apis/client-sdks/tutorials/summarizing-question-answering.md)
4348

4449
## Product
4550

@@ -58,12 +63,13 @@
5863
* [Chatbots](use-cases/chatbots.md)
5964
* [Search](use-cases/improve-search-results-with-machine-learning.md)
6065
* [Embeddings](use-cases/embeddings/README.md)
61-
* [Generating LLM embeddings with open source models in PostgresML](use-cases/embeddings/generating-llm-embeddings-with-open-source-models-in-postgresml/README.md)
62-
* [Tuning vector recall while generating query embeddings in the database](use-cases/embeddings/generating-llm-embeddings-with-open-source-models-in-postgresml/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md)
66+
* [Generating LLM embeddings with open source models](use-cases/embeddings/generating-llm-embeddings-with-open-source-models-in-postgresml.md)
67+
* [Tuning vector recall while generating query embeddings in the database](use-cases/embeddings/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md)
6368
* [Personalize embedding results with application data in your database](use-cases/embeddings/personalize-embedding-results-with-application-data-in-your-database.md)
64-
* [Time-series Forecasting](use-cases/time-series-forecasting.md)
69+
* [Supervised Learning](use-cases/supervised-learning.md)
6570
* [Fraud Detection](use-cases/fraud-detection.md)
6671
* [Recommendation Engine](use-cases/recommendation-engine.md)
72+
* [Time-series Forecasting](use-cases/time-series-forecasting.md)
6773

6874
## Resources
6975

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# APIs
2+
3+
## Introduction
4+
5+
PostgresML adds extensions to the PostgreSQL database, as well as providing separate Client SDKs in JavaScript and Python that leverage the database to implement common ML & AI use cases. 
6+
7+
The extensions provide all of the ML & AI functionality via SQL APIs, like training and inference. They are designed to be used directly for all ML practitioners who implement dozens of different use cases on their own machine learning models. 
8+
9+
We also provide Client SDKs that implement the best practices on top of the SQL APIs, to ease adoption and implement common application use cases in applications, like chatbots or search engines.
10+
11+
## SQL Extensions
12+
13+
Postgres is designed to be _**extensible**_. This has created a rich open-source ecosystem of additional functionality built around the core project. Some [extensions](https://www.postgresql.org/docs/current/contrib.html) are include in the base Postgres distribution, but others are also available via the [PostgreSQL Extension Network](https://pgxn.org/). \
14+
\
15+
There are 2 foundational extensions included in a PostgresML deployment that provide functionality inside the database through SQL APIs.
16+
17+
* **pgml** - provides Machine Learning and Artificial Intelligence APIs with access to more than 50 ML algorithms to train classification, clustering and regression models on your own data, or you can perform dozens of tasks with thousands of models downloaded from HuggingFace.
18+
* **pgvector** - provides indexing and search functionality on vectors, in addition to the traditional application database storage, including JSON and plain text, provided by PostgreSQL.
19+
20+
Learn more about developing with the [sql-extensions](sql-extensions/ "mention")
21+
22+
## Client SDKs
23+
24+
PostgresML provides client SDKs that streamline ML & AI use cases in both JavaScript and Python. With these SDKs, you can seamlessly manage various database tables related to documents, text chunks, text splitters, LLM (Language Model) models, and embeddings. By leveraging the SDK's capabilities, you can efficiently index LLM embeddings using pgvector with HNSW for fast and accurate queries.
25+
26+
These SDKs delegate all work to the extensions running in the database, which minimizes software and hardware dependencies that need to be maintained at the application layer, as well as securing data and models inside the data center. Our SDKs minimize data transfer to maximize performance, efficiency, security and reliability.
27+
28+
Learn more about developing with the [client-sdks](client-sdks/ "mention")
29+
30+
31+
32+
33+
34+
##

pgml-cms/docs/introduction/machine-learning/sdks/overview.md renamed to pgml-cms/docs/introduction/apis/client-sdks/README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Overview
1+
# Client SDKs
22

33
### Key Features
44

@@ -16,10 +16,9 @@
1616

1717
### How the SDK Works
1818

19-
SDK streamlines the development of vector search applications by abstracting away the complexities of database management and indexing. Here's an overview of how the SDK works:
19+
SDK streamlines the development of vector search applications by abstracting away the complexities of database management and indexing. Here's an overview of how the SDK works:
2020

2121
* **Automatic Document and Text Chunk Management**: The SDK provides a convenient interface to manage documents and pipelines, automatically handling chunking and embedding for you. You can easily organize and structure your text data within the PostgreSQL database.
2222
* **Open Source Model Integration**: With the SDK, you can seamlessly incorporate a wide range of open source models to generate high-quality embeddings. These models capture the semantic meaning of text and enable powerful analysis and search capabilities.
2323
* **Embedding Indexing**: The Python SDK utilizes the PgVector extension to efficiently index the embeddings generated by the open source models. This indexing process optimizes search performance and allows for fast and accurate retrieval of relevant results.
2424
* **Querying and Search**: Once the embeddings are indexed, you can perform vector-based searches on the documents and text chunks stored in the PostgreSQL database. The SDK provides intuitive methods for executing queries and retrieving search results.
25-
Lines changed: 28 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# Collections
22

3-
4-
53
Collections are the organizational building blocks of the SDK. They manage all documents and related chunks, embeddings, tsvectors, and pipelines.
64

75
## Creating Collections
@@ -11,35 +9,35 @@ By default, collections will read and write to the database specified by `DATABA
119
### **Default `DATABASE_URL`**
1210

1311
{% tabs %}
14-
{% tab title="Python" %}
15-
```python
16-
collection = Collection("test_collection")
17-
```
18-
{% endtab %}
19-
2012
{% tab title="JavaScript" %}
2113
```javascript
2214
collection = pgml.newCollection("test_collection")
2315
```
2416
{% endtab %}
17+
18+
{% tab title="Python" %}
19+
```python
20+
collection = Collection("test_collection")
21+
```
22+
{% endtab %}
2523
{% endtabs %}
2624

2725
### **Custom DATABASE\_URL**
2826

2927
Create a Collection that reads from a different database than that set by the environment variable `DATABASE_URL`.
3028

3129
{% tabs %}
32-
{% tab title="Python" %}
33-
```python
34-
collection = Collection("test_collection", CUSTOM_DATABASE_URL)
35-
```
36-
{% endtab %}
37-
3830
{% tab title="Javascript" %}
3931
```javascript
4032
collection = pgml.newCollection("test_collection", CUSTOM_DATABASE_URL)
4133
```
4234
{% endtab %}
35+
36+
{% tab title="Python" %}
37+
```python
38+
collection = Collection("test_collection", CUSTOM_DATABASE_URL)
39+
```
40+
{% endtab %}
4341
{% endtabs %}
4442

4543
## Upserting Documents
@@ -49,6 +47,22 @@ Documents are dictionaries with two required keys: `id` and `text`. All other ke
4947
**Upsert documents with metadata**
5048

5149
{% tabs %}
50+
{% tab title="JavaScript" %}
51+
```javascript
52+
const documents = [
53+
{
54+
id: "Document One",
55+
text: "document one contents...",
56+
},
57+
{
58+
id: "Document Two",
59+
text: "document two contents...",
60+
},
61+
];
62+
await collection.upsert_documents(documents);
63+
```
64+
{% endtab %}
65+
5266
{% tab title="Python" %}
5367
```python
5468
documents = [
@@ -67,20 +81,4 @@ collection = Collection("test_collection")
6781
await collection.upsert_documents(documents)
6882
```
6983
{% endtab %}
70-
71-
{% tab title="JavaScript" %}
72-
```javascript
73-
const documents = [
74-
{
75-
id: "Document One",
76-
text: "document one contents...",
77-
},
78-
{
79-
id: "Document Two",
80-
text: "document two contents...",
81-
},
82-
];
83-
await collection.upsert_documents(documents);
84-
```
85-
{% endtab %}
8684
{% endtabs %}

0 commit comments

Comments
 (0)