Skip to content

Commit 92f208c

Browse files
Moloejoegitbook-bot
authored andcommitted
GITBOOK-126: No subject
1 parent 5626365 commit 92f208c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+81
-136
lines changed

pgml-cms/docs/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,27 +4,27 @@ description: The key concepts that make up PostgresML.
44

55
# Overview
66

7-
PostgresML is a complete MLOps platform built on PostgreSQL. 
7+
PostgresML is a complete MLOps platform built on PostgreSQL.
88

99
> _Move the models to the database_, _rather than continuously moving the data to the models._
1010
11-
The data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move the models to the database, rather than continuously moving the data to the models. PostgresML allows you to take advantage of the fundamental relationship between data and models, by extending the database with the following capabilities and goals:
11+
The data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move the models to the database, rather than continuously moving the data to the models\_.\_ PostgresML allows you to take advantage of the fundamental relationship between data and models, by extending the database with the following capabilities and goals:
1212

1313
* **Model Serving** - _**GPU accelerated**_ inference engine for interactive applications, with no additional networking latency or reliability costs.
1414
* **Model Store** - Download _**open-source**_ models including state of the art LLMs from HuggingFace, and track changes in performance between versions.
1515
* **Model Training** - Train models with _**your application data**_ using more than 50 algorithms for regression, classification or clustering tasks. Fine tune pre-trained models like LLaMA and BERT to improve performance.
16-
* **Feature Store** - _**Scalable**_ access to model inputs, including vector, text, categorical, and numeric data. Vector database, text search, knowledge graph and application data all in one _**low-latency**_ system. 
16+
* **Feature Store** - _**Scalable**_ access to model inputs, including vector, text, categorical, and numeric data. Vector database, text search, knowledge graph and application data all in one _**low-latency**_ system.
1717

1818
<figure><img src=".gitbook/assets/ml_system.svg" alt="Machine Learning Infrastructure (2.0) by a16z"><figcaption><p>PostgresML handles all of the functions typically performed by a cacophony of services, <a href="https://a16z.com/emerging-architectures-for-modern-data-infrastructure/">described by a16z</a></p></figcaption></figure>
1919

20-
These capabilities are primarily provided by two open-source software projects, that may be used independently, but are designed to be used with the rest of the Postgres ecosystem, including trusted extensions like pgvector and pg\_partman.&#x20;
20+
These capabilities are primarily provided by two open-source software projects, that may be used independently, but are designed to be used with the rest of the Postgres ecosystem, including trusted extensions like pgvector and pg\_partman.
2121

2222
* **pgml** is an open source extension for PostgreSQL. It adds support for GPUs and the latest ML & AI algorithms _**inside**_ the database with a SQL API and no additional infrastructure, networking latency, or reliability costs.
2323
* **PgCat** is an open source proxy pooler for PostgreSQL. It abstracts the scalability and reliability concerns of managing a distributed cluster of Postgres databases. Client applications connect only to the proxy, which handles load balancing and failover, _**outside**_ of any single database.
2424

2525
<figure><img src=".gitbook/assets/architecture.png" alt="PostgresML architectural diagram" width="275"><figcaption><p>A PostgresML deployment at scale</p></figcaption></figure>
2626

27-
In addition, PostgresML provides [native language SDKs](https://github.com/postgresml/postgresml/tree/master/pgml-sdks/pgml) to implement best practices for common ML & AI applications. The JavaScript and Python SDKs are generated from the core Rust SDK, to provide the same API, correctness and efficiency across all application runtimes.&#x20;
27+
In addition, PostgresML provides [native language SDKs](https://github.com/postgresml/postgresml/tree/master/pgml-sdks/pgml) to implement best practices for common ML & AI applications. The JavaScript and Python SDKs are generated from the core Rust SDK, to provide the same API, correctness and efficiency across all application runtimes.
2828

2929
SDK clients can perform advanced machine learning tasks in a single SQL request, without having to transfer additional data, models, hardware or dependencies to the client application. For example:
3030

@@ -36,6 +36,6 @@ SDK clients can perform advanced machine learning tasks in a single SQL request,
3636
* Forecasting timeseries data for key metrics with complex metadata
3737
* Fraud and anomaly detection with application data
3838

39-
Our goal is to provide access to Open Source AI for everyone. PostgresML is under continuous development to keep up with the rapidly evolving use cases for ML & AI, and we release non breaking changes with minor version updates in accordance with SemVer. We welcome contributions to our [open source code and documentation](https://github.com/postgresml).&#x20;
39+
Our goal is to provide access to Open Source AI for everyone. PostgresML is under continuous development to keep up with the rapidly evolving use cases for ML & AI, and we release non breaking changes with minor version updates in accordance with SemVer. We welcome contributions to our [open source code and documentation](https://github.com/postgresml).
4040

4141
We can host your AI database in our cloud, or you can run our Docker image locally with PostgreSQL, pgml, pgvector and NVIDIA drivers included.

pgml-cms/docs/SUMMARY.md

Lines changed: 37 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -9,40 +9,43 @@
99
* [Import your data](introduction/getting-started/import-your-data/README.md)
1010
* [CSV](introduction/getting-started/import-your-data/csv.md)
1111
* [Foreign Data Wrapper](introduction/getting-started/import-your-data/foreign-data-wrapper.md)
12-
* [APIs](introduction/apis/README.md)
13-
* [SQL Extensions](introduction/apis/sql-extensions/README.md)
14-
* [pgml.deploy()](introduction/apis/sql-extensions/pgml.deploy.md)
15-
* [pgml.embed()](introduction/apis/sql-extensions/pgml.embed.md)
16-
* [pgml.generate()](introduction/apis/sql-extensions/pgml.generate.md)
17-
* [pgml.predict()](introduction/apis/sql-extensions/pgml.predict/README.md)
18-
* [Batch Predictions](introduction/apis/sql-extensions/pgml.predict/batch-predictions.md)
19-
* [pgml.train()](introduction/apis/sql-extensions/pgml.train/README.md)
20-
* [Regression](introduction/apis/sql-extensions/pgml.train/regression.md)
21-
* [Classification](introduction/apis/sql-extensions/pgml.train/classification.md)
22-
* [Clustering](introduction/apis/sql-extensions/pgml.train/clustering.md)
23-
* [Data Pre-processing](introduction/apis/sql-extensions/pgml.train/data-pre-processing.md)
24-
* [Hyperparameter Search](introduction/apis/sql-extensions/pgml.train/hyperparameter-search.md)
25-
* [Joint Optimization](introduction/apis/sql-extensions/pgml.train/joint-optimization.md)
26-
* [pgml.transform()](introduction/apis/sql-extensions/pgml.transform/README.md)
27-
* [Fill Mask](introduction/apis/sql-extensions/pgml.transform/fill-mask.md)
28-
* [Question Answering](introduction/apis/sql-extensions/pgml.transform/question-answering.md)
29-
* [Summarization](introduction/apis/sql-extensions/pgml.transform/summarization.md)
30-
* [Text Classification](introduction/apis/sql-extensions/pgml.transform/text-classification.md)
31-
* [Text Generation](introduction/apis/sql-extensions/pgml.transform/text-generation.md)
32-
* [Text-to-Text Generation](introduction/apis/sql-extensions/pgml.transform/text-to-text-generation.md)
33-
* [Token Classification](introduction/apis/sql-extensions/pgml.transform/token-classification.md)
34-
* [Translation](introduction/apis/sql-extensions/pgml.transform/translation.md)
35-
* [Zero-shot Classification](introduction/apis/sql-extensions/pgml.transform/zero-shot-classification.md)
36-
* [pgml.tune()](introduction/apis/sql-extensions/pgml.tune.md)
37-
* [Client SDKs](introduction/apis/client-sdks/README.md)
38-
* [Overview](introduction/apis/client-sdks/getting-started.md)
39-
* [Collections](introduction/apis/client-sdks/collections.md)
40-
* [Pipelines](introduction/apis/client-sdks/pipelines.md)
41-
* [Vector Search](introduction/apis/client-sdks/search.md)
42-
* [Document Search](introduction/apis/client-sdks/document-search.md)
43-
* [Tutorials](introduction/apis/client-sdks/tutorials/README.md)
44-
* [Semantic Search](introduction/apis/client-sdks/tutorials/semantic-search.md)
45-
* [Semantic Search Using Instructor Model](introduction/apis/client-sdks/tutorials/semantic-search-1.md)
12+
13+
## API
14+
15+
* [Overview](api/apis.md)
16+
* [SQL Extension](api/sql-extensions/README.md)
17+
* [pgml.deploy()](api/sql-extensions/pgml.deploy.md)
18+
* [pgml.embed()](api/sql-extensions/pgml.embed.md)
19+
* [pgml.generate()](api/sql-extensions/pgml.generate.md)
20+
* [pgml.predict()](api/sql-extensions/pgml.predict/README.md)
21+
* [Batch Predictions](api/sql-extensions/pgml.predict/batch-predictions.md)
22+
* [pgml.train()](api/sql-extensions/pgml.train/README.md)
23+
* [Regression](api/sql-extensions/pgml.train/regression.md)
24+
* [Classification](api/sql-extensions/pgml.train/classification.md)
25+
* [Clustering](api/sql-extensions/pgml.train/clustering.md)
26+
* [Data Pre-processing](api/sql-extensions/pgml.train/data-pre-processing.md)
27+
* [Hyperparameter Search](api/sql-extensions/pgml.train/hyperparameter-search.md)
28+
* [Joint Optimization](api/sql-extensions/pgml.train/joint-optimization.md)
29+
* [pgml.transform()](api/sql-extensions/pgml.transform/README.md)
30+
* [Fill Mask](api/sql-extensions/pgml.transform/fill-mask.md)
31+
* [Question Answering](api/sql-extensions/pgml.transform/question-answering.md)
32+
* [Summarization](api/sql-extensions/pgml.transform/summarization.md)
33+
* [Text Classification](api/sql-extensions/pgml.transform/text-classification.md)
34+
* [Text Generation](api/sql-extensions/pgml.transform/text-generation.md)
35+
* [Text-to-Text Generation](api/sql-extensions/pgml.transform/text-to-text-generation.md)
36+
* [Token Classification](api/sql-extensions/pgml.transform/token-classification.md)
37+
* [Translation](api/sql-extensions/pgml.transform/translation.md)
38+
* [Zero-shot Classification](api/sql-extensions/pgml.transform/zero-shot-classification.md)
39+
* [pgml.tune()](api/sql-extensions/pgml.tune.md)
40+
* [Client SDK](api/client-sdks/README.md)
41+
* [Overview](api/client-sdks/getting-started.md)
42+
* [Collections](api/client-sdks/collections.md)
43+
* [Pipelines](api/client-sdks/pipelines.md)
44+
* [Vector Search](api/client-sdks/search.md)
45+
* [Document Search](api/client-sdks/document-search.md)
46+
* [Tutorials](api/client-sdks/tutorials/README.md)
47+
* [Semantic Search](api/client-sdks/tutorials/semantic-search.md)
48+
* [Semantic Search Using Instructor Model](api/client-sdks/tutorials/semantic-search-1.md)
4649

4750
## Product
4851

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# APIs
1+
# Overview
22

33
## Introduction
44

@@ -8,9 +8,9 @@ The extensions provide all of the ML & AI functionality via SQL APIs, like train
88

99
We also provide Client SDKs that implement the best practices on top of the SQL APIs, to ease adoption and implement common application use cases in applications, like chatbots or search engines.
1010

11-
## SQL Extensions
11+
## SQL Extension
1212

13-
Postgres is designed to be _**extensible**_. This has created a rich open-source ecosystem of additional functionality built around the core project. Some [extensions](https://www.postgresql.org/docs/current/contrib.html) are include in the base Postgres distribution, but others are also available via the [PostgreSQL Extension Network](https://pgxn.org/).\
13+
PostgreSQL is designed to be _**extensible**_. This has created a rich open-source ecosystem of additional functionality built around the core project. Some [extensions](https://www.postgresql.org/docs/current/contrib.html) are include in the base Postgres distribution, but others are also available via the [PostgreSQL Extension Network](https://pgxn.org/).\
1414
\
1515
There are 2 foundational extensions included in a PostgresML deployment that provide functionality inside the database through SQL APIs.
1616

@@ -19,11 +19,11 @@ There are 2 foundational extensions included in a PostgresML deployment that pro
1919

2020
Learn more about developing with the [sql-extensions](sql-extensions/ "mention")
2121

22-
## Client SDKs
22+
## Client SDK
2323

24-
PostgresML provides client SDKs that streamline ML & AI use cases in both JavaScript and Python. With these SDKs, you can seamlessly manage various database tables related to documents, text chunks, text splitters, LLM (Language Model) models, and embeddings. By leveraging the SDK's capabilities, you can efficiently index LLM embeddings using pgvector with HNSW for fast and accurate queries.
24+
PostgresML provides a client SDK that streamlines ML & AI use cases in both JavaScript and Python. With this SDK, you can seamlessly manage various database tables related to documents, text chunks, text splitters, LLM (Language Model) models, and embeddings. By leveraging the SDK's capabilities, you can efficiently index LLM embeddings using pgvector with HNSW for fast and accurate queries.
2525

26-
These SDKs delegate all work to the extensions running in the database, which minimizes software and hardware dependencies that need to be maintained at the application layer, as well as securing data and models inside the data center. Our SDKs minimize data transfer to maximize performance, efficiency, security and reliability.
26+
The SDK delegates all work to the extension running in the database, which minimizes software and hardware dependencies that need to be maintained at the application layer, as well as securing data and models inside the data center. Our SDK minimizes data transfer to maximize performance, efficiency, security and reliability.
2727

2828
Learn more about developing with the [client-sdks](client-sdks/ "mention")
2929

File renamed without changes.

pgml-cms/docs/introduction/apis/client-sdks/collections.md renamed to pgml-cms/docs/api/client-sdks/collections.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,3 @@
1-
---
2-
description: Organizational building blocks of the SDK. Manage all documents and related chunks, embeddings, tsvectors, and pipelines.
3-
---
4-
51
# Collections
62

73
Collections are the organizational building blocks of the SDK. They manage all documents and related chunks, embeddings, tsvectors, and pipelines.
File renamed without changes.
File renamed without changes.

pgml-cms/docs/introduction/apis/client-sdks/pipelines.md renamed to pgml-cms/docs/api/client-sdks/pipelines.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,3 @@
1-
---
2-
description: Pipelines are composed of a model, splitter, and additional optional arguments.
3-
---
4-
51
# Pipelines
62

73
`Pipeline`s define the schema for the transformation of documents. Different `Pipeline`s can be used for different tasks.&#x20;
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)