diff --git a/LICENSE b/LICENSE index 2452b1a..6d2920a 100755 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -Copyright (c) 2023-2024 Cequence.io +Copyright (c) 2023-2025 Cequence.io Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/README.md b/README.md index dc0b5f2..65b6d02 100755 --- a/README.md +++ b/README.md @@ -1,12 +1,12 @@ # Pinecone Scala Client 🗂️ -[![version](https://img.shields.io/badge/version-1.1.0-green.svg)](https://cequence.io) [![License](https://img.shields.io/badge/License-MIT-lightgrey.svg)](https://opensource.org/licenses/MIT) ![GitHub Stars](https://img.shields.io/github/stars/cequence-io/pinecone-scala?style=social) [![Twitter Follow](https://img.shields.io/twitter/follow/0xbnd?style=social)](https://twitter.com/0xbnd) +[![version](https://img.shields.io/badge/version-1.3.3-green.svg)](https://cequence.io) [![License](https://img.shields.io/badge/License-MIT-lightgrey.svg)](https://opensource.org/licenses/MIT) ![GitHub Stars](https://img.shields.io/github/stars/cequence-io/pinecone-scala?style=social) [![Twitter Follow](https://img.shields.io/twitter/follow/0xbnd?style=social)](https://twitter.com/0xbnd) -This is an intuitive async Scala client for Pinecone API supporting all the available vector and index/collection operations/endpoints, provided in two convenient services called [PineconeVectorService](./pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeVectorService.scala) and [PineconeIndexService](./pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeIndexService.scala). The supported calls are: +This is an intuitive async full-fledged Scala client for Pinecone API supporting all the available index, vector, collection, inference and assistant operations/endpoints, provided in two convenient services called [PineconeVectorService](./pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeVectorService.scala) and [PineconeIndexService](./pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeIndexService.scala). The supported calls are: * **Vector Operations**: [describeIndexStats](https://docs.pinecone.io/reference/api/2024-07/data-plane/describeindexstats), [query](https://docs.pinecone.io/reference/api/2024-07/data-plane/query), [delete](https://docs.pinecone.io/reference/api/2024-07/data-plane/delete), [fetch](https://docs.pinecone.io/reference/api/2024-07/data-plane/fetch), [update](https://docs.pinecone.io/reference/api/2024-07/data-plane/update), and [upsert](https://docs.pinecone.io/reference/api/2024-07/data-plane/upsert) * **Collection Operations**: [listCollections](https://docs.pinecone.io/reference/api/2024-07/control-plane/list_collections), [createCollection](https://docs.pinecone.io/reference/api/2024-07/control-plane/create_collection), [describeCollection](https://docs.pinecone.io/reference/api/2024-07/control-plane/describe_collection), and [deleteCollection](https://docs.pinecone.io/reference/api/2024-07/control-plane/delete_collection) * **Index Operations**: [listIndexes](https://docs.pinecone.io/reference/api/2024-07/control-plane/list_indexes), [creatIndex](https://docs.pinecone.io/reference/api/2024-07/control-plane/create_index), [describeIndex](https://docs.pinecone.io/reference/api/2024-07/control-plane/describe_index), [deleteIndex](https://docs.pinecone.io/reference/api/2024-07/control-plane/delete_index), and [configureIndex](https://docs.pinecone.io/reference/api/2024-07/control-plane/configure_index) -* **Inference Operations**: [embedData](https://docs.pinecone.io/reference/api/2024-07/inference/generate-embeddings), and [rerank](https://docs.pinecone.io/reference/api/2024-10/inference/rerank) +* **Inference Operations**: [embedData](https://docs.pinecone.io/reference/api/2024-07/inference/generate-embeddings), [rerank](https://docs.pinecone.io/reference/api/2024-10/inference/rerank), and [evaluate](https://docs.pinecone.io/reference/api/assistant/metrics_alignment) * **Assistant Operations**: [listAssistants](https://docs.pinecone.io/reference/api/2024-07/assistant/list-assistants), [createAssistant](https://docs.pinecone.io/reference/api/2024-07/assistant/create-assistant), [describeAssistant](https://docs.pinecone.io/reference/api/2024-07/assistant/describe-assistant), [deleteAssistant](https://docs.pinecone.io/reference/api/2024-07/assistant/delete-assistant), [listFiles](https://docs.pinecone.io/reference/api/2024-07/assistant/list-files), [uploadFile](https://docs.pinecone.io/reference/api/2024-07/assistant/create-file), [describeFile](https://docs.pinecone.io/reference/api/2024-07/assistant/describe-file), [deleteFile](https://docs.pinecone.io/reference/api/2024-07/assistant/delete-file), [chatWithAssistant](https://docs.pinecone.io/reference/api/2024-07/assistant/chat-completion-assistant) - these operations are provided by two services: `PineconeAssistantService` and `PineconeAssistantFileService` @@ -24,7 +24,7 @@ The currently supported Scala versions are **2.12, 2.13**, and **3**. To pull the library you have to add the following dependency to your *build.sbt* ``` -"io.cequence" %% "pinecone-scala-client" % "1.1.0" +"io.cequence" %% "pinecone-scala-client" % "1.3.3" ``` or to *pom.xml* (if you use maven) @@ -33,7 +33,7 @@ or to *pom.xml* (if you use maven) io.cequence pinecone-scala-client_2.12 - 1.1.0 + 1.3.3 ``` @@ -437,6 +437,18 @@ Examples: ) ``` +- Evaluate Q&A + +```scala + pineconeInferenceService.evaluate( + question = "What are the capital cities of France, England and Spain?", + answer = "Paris is a city of France and Barcelona of Spain", + groundTruthAnswer = "Paris is the capital city of France, London of England and Madrid of Spain" + ).map { response => + println(response) + } +``` + ** Assistant Operations** - List assistants @@ -549,7 +561,7 @@ pinecone-scala-client { } ``` -2. _I got an exception like `com.typesafe.config.ConfigException$UnresolvedSubstitution: pinecone-scala-client.conf @ jar:file:.../io/cequence/pinecone-scala-client_2.13/1.1.0/pinecone-scala-client_2.13-1.1.0.jar!/pinecone-scala-client.conf: 4: Could not resolve substitution to a value: ${PINECONE_SCALA_CLIENT_API_KEY}`. What should I do?_ +2. _I got an exception like `com.typesafe.config.ConfigException$UnresolvedSubstitution: pinecone-scala-client.conf @ jar:file:.../io/cequence/pinecone-scala-client_2.13/1.3.3/pinecone-scala-client_2.13-1.3.3.jar!/pinecone-scala-client.conf: 4: Could not resolve substitution to a value: ${PINECONE_SCALA_CLIENT_API_KEY}`. What should I do?_ Set the env. variable `PINECONE_SCALA_CLIENT_API_KEY`. If you don't have one register [here](https://app.pinecone.io/?sessionType=signup). diff --git a/build.sbt b/build.sbt index 38d0fda..ca97947 100755 --- a/build.sbt +++ b/build.sbt @@ -8,7 +8,7 @@ val scala33 = "3.3.1" ThisBuild / organization := "io.cequence" ThisBuild / scalaVersion := scala212 -ThisBuild / version := "1.1.0" +ThisBuild / version := "1.3.3" ThisBuild / isSnapshot := false lazy val core = (project in file("pinecone-core")) diff --git a/examples/src/main/resources/logback.xml b/examples/src/main/resources/logback.xml new file mode 100644 index 0000000..d6c50cb --- /dev/null +++ b/examples/src/main/resources/logback.xml @@ -0,0 +1,26 @@ + + + + %d{YYYY-MM-dd HH:mm:ss.SSS} %-5level %logger{36} - %msg%n + + + + + ./application.log + true + + %d{YYYY-MM-dd HH:mm:ss.SSS} %-5level %logger{36} - %msg%n + + + + + + + + + + + \ No newline at end of file diff --git a/examples/src/main/scala/io/cequence/pineconescala/demo/CreateDenseEmbeddings.scala b/examples/src/main/scala/io/cequence/pineconescala/demo/CreateDenseEmbeddings.scala new file mode 100644 index 0000000..c972497 --- /dev/null +++ b/examples/src/main/scala/io/cequence/pineconescala/demo/CreateDenseEmbeddings.scala @@ -0,0 +1,26 @@ +package io.cequence.pineconescala.demo + +import io.cequence.pineconescala.domain.EmbeddingModelId +import io.cequence.pineconescala.domain.settings.{EmbeddingsInputType, GenerateEmbeddingsSettings} + +// run me - env. variable PINECONE_SCALA_CLIENT_API_KEY must be set +object CreateDenseEmbeddings extends PineconeDemoApp { + + override protected def exec = { + pineconeInferenceService.createEmbeddings( + inputs = Seq( + "What are the capital cities of France, England and Spain?", + "Paris is the capital city of France and Barcelona of Spain", + "Paris is the capital city of France, London of England and Madrid of Spain" + ), + settings = GenerateEmbeddingsSettings( + model = EmbeddingModelId.llama_text_embed_v2, + input_type = Some(EmbeddingsInputType.Query), + dimension = Some(2048) + ) + ).map { response => + println(response) + println("Dims: " + response.data.map(_.values.size).mkString(", ")) + } + } +} diff --git a/examples/src/main/scala/io/cequence/pineconescala/demo/CreateSparseEmbeddings.scala b/examples/src/main/scala/io/cequence/pineconescala/demo/CreateSparseEmbeddings.scala new file mode 100644 index 0000000..82d1b4b --- /dev/null +++ b/examples/src/main/scala/io/cequence/pineconescala/demo/CreateSparseEmbeddings.scala @@ -0,0 +1,30 @@ +package io.cequence.pineconescala.demo + +import io.cequence.pineconescala.domain.EmbeddingModelId +import io.cequence.pineconescala.domain.settings.{EmbeddingsInputType, GenerateEmbeddingsSettings} + +// run me - env. variable PINECONE_SCALA_CLIENT_API_KEY must be set +object CreateSparseEmbeddings extends PineconeDemoApp { + + override protected def exec = { + pineconeInferenceService.createSparseEmbeddings( + inputs = Seq( + "What are the capital cities of France, England and Spain?", + "Paris is the capital city of France and Barcelona of Spain", + "Paris is the capital city of France, London of England and Madrid of Spain" + ), + settings = GenerateEmbeddingsSettings( + model = EmbeddingModelId.pinecone_sparse_english_v0, + input_type = Some(EmbeddingsInputType.Passage), + return_tokens = Some(true) + ) + ).map { response => + response.data.foreach { data => + println(data.sparse_indices.mkString(", ")) + } + response.data.foreach { data => + println(data.toSparseVector.indices.mkString(", ")) + } + } + } +} diff --git a/examples/src/main/scala/io/cequence/pineconescala/demo/Evaluate.scala b/examples/src/main/scala/io/cequence/pineconescala/demo/Evaluate.scala new file mode 100644 index 0000000..19befe8 --- /dev/null +++ b/examples/src/main/scala/io/cequence/pineconescala/demo/Evaluate.scala @@ -0,0 +1,15 @@ +package io.cequence.pineconescala.demo + +// run me - env. variable PINECONE_SCALA_CLIENT_API_KEY must be set +object Evaluate extends PineconeDemoApp { + + override protected def exec = { + pineconeInferenceService.evaluate( + question = "What are the capital cities of France, England and Spain?", + answer = "Paris is a city of France and Barcelona of Spain", + groundTruthAnswer = "Paris is the capital city of France, London of England and Madrid of Spain" + ).map { response => + println(response) + } + } +} diff --git a/examples/src/main/scala/io/cequence/pineconescala/demo/ListAllVectorIds.scala b/examples/src/main/scala/io/cequence/pineconescala/demo/ListAllVectorIds.scala new file mode 100644 index 0000000..948aeba --- /dev/null +++ b/examples/src/main/scala/io/cequence/pineconescala/demo/ListAllVectorIds.scala @@ -0,0 +1,25 @@ +package io.cequence.pineconescala.demo + +import io.cequence.pineconescala.service.PineconeServiceConsts + +// run me - env. variables PINECONE_SCALA_CLIENT_API_KEY and PINECONE_SCALA_CLIENT_ENV must be set +object ListAllVectorIds extends PineconeDemoApp with PineconeServiceConsts { + + private lazy val indexName = "auto-gpt-test" + private lazy val namespace = "auto-gpt" + + override protected def exec = { + for { + vectorService <- createPineconeVectorService(indexName) + + queryResponse <- vectorService.listAllVectorsIDs( + namespace = namespace, + batchLimit = Some(20) + ) + } yield { + val ids = queryResponse.map(_.id) + println(s"Vector Ids: ${ids.size}") + } + } + +} diff --git a/examples/src/main/scala/io/cequence/pineconescala/demo/PineconeVectorLongDemo.scala b/examples/src/main/scala/io/cequence/pineconescala/demo/PineconeVectorLongDemo.scala index 367f676..a20c875 100644 --- a/examples/src/main/scala/io/cequence/pineconescala/demo/PineconeVectorLongDemo.scala +++ b/examples/src/main/scala/io/cequence/pineconescala/demo/PineconeVectorLongDemo.scala @@ -4,7 +4,8 @@ import akka.actor.ActorSystem import akka.stream.Materializer import io.cequence.pineconescala.domain.settings.QuerySettings import io.cequence.pineconescala.domain.{PVector, SparseVector} -import io.cequence.pineconescala.service.PineconeVectorServiceFactory +import io.cequence.pineconescala.service.PineconeIndexServiceFactory.FactoryImplicits +import io.cequence.pineconescala.service.{PineconeIndexServiceFactory, PineconeVectorServiceFactory} import scala.concurrent.ExecutionContext import scala.util.Random @@ -15,11 +16,18 @@ object PineconeVectorLongDemo extends App { implicit val ec: ExecutionContext = ExecutionContext.global implicit val materializer: Materializer = Materializer(ActorSystem()) - private val indexName = "auto-gpt-test" + private val indexName = "auto-gpt" private val testIds = Seq("666", "667") + private val namespace = "test" + + val pineconeIndexService = PineconeIndexServiceFactory().asOne { for { + indexes <- pineconeIndexService.listIndexes + + _ = println(s"Indexes: ${indexes.mkString(", ")}") + pineconeVectorService <- PineconeVectorServiceFactory(indexName).map( _.getOrElse(throw new IllegalArgumentException(s"index '${indexName}' not found")) ) @@ -47,33 +55,39 @@ object PineconeVectorLongDemo extends App { PVector( id = testIds(1), values = Seq.fill(stats.dimension)(Random.nextDouble), - sparseValues = Some( - SparseVector( - indices = Seq(4, 5, 6), - values = Seq(-0.12, 0.57, 0.69) - ) - ), +// sparseValues = Some( +// SparseVector( +// indices = Seq(4, 5, 6), +// values = Seq(-0.12, 0.57, 0.69) +// ) +// ), metadata = Map( "is_relevant" -> "very much so", "food_quality" -> "burritos are the best!" ) ) ), - namespace = "my_namespace" + namespace ) _ = println(s"Upserted ${vectorUpsertedCount} vectors.") fetchResponse <- pineconeVectorService.fetch( ids = testIds, - namespace = "my_namespace" + namespace ) - _ = println(s"Fetched ${fetchResponse.vectors.keySet.size} vectors.") + _ = println(s"Fetched ${fetchResponse.vectors.keySet.size} vectors: ${fetchResponse.vectors.keySet.mkString(", ")}") queryResponse <- pineconeVectorService.query( vector = fetchResponse.vectors(testIds(0)).values, - namespace = "my_namespace", + namespace, +// sparseVector = Some( +// SparseVector( +// indices = Seq(4, 5, 6), +// values = Seq(-0.12, 0.57, 0.69) +// ) +// ), settings = QuerySettings( topK = 5, includeValues = true, @@ -85,7 +99,7 @@ object PineconeVectorLongDemo extends App { queryResponse2 <- pineconeVectorService.queryById( id = testIds(0), - namespace = "my_namespace", + namespace, settings = QuerySettings( topK = 5, includeValues = true, @@ -95,9 +109,9 @@ object PineconeVectorLongDemo extends App { _ = println(s"Query by id matched ${queryResponse2.matches.size} vectors.") - _ <- pineconeVectorService.update( + updateResponse <- pineconeVectorService.update( id = testIds(0), - namespace = "my_namespace", + namespace, values = fetchResponse.vectors(testIds(0)).values.map(_ / 100), sparseValues = Some( SparseVector( @@ -110,25 +124,42 @@ object PineconeVectorLongDemo extends App { ) ) - _ = println(s"Update finished.") + _ = println(s"Update finished. Updated ${updateResponse} vectors.") +// _ <- pineconeVectorService.update( +// id = testIds(0), +// namespace, +// values = fetchResponse.vectors(testIds(0)).values.map(_ / 100), +// sparseValues = Some( +// SparseVector( +// indices = Seq(1, 2, 3), +// values = Seq(8.8, 7.7, 2.2) +// ) +// ), +// setMetaData = Map( +// "solid_info" -> "this is the source of the truth" +// ) +// ) + + _ = println(s"Metadata update finished.") fetchResponse2 <- pineconeVectorService.fetch( ids = Seq(testIds(0)), - namespace = "my_namespace" + namespace ) - _ = println(s"Fetched ${fetchResponse2.vectors.keySet.size} vectors.") + _ = println(fetchResponse.vectors(testIds(0)).values.mkString(", ")) + _ = println(s"Fetched ${fetchResponse2.vectors.keySet.size} vectors.\n${fetchResponse2.vectors.head._2.values.mkString(", ")}\n${fetchResponse2.vectors.head._2.metadata}") _ <- pineconeVectorService.delete( ids = testIds, - namespace = "my_namespace" + namespace ) _ = println(s"Delete finished.") fetchResponse3 <- pineconeVectorService.fetch( ids = testIds, - namespace = "my_namespace" + namespace ) _ = println(s"Fetched ${fetchResponse3.vectors.keySet.size} vectors after delete.") diff --git a/openai-examples/README.md b/openai-examples/README.md index 3b595c9..55307bd 100644 --- a/openai-examples/README.md +++ b/openai-examples/README.md @@ -6,7 +6,7 @@ This is a ready-to-fork, example/demo project demonstrating how to use [Pinecone The demo app can be found in [PineconeOpenAIDemo](./src/main/scala/io/cequence/pineconeopenai/demo/PineconeOpenAIDemo.scala). The following env. variables are expected: - `PINECONE_SCALA_CLIENT_API_KEY` -- `PINECONE_SCALA_CLIENT_ENV` +- `PINECONE_SCALA_CLIENT_ENV` (optional) - `OPENAI_SCALA_CLIENT_API_KEY` - `OPENAI_SCALA_CLIENT_ORG_ID` (optional) diff --git a/openai-examples/src/main/resources/logback.xml b/openai-examples/src/main/resources/logback.xml new file mode 100644 index 0000000..d6c50cb --- /dev/null +++ b/openai-examples/src/main/resources/logback.xml @@ -0,0 +1,26 @@ + + + + %d{YYYY-MM-dd HH:mm:ss.SSS} %-5level %logger{36} - %msg%n + + + + + ./application.log + true + + %d{YYYY-MM-dd HH:mm:ss.SSS} %-5level %logger{36} - %msg%n + + + + + + + + + + + \ No newline at end of file diff --git a/pinecone-client/src/main/scala/io/cequence/pineconescala/JsonFormats.scala b/pinecone-client/src/main/scala/io/cequence/pineconescala/JsonFormats.scala index 7a6095c..7d6f00e 100644 --- a/pinecone-client/src/main/scala/io/cequence/pineconescala/JsonFormats.scala +++ b/pinecone-client/src/main/scala/io/cequence/pineconescala/JsonFormats.scala @@ -6,7 +6,7 @@ import io.cequence.pineconescala.domain.settings.{EmbeddingsInputType, Embedding import io.cequence.pineconescala.domain.settings.EmbeddingsInputType.{Passage, Query} import io.cequence.pineconescala.domain.{Metric, PVector, PodType, SparseVector, response} import io.cequence.wsclient.JsonUtil -import io.cequence.wsclient.JsonUtil.{JsonOps, enumFormat, toJson} +import io.cequence.wsclient.JsonUtil.enumFormat import play.api.libs.json._ import play.api.libs.functional.syntax._ @@ -87,13 +87,22 @@ object JsonFormats { Json.format[ServerlessIndexInfo] // embeddings - implicit lazy val embeddingUsageInfoReads: Reads[EmbeddingsUsageInfo] = - Json.reads[EmbeddingsUsageInfo] + implicit lazy val embeddingUsageInfoFormat: Format[EmbeddingsUsageInfo] = + Json.format[EmbeddingsUsageInfo] implicit lazy val embeddingInfoReads: Reads[EmbeddingsInfo] = Json.reads[EmbeddingsInfo] - implicit lazy val embeddingValuesReads: Reads[EmbeddingsValues] = - Json.reads[EmbeddingsValues] - implicit lazy val embeddingResponseReads: Reads[GenerateEmbeddingsResponse] = - Json.reads[GenerateEmbeddingsResponse] + implicit lazy val denseEmbeddingValuesReads: Reads[DenseEmbeddingsValues] = + Json.reads[DenseEmbeddingsValues] + implicit lazy val denseEmbeddingResponseReads: Reads[EmbeddingsResponse.Dense] = + Json.reads[EmbeddingsResponse.Dense] + implicit lazy val sparseEmbeddingValuesReads: Reads[SparseEmbeddingsValues] = + ( + (__ \ "sparse_values").read[Seq[Double]] and + (__ \ "sparse_indices").read[Seq[Long]] and + (__ \ "sparse_tokens").readWithDefault[Seq[String]](Nil) + )(SparseEmbeddingsValues.apply _) + + implicit lazy val sparseEmbeddingResponseReads: Reads[EmbeddingsResponse.Sparse] = + Json.reads[EmbeddingsResponse.Sparse] implicit lazy val embeddingsInputTypeWrites: Writes[EmbeddingsInputType] = enumFormat( Query, @@ -187,4 +196,12 @@ object JsonFormats { Json.format[RerankedDocument] } implicit lazy val rerankResponseFormat: Format[RerankResponse] = Json.format[RerankResponse] -} + + // evaluate + implicit lazy val factFormat: Format[Fact] = Json.format[Fact] + implicit lazy val evaluateUsageFormat: Format[EvaluateUsage] = Json.format[EvaluateUsage] + implicit lazy val evaluatedFactFormat: Format[EvaluatedFact] = Json.format[EvaluatedFact] + implicit lazy val reasoningFormat: Format[Reasoning] = Json.format[Reasoning] + implicit lazy val metricsFormat: Format[Metrics] = Json.format[Metrics] + implicit lazy val evaluateResponseFormat: Format[EvaluateResponse] = Json.format[EvaluateResponse] +} \ No newline at end of file diff --git a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/EndPoint.scala b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/EndPoint.scala index dabe3a0..a4e0c94 100644 --- a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/EndPoint.scala +++ b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/EndPoint.scala @@ -1,6 +1,5 @@ package io.cequence.pineconescala.service -import io.cequence.pineconescala.domain.settings.IndexSettings.{CreatePodBasedIndexSettings, CreateServerlessIndexSettings} import io.cequence.wsclient.domain.NamedEnumValue sealed abstract class EndPoint(value: String = "") extends NamedEnumValue(value) @@ -9,7 +8,6 @@ object EndPoint { case object assistants extends EndPoint("assistant/assistants") case object chat extends EndPoint("assistant/chat") case object describe_index_stats extends EndPoint - case object embed extends EndPoint case object files extends EndPoint("assistant/files") case object query extends EndPoint case object vectors_delete extends EndPoint("vectors/delete") @@ -20,7 +18,9 @@ object EndPoint { case object collections extends EndPoint case object databases extends EndPoint case object indexes extends EndPoint - case object rerank extends EndPoint + case class embed(prefix: String) extends EndPoint(s"${prefix}embed") + case class rerank(prefix: String) extends EndPoint(s"${prefix}rerank") + case class evaluate(prefix: String) extends EndPoint(s"${prefix}assistant/evaluation/metrics/alignment") } // TODO: rename to Param @@ -69,4 +69,7 @@ object Tag { case object top_n extends Tag case object return_documents extends Tag case object rank_fields extends Tag + case object question extends Tag + case object answer extends Tag + case object ground_truth_answer extends Tag } diff --git a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileServiceImpl.scala b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileServiceImpl.scala index 93c1117..0f37789 100644 --- a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileServiceImpl.scala +++ b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileServiceImpl.scala @@ -1,7 +1,6 @@ package io.cequence.pineconescala.service import akka.stream.Materializer -import com.typesafe.config.{Config, ConfigFactory} import io.cequence.pineconescala.PineconeScalaClientException import io.cequence.pineconescala.domain.response.{ ChatCompletionResponse, @@ -40,7 +39,7 @@ class PineconeAssistantFileServiceImpl( requestContext = WsRequestContext( authHeaders = Seq( ("Api-Key", apiKey) - // ("X-Pinecone-API-Version", "2024-07") + // ("X-Pinecone-API-Version", apiVersion) ), explTimeouts = explicitTimeouts ) @@ -95,7 +94,7 @@ class PineconeAssistantFileServiceImpl( // FIXME: provide support for end point param followed by URL suffix endPointParam = Some(s"$assistantName/chat/completions"), bodyParams = jsonBodyParams( - Tag.messages -> Some(Json.toJson(messages.map(UserMessage))) + Tag.messages -> Some(Json.toJson(messages.map(UserMessage.apply))) ) ).map(_.asSafeJson[ChatCompletionResponse]) diff --git a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantServiceImpl.scala b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantServiceImpl.scala index 48f9ebc..757049e 100644 --- a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantServiceImpl.scala +++ b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantServiceImpl.scala @@ -1,7 +1,6 @@ package io.cequence.pineconescala.service import akka.stream.Materializer -import com.typesafe.config.{Config, ConfigFactory} import io.cequence.pineconescala.domain.response.{ Assistant, DeleteResponse, @@ -35,7 +34,7 @@ class PineconeAssistantServiceImpl( requestContext = WsRequestContext( authHeaders = Seq( ("Api-Key", apiKey), - ("X-Pinecone-API-Version", "2024-07") + ("X-Pinecone-API-Version", apiVersion) ), explTimeouts = explicitTimeouts ) diff --git a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeIndexServiceImpl.scala b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeIndexServiceImpl.scala index 1b1cf23..56379b5 100644 --- a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeIndexServiceImpl.scala +++ b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeIndexServiceImpl.scala @@ -175,7 +175,7 @@ private final class PineconePodPineconeBasedImpl( replicas: Option[Int], podType: Option[PodType] ): Future[ConfigureIndexResponse] = - execPATCRich( + execPATCHRich( indexesEndpoint, endPointParam = Some(indexName), bodyParams = jsonBodyParams( @@ -255,7 +255,10 @@ abstract class PineconeIndexServiceImpl[S <: IndexSettings]( override protected val engine: WSClientEngine = PlayWSClientEngine( coreUrl, requestContext = WsRequestContext( - authHeaders = Seq(("Api-Key", apiKey)), + authHeaders = Seq( + "Api-Key" -> apiKey, + "X-Pinecone-API-Version" -> apiVersion + ), explTimeouts = explicitTimeouts ) ) diff --git a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceServiceImpl.scala b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceServiceImpl.scala index 424130f..99c68e8 100644 --- a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceServiceImpl.scala +++ b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceServiceImpl.scala @@ -1,13 +1,13 @@ package io.cequence.pineconescala.service import akka.stream.Materializer -import io.cequence.pineconescala.domain.response.{GenerateEmbeddingsResponse, RerankResponse} +import io.cequence.pineconescala.domain.response.{EmbeddingsResponse, EvaluateResponse, RerankResponse} import io.cequence.pineconescala.domain.settings.{GenerateEmbeddingsSettings, RerankSettings} import io.cequence.wsclient.ResponseImplicits._ import io.cequence.wsclient.service.ws.{PlayWSClientEngine, Timeouts} import io.cequence.pineconescala.JsonFormats._ import io.cequence.pineconescala.PineconeScalaClientException -import io.cequence.wsclient.domain.WsRequestContext +import io.cequence.wsclient.domain.{Response, WsRequestContext} import io.cequence.wsclient.service.WSClientEngine import io.cequence.wsclient.service.WSClientWithEngineTypes.WSClientWithEngine @@ -25,13 +25,16 @@ private class PineconeInferenceServiceImpl( override protected type PEP = EndPoint override protected type PT = Tag + private val regularURL = "api.pinecone.io/" + private val prodURL = "prod-1-data.ke.pinecone.io/" + // we use play-ws backend override protected val engine: WSClientEngine = PlayWSClientEngine( - coreUrl = "https://api.pinecone.io/", + coreUrl = "https://", // TODO: change to regularURL eventually requestContext = WsRequestContext( authHeaders = Seq( - ("Api-Key", apiKey), - ("X-Pinecone-API-Version", "2024-10") + "Api-Key" -> apiKey, + "X-Pinecone-API-Version" -> apiVersion ), explTimeouts = explicitTimeouts ) @@ -49,9 +52,25 @@ private class PineconeInferenceServiceImpl( override def createEmbeddings( inputs: Seq[String], settings: GenerateEmbeddingsSettings - ): Future[GenerateEmbeddingsResponse] = + ): Future[EmbeddingsResponse.Dense] = + createDenseSparseEmbeddingsAux(inputs, settings).map( + _.asSafeJson[EmbeddingsResponse.Dense] + ) + + override def createSparseEmbeddings( + inputs: Seq[String], + settings: GenerateEmbeddingsSettings + ): Future[EmbeddingsResponse.Sparse] = + createDenseSparseEmbeddingsAux(inputs, settings).map( + _.asSafeJson[EmbeddingsResponse.Sparse] + ) + + private def createDenseSparseEmbeddingsAux( + inputs: Seq[String], + settings: GenerateEmbeddingsSettings + ): Future[Response] = execPOST( - EndPoint.embed, + EndPoint.embed(regularURL), bodyParams = jsonBodyParams( Tag.inputs -> Some( inputs.map(input => Map("text" -> input)) @@ -60,12 +79,12 @@ private class PineconeInferenceServiceImpl( Tag.parameters -> Some( Map( "input_type" -> settings.input_type.map(_.toString), - "truncate" -> settings.truncate.toString + "truncate" -> settings.truncate.toString, + "return_tokens" -> settings.return_tokens, + "dimension" -> settings.dimension ) ) ) - ).map( - _.asSafeJson[GenerateEmbeddingsResponse] ) /** @@ -85,10 +104,10 @@ private class PineconeInferenceServiceImpl( override def rerank( query: String, documents: Seq[Map[String, Any]], - settings: RerankSettings = DefaultSettings.Rerank + settings: RerankSettings ): Future[RerankResponse] = execPOST( - EndPoint.rerank, + EndPoint.rerank(regularURL), bodyParams = jsonBodyParams( Tag.query -> Some(query), Tag.documents -> Some(documents), @@ -106,6 +125,22 @@ private class PineconeInferenceServiceImpl( _.asSafeJson[RerankResponse] ) + override def evaluate( + question: String, + answer: String, + groundTruthAnswer: String + ): Future[EvaluateResponse] = + execPOST( + EndPoint.evaluate(prodURL), + bodyParams = jsonBodyParams( + Tag.question -> Some(question), + Tag.answer -> Some(answer), + Tag.ground_truth_answer -> Some(groundTruthAnswer) + ) + ).map( + _.asSafeJson[EvaluateResponse] + ) + override protected def handleErrorCodes( httpCode: Int, message: String diff --git a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeVectorServiceImpl.scala b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeVectorServiceImpl.scala index fe08384..b3c61fa 100644 --- a/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeVectorServiceImpl.scala +++ b/pinecone-client/src/main/scala/io/cequence/pineconescala/service/PineconeVectorServiceImpl.scala @@ -3,7 +3,7 @@ package io.cequence.pineconescala.service import akka.stream.Materializer import com.typesafe.config.{Config, ConfigFactory} import io.cequence.pineconescala.JsonFormats._ -import io.cequence.pineconescala.PineconeScalaClientException +import io.cequence.pineconescala.{PineconeScalaClientException, PineconeScalaClientMetadataSizeExceededException} import io.cequence.pineconescala.domain.response._ import io.cequence.pineconescala.domain.settings.QuerySettings import io.cequence.pineconescala.domain.{PVector, SparseVector} @@ -43,8 +43,11 @@ private class PineconeVectorServiceImpl( // we use play-ws backend override protected val engine: WSClientEngine = PlayWSClientEngine( coreUrl, - requestContext = WsRequestContext( - authHeaders = Seq(("Api-Key", apiKey)), + requestContext = WsRequestContext( + authHeaders = Seq( + "Api-Key" -> apiKey, + "X-Pinecone-API-Version" -> apiVersion + ), explTimeouts = explicitTimeouts ) ) @@ -57,6 +60,7 @@ private class PineconeVectorServiceImpl( override def query( vector: Seq[Double], namespace: String, + sparseVector: Option[SparseVector], settings: QuerySettings ): Future[QueryResponse] = execPOST( @@ -68,7 +72,7 @@ private class PineconeVectorServiceImpl( Tag.filter -> (if (settings.filter.nonEmpty) Some(settings.filter) else None), Tag.includeValues -> Some(settings.includeValues), Tag.includeMetadata -> Some(settings.includeMetadata), - Tag.sparseVector -> settings.sparseVector.map(Json.toJson(_)(sparseVectorFormat)) + Tag.sparseVector -> sparseVector.map(Json.toJson(_)(sparseVectorFormat)) ) ).map( _.asSafeJson[QueryResponse] @@ -87,13 +91,38 @@ private class PineconeVectorServiceImpl( Tag.topK -> Some(settings.topK), Tag.filter -> (if (settings.filter.nonEmpty) Some(settings.filter) else None), Tag.includeValues -> Some(settings.includeValues), - Tag.includeMetadata -> Some(settings.includeMetadata), - Tag.sparseVector -> settings.sparseVector.map(Json.toJson(_)(sparseVectorFormat)) + Tag.includeMetadata -> Some(settings.includeMetadata) ) ).map( _.asSafeJson[QueryResponse] ) + override def listAllVectorsIDs( + namespace: String, + batchLimit: Option[Int], + prefix: Option[String] + ): Future[Seq[VectorId]] = { + + // aux recursive function + def listVectorsAux( + namespace: String, + paginationToken: Option[String] = None + ): Future[Seq[VectorId]] = { + listVectorIDs(namespace, batchLimit, paginationToken, prefix).flatMap { response => + val nextToken = response.pagination.flatMap(_.next) + nextToken.map { nextToken => + listVectorsAux(namespace, Some(nextToken)).map { vectors => + response.vectors ++ vectors + } + }.getOrElse( + Future.successful(response.vectors) + ) + } + } + + listVectorsAux(namespace) + } + override def listVectorIDs( namespace: String, limit: Option[Int], @@ -169,17 +198,17 @@ private class PineconeVectorServiceImpl( values: Seq[Double], sparseValues: Option[SparseVector], setMetaData: Map[String, String] - ): Future[Unit] = + ): Future[String] = execPOST( EndPoint.vectors_update, bodyParams = jsonBodyParams( Tag.id -> Some(id), Tag.namespace -> Some(namespace), - Tag.values -> Some(values), + Tag.values -> (if (values.nonEmpty) Some(values) else None), Tag.sparseValues -> sparseValues.map(Json.toJson(_)), Tag.setMetadata -> (if (setMetaData.nonEmpty) Some(setMetaData) else None) ) - ).map(_ => ()) + ).map(_.string) override def upsert( vectors: Seq[PVector], @@ -205,7 +234,14 @@ private class PineconeVectorServiceImpl( httpCode: Int, message: String ): Nothing = - throw new PineconeScalaClientException(s"Code ${httpCode} : ${message}") + httpCode match { + // {"code":3,"message":"Metadata size is 52722 bytes, which exceeds the limit of 40960 bytes per vector","details":[] + case 400 if message.contains("\"code\":3") && message.contains("exceeds the limit") => + throw new PineconeScalaClientMetadataSizeExceededException(message) + + case _ => + throw new PineconeScalaClientException(s"Code ${httpCode} : ${message}") + } } object PineconeVectorServiceFactory extends PineconeServiceFactoryHelper { diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/PineconeScalaClientException.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/PineconeScalaClientException.scala index 6d60fc2..c9aa390 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/PineconeScalaClientException.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/PineconeScalaClientException.scala @@ -4,6 +4,10 @@ class PineconeScalaClientException(message: String, cause: Throwable) extends Ru def this(message: String) = this(message, null) } +class PineconeScalaClientMetadataSizeExceededException(message: String, cause: Throwable) extends PineconeScalaClientException(message, cause) { + def this(message: String) = this(message, null) +} + class PineconeScalaClientTimeoutException(message: String, cause: Throwable) extends PineconeScalaClientException(message, cause) { def this(message: String) = this(message, null) } diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/EmbeddingModelId.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/EmbeddingModelId.scala index b3a00f7..96b2021 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/EmbeddingModelId.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/EmbeddingModelId.scala @@ -1,5 +1,10 @@ package io.cequence.pineconescala.domain object EmbeddingModelId { + // 2048 input tokens + val llama_text_embed_v2 = "llama-text-embed-v2" + // 507 input tokens, dim 1024 (dense) val multilingual_e5_large = "multilingual-e5-large" + // 512 input tokens + val pinecone_sparse_english_v0 = "pinecone-sparse-english-v0" } diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/RerankModelId.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/RerankModelId.scala index 237cdfd..cc9c36e 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/RerankModelId.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/RerankModelId.scala @@ -1,5 +1,10 @@ package io.cequence.pineconescala.domain object RerankModelId { + // 1024 input tokens val bge_reranker_v2_m3 = "bge-reranker-v2-m3" + // 4096 input tokens + val cohere_rerank_3_5 = "cohere-rerank-3.5" + // 512 input tokens + val pinecone_rerank_v0 = "pinecone-rerank-v0" } diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/SparseVector.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/SparseVector.scala index c9a57cf..9f5ad0b 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/SparseVector.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/SparseVector.scala @@ -1,6 +1,6 @@ package io.cequence.pineconescala.domain case class SparseVector( - indices: Seq[Int], + indices: Seq[Long], values: Seq[Double] ) \ No newline at end of file diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/EmbeddingsResponse.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/EmbeddingsResponse.scala new file mode 100644 index 0000000..4173f20 --- /dev/null +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/EmbeddingsResponse.scala @@ -0,0 +1,45 @@ +package io.cequence.pineconescala.domain.response + +import io.cequence.pineconescala.domain.SparseVector + +sealed trait EmbeddingsResponse + +object EmbeddingsResponse { + + case class Dense( + model: String, + data: Seq[DenseEmbeddingsValues], + usage: EmbeddingsUsageInfo + ) + + case class Sparse( + model: String, + data: Seq[SparseEmbeddingsValues], + usage: EmbeddingsUsageInfo + ) +} + +case class DenseEmbeddingsValues(values: Seq[Double]) + +case class SparseEmbeddingsValues( + sparse_values: Seq[Double], + sparse_indices: Seq[Long], + // TODO: is it even supported? + sparse_tokens: Seq[String] +) { + def toSparseVector = + SparseVector( + indices = sparse_indices, + values = sparse_values + ) +} + +case class EmbeddingsUsageInfo( + total_tokens: Int +) + +@Deprecated +case class EmbeddingsInfo( + embedding: Seq[Double], + index: Int +) diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/EvaluateResponse.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/EvaluateResponse.scala new file mode 100644 index 0000000..d55810b --- /dev/null +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/EvaluateResponse.scala @@ -0,0 +1,32 @@ +package io.cequence.pineconescala.domain.response + +case class EvaluateResponse( + metrics: Metrics, + reasoning: Reasoning, + usage: EvaluateUsage +) + +case class Metrics( + correctness: Double, + completeness: Double, + alignment: Double +) + +case class Reasoning( + evaluated_facts: List[EvaluatedFact] +) + +case class EvaluatedFact( + fact: Fact, + entailment: String +) + +case class Fact( + content: String +) + +case class EvaluateUsage( + prompt_tokens: Int, + completion_tokens: Int, + total_tokens: Int +) diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/GenerateEmbeddingsResponse.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/GenerateEmbeddingsResponse.scala deleted file mode 100644 index e827e60..0000000 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/response/GenerateEmbeddingsResponse.scala +++ /dev/null @@ -1,18 +0,0 @@ -package io.cequence.pineconescala.domain.response - -case class GenerateEmbeddingsResponse( - data: Seq[EmbeddingsValues], - model: String, - usage: EmbeddingsUsageInfo -) - -case class EmbeddingsValues(values: Seq[Double]) - -case class EmbeddingsInfo( - embedding: Seq[Double], - index: Int -) - -case class EmbeddingsUsageInfo( - total_tokens: Int -) diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/GenerateEmbeddingsSettings.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/GenerateEmbeddingsSettings.scala index 8241597..094c22d 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/GenerateEmbeddingsSettings.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/GenerateEmbeddingsSettings.scala @@ -10,7 +10,14 @@ case class GenerateEmbeddingsSettings( input_type: Option[EmbeddingsInputType] = None, // The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models. - truncate: EmbeddingsTruncate = EmbeddingsTruncate.End + truncate: EmbeddingsTruncate = EmbeddingsTruncate.End, + + // TODO: is it even supported? + @Deprecated + return_tokens: Option[Boolean] = None, + + // Dimension of the vector to return. Supported by Dense: llama-text-embed-v2 + dimension: Option[Int] = None, ) { def withPassageInputType = copy(input_type = Some(EmbeddingsInputType.Passage)) def withQueryInputType = copy(input_type = Some(EmbeddingsInputType.Query)) diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/QuerySettings.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/QuerySettings.scala index 87f814e..e2474a2 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/QuerySettings.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/domain/settings/QuerySettings.scala @@ -1,7 +1,5 @@ package io.cequence.pineconescala.domain.settings -import io.cequence.pineconescala.domain.SparseVector - case class QuerySettings( // The number of results to return for each query. topK: Int, @@ -14,9 +12,5 @@ case class QuerySettings( includeValues: Boolean, // Indicates whether metadata is included in the response as well as the ids. - includeMetadata: Boolean, - - // Vector sparse data. - // Represented as a list of indices and a list of corresponded values, which must be the same length. - sparseVector: Option[SparseVector] = None, + includeMetadata: Boolean ) \ No newline at end of file diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileService.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileService.scala index 1f97c0c..d351019 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileService.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeAssistantFileService.scala @@ -1,6 +1,10 @@ package io.cequence.pineconescala.service -import io.cequence.pineconescala.domain.response.{ChatCompletionResponse, DeleteResponse, FileResponse} +import io.cequence.pineconescala.domain.response.{ + ChatCompletionResponse, + DeleteResponse, + FileResponse +} import io.cequence.wsclient.service.CloseableService import java.io.File @@ -13,11 +17,11 @@ import scala.concurrent.Future * * The following services are supported: * - * - listFiles - * - uploadFile - * - describeFile - * - deleteFile - * - chatWithAssistant + * - listFiles + * - uploadFile + * - describeFile + * - deleteFile + * - chatWithAssistant * * @since July * 2024 @@ -27,7 +31,8 @@ trait PineconeAssistantFileService extends CloseableService { /** * This operation returns a list of all files in an assistant. * - * @param assistantName The name of the assistant to get files of. + * @param assistantName + * The name of the assistant to get files of. * @return */ def listFiles(assistantName: String): Future[Seq[FileResponse]] @@ -35,38 +40,56 @@ trait PineconeAssistantFileService extends CloseableService { /** * This operation uploads a file to a specified assistant. * - * @param assistantName The name of the assistant to upload file to. - * @param file A file to upload. - * @param displayFileName The name of the file to be displayed. + * @param assistantName + * The name of the assistant to upload file to. + * @param file + * A file to upload. + * @param displayFileName + * The name of the file to be displayed. * @return */ - def uploadFile(assistantName: String, file: File, displayFileName: Option[String] = None): Future[FileResponse] + def uploadFile( + assistantName: String, + file: File, + displayFileName: Option[String] = None + ): Future[FileResponse] /** - * - * @param assistantName The name of the assistant to get file from. - * @param fileId The UUID of the file to be described. + * @param assistantName + * The name of the assistant to get file from. + * @param fileId + * The UUID of the file to be described. * @return */ - def describeFile(assistantName: String, fileId: UUID): Future[Option[FileResponse]] + def describeFile( + assistantName: String, + fileId: UUID + ): Future[Option[FileResponse]] /** - * * @param assistantName - * @param fileId The UUID of the file to be described. + * @param fileId + * The UUID of the file to be described. * @return */ - def deleteFile(assistantName: String, fileId: UUID): Future[DeleteResponse] - + def deleteFile( + assistantName: String, + fileId: UUID + ): Future[DeleteResponse] /** - * This operation queries the completions endpoint of a Pinecone Assistant. - * For guidance and examples, see the chat with assistant guide. + * This operation queries the completions endpoint of a Pinecone Assistant. For guidance and + * examples, see the chat with assistant guide. * - * @param assistantName The name of the assistant to be described. - * @param messages An array of objects that represent the messages in a conversation. - * @return The ChatCompletionModel describes the response format of a chat request + * @param assistantName + * The name of the assistant to be described. + * @param messages + * An array of objects that represent the messages in a conversation. + * @return + * The ChatCompletionModel describes the response format of a chat request */ - def chatWithAssistant(assistantName: String, messages: Seq[String]): Future[ChatCompletionResponse] - + def chatWithAssistant( + assistantName: String, + messages: Seq[String] + ): Future[ChatCompletionResponse] } diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceService.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceService.scala index 3464e16..766c0ce 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceService.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeInferenceService.scala @@ -1,6 +1,10 @@ package io.cequence.pineconescala.service -import io.cequence.pineconescala.domain.response.{GenerateEmbeddingsResponse, RerankResponse} +import io.cequence.pineconescala.domain.response.{ + EvaluateResponse, + EmbeddingsResponse, + RerankResponse +} import io.cequence.pineconescala.domain.settings.{GenerateEmbeddingsSettings, RerankSettings} import io.cequence.wsclient.service.CloseableService @@ -13,6 +17,8 @@ import scala.concurrent.Future * The following services are supported: * * - createEmbeddings + * - rerank + * - evaluate * * @since May * 2024 @@ -20,7 +26,7 @@ import scala.concurrent.Future trait PineconeInferenceService extends CloseableService with PineconeServiceConsts { /** - * Uses the specified model to generate embeddings for the input sequence. + * Uses the specified model to generate dense embeddings for the input sequence. * * @param inputs * Input sequence for which to generate embeddings. @@ -28,27 +34,78 @@ trait PineconeInferenceService extends CloseableService with PineconeServiceCons * @return * list of embeddings inside an envelope * - * @see Pinecone Doc + * @see + * Pinecone + * Doc */ // TODO: rename to embedData to be consistent with the API def createEmbeddings( inputs: Seq[String], settings: GenerateEmbeddingsSettings = DefaultSettings.GenerateEmbeddings - ): Future[GenerateEmbeddingsResponse] + ): Future[EmbeddingsResponse.Dense] + + /** + * Uses the specified model to generate sparse embeddings for the input sequence. + * + * @param inputs + * Input sequence for which to generate embeddings. + * @param settings + * @return + * list of embeddings inside an envelope + * + * @see + * Pinecone + * Doc + */ + def createSparseEmbeddings( + inputs: Seq[String], + settings: GenerateEmbeddingsSettings = DefaultSettings.GenerateEmbeddings + ): Future[EmbeddingsResponse.Sparse] /** * Using a reranker to rerank a list of items for a query. * - * @param query The query to rerank documents against (required) - * @param documents The documents to rerank (required) + * @param query + * The query to rerank documents against (required) + * @param documents + * The documents to rerank (required) * @param settings * @return * - * @see Pinecone Doc + * @see + * Pinecone + * Doc */ def rerank( query: String, documents: Seq[Map[String, Any]], settings: RerankSettings = DefaultSettings.Rerank ): Future[RerankResponse] + + /** + * Evaluate an answer + * + * The metrics_alignment endpoint evaluates the correctness, completeness, and alignment of a + * generated answer with respect to a question and a ground truth answer. The correctness and + * completeness are evaluated based on the precision and recall of the generated answer with + * respect to the ground truth answer facts. Alignment is the harmonic mean of correctness + * and completeness. + * + * Note: Originally in the Pinecone API this function is part of Assistant API. + * + * @param question + * The question for which the answer was generated. + * @param answer + * The generated answer. + * @param groundTruthAnswer + * The ground truth answer to the question. + * @return + */ + def evaluate( + question: String, + answer: String, + groundTruthAnswer: String + ): Future[EvaluateResponse] } diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeServiceConsts.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeServiceConsts.scala index 5e24499..a2b1004 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeServiceConsts.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeServiceConsts.scala @@ -13,6 +13,8 @@ trait PineconeServiceConsts { protected val configFileName = "pinecone-scala-client.conf" + protected val apiVersion = "2025-04" // "2025-01" + object DefaultSettings { val Query = QuerySettings( @@ -37,7 +39,8 @@ trait PineconeServiceConsts { ) val GenerateEmbeddings = GenerateEmbeddingsSettings( - model = EmbeddingModelId.multilingual_e5_large + model = EmbeddingModelId.multilingual_e5_large, + input_type = Some(EmbeddingsInputType.Query) ) val Rerank = RerankSettings( diff --git a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeVectorService.scala b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeVectorService.scala index f5c9d93..496fa72 100644 --- a/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeVectorService.scala +++ b/pinecone-core/src/main/scala/io/cequence/pineconescala/service/PineconeVectorService.scala @@ -1,4 +1,4 @@ -package io.cequence.pineconescala.service +package io.cequence.pineconescala.service import io.cequence.pineconescala.domain.{PVector, SparseVector} import io.cequence.pineconescala.domain.response._ @@ -8,54 +8,71 @@ import io.cequence.pineconescala.domain.settings.QuerySettings import scala.concurrent.Future /** - * Central service to access all Pinecone vector operations/endpoints as defined at the API ref. page + * Central service to access all Pinecone vector operations/endpoints as defined at the API ref. page * * The following services are supported: * - * - describeIndexStats - * - query - by vector or by id (queryById) - * - delete - by filter or ids, or delete all - * - fetch - * - update - * - upsert - * - listVectorIDs + * - describeIndexStats + * - query - by vector or by id (queryById) + * - delete - by filter or ids, or delete all + * - fetch + * - update + * - upsert + * - listVectorIDs * - * @since Apr 2023 + * @since Apr + * 2023 */ trait PineconeVectorService extends PineconeServiceConsts { /** - * The DescribeIndexStats operation returns statistics about the index's contents, including the vector count per namespace and the number of dimensions. + * The DescribeIndexStats operation returns statistics about the index's contents, including + * the vector count per namespace and the number of dimensions. * - * @return IndexStats - * @see Pinecone Doc + * @return + * IndexStats + * @see + * Pinecone Doc */ def describeIndexStats: Future[IndexStats] /** - * The Query operation searches a namespace, using a query vector. - * It retrieves the ids of the most similar items in a namespace, along with their similarity scores. + * The Query operation searches a namespace, using a query vector. It retrieves the ids of + * the most similar items in a namespace, along with their similarity scores. * - * @param vector The query vector. This should be the same length as the dimension of the index being queried. - * @param namespace The namespace to query. - * @return model or None if not found - * @see Pinecone Doc + * @param vector + * The query vector. This should be the same length as the dimension of the index being + * queried. + * @param sparseVector + * Represented as a list of indices and a list of corresponded values, which must be the + * same length. + * @param namespace + * The namespace to query. + * @return + * model or None if not found + * @see + * Pinecone Doc */ def query( vector: Seq[Double], namespace: String, + sparseVector: Option[SparseVector] = None, settings: QuerySettings = DefaultSettings.Query ): Future[QueryResponse] /** - * The Query operation searches a namespace, using an unique id of the vector. - * It retrieves the ids of the most similar items in a namespace, along with their similarity scores. + * The Query operation searches a namespace, using an unique id of the vector. It retrieves + * the ids of the most similar items in a namespace, along with their similarity scores. * - * @param id The unique ID of the vector to be used as a query vector. + * @param id + * The unique ID of the vector to be used as a query vector. * @param namespace * @param settings - * @return QueryResult - * @see Pinecone Doc + * @return + * QueryResult + * @see + * Pinecone Doc */ def queryById( id: String, @@ -64,38 +81,59 @@ trait PineconeVectorService extends PineconeServiceConsts { ): Future[QueryResponse] /** - * The list operation lists the IDs of vectors in a single namespace. - * An optional prefix can be passed to limit the results to IDs with a common prefix. + * The list operation lists the IDs of vectors in a single namespace. An optional prefix can + * be passed to limit the results to IDs with a common prefix. * * Note: This operation seems to be working only for serverless indexes. * - * It returns up to 100 IDs at a time by default in sorted order (bitwise/"C" collation). - * If the limit parameter is set, list returns up to that number of IDs instead. - * Whenever there are additional IDs to return, the response also includes a pagination_token that you can use to get the next batch of IDs. - * When the response does not includes a pagination_token, there are no more IDs to return. + * It returns up to 100 IDs at a time by default in sorted order (bitwise/"C" collation). If + * the limit parameter is set, list returns up to that number of IDs instead. Whenever there + * are additional IDs to return, the response also includes a pagination_token that you can + * use to get the next batch of IDs. When the response does not includes a pagination_token, + * there are no more IDs to return. * * @param namespace * @param limit * @param paginationToken * @param prefix * - * @return List of vector IDs wrapped in a ListVectorIdsResponse - * @see Pinecone Doc + * @return + * List of vector IDs wrapped in a ListVectorIdsResponse + * @see + * Pinecone Doc */ def listVectorIDs( namespace: String, limit: Option[Int] = None, paginationToken: Option[String] = None, - prefix: Option[String] = None, + prefix: Option[String] = None ): Future[ListVectorIdsResponse] + /** + * Same as [[listVectorIDs]] but returns all the IDs in the namespace. + * + * @param namespace + * @param batchLimit + * @param prefix + * @return + */ + def listAllVectorsIDs( + namespace: String, + batchLimit: Option[Int] = None, + prefix: Option[String] = None + ): Future[Seq[VectorId]] + /** * The Delete operation deletes vectors, by id, from a single namespace. * - * @param ids Vectors to delete. - * @param namespace The namespace to delete vectors from, if applicable. - * @return N/A - * @see Pinecone Doc + * @param ids + * Vectors to delete. + * @param namespace + * The namespace to delete vectors from, if applicable. + * @return + * N/A + * @see + * Pinecone Doc */ def delete( ids: Seq[String], @@ -105,11 +143,13 @@ trait PineconeVectorService extends PineconeServiceConsts { /** * The Delete operation deletes vectors, by the metadata filter, from a single namespace. * - * @param filter The metadata filter here will be used to select the vectors to delete. - * See https://www.pinecone.io/docs/metadata-filtering/. + * @param filter + * The metadata filter here will be used to select the vectors to delete. See + * https://www.pinecone.io/docs/metadata-filtering/. * @param namespace * @return - * @see Pinecone Doc + * @see + * Pinecone Doc */ def delete( filter: Map[String, String], @@ -121,20 +161,23 @@ trait PineconeVectorService extends PineconeServiceConsts { * * @param namespace * @return - * @see Pinecone Doc + * @see + * Pinecone Doc */ def deleteAll( namespace: String ): Future[Unit] /** - * The Fetch operation looks up and returns vectors, by ID, from a single namespace. - * The returned vectors include the vector data and/or metadata. + * The Fetch operation looks up and returns vectors, by ID, from a single namespace. The + * returned vectors include the vector data and/or metadata. * - * @param id The vector IDs to fetch. Does not accept values containing spaces. + * @param id + * The vector IDs to fetch. Does not accept values containing spaces. * @param namespace * @return - * @see Pinecone Doc + * @see + * Pinecone Doc */ def fetch( ids: Seq[String], @@ -142,17 +185,25 @@ trait PineconeVectorService extends PineconeServiceConsts { ): Future[FetchResponse] /** - * The Update operation updates vector in a namespace. - * If a value is included, it will overwrite the previous value. - * If a set_metadata is included, the values of the fields specified in it will be added or overwrite the previous value. + * The Update operation updates vector in a namespace. If a value is included, it will + * overwrite the previous value. If a set_metadata is included, the values of the fields + * specified in it will be added or overwrite the previous value. * - * @param id Vector's unique id. - * @param namespace The namespace containing the vector to update. - * @param values Vector data. - * @param sparseValues Vector sparse data. Represented as a list of indices and a list of corresponded values, which must be the same length. - * @param setMetaData Metadata to set for the vector. - * @return N/A - * @see Pinecone Doc + * @param id + * Vector's unique id. + * @param namespace + * The namespace containing the vector to update. + * @param values + * Vector data. + * @param sparseValues + * Vector sparse data. Represented as a list of indices and a list of corresponded values, + * which must be the same length. + * @param setMetaData + * Metadata to set for the vector. + * @return + * N/A + * @see + * Pinecone Doc */ def update( id: String, @@ -160,16 +211,20 @@ trait PineconeVectorService extends PineconeServiceConsts { values: Seq[Double], sparseValues: Option[SparseVector] = None, setMetaData: Map[String, String] = Map() - ): Future[Unit] + ): Future[String] /** - * The Upsert operation writes vectors into a namespace. - * If a new value is upserted for an existing vector id, it will overwrite the previous value. + * The Upsert operation writes vectors into a namespace. If a new value is upserted for an + * existing vector id, it will overwrite the previous value. * - * @param vectors An array containing the vectors to upsert. Recommended batch limit is 100 vectors. - * @param namespace This is the namespace name where you upsert vectors. - * @return The number of vectors upserted. - * @see Pinecone Doc + * @param vectors + * An array containing the vectors to upsert. Recommended batch limit is 100 vectors. + * @param namespace + * This is the namespace name where you upsert vectors. + * @return + * The number of vectors upserted. + * @see + * Pinecone Doc */ def upsert( vectors: Seq[PVector], @@ -180,4 +235,4 @@ trait PineconeVectorService extends PineconeServiceConsts { * Closes the underlying ws client, and releases all its resources. */ def close(): Unit -} \ No newline at end of file +} diff --git a/project/Dependencies.scala b/project/Dependencies.scala index 3d732ee..3c7833f 100644 --- a/project/Dependencies.scala +++ b/project/Dependencies.scala @@ -1,6 +1,6 @@ object Dependencies { object Versions { - val wsClient = "0.5.8" + val wsClient = "0.7.2" } }