Feat: Allowing evaluations using Ragas Metrics in EvalTask #5197

sahusiddharth · 2025-04-23T02:42:07Z

This PR enables evaluation using Ragas's framework alongside existing Vertex's metrics.

Implementation Details

Ragas metrics evaluation is executed in a separate loop after the main executor loop where Vertex metrics are evaluated. This separate implementation was necessary because:

Ragas performs evaluation asynchronously, while the existing evaluation infrastructure uses multi-threading.
Combining these approaches led to several runtime errors:

BlockingIOError: [Errno 35] Resource temporarily unavailable inside gRPC polling callbacks
Future Attached to a Different Loop errors when async Ragas calls were invoked on one event loop but processed by another
Synchronous Ragas functions (wrappers around async implementations) caused similar conflicts

Attempted Solutions

Multiple approaches were tested to integrate Ragas within the existing evaluation loop:

Using synchronous single_turn_score functions resulted in gRPC polling callback errors
Using asynchronous single_turn_ascore functions created coroutine processing challenges
Attempts to isolate asyncio event loops between threads were unsuccessful

Final Solution

The chosen implementation runs Ragas metrics separately after the main evaluation loop completes, preserving both the multi-threaded performance of the existing evaluation system and the asynchronous benefits of Ragas, while avoiding runtime conflicts between the two approaches.

The diagram illustrates the functional organization within _evaluation.py where changes have been implemented. Yellow boxes indicate functions that import from the Ragas framework

Testing:

A complete end-to-end example demonstrating the implementation is available in the accompanying gist, which shows successful execution without runtime errors:

https://gist.github.com/sahusiddharth/39030eb6318a16b7cdc3d30c6a7c458b

vertexai/evaluation/_evaluation.py

sahusiddharth · 2025-04-30T03:51:52Z

Hi @jsondai, please let me know if there’s anything I can adjust to help move this PR forward!

jsondai · 2025-04-30T17:45:25Z

Hi @jsondai, please let me know if there’s anything I can adjust to help move this PR forward!

Hi Siddharth,

Thank you very much for the PR!

Our team is discussing on:

Future support and maintenance for the external partnership and integration to the Vertex Eval SDK.
The standard process for external partners to integrate wtih Vertex Evaluation Service.

Regarding the code changes, could you please add some unit tests for the PR in tests/unit/vertexai/test_evaluation.py? Otherwise it looks good to me.

Thanks,
Jason

jsondai · 2025-06-26T21:10:59Z

Closing this PR due to inactivity after discussion with the PR author.

ragas integration

538ba25

product-auto-label bot added size: m Pull request size is medium. api: vertex-ai Issues related to the googleapis/python-aiplatform API. labels Apr 23, 2025

jsondai reviewed Apr 23, 2025

View reviewed changes

vertexai/evaluation/_evaluation.py Outdated Show resolved Hide resolved

jsondai reviewed Apr 23, 2025

View reviewed changes

vertexai/evaluation/_evaluation.py Outdated Show resolved Hide resolved

added type annotation

ff24c79

sahusiddharth requested a review from jsondai April 26, 2025 09:54

jaycee-li added do not merge Indicates a pull request not ready for merge, due to either quality or timing. and removed do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels May 8, 2025

jaycee-li added do not merge Indicates a pull request not ready for merge, due to either quality or timing. and removed do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels May 15, 2025

matthew29tang assigned jsondai Jun 25, 2025

matthew29tang closed this Jun 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Allowing evaluations using Ragas Metrics in EvalTask #5197

Feat: Allowing evaluations using Ragas Metrics in EvalTask #5197

Uh oh!

sahusiddharth commented Apr 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

sahusiddharth commented Apr 30, 2025

Uh oh!

jsondai commented Apr 30, 2025

Uh oh!

jsondai commented Jun 26, 2025

Uh oh!

Uh oh!

Feat: Allowing evaluations using Ragas Metrics in EvalTask #5197

Feat: Allowing evaluations using Ragas Metrics in EvalTask #5197

Uh oh!

Conversation

sahusiddharth commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation Details

Attempted Solutions

Final Solution

Testing:

Uh oh!

Uh oh!

Uh oh!

sahusiddharth commented Apr 30, 2025

Uh oh!

jsondai commented Apr 30, 2025

Uh oh!

jsondai commented Jun 26, 2025

Uh oh!

Uh oh!

sahusiddharth commented Apr 23, 2025 •

edited

Loading