Skip to content

Internal Server Error (500) when using Gemini API with Inspect framework #573

@lennijusten

Description

@lennijusten

Description of the bug:

Description

I'm consistently encountering an Internal Server Error (HTTP 500) when trying to use the Google Gemini API through the Inspect evaluation framework. This error occurs during the generate_content call. I'm currently on the free trial.

Steps to Reproduce

  1. Set up an evaluation using the Inspect framework
  2. Configure the evaluation to use the Gemini model (in my case, google/gemini-1.5-pro)
  3. Run the evaluation

Error Message

InternalServerError: 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting

Environment

  • Operating System: MacOS Sonoma 14.5
  • Python version: 3.12.4

Package Versions

  • google-ai-generativelanguage: 0.6.6
  • google-api-core: 2.19.2
  • google-api-python-client: 2.143.0
  • google-auth: 2.34.0
  • google-auth-httplib2: 0.2.0
  • google-generativeai: 0.7.2
  • googleapis-common-protos: 1.65.0

Full error traceback:

╭─ benchmarks/gpqa (78 samples): google/gemini-1.5-pro ─────────────────────────────────────────────────────────────────────╮
│ ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮   dataset: (samples) │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │       scorer: choice │
│ │ in task_run                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in task_run_sample                                                                               │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in __call__                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in task_run_sample                                                                               │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in solve                                                                                         │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in generate                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in task_generate                                                                                 │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in generate                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in _generate                                                                                     │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/tenacity/as… │                      │
│ │ in async_wrapped                                                                                 │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/tenacity/as… │                      │
│ │ in __call__                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/tenacity/as… │                      │
│ │ in iter                                                                                          │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/tenacity/_u… │                      │
│ │ in inner                                                                                         │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/tenacity/__… │                      │
│ │ in <lambda>                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.1… │                      │
│ │ in result                                                                                        │                      │
│ │                                                                                                  │                      │
│ │   446 │   │   │   │   if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:                     │                      │
│ │   447 │   │   │   │   │   raise CancelledError()                                                 │                      │
│ │   448 │   │   │   │   elif self._state == FINISHED:                                              │                      │
│ │ ❱ 449 │   │   │   │   │   return self.__get_result()                                             │                      │
│ │   450 │   │   │   │                                                                              │                      │
│ │   451 │   │   │   │   self._condition.wait(timeout)                                              │                      │
│ │   452                                                                                            │                      │
│ │                                                                                                  │                      │
│ │ /opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.1… │                      │
│ │ in __get_result                                                                                  │                      │
│ │                                                                                                  │                      │
│ │   398 │   def __get_result(self):                                                                │                      │
│ │   399 │   │   if self._exception:                                                                │                      │
│ │   400 │   │   │   try:                                                                           │                      │
│ │ ❱ 401 │   │   │   │   raise self._exception                                                      │                      │
│ │   402 │   │   │   finally:                                                                       │                      │
│ │   403 │   │   │   │   # Break a reference cycle with the exception in self._exception            │                      │
│ │   404 │   │   │   │   self = None                                                                │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/tenacity/as… │                      │
│ │ in __call__                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in generate                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/inspect_ai/… │                      │
│ │ in generate                                                                                      │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/google/gene… │                      │
│ │ in generate_content_async                                                                        │                      │
│ │                                                                                                  │                      │
│ │   382 │   │   │   │   │   )                                                                      │                      │
│ │   383 │   │   │   │   return await generation_types.AsyncGenerateContentResponse.from_aiterato   │                      │
│ │   384 │   │   │   else:                                                                          │                      │
│ │ ❱ 385 │   │   │   │   response = await self._async_client.generate_content(                      │                      │
│ │   386 │   │   │   │   │   request,                                                               │                      │
│ │   387 │   │   │   │   │   **request_options,                                                     │                      │
│ │   388 │   │   │   │   )                                                                          │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/google/ai/g… │                      │
│ │ in generate_content                                                                              │                      │
│ │                                                                                                  │                      │
│ │    403 │   │   self._client._validate_universe_domain()                                          │                      │
│ │    404 │   │                                                                                     │                      │
│ │    405 │   │   # Send the request.                                                               │                      │
│ │ ❱  406 │   │   response = await rpc(                                                             │                      │
│ │    407 │   │   │   request,                                                                      │                      │
│ │    408 │   │   │   retry=retry,                                                                  │                      │
│ │    409 │   │   │   timeout=timeout,                                                              │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/google/api_… │                      │
│ │ in retry_wrapped_func                                                                            │                      │
│ │                                                                                                  │                      │
│ │   227 │   │   │   sleep_generator = exponential_sleep_generator(                                 │                      │
│ │   228 │   │   │   │   self._initial, self._maximum, multiplier=self._multiplier                  │                      │
│ │   229 │   │   │   )                                                                              │                      │
│ │ ❱ 230 │   │   │   return await retry_target(                                                     │                      │
│ │   231 │   │   │   │   functools.partial(func, *args, **kwargs),                                  │                      │
│ │   232 │   │   │   │   predicate=self._predicate,                                                 │                      │
│ │   233 │   │   │   │   sleep_generator=sleep_generator,                                           │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/google/api_… │                      │
│ │ in retry_target                                                                                  │                      │
│ │                                                                                                  │                      │
│ │   157 │   │   # This function explicitly must deal with broad exceptions.                        │                      │
│ │   158 │   │   except Exception as exc:                                                           │                      │
│ │   159 │   │   │   # defer to shared logic for handling errors                                    │                      │
│ │ ❱ 160 │   │   │   _retry_error_helper(                                                           │                      │
│ │   161 │   │   │   │   exc,                                                                       │                      │
│ │   162 │   │   │   │   deadline,                                                                  │                      │
│ │   163 │   │   │   │   sleep,                                                                     │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/google/api_… │                      │
│ │ in _retry_error_helper                                                                           │                      │
│ │                                                                                                  │                      │
│ │   209 │   │   │   RetryFailureReason.NON_RETRYABLE_ERROR,                                        │                      │
│ │   210 │   │   │   original_timeout,                                                              │                      │
│ │   211 │   │   )                                                                                  │                      │
│ │ ❱ 212 │   │   raise final_exc from source_exc                                                    │                      │
│ │   213 │   if on_error_fn is not None:                                                            │                      │
│ │   214 │   │   on_error_fn(exc)                                                                   │                      │
│ │   215 │   if deadline is not None and time.monotonic() + next_sleep > deadline:                  │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/google/api_… │                      │
│ │ in retry_target                                                                                  │                      │
│ │                                                                                                  │                      │
│ │   152 │                                                                                          │                      │
│ │   153 │   for sleep in sleep_generator:                                                          │                      │
│ │   154 │   │   try:                                                                               │                      │
│ │ ❱ 155 │   │   │   return await target()                                                          │                      │
│ │   156 │   │   # pylint: disable=broad-except                                                     │                      │
│ │   157 │   │   # This function explicitly must deal with broad exceptions.                        │                      │
│ │   158 │   │   except Exception as exc:                                                           │                      │
│ │                                                                                                  │                      │
│ │ /Users/lenni/Documents/GitHub/biology-benchmarks/.venv/lib/python3.12/site-packages/google/api_… │                      │
│ │ in __await__                                                                                     │                      │
│ │                                                                                                  │                      │
│ │    85 │   │   │   response = yield from self._call.__await__()                                   │                      │
│ │    86 │   │   │   return response                                                                │                      │
│ │    87 │   │   except grpc.RpcError as rpc_error:                                                 │                      │
│ │ ❱  88 │   │   │   raise exceptions.from_grpc_error(rpc_error) from rpc_error                     │                      │
│ │    89                                                                                            │                      │
│ │    90                                                                                            │                      │
│ │    91 class _WrappedStreamResponseMixin(Generic[P], _WrappedCall):                               │                      │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯                      │
│ InternalServerError: 500 An internal error has occurred. Please retry or report in                                        │
│ https://developers.generativeai.google/guide/troubleshooting  

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

Metadata

Metadata

Assignees

Labels

component:apiIssues related to the API, not the SDK.status:triagedIssue/PR triaged to the corresponding sub-teamtype:questionSupport-related issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions