-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Is your feature request related to a problem? Please describe.
In Storage, a list objects operation returns a response body with both a list of items
and prefixes
. The list of items are objects ordered lexicographically by name, whereas the prefixes operate like a directory listing.
The storage objects.list API allows a query paramaeter maxResults
that represents the maximum combined number of entries in items[]
and prefixes[]
to return in a single page of responses.
While the Iterator
class supports maxResults
, it only accounts for the maximum number of items.
Describe the solution you'd like
Is there a way to have HTTPIterator.max_results
and HTTPIterator.num_results
account for results other than items from the response body? Currently, only items are kept count for num_results
and consequently _has_next_page()
def _items_iter(self):
"""Iterator for each item returned."""
for page in self._page_iter(increment=False):
for item in page:
self.num_results += 1
yield item
GCS objects.list API call returns a response body with the following structure:
{
"kind": "storage#objects",
"nextPageToken": string,
"prefixes": [
string
],
"items": [
objects Resource
]
}
Additional context
Customers are concerned running into memory issue when all prefixes are returned and max_results doesn't take any effect. Internal b/401531977
it = client.list_blobs(
bucket_name,
delimiter='/',
prefix="some_prefix_pattern",
max_results=100,
)
list(it)
print(len(it.prefixes)) # All 755 prefixes are returned regardless setting max_results=100