Skip to content

[Feat] Support slice for reader #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 12, 2025
Merged

[Feat] Support slice for reader #43

merged 2 commits into from
Aug 12, 2025

Conversation

haochengxia
Copy link
Collaborator

@haochengxia haochengxia commented Aug 12, 2025

Example

import libcachesim as lcs
import logging
logging.basicConfig(level=logging.DEBUG)


URI = "s3://cache-datasets/cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
reader = lcs.TraceReader(
    trace = URI,
    trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE,
    reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False)
)

for req in reader[:3]:
    print(req.obj_id, req.obj_size)

for req in reader[1:4]:
    print(req.obj_id, req.obj_size)

reader.reset()
read_n_req = 4
for req in reader:
    if read_n_req <= 0:
        break
    print(req.obj_id, req.obj_size)
    read_n_req -= 1

@haochengxia haochengxia changed the title Support slice for reader [Feat] Support slice for reader Aug 12, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @haochengxia, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces significant enhancements to the libcachesim library by adding support for slice and index-based access to trace readers. This allows users to more easily and efficiently access specific portions of large trace datasets without needing to iterate through the entire stream. The changes also include minor improvements to cache management and reader utility functions, making the library more robust and user-friendly for trace analysis and simulation.

Highlights

  • Slice and Indexing Support for Readers: The SyntheticReader and TraceReader classes now implement the __getitem__ method, enabling direct access to individual requests by index (e.g., reader[5]) and iteration over ranges of requests using slice notation (e.g., reader[:10], reader[5:15]). This significantly enhances the flexibility of reading trace data.
  • New Example for Slice Reader Functionality: A new example script, examples/slice_reader.py, has been added to demonstrate the practical usage of the new slice and indexing capabilities for TraceReader, providing clear guidance for users.
  • Dynamic Cache Size Adjustment: A set_cache_size method has been added to the Cache class and exposed to Python, allowing for dynamic adjustment of the cache's capacity during runtime.
  • TraceFormat Enumeration Export: The TraceFormat enumeration has been exported to Python, providing clearer distinctions between different trace file formats (binary, text, invalid) within the Python interface.
  • Improved skip_n_req Return Value: The skip_n_req function's Python binding now returns the actual number of requests successfully skipped, offering more precise feedback on its operation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature by adding slicing support to TraceReader and SyntheticReader, which greatly improves data access flexibility. The implementation for SyntheticReader is solid. However, the implementation for TraceReader has some critical performance and correctness issues, particularly in the TraceReaderSliceIterator and __getitem__ methods, that need to be addressed. Additionally, the new set_cache_size C++ binding is potentially unsafe. My review includes detailed feedback on these points.

@haochengxia haochengxia merged commit 39edefb into main Aug 12, 2025
32 checks passed
@haochengxia haochengxia deleted the hxia/slice branch August 12, 2025 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant