Skip to content

Add string interning #2325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 16, 2020
Merged

Add string interning #2325

merged 3 commits into from
Nov 16, 2020

Conversation

coolreader18
Copy link
Member

Command Mean [s] Min [s] Max [s] Relative
./rustpython-norm benchmarks/pystone.py 500000 78.248 ± 1.790 76.982 79.513 1.12 ± 0.05
./rustpython-opt benchmarks/pystone.py 500000 70.099 ± 2.543 68.301 71.897 1.00

@coolreader18
Copy link
Member Author

Ah, crap, I meant to make the rustpython_bytecode::bytecode flattening a separate commit, one sec

@coolreader18 coolreader18 force-pushed the coolreader18/string-intern branch from 039086c to 3fc525d Compare November 9, 2020 02:59
@coolreader18 coolreader18 force-pushed the coolreader18/string-intern branch 2 times, most recently from 2448976 to 9fa7014 Compare November 9, 2020 04:10
@coolreader18
Copy link
Member Author

coolreader18 commented Nov 9, 2020

Alright, this is good for review now

@coolreader18 coolreader18 force-pushed the coolreader18/string-intern branch from 9fa7014 to 2f232ec Compare November 9, 2020 21:52
Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you explain a bit about PyStrExact?
I am curious how it works in CPython and how they make benefits in RustPython comparing to bare PyStr.

@@ -172,37 +181,39 @@ pub enum ConversionFlag {
Repr,
}

pub type NameIdx = usize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intended to be used with arithmatic operations? otherwise, I think struct(usize) would be better.

Comment on lines +117 to +120
bytecode::BorrowedConstant::Str { value } if value.len() <= 20 => {
vm.intern_string(value).into_object()
}
bytecode::BorrowedConstant::Str { value } => vm.ctx.new_str(value),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does constant interning also affect by string length?

Copy link
Member Author

@coolreader18 coolreader18 Nov 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I confused it with this, it should be interned still

@coolreader18
Copy link
Member Author

Would you explain a bit about PyStrExact?

Yeah, so it's a lot easier to deal with interning if you know that everything in the dict is a strict str, cause subclasses' __eq__ or __hash__ implementations might raise an exception. I might actually make a PyRefExact<T> struct though, to make it more general over any type.

@coolreader18 coolreader18 merged commit 64b951e into master Nov 16, 2020
@coolreader18 coolreader18 deleted the coolreader18/string-intern branch November 16, 2020 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants