-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Context Sensitive Lexer #1054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context Sensitive Lexer #1054
Conversation
e3c8519
to
94a4100
Compare
static unsigned int in_class = 0; | ||
static zend_bool in_const_list = 0; | ||
|
||
ZEND_ASSERT(in_class >= 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in_class is an unsigned int ... I would be very surprised if it became smaller than zero ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a test with static unsigned int in_class = -1;
and it caused strange bugs without errors here :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the unsigned and kept the assertion because it's currently not very obvious to detect bugs if somebody decrements in_class
in wrong conditions.
8f862e7
to
ddea7e8
Compare
4c384aa
to
636e210
Compare
It's great idea! |
a8effa2
to
1fe57cd
Compare
I have a question though, does this also apply to classes, interfaces or traits, or only to properties, methods and constants? Lets say if scalar type hinting is passed and the scalar names become reserved words, while I have an Int class and a String class. Will this RFC solve the problem? |
Hi,
No. I tried a more ambitious RFC before that targeted namespaces, classes, A pure lexical approach, without significant changes on the lexer Anyway, the current solution is a great start and already mitigates a lot
Hope that's useful information. Thanks, |
I see, thanks for your response, I will be looking forward to the more ambitious case-sensitive patch too. It's unfortunate that scalar names will become reserved words, I have mixed feelings about scalar type hinting. Part of me like it since its a step towards strongly typed PHP I was hoping for, but reserving scalar types makes it inconvenient for me. I know Levi Morrison proposed reserve some types in PHP 7, I am okay with his proposal though as I fully understand why this may be necessary. |
#define ZEND_RESERVED_PRIVATE 4 | ||
#define ZEND_RESERVED_PROTECTED 5 | ||
#define ZEND_RESERVED_PUBLIC 6 | ||
#define ZEND_RESERVED_CLASS 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the distinction between the individual reserved keywords really relevant here? It look to me like it would be good enough to just return a boolean from the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be useful in case one wants to check for a specific identifier to give more specialized err messages, like: https://github.com/php/php-src/pull/1054/files#diff-3a8139128d4026ce0cb0c86beba4e6b9R4511
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen the error message check, but the distinction seems really weird there ... it's pretty much the same message with slightly different wording. That does not seem worthwhile.
yy_push_state(ST_LOOKING_FOR_SR_NAME); | ||
} | ||
return ','; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about const FOO = [1, isset(BAR[0])];
will the isset
be lexed as T_ISSET or as T_STRING? It should be a T_ISSET. We currently don't support any language constructors in constants I think, but that's just because they don't happen to be implemented, we may want to add them in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T_ISSET because the ST_LOOKING_FOR_SR_NAME pops itself after first match so in const FOO = [1, isset(BAR[0])];
only FOO
will be the T_STRING.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand your response. In that case in_class
and in_const_list
should be 1, so that would push LOOKING_FOR_SR_NAME
after the ,
- or not? How does it distinguish between A = 1, B = 2
and A = [1, B]
for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
OK. Fixed with marcioAlmada@a8b1f21. Thanks ^^
superseded by #1158 |
This aims to be a consistent implementation of semi-reserved words grammar on OO context and OO access context.
RFC: https://wiki.php.net/rfc/context_sensitive_lexer