-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[HttpFoundation] implement session locking for PDO #10908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tests are broken |
Yeah I'm on it |
Done. |
Thanks @Tobion. |
This PR was merged into the 2.3 branch. Discussion ---------- [HttpFoundation] implement session locking for PDO | Q | A | ------------- | --- | Bug fix? | yes | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | #4976 for PDO | License | MIT This is probably the first Session Handler for databases that actually works with locking. I've seen many implementations of session handlers (mostly only for one database vendor) while researching and none used locking. Not even the [PHPs SQLite session handler](https://github.com/php/php-src/blob/PHP-5.3/ext/sqlite/sess_sqlite.c) or [PECL Postgres Handler](http://svn.php.net/viewvc/pecl/session_pgsql/trunk/session_pgsql.c?revision=326806&view=markup) implemented locking correctly which is probably the reason why they have been discontinued. [Zend Session](https://github.com/zendframework/zf2/blob/master/library/Zend/Session/SaveHandler/DbTableGateway.php) seems not to use locking either. But it saves the lifetime together with the session which seems like a good idea because you could have different lifetimes for different sessions. - Implements session locking for MySQL, Postgres, Oracle, SQL Server and SQLite. Only tested it for MySQL. So would be good if someone can confirm it works as intended on the other databases as well. - Also removed the custom RuntimeException which is not useful and a PDOException extends RuntimeException anyway, so no BC break. - I added a default for the table name to be in line with the DoctrineSessionHandler. - Check session.gc_maxlifetime in read(). Imagine we have only ever one user on an app. If maxlifetime is not checked in read, his session would never expire! What I don't get is why PHP calls gc() after read() instead of calling it before... Strange decision. For this reason I also had to do the following to improve performance. - I delay gc() to close() so that it is executed outside the transactional and blocking read-write process. This way, pruning expired sessions does not block them from being started while the current session is used. - Fixed time update for Oracle and SQL Server. Commits ------- 50ec828 [HttpFoundation] implement session locking for PDO
This does break our implementation, even if we use a different connection for the session management. Our session management enforces a write sequence, so on every session write, we remember a timestamp in the session in combination with the written data. If a session wants to write its data, it has to check if in the meantime newer data has been written. This forces our session management to read the session data again before writing it, resulting in a BadMethodCallException, because the read wants to begin a transaction - which it already started at the beginning of the session. |
This sounds like a completely different thing and not the approach that PHP or symfony uses. You would need to write the session each time you call |
We don't write on every session modification. That's why we need to check if we are operating on out of date data on session_write_close. This is the point, where you could merge intelligently or indeed decide to lose your data to avoid overwriting newer data. |
The point that read() is never called twice from PHP without closing it first again (you cannot start a session twice) and SessionHandlerInterface is not supposed to be called manually. They are callbacks in from session lifecycle. Also since this session handler implements blocking, read data is always up to date. Otherwise it would block. |
Yeah, that makes sense. The idea behind our implementation, that forces the correct write sequence in parallel calls was, that you don't have to block. We've got pages, that use a lot of ajax requests and we don't want them to run one after another due to blocking. |
Yes I see your point. I think the main problem is that people don't use the session right. If you have a read-only session, all you need to do is start the session and immediatly save it (Session->save(), i.e. session_write_close). Then you can continue reading from the session. Blocking would be reduced to the time between read and write which is extremely short since there is nothing in between. There should be a read-only session implementation out-of-the-box that makes this clear. |
That's a good idea. Such an implementation would be a hint for the right behavior and at the same time provide the tool to do it correctly. |
@fabpot Even though this locking behavior is the way it should be, there are probably people for which this will cause problems. Either they use same database connection and thus other statements would be wrapped in the session transaction. Or they want a read-only session but leave the session open very long which slows down their app because of blocking. |
@Tobion Arg, that's going to make my life really miserable. Next time, it would be better that we revert ASAP, but well, I will revert, retag, ... |
given that some people have already updated to the existing tag, it is better to tag a new release that retag the current one |
"retag" in the sense of creating a new tag. |
ok, reverted in 2.3 and 2.4, kept in master/2.5 |
* 2.3: Revert "bug #10908 [HttpFoundation] implement session locking for PDO (Tobion)" bumped Symfony version to 2.3.15 updated VERSION for 2.3.14 update CONTRIBUTORS for 2.3.14 updated CHANGELOG for 2.3.14 Conflicts: src/Symfony/Component/HttpKernel/Kernel.php
* 2.4: Revert "bug #10908 [HttpFoundation] implement session locking for PDO (Tobion)" bumped Symfony version to 2.3.15 updated VERSION for 2.3.14 update CONTRIBUTORS for 2.3.14 updated CHANGELOG for 2.3.14
@fabpot I hope your life is back to good state again :) Btw, in 2.4 it's actually not reverted yet: https://github.com/symfony/symfony/blob/2.4/src/Symfony/Component/HttpFoundation/Session/Storage/Handler/PdoSessionHandler.php |
Thanks for the effort @fabpot and @Tobion |
What I mean with bad design is that some session implementations use the SessionHandlerInterface but do not actually make use of PHPs session lifecycle. Laravel for example calls the SessionHandler methods manually instead of relying on phps session callbacks (https://github.com/laravel/framework/blob/master/src/Illuminate/Session/Store.php). On top of it, it does not not call open() or close() which of course won't work as soon as the internal implementation changes slightly. About the second connection. There is no way to really lock a row without using a transaction. So yes I think for the general case it's better to use a second connection (see also #10931 that only connects when session is used). If you serialize your database operations correctly, you could also be fine with one connection. For example when you make sure that you don't WRITE to the DB on the same connection between Session->start and Session->save. So when you delay your database access operation after you closed the session, you're fine with one connection. |
@Tobion Can you open an issue on the docs? We must document what you wrote here as I think it is super important to make this very clear to our users. Thanks. |
* 2.5: Revert "bug #10908 [HttpFoundation] implement session locking for PDO (Tobion)"
reverted everywhere except in master/2.6. |
I would like to pitch in here as well (full disclosure: I work in the same company as bestform). Reading and persisting the session is not free (from a performance perspective). Assuming we have a website with multiple modules - some of which may require write access to the session - we generally want to avoid multiple read and write operations on the session objects. So once the session is opened we keep it open until all modules have been processed. My understanding is, that this is also the symfony way of handling the session (I have access to it via request - the session is closed once the php process terminates. So once I open a session all other request would have to wait until my current request terminates - a nightmare in an ajax world.) I am ignoring the problematic of doubling the number of db connections here since bestform already addressed this (I do consider this an issue though). Perhaps I am missing a key concept in the way sessions are supposed to be handelt (in symfony) - if so I would really appreciate it if you could point me in the right direction. But as it stands right now I see this change having a negative effect on the behaviour of an application both from a performance and behaviour point of view. I do appreciate the difficulty associated with implementing a locking session manager. We implement a lazy locking solution using mysql GET_LOCK - which brings its own set of problems and is of course much to specific to be part of a framework that might user a db other than mysql - so that is no help here either |
|
So my assumption is correct, that symfony handles the session close the exact same way php will per default (ie. close on termination). It follows then, that the change in the PDO handler will have the same effect on a pure symfony project as it does in our context. And yes, if we know that we require a read only we could close the session again. In most use cases I am aware of we have many more reads than writes. It feels a bit awkward to make the read only case the exception. And then there is the Problem of known that you will not require a write - this may not always be right after opening the session (especially when dealing with a multi-module setup). As to why someone might expect a PDO session handler from being non locking: I am not aware of any that are locking - this is also why we had to add our own locking on top. In fact, one of the "advantages" to moving from a file based to a db or memcache based session storage is the lack of locking (there are other, more important reasons for the switch, but this is also one). I may be wrong, but I think it is a fairly safe bet, that most people assume that a db session handler is non-locking. Although I assume that our solution can not be used in this case it may still be useful if I add a short description here.
This way a process that requires a session lock can get one when the lock is needed - and only then. All other processes can use the session without having to worry about locking - as long as they are aware that their changes may be discarded due to write sequence enforcing). |
What do you consider a "pure symfony project"? Symfony is a framework so the developers are resposible to close the session when write access is not required anymore. The only reason why there is no locking database session handler is that nobody put in the effort to do it correctly. Btw, you are wrong. Native memcached session handler also locks and the Redis session handler in a bundle also implemented locking recently. So the timestamped session data that you implemented will only detect race conditions. But it doesn't do anything against it. So it's not a solution to the problem. |
I assume that most ppl will leave the session close to php - although I may be wrong. As for memcache: although it is true, that the native handler has the option (it did not always ) there are other memcache session implementations that do not use the default implementation. I was using this only as an example. Even though the reason for there not being a locking PDO driver may be that no one put in the effort, it comes down to the same thing: up until now the driver did not lock - so many applications will be build with this assumation. That alone is no reason to keep things as they are, but it may still be well worth it to take consider the consequences. Our incoming requests checks if there is a lock on the session only if it also requires a lock (we also check if we are the owning thread to prevent blocking if for some reason a thread tries to open multiple locks). All other request will get a read only session (they will be unable to write if they changed anything in the session). I guess we can adjust to the change though - at least if the transaction based solution is replaced by something else to avoid the additional connection overhead. |
To sum up your implementation:
Btw, this resembles optimistic concurrency control. But I don't see how this can work for sessions. Optimistic concurrency usually involves reapplying/merging the changes manually and then retry the write. But since the session data is written at runtime by the app written by the dev, manual conflict resolution is not possible. |
IMO if you have a lot of ajax request it would be better to use something like scache for your session data. |
@Tobion the lock has an effect - it ensures that the changes protected by the lock make it to storage. I am not suggesting our solution as one to use here - It did the trick for us, but is not generic enough as it stands. So I am assuming that the locking behaviour will be kept (we can deal with that). Will the transaction solution also be kept? |
I'm sure the locking behavior is the right thing. People just need to be aware of it and design their apps this way. My ideal solution from an library point of view would be to expose a locking mode option.
Probably not every database supports advisory locks. Sqlite doesn't AFAIK. @rnuernberg You are welcome to help with that. The main problem for me is to find a solution for all these different database systems (and to actually test it). |
@Tobion I think MemcachedSessionHandler also has no locking implementation currently. I get the feeling that locking should not be the responsibility of the storage but rather of \Symfony\Component\HttpFoundation\Session\Session |
Yes we know. See #4976. |
i see. sounds like the two tickets are essentially the same. If we need an advisory lock that works for different dbs it may make sense to completely decouple the locking from the handler. Once we do that, we could move it to the session - and thus provide locking for other implementations as well. One could still provide a transaction based PDO Locking handler as one option - so we would cover all bases while allowing all session handlers to benefit from it. I'll look a bit into it to see if we can make a meaningful contribution. |
This "ticket" is a PR. |
@Tobion you write, that this is a PR (pull request?) - this means we should be taking the conversion somewhere else? I'll be glad to oblige - but have no idea where I should be taking it instead... I wrote up a short solution suggestion (which I would be will to implement/contribute to it we find the solution workable). I have added it below - again, if this is the wrong place let me know :) Motivation: Using a non blocking sessions is performance relevant:
Problem: using non blocking sessions produces race conditions when two processes try to update the same session We can reduce the number of race conditions if we use a smart merge strategy when writing the sessions. Although this reduces the number of race conditions it does not eliminate them. This may be fine when the data written to session is not critical (ie. where we are willing to accept the data loss risk in favour of the performance increase). Blocking Sessions would solve the race condition problematic, but they also force each request to wait for every other request. The associated performance hit could be reduced if each request knew if this is a read only or a write access. It can be assumed that legacy code would need to be refactored to provided this information - and even then it may be very hard to predict if the request requires a write. Suggested solution:
Locking would be implemented in LockingHandler which would be passed to the Session class (or the session class could be wrapped by a SessionLock Decorator which then in turn gets an instance of a LockingHandler). It most likely makes sense to use the session store for locking (db, memcached ) - but this is not a must (it may not be feasible to implement (a sensible) locking in some session storage - such as sqlite). By moving the locking into the session class (away from the session handler) we can provide consistent behavior independent of the session handler used |
@rnuernberg there is another case where non-blocking sessions don't produce the expected behavior (and people have reported it many times): if you write to the session and then perform some logic in the kernel.terminate event after the response is sent to the client (possible only when using PHP-FPM as other SAPIs don't support ending the connection before the end of the request handling) and you let PHP write the session by itself at the end of the process (which is the default), the next request sent by the client is likely to be processed before the session is written, thus giving you an outdated session (this happens when the response was a redirection as the next request is sent very fast after the response) |
and your suggestion of creating a different implementation of SessionInterface to handle the locking won't work. the SessionInterface is a wrapper around the PHP session. The session handler on the other hand is hooking inside the PHP session storage. The locking needs to happen inside the PHP session system, not around it |
@stof we addressed that issue by forcing redirects through a redirect service. it executes the redirect via shutdown function registered as the last shutdown - thus we can be sure the session commit is completed before the redirect is executed. I am afraid I don't follow why the locking needs to happen inside the PHP session. Note that I am only suggesting to provide the SessionInterface with a method to open non-blocking. It may be necessary to also wrap the handler with another SessionHandlerProxy to do the heavy lifting - but this still separates the locking logic from the session handler itself and allows us to user different locking implementations independent of the session handler that is concerned with session storage... |
@rnuernberg I implemented the different lock strategies, I suggested, in #10931. Also it's possible to call multiple read() or write() consecusively in a manual way as it was a requirement for @bestform in #10908 (comment) Also I have an idea to implement optimistic concurrency control (basically what you suggested) inside https://github.com/symfony/symfony/blob/master/src/Symfony/Component/HttpFoundation/Session/Storage/Handler/WriteCheckSessionHandler.php . You said you would extract locking from the handler. I don't think it's a good idea. A use-case would be to store sessions in mysql but implement locking via filesystem or sqlite for example. That's quite awkward without any benefit. And just extracting the logic but still using mysql both for storage and locking is not easily possible without much coupling. |
This PR was merged into the 2.6-dev branch. Discussion ---------- [HttpFoundation] enhance PdoSessionHandler | Q | A | ------------- | --- | Bug fix? | yes | New feature? | yes | BC breaks? | yes | Deprecations? | no | Tests pass? | yes | Fixed tickets | #5483, #2067, #2382, #9029 | License | MIT 0. [x] Continuation of locking implementation (#10908): Implement different locking strategies - `PdoSessionHandler::LOCK_TRANSACTIONAL` (default): Issues a real row lock but requires a transaction - `PdoSessionHandler::LOCK_ADVISORY`: app-level lock, safe as long as only the PdoSessionHandler accesses sessions, advantage is it does not require a transaction (not implemented for oracle or sqlsrv yet) - `PdoSessionHandler::LOCK_NONE`: basically what is was before, prone to race conditions, means the last session write wins 1. [x] Save session data as binary: Encoding session data was definitely the wrong solution. Session data is binary text (esp. when using other session.serialize_handler) that must stay as-is and thus must also be safed in a binary column. Base64 encoding session data just decreses performance and increases storage costs and is semantically wrong because it does not have a character encoding. That saving null bytes in Posgres won't work on a character column is also documented > First, binary strings specifically allow storing octets of value zero and other "non-printable" octets (usually, octets outside the range 32 to 126). Character strings disallow zero octets, and also disallow any other octet values and sequences of octet values that are invalid according to the database's selected character set encoding. http://www.postgresql.org/docs/9.1/static/datatype-binary.html#DATATYPE-BINARY-TABLE 2. [x] Implement lazy connections that are only opened when session is used by either passing a dsn string explicitly or falling back to session.save_path ini setting. Fixes #9029 3. [x] add a create table method that creates the correct table depending on database vendor. This makes the class self-documenting and standalone useable. 5. [x] add lifetime column to session table which allows to have different lifetimes for each session 6. [x] add isSessionExpired() method to be able to distinguish between a new session and one that expired due to inactivity, e.g. to display flash message to user 7. [x] added upgrade and changelog notes Commits ------- 1bc6680 [HttpFoundation] implement different locking strategies for sessions 6f5748e adjust sqlite table definition 5978fcf added upgrade and changelog notes for PdoSessionHandler 182a5d3 [HttpFoundation] add create table method to pdo session handler e79229d [HttpFoundation] allow different lifetime per session af1bb1f add test for null byte in session data 251238d [HttpFoundation] implement lazy connect for pdo session handler 7dad54c [HttpFoundation] remove base64 encoding of session data
Hello! I'm not sure if it is the right place to ask such a question, but discussion (especially points of @rnuernberg) looks pretty relative to the problems I experience in my Symfony 2 application. I have an Angular.js application that uses a lot of parallel AJAX requests to the Symfony 2 backend and I desperately need them to work as fast as possible, the responsiveness of the graphical interface fully depends on it. At one point, I've realized that requests are not processed in parallel, but actually placed in some kind of a queue and processed one after another sequentially. After an investigation I've found that Symfony uses a native filesystem storage for session data, and session file is locked for each request those blocking all other requests. To counter this, I've switched to Today, I've noticed the same blocking behavior in the application and thoroughly tested if I'm not using session directly in my code, I'm using it only for user authentication and all my requests can be considered read only from the session perspective. How do I revert to the previous behavior of the session handler? Is it possible to treat all requests as read only by default and only obtain a session lock when actually required? I do believe this subject is not covered in the documentation at all. Thank you! |
@slavafomin do you need to use a session-based user auth for something looking like an API ? Using a stateless auth method would allow to avoid such issue. Btw, you could call |
Thank you for your suggestions @stof. Right now we are using session auth and we can't migrate to stateless auth right away, however, I totally agree with you that this is the right course of action. I've decided to disable the locking for the time being using Also, maybe we need to add a chapter on locking issues somewhere in this article: http://symfony.com/doc/current/components/http_foundation/session_configuration.html. |
This is probably the first Session Handler for databases that actually works with locking. I've seen many implementations of session handlers (mostly only for one database vendor) while researching and none used locking. Not even the PHPs SQLite session handler or PECL Postgres Handler implemented locking correctly which is probably the reason why they have been discontinued. Zend Session seems not to use locking either. But it saves the lifetime together with the session which seems like a good idea because you could have different lifetimes for different sessions.
Only tested it for MySQL. So would be good if someone can confirm it works as intended on the other databases as well.