Skip to content
This repository was archived by the owner on Jan 30, 2020. It is now read-only.

Support for automatic reconnection attempts after server goes away #66

Closed
wants to merge 11 commits into from

Conversation

mbaynton
Copy link
Contributor

From my requests initially made in issue #59, this PR adds support for retrying the connection.

@heiglandreas
Copy link
Member

Thank you for your contribution! 🎉

I'll try to go through it this weekend!

@mbaynton
Copy link
Contributor Author

mbaynton commented Jun 15, 2017

Ok, that's as far as I think I'll go with this until I hear from you, @heiglandreas . I am suspicious that coveralls' algorithm has changed rather than that there is actually a decrease in coverage. (Look at others' PRs as well, everything is supposedly decreasing by 0.7%)

@ThaDafinser
Copy link

Isn't this a bit dangerous? If the reconnect "count" goes up, you should at least know it? (logging)
Since then you should check the root of the problem

@mbaynton
Copy link
Contributor Author

mbaynton commented Jun 16, 2017

@ThaDafinser, maybe for some sites? I suspect the most common cause for needing to reconnect in the wild will be the one I'm experiencing: we have a number of scripts that will bind and maybe run one search, and then hang out for a while while waiting for user input or doing other processing, and then try to query the ldap server again on the same socket, which the server has timed out. Nothing unexpected or extraordinary happening in that scenario, so I don't want a log entry about it.

If a site wanted to keep a close eye on client/server ldap connection reliability, both the server and the openldap client libs underlying the php implementation can be configured to produce those logs for you to chew on.

Hypothetically, if one wanted to add this to zend-ldap, I'm not seeing a logging interface already utilized by the codebase to hook into. The site desiring this feature could extend Ldap with a class aware of that sites' logging infrastructure and override shouldReconnect().

@ThaDafinser
Copy link

and then hang out for a while while waiting for user input or doing other processing

i'm not sure if it's good to do it that way generally, but maybe there is no other way for your special use case.

But in that use case i understand why you don't require logging for that.

@mbaynton
Copy link
Contributor Author

@heiglandreas just a friendly request to take a look at these PRs if you get a chance. I'm adding my git repo to composer for now. Thanks!

@heiglandreas
Copy link
Member

@mbaynton So this PR adds the ability to reconnect transparently to an LDAP-Server. What I don't quite understand is - under what circumstances does that happen? The default lifetime of a request usually is so short, that such a timeout shouldn't happen at all. So what is your explicit usecase for this? Having the changes in the library causes us to maintain it for some time and I'd like to get to know what benefit it brings…

Copy link
Member

@heiglandreas heiglandreas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't change existing tests but instead introduce new ones. The existing tests need to pass as they are otherwise we have BC-issues which we don't want.

}

protected static function getStandardOptions()
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should never change existing tests.That can easily introduce BC issues. Please write a new test that tests the new behaviour. The current behaviour should still be working AS-IS and that is tested by these unchanged tests…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the reason here was because I needed the options array in my ReconnectTest class. I will duplicate this code in ReconnectTest instead.

$ldap_mod_del = $this->getFunctionMock('Zend\\Ldap', 'ldap_mod_add');
$ldap_mod_del->expects($this->once())
$ldap_mod_add = $this->getFunctionMock('Zend\\Ldap', 'ldap_mod_add');
$ldap_mod_add->expects($this->once())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not change existing tests. Especially when the test has nothing to do with the newly added features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what this was. testAddingAttributesFails() had clearly been copied from testRemovingAttributesFails(), but the $ldap_mod_del local variable name hadn't been updated. Was just trying to tidy up a bit; it wouldn't have a material impact on the test, but I will revert it.

BuiltinFunctionMocks::$ldap_set_option_mock->enable();
}

public function tearDown()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change the existsing test. If you need special stuff for new tests you should move them to a separate test-class

@mbaynton
Copy link
Contributor Author

@heiglandreas Your concern about added maintenance makes sense if you do not see the utility of the change, so thanks for asking me about that. I'd say that it's not the lifetime of a request which is potentially problematic, but rather the length of time between requests while the TCP connection remains open and unused, during which an ldap server might rightly decide to terminate the bind and close the connection. Without an abstraction layer like zend-ldap that's a really annoying possibility to have to check for and handle every time you want to touch ldap.

It's the same general deal as a connection to an SQL database that is periodically queried, and presumably the impetus for eg. this module. Of course, with php being primarily used on the web, the lifetime of many entire php script executions is so short that this doesn't apply, but php is perfectly suited to general purpose scripting tasks, and I'd say its popular libraries such as zend-ldap ought to be up for the task too. It's in non-web contexts where I frequently encounter this issue.

Python seems to be more popular for general purpose scripting so I thought perhaps they'd already implemented this. I'm not a python guy but it took me only about 5 minutes to find this feature in pyldap: https://github.com/pyldap/pyldap/blob/630955779e90b80f7f04763df1ebb0196af269eb/Lib/ldap/ldapobject.py#L1022

I'll take a look at the tests later today, it's been long enough that I don't recall immediately why I was tinkering with them.

@heiglandreas
Copy link
Member

Hey @mbaynton. Thanks for your clarification! I was already assuming that it's an issue within long-running scripts. While it's perfectly sensible to use PHP within such contexts it's not the main domain of PHP. And I'm not sure whether we should introduce a solution to a problem the zend-framework wasn't built for in the first place. But perhaps @weierophinney can shed some light there.

But even when we decide to not make that a use-case for zend-ldap, it would be great to at least have a way to notify the calling code (by pushing a notification or by the code being able to pull the information) that the connection was closed/terminated/timed out. So having at least that would be a great possibility.

@mbaynton
Copy link
Contributor Author

@heiglandreas gotcha. I confess, I'm a Symfony developer ;) so I'll defer to you on what the Zend framework was built for, but on the surface it looks like non-web usage is also encouraged?

...I'd have gone with a Symfony component for ldap if there was one remotely as developed or mature as zend-ldap. But you're the only game in town per-se, so I hope it can be incorporated here.

@weierophinney
Copy link
Member

And I'm not sure whether we should introduce a solution to a problem the zend-framework wasn't built for in the first place. But perhaps @weierophinney can shed some light there.

ZF components target a variety of contexts, based on the use cases presented. For instance, zend-http's client supports timeouts and retries, as does zend-feed's Reader subcomponent; zend-db allows for persistent connections.

I think supporting reconnects would be great. Logging can be added pretty easily by extending the client and overriding the methods where it occurs to introduce logging. I think having some sort of documentation demonstrating that would be ideal when providing this feature.

The only reason I would be reluctant to support the feature is if it poses a heavy maintenance burden; as such, I'll defer to @heiglandreas 's judgment; my main point is I do not see a conflict with what we support in other components.

@heiglandreas
Copy link
Member

Thanks @weierophinney for the feedback.

So let's tackle this then! @mbaynton if you could have a look for the tests that would be great! Ad don't worry about the code-coverage…

Copy link
Member

@heiglandreas heiglandreas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all a lot of changes that add a lot of stuff to the already rather lengthy LDAP-Class. Would moving that stuff into an extended ReconnectLdap-class be an option? (The name isn't really good, so when you find something more appropriate that would be much appreciated!)

src/Ldap.php Outdated
@@ -92,7 +111,7 @@ public function __construct($options = [])
*/
public function __destruct()
{
$this->disconnect();
$this->unbind();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that it does the same. But from a naming POV I'd definitely prefer the disconnect here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any justification for why I changed this. Will remove the change.

src/Ldap.php Outdated
$this->lastConnectBindParams[$method][$parameter],
$property
);
return self::coalesce(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What point am I missing here? Can this return be reached at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, thanks for catching this!

I went for the version that includes the isset() check, but all possible values will always be set in this PR anyway (or I'd add another test that covers the unset case!). It becomes relevant when one attempts to merge this with PR#64 to support SASL binds.

src/Ldap.php Outdated
$this->boundUser = $username;
return $this;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this else obsolete? When $bind was not false we returned and so there's no need for an else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this again but it looks right to me -- if $bind is false, the else provides an opportunity to try again if the user set a nonzero reconnectAttempts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well: Consider this code:

if ($bind !== false) {
    $this->boundUser = $username;
    return $this;
}

if ($this->shouldReconnect($this->resource)) {
    return $this;
}

It does the same, as it allows an opportunity to try again when $bind is false. But without the else

src/Ldap.php Outdated

throw $zle;
}

protected function shouldReconnect($resource)
{
if ($this->reconnectCount < $this->getReconnectsToAttempt()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we reverse this if/else so that the else-condition will be checked first? That would allow us to return early and skip the else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't understand. There's only one conditional here, e.g. it's an else not an else if, so I'm not sure what you mean by the 'else'-condition. Did you want to check ldap_errno($resource) === -1 before $this->reconnectCount < $this->getReconnectsToAttempt()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking along these lines:

protected function shouldReconnect($resource) 
{
    if ($this->reconnectCount >= $this->getReconnectsToAttempt() || ldap_errno($resource) !== -1 {
        $this->reconnectsAttempted = $this->reconnectCount;
        $this->reconnectCount = 0;
        return false;
    }
    
    $this->reconnectCount++;
    $this->reconnectSleep();
    
    try {
        $this->connect();
        $this->bind();
        $this->reconnectsAttempted = $this->reconnectCount;
        $this->reconnectCount = 0;
         return true;
    } catch (LdapException $e) {
        if ($e->getCode() !== -1) {
            return false;
        }
    }
    return $this->shouldReconnect($this->getResource());
}

It's just shifting code around and altering the if-condition a bit but it cleans up indentation levels and also removes the not necessary elses…

src/Ldap.php Outdated
$this->reconnectCount = 0;
return true;
} catch (LdapException $e) {
if ($e->getCode() === -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when there is an LDAPException with a code different to -1? we silently do nothing? Or should we rethrow the Exception? Alternatively perhaps throwing a new Exception would be an option as well…

Copy link
Contributor Author

@mbaynton mbaynton Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. As it's written I think the code "Does The Right Thing," but it is not very self-documenting. If you look at everywhere Ldap::shouldReconnect() is used, there was already code immediately following whatever ldap_*() function call that checks the return value and converts failures expressed in return values to exceptions.

I'll add an explicit return false; to Ldap::shouldReconnect() to make it clear that this is the intended outcome in the case of a code different from -1

@mbaynton
Copy link
Contributor Author

PR #71 is needed for the build to pass.

Copy link
Member

@heiglandreas heiglandreas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look at the two comments. If that code needs to be there, we have a backwards-compatibility issue I'd like to avoid! Please see whether that can be resolved!

@@ -80,6 +80,7 @@ public function testOptionsGetter()
'useStartTls' => false,
'optReferrals' => false,
'tryUsernameSplit' => true,
'reconnectAttempts' => 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that really need to be set? Because if there's no fallback and that needs to be set that is a Backwards incompatibility! And I'd like to avoid that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't create a BC breakage. Ldap::getOptions() has always returned the current values of all supported options, but you don't have to pass anything new to the constructor or setOptions().

@@ -109,6 +110,7 @@ public function testConfigObject()
'useStartTls' => false,
'optReferrals' => false,
'tryUsernameSplit' => true,
'reconnectAttempts' => 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that really need to be set? Because if there's no fallback and that needs to be set that is a Backwards incompatibility! And I'd like to avoid that!

@mbaynton
Copy link
Contributor Author

Hey @heiglandreas did you see my response re: breaking changes? Do you understand or disagree?

@heiglandreas
Copy link
Member

Seen it and agree! On the go to phpsouthafrica ATM so I‘ll get into it in more detail tomorrow or the day after! ;)

src/Ldap.php Outdated
$this->reconnectCount = 0;
return true;
} catch (LdapException $e) {
if ($e->getCode() === -1) {
Copy link
Contributor Author

@mbaynton mbaynton Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. As it's written I think the code "Does The Right Thing," but it is not very self-documenting. If you look at everywhere Ldap::shouldReconnect() is used, there was already code immediately following whatever ldap_*() function call that checks the return value and converts failures expressed in return values to exceptions.

I'll add an explicit return false; to Ldap::shouldReconnect() to make it clear that this is the intended outcome in the case of a code different from -1

$isAdded = ldap_add($resource, $dn->toString(), $entry);
ErrorHandler::stop();
} while ($isAdded === false && $this->shouldReconnect($resource));

if ($isAdded === false) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of why we don't need to throw anything in shouldReconnect()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, I'm not sure why this ended up being displayed here, it isn't about the change to the tests.

Copy link
Member

@heiglandreas heiglandreas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the two code examples I added to perhaps make clearer what I was thinking about…

src/Ldap.php Outdated

throw $zle;
}

protected function shouldReconnect($resource)
{
if ($this->reconnectCount < $this->getReconnectsToAttempt()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking along these lines:

protected function shouldReconnect($resource) 
{
    if ($this->reconnectCount >= $this->getReconnectsToAttempt() || ldap_errno($resource) !== -1 {
        $this->reconnectsAttempted = $this->reconnectCount;
        $this->reconnectCount = 0;
        return false;
    }
    
    $this->reconnectCount++;
    $this->reconnectSleep();
    
    try {
        $this->connect();
        $this->bind();
        $this->reconnectsAttempted = $this->reconnectCount;
        $this->reconnectCount = 0;
         return true;
    } catch (LdapException $e) {
        if ($e->getCode() !== -1) {
            return false;
        }
    }
    return $this->shouldReconnect($this->getResource());
}

It's just shifting code around and altering the if-condition a bit but it cleans up indentation levels and also removes the not necessary elses…

src/Ldap.php Outdated
$this->boundUser = $username;
return $this;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well: Consider this code:

if ($bind !== false) {
    $this->boundUser = $username;
    return $this;
}

if ($this->shouldReconnect($this->resource)) {
    return $this;
}

It does the same, as it allows an opportunity to try again when $bind is false. But without the else

@mbaynton
Copy link
Contributor Author

mbaynton commented Sep 26, 2017

@heiglandreas I see what you mean and your preferred way to express the logic is fine with me; I've updated it.

So about the automated tests, it's failing against PHP 5.6 because you need to approve PR#71 ;), but my theory about the failures on PHP 7.0 is that's the one running coverage analysis, and it is slowing down enough to actually experience a timeout with the low value I configured in slapd.conf :(. Ick.

Maybe run two separate slapd server instances in the CI environment, one on a different port used only by the reconnect tests? Got any better ideas?

Edit 9/27: It looks like this tool combined with a kernel from late 2015 or better could target and close the specific socket, forcing the reconnect logic to be tested without relying on timeouts. This would require moving to a very up-to-date version of ss and the kernel; for example it's not in Ubuntu 16.04.

@mbaynton
Copy link
Contributor Author

@heiglandreas, the network administrator at my office and I figured out another way to produce connectivity issues that force the reconnection code to be exercised. It does require some notable changes to the CI / Vagrant environments though, such as

  • TravisCI requires sudo
  • Another port forward is created in Vagrant, and the php built-in webserver listens on it as a simple channel for signaling when to produce ldap connection dropouts.

Have a look at https://github.com/mbaynton/zend-ldap/pull/1/files for all the changes. If this looks good to you I'll merge them to this PR.

It does result in completely passing tests.

@mbaynton
Copy link
Contributor Author

mbaynton commented Oct 9, 2017

Hey @heiglandreas. I'd hate to lose momentum on this PR. Want me to merge my CI setup changes in so the tests pass?

@heiglandreas
Copy link
Member

Hey @mbaynton. I'd hate to loose the containerized infrastructure on travis for this project. But the changes needed for testing the dropped connection seem too need this.

Though at the moment I'm much more concerned about the failing tests due to the missing or "wrong" LDAP-Extension…

@mbaynton
Copy link
Contributor Author

@heiglandreas Yes, you're right about not being able to use a container-based environment with Travis. Can't be helped as far as I know though.

About the ldap.so warnings, that's not causing the tests to fail and also not related to this PR. See for example the other PR for ldap_mod_replace, whose test run produces the same warnings but is passing.

I'll go ahead and merge my CI changes into this PR so we can see a passing test run.

Rework reconnect tests to cause established connections to actively be dropped, rather than waiting on a timeout.
@heiglandreas
Copy link
Member

Hey Mike. Perfect! Thanks!

It was far too early this morning…

@mbaynton
Copy link
Contributor Author

mbaynton commented Nov 6, 2017

@heiglandreas, by the way about those ldap.so warnings, just noticed I had already actually tracked down and fixed the ldap.so warnings in the Travis-CI builds, if you wanna go ahead and merge #65 ;) Or this one. That'd be cool too.

@mbaynton
Copy link
Contributor Author

@heiglandreas, looks like where we left off with this the changes were approved and the tests were passing. Is there anything more you need from me to help the process of getting it merged?

heiglandreas pushed a commit to heiglandreas/zend-ldap that referenced this pull request Jul 3, 2018
Support for automatic reconnection attempts after server goes away
@heiglandreas
Copy link
Member

I've found the issue! The PR introduces a firewall for testing purposes. But that firewall was missing the rule to allow traffic through the LDAPS-port. That wasn't necessary at the time you created it… 😄

heiglandreas added a commit to heiglandreas/zend-ldap that referenced this pull request Jul 4, 2018
Support for automatic reconnection attempts after server goes away
heiglandreas added a commit that referenced this pull request Jul 5, 2018
This is necessary to make use of the reconnect-possibilities introduced
with #66

The code is taken from PR #69
@heiglandreas heiglandreas self-assigned this Jul 5, 2018
@heiglandreas heiglandreas added this to the 2.10.0 milestone Jul 5, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants