Skip to content

[Java] CWE-295 - Incorrect Hostname Verification #3581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

intrigus-lgtm
Copy link
Contributor

@intrigus-lgtm intrigus-lgtm commented May 27, 2020

Hostname verification is an essential part of Transport Layer Security (TLS) and HTTPS.

When a TLS connection is established there are two important steps:
First, the chain of trust is verified that means it is checked whether the certificate has been issued by a trusted certificate authority.
Second, the hostname (that is being connected to) needs to be verified against the certificate.
That means it is checked whether the certificate is actually for the hostname.

If the hostname is not verified an attacker could present any certificate with a valid chain of trust, but that is not issued for the hostname at all.

Many posts tell developers that have problems with hostname verification to accept any certificate as valid in the case of a mismatch.
These problems usually are configuration problems that should be fixed instead.

Java verifies HTTPS hostname by default when using URL/HttpsUrlConnection.
But when a protocol uses TLS (SSLEngine/SSLSocket) directly, the hostname is not verified by default.
That's where the second part of my query (IncorrectHostnameVerifier.ql) comes into play.

[This PR and PR description is WIP]
[I've also noticed #3550, which seems to focus on the chain of trust vaildation]

@intrigus-lgtm
Copy link
Contributor Author

The QL code can probably be considered finished and reviewed already :)
Most work will be in the missing documentation, tests and examples.

@luchua-bc
Copy link
Contributor

luchua-bc commented May 28, 2020

@intrigus-lgtm My PR #3550 does cover both hostname verifier and certificate trust manager although my implementation of HostnameVerifier.verify() only checks the most common scenario of always returning "true" for now, which is the same as your AlwaysAcceptingVerifyMethod and AlwaysTrueHostnameVerifier.java. Mine is testTrustAllHostnameOfAnonymousClass and testTrustAllHostnameOfVariable.

Your DangerousVerifyMethod checks more sub-cases of HostnameVerifier.verify(), which is nice. However, my PR covers quite a few additional cases of hostname verification related to SSLEngine/SSLContext and third-party libraries. So our PRs are indeed for the same CWE issue and vulnerability.

Also CodeQL does have a built-in library semmle.code.java.security.Encryption, which contains types like HostnameVerifier already, so you don't need to recreate your own TypeHostnameVerifier.

You can merge those after my PR #3550 is merged.

@luchua-bc

@intrigus-lgtm
Copy link
Contributor Author

Also CodeQL does have a built-in library semmle.code.java.security.Encryption, which contains types like HostnameVerifier already, so you don't need to recreate your own TypeHostnameVerifier.

Thank you for the hint, I missed these classes.

@intrigus-lgtm My PR #3550 does cover both hostname verifier and certificate trust manager although my implementation of HostnameVerifier.verify() only checks the most common scenario of always returning "true" for now, which is the same as your AlwaysAcceptingVerifyMethod and AlwaysTrueHostnameVerifier.java. Mine is testTrustAllHostnameOfAnonymousClass and testTrustAllHostnameOfVariable.

Your DangerousVerifyMethod checks more sub-cases of HostnameVerifier.verify(), which is nice. However, my PR covers quite a few additional cases of hostname verification related to SSLEngine/SSLContext and third-party libraries. So our PRs are indeed for the same CWE issue and vulnerability.

What do you think about the idea to remove your "trust all" verifier query and focusing on instances where hostname verification has not been enabled?
My query would then focus on the actual implementation of verifiers and yours on not enabled verifiers :)

@luchua-bc
Copy link
Contributor

Also CodeQL does have a built-in library semmle.code.java.security.Encryption, which contains types like HostnameVerifier already, so you don't need to recreate your own TypeHostnameVerifier.

Thank you for the hint, I missed these classes.

@intrigus-lgtm My PR #3550 does cover both hostname verifier and certificate trust manager although my implementation of HostnameVerifier.verify() only checks the most common scenario of always returning "true" for now, which is the same as your AlwaysAcceptingVerifyMethod and AlwaysTrueHostnameVerifier.java. Mine is testTrustAllHostnameOfAnonymousClass and testTrustAllHostnameOfVariable.
Your DangerousVerifyMethod checks more sub-cases of HostnameVerifier.verify(), which is nice. However, my PR covers quite a few additional cases of hostname verification related to SSLEngine/SSLContext and third-party libraries. So our PRs are indeed for the same CWE issue and vulnerability.

What do you think about the idea to remove your "trust all" verifier query and focusing on instances where hostname verification has not been enabled?
My query would then focus on the actual implementation of verifiers and yours on not enabled verifiers :)

In my opinion, I think it makes sense to have all hostname verifier related queries in one place under a single directory. This facilitates lookup, usage, and further enhancements by other researchers. The CodeQL team may be able to help with the merge and possible revamp. As my PR is still under review so its query may not be finalized, and this PR is not reviewed yet, let's wait and see what's the take of the CodeQL team on this one.

Regards,
@luchua-bc

@intrigus-lgtm intrigus-lgtm force-pushed the java-hostname-verifier-cwe-295 branch from 4341325 to 22c934c Compare May 28, 2020 21:44
@intrigus-lgtm intrigus-lgtm force-pushed the java-hostname-verifier-cwe-295 branch from 22c934c to 3f1d01d Compare June 1, 2020 21:44
@intrigus-lgtm intrigus-lgtm marked this pull request as ready for review June 1, 2020 21:45
@intrigus-lgtm intrigus-lgtm requested a review from a team as a code owner June 1, 2020 21:45
@anticomputer
Copy link

anticomputer commented Jun 24, 2020

@intrigus-lgtm as a status update: it seems there is significant overlap between this PR and the one we received prior in #3550

Our current policy on query collision is to consider queries for consideration of award in the order we receive them and only consider subsequent queries for award if the initial query is not accepted.

The initial query we received is still under evaluation, but we wanted to provide you with some clarity on the decision making process here. If that query is accepted for award it will preclude this query from further consideration.

(EDIT: to clarify, by collision we (the security lab team) mean queries that produce essentially the same TP result set, this is still under eval in this case)

We will update the bounty FAQ accordingly as well. Our apologies for any confusion.

/cc @aschackmull

@intrigus-lgtm
Copy link
Contributor Author

intrigus-lgtm commented Jun 24, 2020

@intrigus-lgtm as a status update: it seems there is significant overlap between this PR and the one we received prior in #3550

Our current policy on query collision is to consider queries for consideration of award in the order we receive them and only consider subsequent queries for award if the initial query is not accepted.

The initial query we received is still under evaluation, but we wanted to provide you with some clarity on the decision making process here. If that query is accepted for award it will preclude this query from further consideration.

We will update the bounty FAQ accordingly as well. Our apologies for any confusion.

/cc @aschackmull

(Sorry for my ranting...)
IMHO there is no significant overlap.
The only thing that is overlapping is https://github.com/github/codeql/pull/3581/files#diff-325abc11a872a316989663be65407dffR21-R49
And the overlapping code 6d1ba3f#diff-27964f9a40f920f977c018d1b0bb5d1eR60-R70:
doesn't work correctly...
It will simply select any Method that contains a ReturnStmt that returns true.
Even if the return is unreachable or if the return does verify the hostname.

import java
import semmle.code.java.security.Encryption
class TrustAllHostnameVerifier extends RefType {
  TrustAllHostnameVerifier() {
    this.getASupertype*() instanceof HostnameVerifier and
    exists(Method m, ReturnStmt rt |
      m.getDeclaringType() = this and
      m.hasName("verify") and
      rt.getEnclosingCallable() = m and
      rt.getResult().(BooleanLiteral).getBooleanValue() = true
    )
  }
}

https://lgtm.com/query/8313609408411285911/
https://lgtm.com/projects/g/payara/Payara/snapshot/8d3d02cfa8481d0e6ca2b5605232e3eb25b82c54/files/appserver/load-balancer/admin/src/main/java/org/glassfish/loadbalancer/admin/cli/connection/SSLHostNameVerifier.java?sort=name&dir=ASC&mode=heatmap#L66

ALSO:

If I remove the colliding code, this becomes eligible again?
Or how should I interpret this statement:

If a bounty is awarded to the first received submission, this makes any colliding submissions ineligible for award.

Does this mean that the whole submission is ineligible?
If so, why?
There are definitely parts that are completely distinct...

@luchua-bc
Copy link
Contributor

@intrigus-lgtm Sorry I don't quite understand your comment below for the code in my query #3550:

the overlapping code 6d1ba3f#diff-27964f9a40f920f977c018d1b0bb5d1eR60-R70:
doesn't work correctly...
It will simply select any Method that contains a ReturnStmt that returns true.
Even if the return is unreachable or if the return does verify the hostname.

With my design, the code is to detect the most common use case of improper hostname verification, which is:

@Override
public boolean verify(String hostname, SSLSession session) {
	return true; // Noncompliant
}

The implementation of the code is:

class TrustAllHostnameVerifier extends RefType {
  TrustAllHostnameVerifier() {
    this.getASupertype*() instanceof HostnameVerifier and
    exists(Method m, ReturnStmt rt |
      m.getDeclaringType() = this and
      m.hasName("verify") and
      rt.getEnclosingCallable() = m and
      rt.getResult().(BooleanLiteral).getBooleanValue() = true
    )
  }
}

The code works correctly with the following test cases in my test program of PR #3550:

	/**
	 * Test the implementation of trusting all hostnames as an anonymous class
	 */
	public void testTrustAllHostnameOfAnonymousClass() {
		HttpsURLConnection.setDefaultHostnameVerifier(new HostnameVerifier() {
			@Override
			public boolean verify(String hostname, SSLSession session) {
				return true; // Noncompliant
			}
		});
	}

	/**
	 * Test the implementation of trusting all hostnames as a variable
	 */
	public void testTrustAllHostnameOfVariable() {
		HostnameVerifier verifier = new HostnameVerifier() {
			@Override
			public boolean verify(String hostname, SSLSession session) {
				return true; // Noncompliant
			}
		};
		HttpsURLConnection.setDefaultHostnameVerifier(verifier);
	}

If the return value is not the boolean literal true, the query can detect as expected. Please elaborate why the query code above doesn't work as expected then I will be more than happy to fix it. I have received a lot of great advices and comments from many people in the CodeQL community, which helped me to enhance my queries and contribute to the project.

My understanding is that your query in this PR detects more cases like the scenario that the hostname variable passed in as a method parameter is not being used in the method body for decision making. Am I right? If this is the case, how common are these cases in the 27 of the 87 projects detected by this query? That is, how many projects overridden the HostnameVerifier.verify method doesn't return hard-coded true value detected by both queries? I think it's a nice enhancement but more information will be helpful to the team.

Thanks,
@luchua-bc

@anticomputer
Copy link

@intrigus-lgtm as a status update: it seems there is significant overlap between this PR and the one we received prior in #3550
Our current policy on query collision is to consider queries for consideration of award in the order we receive them and only consider subsequent queries for award if the initial query is not accepted.
The initial query we received is still under evaluation, but we wanted to provide you with some clarity on the decision making process here. If that query is accepted for award it will preclude this query from further consideration.
We will update the bounty FAQ accordingly as well. Our apologies for any confusion.
/cc @aschackmull

(Sorry for my ranting...)
IMHO there is no significant overlap.
The only thing that is overlapping is https://github.com/github/codeql/pull/3581/files#diff-325abc11a872a316989663be65407dffR21-R49
And the overlapping code 6d1ba3f#diff-27964f9a40f920f977c018d1b0bb5d1eR60-R70:
doesn't work correctly...
It will simply select any Method that contains a ReturnStmt that returns true.
Even if the return is unreachable or if the return does verify the hostname.

import java
import semmle.code.java.security.Encryption
class TrustAllHostnameVerifier extends RefType {
  TrustAllHostnameVerifier() {
    this.getASupertype*() instanceof HostnameVerifier and
    exists(Method m, ReturnStmt rt |
      m.getDeclaringType() = this and
      m.hasName("verify") and
      rt.getEnclosingCallable() = m and
      rt.getResult().(BooleanLiteral).getBooleanValue() = true
    )
  }
}

https://lgtm.com/query/8313609408411285911/
https://lgtm.com/projects/g/payara/Payara/snapshot/8d3d02cfa8481d0e6ca2b5605232e3eb25b82c54/files/appserver/load-balancer/admin/src/main/java/org/glassfish/loadbalancer/admin/cli/connection/SSLHostNameVerifier.java?sort=name&dir=ASC&mode=heatmap#L66

ALSO:

If I remove the colliding code, this becomes eligible again?
Or how should I interpret this statement:

If a bounty is awarded to the first received submission, this makes any colliding submissions ineligible for award.

Does this mean that the whole submission is ineligible?
If so, why?
There are definitely parts that are completely distinct...

@intrigus-lgtm as a status update: it seems there is significant overlap between this PR and the one we received prior in #3550
Our current policy on query collision is to consider queries for consideration of award in the order we receive them and only consider subsequent queries for award if the initial query is not accepted.
The initial query we received is still under evaluation, but we wanted to provide you with some clarity on the decision making process here. If that query is accepted for award it will preclude this query from further consideration.
We will update the bounty FAQ accordingly as well. Our apologies for any confusion.
/cc @aschackmull

(Sorry for my ranting...)
IMHO there is no significant overlap.
The only thing that is overlapping is https://github.com/github/codeql/pull/3581/files#diff-325abc11a872a316989663be65407dffR21-R49
And the overlapping code 6d1ba3f#diff-27964f9a40f920f977c018d1b0bb5d1eR60-R70:
doesn't work correctly...
It will simply select any Method that contains a ReturnStmt that returns true.
Even if the return is unreachable or if the return does verify the hostname.

import java
import semmle.code.java.security.Encryption
class TrustAllHostnameVerifier extends RefType {
  TrustAllHostnameVerifier() {
    this.getASupertype*() instanceof HostnameVerifier and
    exists(Method m, ReturnStmt rt |
      m.getDeclaringType() = this and
      m.hasName("verify") and
      rt.getEnclosingCallable() = m and
      rt.getResult().(BooleanLiteral).getBooleanValue() = true
    )
  }
}

https://lgtm.com/query/8313609408411285911/
https://lgtm.com/projects/g/payara/Payara/snapshot/8d3d02cfa8481d0e6ca2b5605232e3eb25b82c54/files/appserver/load-balancer/admin/src/main/java/org/glassfish/loadbalancer/admin/cli/connection/SSLHostNameVerifier.java?sort=name&dir=ASC&mode=heatmap#L66

ALSO:

If I remove the colliding code, this becomes eligible again?
Or how should I interpret this statement:

If a bounty is awarded to the first received submission, this makes any colliding submissions ineligible for award.

Does this mean that the whole submission is ineligible?
If so, why?
There are definitely parts that are completely distinct...

Our (the Security Lab) evaluation of the query is mostly based on the result sets they generate. So queries that are aimed at finding the same core issues, we consider as potentially colliding submissions.

We're still evaluating the result sets between the two queries, if your query finds a distinct, or significant addition to the results, then we will push the evaluation forward.

We'd like to inspire collaboration between authors of queries that are focused on the same area since the ultimate goal is to build the best query for the QL community, so we're thinking about how to best facilitate that going forward. I think this is a good test case for that scenario and we'll have some deeper discussion internally on how to make this a positive experience.

TL;DR: we're still looking at your result set, if it's distinct enough from the other query result set, we'll push it through for further award eval. After the bounty evaluation process for the two queries completes, and there is a distinct difference in TP findings between the two, we'd love to see collaboration to build a final query (where it makes sense) that joins the result sets.

@anticomputer
Copy link

@intrigus-lgtm Update: the seclab team just concluded that your query finds a qualifying distinct TP result set and we've moved it forward in the award evaluation pipeline.

Once the bounty process completes for both you and @luchua-bc we'd love to see a feature diff and integration effort if you're both up for it :)

@luchua-bc
Copy link
Contributor

Sounds like a good plan to me. I'd also like to see a feature diff and integration effort so that we can have all code related to hostname verification in a central place to make the repository cleaner and facilitate usage by all security researchers as well as further enhancements.

@intrigus-lgtm
Copy link
Contributor Author

@anticomputer @luchua-bc apologies for my heated comment above.

Once everything got merged we should indeed integrate everything in a central place 👍

@luchua-bc
Copy link
Contributor

@intrigus-lgtm Will it be better to merge with code of PR #3550 already in the repository now or merge after this PR is merged into the repository? The latter requires another PR.

I think it may be desired to move code in this PR from the CWE-295 folder to the CWE-273 folder then add the following tag to both queries:

 * @tags security
 *       external/cwe/cwe-273
 *       external/cwe/cwe-295

Feel free to have the duplicate code related to hostname verifier removed from the code already in the repository or merged in a way you'd like as long as everything in a central place.

@luchua-bc

@adityasharad adityasharad changed the base branch from master to main August 14, 2020 18:34
@smowton
Copy link
Contributor

smowton commented Oct 13, 2020

@intrigus-lgtm I'm helping out with reviewing this PR; what's your take on it at the moment? Are you up for adapting this into an improvement atop the now-merged #3550?

@smowton smowton self-assigned this Oct 13, 2020
@intrigus-lgtm
Copy link
Contributor Author

@intrigus-lgtm I'm helping out with reviewing this PR; what's your take on it at the moment? Are you up for adapting this into an improvement atop the now-merged #3550?

I'm currently cleaning up my notifications and other stuff.
I plan to work on this some time this month, but I may also work on another of my QL-PRs that feel "closer" to the finish line.

@intrigus-lgtm
Copy link
Contributor Author

Superseded by #4771

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants