Skip to content

Remove tfexec, allow TF_ environment vars and log them #2264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 16, 2022

Conversation

spikecurtis
Copy link
Contributor

Fixes #1359

Couple of notes:

I've refactored our use of terraform and in the process made some changes that I believe bring us in line with what we are actually doing:

  • Dropped calls to terraform state pull because I don't believe we are supporting remote terraform state; instead it is stored in Coder
  • commands are directly passed the Context that gets cancelled when we read a Cancel message off the stream.

Signed-off-by: Spike Curtis <spike@coder.com>
Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis requested a review from a team June 11, 2022 00:29
Comment on lines 10 to 12
type Logger interface {
Log(l *proto.Log) error
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this interface primarily to merge the Provision and Parse log streams? I think abstracting this makes it confusing to follow the code, and just to save ~30 lines of code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the idea is to be able to write generic logging code, like LogWriter() that work on either kind of stream. This is something I found really useful when I prototyped my docker-compose provisioner, so I think it's really useful to have generic logging routines.

Copy link
Member

@mafredri mafredri Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logger interface could be ok, but I don't see much of a use-case for having the stream abstractions (not yet at least).

Instead we could just define a fakeLogger in our tests that is passed to terraform. This would also allow us to get rid of the mocks which I think adds further abstraction and has the downside of "not reading like Go code".

type fakeLogger struct{
        logs []*proto.Log
}

func (l *fakeLogger) Log(log *proto.Log) error {
        l.logs = append(l.logs, log)
        return nil
}

All we then need to do is assert that terraform added the right log messages to l.logs.

return w, done
}

func readAndLog(logger Logger, r io.Reader, done chan<- any, level proto.LogLevel) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my read over the code, this channel is only used for testing, but I'm not sure why it's required because we already ensure the output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channel is used in product code, e.g. in executor.go:141. The purpose is to allow the code that calls logging routines to ensure that the logging is finished before moving on to other things. Without this, you could end up with out-of-order logs, or worse, a log response after a complete response.

Comment on lines 60 to 72
func TestParseLogger_Log(t *testing.T) {
t.Parallel()

mStream := new(mocks.ParseStream)
defer mStream.AssertExpectations(t)

l := &proto.Log{Level: proto.LogLevel_ERROR, Output: "an error"}
mStream.On("Send", mock.MatchedBy(withLog(l))).Return(nil)

uut := provisionersdk.NewParseLogger(mStream)
err := uut.Log(l)
require.NoError(t, err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this log abstraction tests that: a provisioner logger sends a provision log type, and a parse logger sends a parse log type. I feel like this is a lot of code to do that. Would this be caught by many other tests in the codebase?

Copy link
Contributor Author

@spikecurtis spikecurtis Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the compiler actually ensures that the types match up. This tests that a call to Log results in a call to Send and that the field values match up. Pretty basic, I know, but it's a baseline for the test below for LogWriter().

These small, basic, unit tests are important right by the code they test. It might be the case that other tests in the codebase actually cover this function, but it breaks encapsulation to depend on that for test coverage; changes elsewhere shouldn't cause us to leave this code exposed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, but the decomposition of this comes at a relatively high cost of maintainability. Adding ~400 lines for logging seems like a lot right now (I'm including mock code and tests in that count), considering what we gain from it.

I'm not convinced this abstraction reduces complexity for an implementor and seems premature when we only have a single provisioner emitting logs or using this abstraction. It's hard for me to understand as a contributor the purpose of this abstraction when it's only used in a single place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lines of code is a poor metric for complexity in the first place, and counting autogenerated mocks and test code in your count is counterproductive because it penalizes code that is thoroughly tested versus code that is not.

This abstraction increases maintainability because it separates concerns and keeps logging boilerplate out of the provisioner functions which bloat those functions and makes them harder to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some basis of lines of code to complexity in solving a problem. It's not strong, but if the difference is a significant factor it can be a warning of complexity. This code is ~45 lines of code in main (https://github.com/coder/coder/blob/main/provisioner/terraform/provision.go#L150-L195).

There are no tests in the Terraform provider that guarantee logs are sent, so I'd be in favor of moving the testing to there for now and abstracting this once we add providers. I agree with the separation of concerns, but adding an interface that's only been implemented a single time certainly adds cognitive overhead when navigating the codebase.

There are presently no parse logs sent from the Terraform provider, so half of the purpose for this abstraction isn't actually used (apart from tests) at all. It increases maintainability assuming there are many provisioners, but we only have one.

Copy link
Member

@kylecarbs kylecarbs Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to get others' perspectives on this too, it's very possible I'm being too pedantic. I care a lot about all most of the code having a clear purpose for its structure, hence my hesitation to abstract before multiple usages pop up.

Copy link
Member

@mafredri mafredri Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already touched on this a bit in another comment. But I kind of agree with Kyle here, at least some of the abstractions. I'm fine with the Logger interface (I'm not sure how we'd do it otherwise), but I think the other logic could be moved into the terraform package. It's usually best to wait for 2+ (more the merrier) use-cases before creating an abstraction, that way we can be more confident that we're creating the right abstraction.

And there's no need to be afraid to copy-paste code between packages until enough use-cases surface! As saying GOes:

A little copying is better than a little dependency.

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked much at the terraform package before, so I haven't 100% reviewed all changes, but added my comments and thoughts nonetheless.

cmd.Env = env
if err = cmd.Run(); err != nil {
errString, _ := io.ReadAll(stdErr)
return xerrors.Errorf("%s: %w", errString, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It's good practice to describe what action failed, helps with debugging (could be something more detailed than this).

Suggested change
return xerrors.Errorf("%s: %w", errString, err)
return xerrors.Errorf("command run failed: %s: %w", errString, err)

// Provision executes `terraform apply`.
func (t *terraform) Provision(stream proto.DRPCProvisioner_ProvisionStream) error {
// Provision executes `terraform apply` or `terraform plan` for dry runs.
func (t *server) Provision(stream proto.DRPCProvisioner_ProvisionStream) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Type renamed without renaming receiver.

if strings.HasPrefix(e, "TF_") {
parts := strings.SplitN(e, "=", 2)
if len(parts) != 2 {
panic("os.Environ() returned vars not in key=value form")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a panic here, it's safe to assume os.Environ() returns the correct format. (The out of bounds access would panic anyway.)

parts[1] = "<value redacted>"
}
err := logger.Log(&proto.Log{
Level: proto.LogLevel_WARN,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: I'm not sure why warn was picked, but would info be more descriptive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

passing terraform environment variables could alter behavior in ways we have not predicted, so that's why I picked WARN.

"TF_REGISTRY_DISCOVERY_RETRY": true,
"TF_REGISTRY_CLIENT_TIMEOUT": true,
"TF_CLI_CONFIG_FILE": true,
"TF_IGNORE": true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice!


err = logger.Log(&proto.Log{Level: logLevel, Output: log.Message})
if err != nil {
// Not much we can do. We can't log because logging is itself breaking!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: We could pass a slog down to the executor (and pass as arg here or make it a method) and be able to catch these as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about that, but then you end up with two "loggers" and that's just real confusing.

Comment on lines 10 to 12
type Logger interface {
Log(l *proto.Log) error
}
Copy link
Member

@mafredri mafredri Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logger interface could be ok, but I don't see much of a use-case for having the stream abstractions (not yet at least).

Instead we could just define a fakeLogger in our tests that is passed to terraform. This would also allow us to get rid of the mocks which I think adds further abstraction and has the downside of "not reading like Go code".

type fakeLogger struct{
        logs []*proto.Log
}

func (l *fakeLogger) Log(log *proto.Log) error {
        l.logs = append(l.logs, log)
        return nil
}

All we then need to do is assert that terraform added the right log messages to l.logs.

})
}

func LogWriter(logger Logger, level proto.LogLevel) (io.WriteCloser, <-chan any) {
Copy link
Member

@mafredri mafredri Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems very specific to how we are running terraform and moves some of the granularity out of terraform, whilst creating a dependency provisionersdk. Some thoughts:

  • We're assuming output will be newline delimited (scanner)
  • We cannot control log level on a per-log line basis (may be relevant for future provisioners)
  • This function creates an io.Pipe but never closes r or w, closing w is implicitly up to the caller (the done channel makes it a bit less clear we're supposed to close)
  • In acutal uses, we're relying on exec.Command to close w (i.e. there's a lot of implicitness going on)

Comment on lines 60 to 72
func TestParseLogger_Log(t *testing.T) {
t.Parallel()

mStream := new(mocks.ParseStream)
defer mStream.AssertExpectations(t)

l := &proto.Log{Level: proto.LogLevel_ERROR, Output: "an error"}
mStream.On("Send", mock.MatchedBy(withLog(l))).Return(nil)

uut := provisionersdk.NewParseLogger(mStream)
err := uut.Log(l)
require.NoError(t, err)
}
Copy link
Member

@mafredri mafredri Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already touched on this a bit in another comment. But I kind of agree with Kyle here, at least some of the abstractions. I'm fine with the Logger interface (I'm not sure how we'd do it otherwise), but I think the other logic could be moved into the terraform package. It's usually best to wait for 2+ (more the merrier) use-cases before creating an abstraction, that way we can be more confident that we're creating the right abstraction.

And there's no need to be afraid to copy-paste code between packages until enough use-cases surface! As saying GOes:

A little copying is better than a little dependency.

Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis
Copy link
Contributor Author

I've dropped parse support and moved the logging stuff into the terraform package. One consequence of this is that the unit tests end up being in the terraform package instead of terraform_test because none of the logging code is exported. I don't personally mind a mix of internal and external tests of a package, but something to be aware of.

Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis requested a review from kylecarbs June 15, 2022 13:55
@@ -0,0 +1,63 @@
// nolint:testpackage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executor_internal_test would make the linter happy! I personally enjoy this idiom, but it's quite subjective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in executor_internal_test.go or something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, exactly

@spikecurtis spikecurtis force-pushed the spike/1359_no_tfexec branch from a41ea7d to fb6ced5 Compare June 16, 2022 17:32
@spikecurtis spikecurtis enabled auto-merge (squash) June 16, 2022 17:35
@spikecurtis spikecurtis merged commit 552dad6 into main Jun 16, 2022
@spikecurtis spikecurtis deleted the spike/1359_no_tfexec branch June 16, 2022 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manually set terraform env variable prevents template creation
3 participants