Skip to content

[Kinesis] add Scala kinesis-mock build behind feature flag #12559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 16, 2025

Conversation

gregfurman
Copy link
Contributor

Motivation

We've observed the NodeJS build of kinesis mock to encounter performance issues at higher volumes of requests -- namely with larger payload sizes.

This PR adds the original Scala build as an alternative engine for our Kinesis server behind the KINESIS_MOCK_PROVIDER_ENGINE flag.

Notes

Changes

Functionality

  • Splits up the KinesisMockServer to include a KinesisMockScalaServer and KinesisMockNodeServer
  • KinesisMockPackage can now either use the KinesisMockScalaPackageInstaller (downloads from GitHub) or KinesisMockNodePackageInstaller (downloads from NPM)

Config

  • Adds a KINESIS_MOCK_PROVIDER_ENGINE flag for switching to Scala build of Kinesis Mock. Valid values are node or scala -- where empty or invalid values will always default to node.
  • KINESIS_MOCK_MAXIMUM_HEAP_SIZE sets the maximum Java heap size corresponding to the '-Xmx' flag
  • KINESIS_MOCK_INITIAL_HEAP_SIZE sets the initial Java heap size corresponding to the '-Xms' flag

Testing

  • Creates a TestKinesisMockScala(TestKinesis) subclass that runs all tests in tests/aws/services/kinesis/test_kinesis.py::TestKinesis but with the configuration value monkeypatched to use the Scala build of Kinesis mock instead of NodeJS.

@gregfurman gregfurman added area: performance Make LocalStack go rocket-fast aws:kinesis Amazon Kinesis semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases labels Apr 25, 2025
@gregfurman gregfurman added this to the 4.4 milestone Apr 25, 2025
@gregfurman gregfurman requested a review from dfangl April 25, 2025 13:39
@gregfurman gregfurman self-assigned this Apr 25, 2025
Copy link

github-actions bot commented Apr 25, 2025

S3 Image Test Results (AMD64 / ARM64)

  2 files    2 suites   8m 41s ⏱️
488 tests 438 ✅  50 💤 0 ❌
976 runs  876 ✅ 100 💤 0 ❌

Results for commit a089a68.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Apr 25, 2025

LocalStack Community integration with Pro

    2 files  ± 0      2 suites  ±0   1h 43m 32s ⏱️ + 1m 3s
4 441 tests +16  4 058 ✅ +16  383 💤 ±0  0 ❌ ±0 
4 443 runs  +16  4 058 ✅ +16  385 💤 ±0  0 ❌ ±0 

Results for commit a089a68. ± Comparison against base commit b961fee.

♻️ This comment has been updated with latest results.

@gregfurman gregfurman marked this pull request as ready for review April 25, 2025 17:22
Comment on lines 122 to 136
"-XX:+UseG1GC",
"-XX:MaxGCPauseMillis=500",
"-XX:+UseGCOverheadLimit",
"-XX:+ExplicitGCInvokesConcurrent",
"-XX:+HeapDumpOnOutOfMemoryError",
"-XX:+ExitOnOutOfMemoryError",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to set all these flags? Do they improve performance in any way? Especially the heap dump seems like something our customers do not want, and G1GC is generally the default GC in all recent java versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MaxGCPauseMillis was for performance reasons and the ExitOnOutOfMemoryError was because I didn't want the server to stall when encountering an OOM.

Will remove:

  • UseG1GC since I was unaware it was the default (as per your comment)
  • UseGCOverheadLimit since it does nothing with G1GC
  • HeapDumpOnOutOfMemoryError since it'll spam the logs quite a bit

Comment on lines 60 to 63
@cached_property
def engine(self) -> KinesisMockEngine:
return KinesisMockEngine(config.KINESIS_MOCK_PROVIDER_ENGINE)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To determine which Kinesis mock backend to run:

if kinesismock_package.engine == KinesisMockEngine.SCALA:
return KinesisMockScalaServer(
port=port,
exe_path=kinesis_mock_path,
log_level=log_level,
latency=latency,
data_dir=persist_path,
account_id=account_id,
)
return KinesisMockNodeServer(
port=port,
exe_path=kinesis_mock_path,
log_level=log_level,
latency=latency,
data_dir=persist_path,
account_id=account_id,
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case - I am not that happy with the control here. The Servers should control which package is installed, not the package which server is used, in my opinion.
Could simplify this a bit, by having two packages, and installing the right one depending on the configuration, and then use the right server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure happy to have 2 packages 👍

Copy link
Contributor Author

@gregfurman gregfurman May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfangl How should we go about configuring the plugin for the LPM?

Should we have a seperate kinesis-mock and kinesis-mock-scala plugin or should this implementation be unified to cater for both node + scala builds i.e

@package(name="kinesis-mock")
def kinesismock_package() -> Package:
    from localstack.services.kinesis.packages import kinesismock_package

    return kinesismock_package

@package(name="kinesis-mock-scala")
def kinesismock_scala_package() -> Package:
    from localstack.services.kinesis.packages import kinesismock_scala_package
    
    return kinesismock_scala_package

OR

@package(name="kinesis-mock")
def kinesismock_package() -> Package:
    from localstack.services.kinesis.packages import (
        KinesisMockEngine,
        kinesismock_package,
        kinesismock_scala_package,
    )

    if KinesisMockEngine(config.KINESIS_MOCK_PROVIDER_ENGINE) == KinesisMockEngine.SCALA:
        return kinesismock_scala_package

    return kinesismock_package

Copy link
Contributor Author

@gregfurman gregfurman May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfangl FYI I spoke to Alex re: the above and we went with the second approach ☝️ Nvm: going to comment on your latest review

@gregfurman gregfurman modified the milestones: 4.4, 4.5 May 6, 2025
@gregfurman gregfurman force-pushed the scala/kinesis-mock branch from 4eed69f to cc634d2 Compare May 12, 2025 15:36
@gregfurman gregfurman requested a review from dfangl May 12, 2025 15:38
Copy link
Member

@dfangl dfangl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Way cleaner than before in my opinion 👍

Comment on lines +18 to +22
def _missing_(cls, value: str | Any) -> str:
# default to 'node' if invalid enum
if not isinstance(value, str):
return cls(cls.NODE)
return cls.__members__.get(value.upper(), cls.NODE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Not entirely happy with this fallback logic here - it is effectively unused, as we only ever check if it is scala anyway, and we would silently accept obviously wrong values. Also, if we ever switch the default, we have to do it in multiple places. Nothing to hold back the merge, but I think we could potentially make this easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I suppose this is unideal but I wanted to make sure (since we're feature-flagging this strategy) that nothing gets broken if incorrect values are used. Definitely erring on the defensive side.

What have we done in past for situations like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we would fallback, but at least with a warning in the log.

@gregfurman gregfurman merged commit 9990b6f into master May 16, 2025
39 checks passed
@gregfurman gregfurman deleted the scala/kinesis-mock branch May 16, 2025 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: performance Make LocalStack go rocket-fast aws:kinesis Amazon Kinesis semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants