Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kinesisfirehose: decompress CloudWatch logs and extract message fields #33691

Open
1 of 2 tasks
Tietew opened this issue Mar 5, 2025 · 2 comments
Open
1 of 2 tasks

kinesisfirehose: decompress CloudWatch logs and extract message fields #33691

Tietew opened this issue Mar 5, 2025 · 2 comments
Labels
@aws-cdk/aws-kinesisfirehose Related to Amazon Kinesis Data Firehose effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2

Comments

@Tietew
Copy link
Contributor

Tietew commented Mar 5, 2025

Describe the feature

Data Firehose supports following convenient conversions:

Use Case

  • To use Firehose Data Format Conversion (Parquet, ORC) or Dynamic partitioning, decompression must be enabled.
  • These features enable us to avoid decompressing and extracting log entries explicitly on analyze collected logs.

Proposed Solution

They are implemented as processors.

CDK code:

new firehose.DeliveryStream(this, 'DeliveryStream', {
  destination: new firehose.S3Bucket(bucket, {
    decompressCloudWatchLogs: true,
    extractMessageFields: true,
  }),
});

will generate a CloudFormation template like:

{
  "ExtendedS3DestinationDescription": {
    // other configurations
    "ProcessingConfiguration": {
      "Enabled": true,
      "Processors": [
        {
          "Type": "Decompression",
          "Parameters": [
            {
              "ParameterName": "CompressionFormat",
              "ParameterValue": "GZIP"
            }
          ]
        },
        {
          "Type": "CloudWatchLogProcessing",
          "Parameters": [
            {
              "ParameterName": "DataMessageExtraction",
              "ParameterValue": "true"
            }
          ]
        }
      ]
    }
  }
}

A custom Lambda processor will be appended after decompression and extraction.

Other Information

Related issues and PRs:

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.181.0

Environment details (OS name and version, etc.)

Linux

@Tietew Tietew added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Mar 5, 2025
@github-actions github-actions bot added the @aws-cdk/aws-kinesisfirehose Related to Amazon Kinesis Data Firehose label Mar 5, 2025
@pahud
Copy link
Contributor

pahud commented Mar 5, 2025

Thank you. I guess we probably need to add DecompressionProcessor and CloudWatchLogMessageExtractionProcessor implementations and maybe update S3Bucket destination to support accepting multiple processors.

Anyways, we appreciate your pull request and we'll take a look when it's ready.

@pahud pahud added p2 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Mar 5, 2025
@Tietew
Copy link
Contributor Author

Tietew commented Mar 6, 2025

@pahud Thank you for your suggestion!

But the reasons why I chose booleans instead of processor classes are:

  • AWS management console only shows us checkboxes with no additional parameters.
  • The processors should be specified in order: decompression - extraction - format conversion - custom lambda processor - appending delimiter
  • The parameters of the processors should have predefined fixed values.

Anyway I will make more investigation about the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-kinesisfirehose Related to Amazon Kinesis Data Firehose effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2
Projects
None yet
Development

No branches or pull requests

2 participants