Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat]Add azure input #14092

Closed
narph opened this issue Oct 16, 2019 · 6 comments
Closed

[Filebeat]Add azure input #14092

narph opened this issue Oct 16, 2019 · 6 comments
Assignees
Labels
discuss Issue needs further discussion. Filebeat Filebeat release-highlight Team:Integrations Label for the Integrations team

Comments

@narph
Copy link
Contributor

narph commented Oct 16, 2019

2 alternatives created in order to add support of an azure input in x-pack/filebeat

1. Using the kafka input and creating a wrapper around it.

Ex configuration:

- type: azure
  eventhub: "{eventhub name}"
  consumer_group: "{consumer group}"
  connection_string: "Endpoint=sb://..."

Example output:

      {
        "_index" : "filebeat-8.0.0-2019.10.09",
        "_type" : "_doc",
        "_id" : "eeFm1W0BBEs",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2019-10-09T16:13:59.659Z",
          "input" : {
            "type" : "azure"
          },
          "agent" : {
            "version" : "8.0.0",
            "type" : "filebeat",
            "ephemeral_id" : "34ds3dsfsfs-e536-494b-abf6-a002a1ec7cd2",
            "hostname" : "DESKTOP-RFOOE09",
            "id" : "646dgdgd-8320-4632-a508-98226184ad8e"
          },
          "ecs" : {
            "version" : "1.1.0"
          },
          "host" : {
            "name" : "DESKTOP-RFOOE09",
            "id" : "1e50b6e1-9710-4164-a8f0-34fsf45fsfs",
            "hostname" : "DESKTOP-RFOOE09",
            "architecture" : "x86_64",
            "os" : {
              "platform" : "windows",
              "version" : "10.0",
              "family" : "windows",
              "name" : "Windows 10 Pro",
              "kernel" : "10.0.18362.388 (WinBuild.160101.0800)",
              "build" : "18362.388"
            }
          },
          "azure" : {
            "operationName" : "NetworkSecurityGroupCounters",
            "properties" : {
              "matchedConnections" : 0,
              "primaryIPv4Address" : "10.1.7.6",
              "ruleName" : "DefaultRule_AllowVnetInBound",
              "subnetPrefix" : "10.1.7.0/24",
              "type" : "allow",
              "vnetResourceGuid" : "{...}",
              "direction" : "In",
              "macAddress" : "..."
            },
            "resourceId" : "...",
            "systemId" : "...",
            "input" : {
              "partition" : 3,
              "offset" : 47536,
              "key" : "",
              "headers" : [ ],
              "topic" : "insights-operational-logs"
            },
            "time" : "2019-10-09T16:13:59.6590000Z",
            "category" : "NetworkSecurityGroupRuleCounter"
          }
        }
      }

The POC will try to map the "time" message field and return all properties inside the message.
@roncohen , let me know if this is what you were thinking about.

POC #14093

2. Using the azure event hubs sdk and creating an azure input from scratch:

POC #14882
In progress, will update this ticket with configuration and event details soon.

cc: @exekias

@narph narph self-assigned this Oct 16, 2019
@narph narph added Team:Integrations Label for the Integrations team discuss Issue needs further discussion. Filebeat Filebeat labels Oct 16, 2019
@roncohen
Copy link
Contributor

something like that makes sense to me. I wonder if we should even call it "eventhub" ? Can we imagine other "Azure" filebeat modules in the future?

@exekias
Copy link
Contributor

exekias commented Oct 17, 2019

Thank you for opening this @narph!

From the example output I see that you already parsed the resulting JSON. Is this something we can do with a generic mapping? I think that we are currently outputting this JSON as raw text, and parsing happens later in the pipeline.

About the name. We have a pubsub input, name ended up as google-pubsub https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-input-google-pubsub.html. I think we should go for azure-eventhub here

@narph
Copy link
Contributor Author

narph commented Oct 17, 2019

@roncohen , @exekias , thank you for your quick feedback.
Azure logs can be read from other mediums as well, storage accounts, using the azure monitor logs API etc. so it definitely makes sense to call it azure-eventhub or azure-kafka. It will also be consistent with the current naming scheme.

regarding:

From the example output I see that you already parsed the resulting JSON. Is this something we can do with a generic mapping? I think that we are currently outputting this JSON as raw text, and parsing happens later in the pipeline.

At the moment I am just overriding the kafka input function that creates the beat.Event object and reading all the properties inside the json object. No additional processing of the json is involved.

Excuse the messy code, I was just getting the plumbing in place for this POC

func mapEvents(messages []string, kafkaFields common.MapStr) []beat.Event {

@faec
Copy link
Contributor

faec commented Oct 17, 2019

Thanks! Can you clarify, is there specific functionality here that isn't possible with the existing kafka input / azure module? Is the idea to use the Azure libraries directly, so that it can work on Azure Event Hubs that don't have kafka compatibility enabled, or is this about better / more precise metadata handling, or something else? An Azure-specific input may make sense but I want to be clear about what we're gaining compared to a configuration-based solution.

@narph
Copy link
Contributor Author

narph commented Oct 21, 2019

hi @faec, a few things here could be gained here, as you mentioned above metadata handling (replacing kafka to eventhub conceptual mapping), and some message pre-processing, like automatically splitting the json messages (users don't have to necessarily know about the 'expand_event_list_from_field' setting).
The initial plan is not to use the azure libraries directly (although indeed the added value will be removing the kafka enabled requirement) but to still reuse the kafka input somehow.

@narph
Copy link
Contributor Author

narph commented Mar 26, 2020

closing this, #14882 added the new azure-eventhub input

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. Filebeat Filebeat release-highlight Team:Integrations Label for the Integrations team
Projects
None yet
Development

No branches or pull requests

6 participants