Otel Collector 를 통한 Health Checker

Splunk/Observability

by 야솔아빠 2023. 3. 27. 14:53

Health Checker — Splunk Observability Cloud documentation

Was this topic useful? Was this documentation topic helpful? Please select Yes No Please specify the reason Please select The topic did not answer my question(s) I found an error I did not like the topic organization Other Enter your email address, and som

docs.splunk.com

2. agent_config.yaml

# Default configuration file for the Linux (deb/rpm) and Windows MSI collector packages

# If the collector is installed without the Linux/Windows installer script, the following
# environment variables are required to be manually defined or configured below:
# - SPLUNK_ACCESS_TOKEN: The Splunk access token to authenticate requests
# - SPLUNK_API_URL: The Splunk API URL, e.g. https://api.us0.signalfx.com
# - SPLUNK_BUNDLE_DIR: The path to the Smart Agent bundle, e.g. /usr/lib/splunk-otel-collector/agent-bundle
# - SPLUNK_COLLECTD_DIR: The path to the collectd config directory for the Smart Agent, e.g. /usr/lib/splunk-otel-collector/agent-bundle/run/collectd
# - SPLUNK_HEC_TOKEN: The Splunk HEC authentication token
# - SPLUNK_HEC_URL: The Splunk HEC endpoint URL, e.g. https://ingest.us0.signalfx.com/v1/log
# - SPLUNK_INGEST_URL: The Splunk ingest URL, e.g. https://ingest.us0.signalfx.com
# - SPLUNK_TRACE_URL: The Splunk trace endpoint URL, e.g. https://ingest.us0.signalfx.com/v2/trace

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  http_forwarder:
    ingress:
      endpoint: 0.0.0.0:6060
    egress:
      endpoint: "${SPLUNK_API_URL}"
      # Use instead when sending to gateway
      #endpoint: "${SPLUNK_GATEWAY_URL}"
  smartagent:
    bundleDir: "${SPLUNK_BUNDLE_DIR}"
    collectd:
      configDir: "${SPLUNK_COLLECTD_DIR}"
  zpages:
    #endpoint: 0.0.0.0:55679
  memory_ballast:
    # In general, the ballast should be set to 1/3 of the collector's memory, the limit
    # should be 90% of the collector's memory.
    # The simplest way to specify the ballast size is set the value of SPLUNK_BALLAST_SIZE_MIB env variable.
    size_mib: ${SPLUNK_BALLAST_SIZE_MIB}

receivers:
  fluentforward:
    endpoint: 127.0.0.1:8006
  hostmetrics:
    collection_interval: 10s
    scrapers:
      cpu:
      disk:
      filesystem:
      memory:
      network:
      # System load average metrics https://en.wikipedia.org/wiki/Load_(computing)
      load:
      # Paging/Swap space utilization and I/O metrics
      paging:
      # Aggregated system process count metrics
      processes:
      # System processes metrics, disabled by default
      # process:
  jaeger:
    protocols:
      grpc:
        endpoint: 0.0.0.0:14250
      thrift_binary:
        endpoint: 0.0.0.0:6832
      thrift_compact:
        endpoint: 0.0.0.0:6831
      thrift_http:
        endpoint: 0.0.0.0:14268
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  # This section is used to collect the OpenTelemetry Collector metrics
  # Even if just a Splunk APM customer, these metrics are included
  prometheus/internal:
    config:
      scrape_configs:
      - job_name: 'otel-collector'
        scrape_interval: 10s
        static_configs:
        - targets: ['0.0.0.0:8888']
        metric_relabel_configs:
          - source_labels: [ __name__ ]
            regex: '.*grpc_io.*'
            action: drop
  smartagent/signalfx-forwarder:
    type: signalfx-forwarder
    listenAddress: 0.0.0.0:9080
  smartagent/processlist:
    type: processlist
  signalfx:
    endpoint: 0.0.0.0:9943
    # Whether to preserve incoming access token and use instead of exporter token
    # default = false
    #access_token_passthrough: true
  zipkin:
    endpoint: 0.0.0.0:9411
  smartagent/health-checker_55679:
    type: collectd/health-checker
    host: 127.0.0.1
    port: 55679
    tcpCheck: true
    #url: http://{{.Host}}:{{.Port}}/
    #jsonKey: "your_key"
    #jsonVal: "1"
  smartagent/health-checker_22:
    type: collectd/health-checker
    host: 127.0.0.1
    port: 22
    tcpCheck: true

processors:
  batch:
  # Enabling the memory_limiter is strongly recommended for every pipeline.
  # Configuration is based on the amount of memory allocated to the collector.
  # For more information about memory limiter, see
  # https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/memorylimiter/README.md
  memory_limiter:
    check_interval: 2s
    limit_mib: ${SPLUNK_MEMORY_LIMIT_MIB}

  # Detect if the collector is running on a cloud system, which is important for creating unique cloud provider dimensions.
  # Detector order is important: the `system` detector goes last so it can't preclude cloud detectors from setting host/os info.
  # Resource detection processor is configured to override all host and cloud attributes because instrumentation
  # libraries can send wrong values from container environments.
  # https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourcedetectionprocessor#ordering
  resourcedetection:
    detectors: [gcp, ecs, ec2, azure, system]
    override: true

  # Optional: The following processor can be used to add a default "deployment.environment" attribute to the logs and
  # traces when it's not populated by instrumentation libraries.
  # If enabled, make sure to enable this processor in the pipeline below.
  #resource/add_environment:
    #attributes:
      #- action: insert
        #value: staging/production/...
        #key: deployment.environment

exporters:
  # Traces
  sapm:
    access_token: "${SPLUNK_ACCESS_TOKEN}"
    endpoint: "${SPLUNK_TRACE_URL}"
  # Metrics + Events
  signalfx:
    access_token: "${SPLUNK_ACCESS_TOKEN}"
    api_url: "${SPLUNK_API_URL}"
    ingest_url: "${SPLUNK_INGEST_URL}"
    # Use instead when sending to gateway
    #api_url: http://${SPLUNK_GATEWAY_URL}:6060
    #ingest_url: http://${SPLUNK_GATEWAY_URL}:9943
    sync_host_metadata: true
    correlation:
  # Logs
  splunk_hec:
    token: "${SPLUNK_HEC_TOKEN}"
    endpoint: "${SPLUNK_HEC_URL}"
    source: "otel"
    sourcetype: "otel"
  # Send to gateway
  otlp:
    endpoint: "${SPLUNK_GATEWAY_URL}:4317"
    tls:
      insecure: true
  # Debug
  logging:
    loglevel: debug

service:
  extensions: [health_check, http_forwarder, zpages, memory_ballast, smartagent]
  pipelines:
    traces:
      receivers: [jaeger, otlp, smartagent/signalfx-forwarder, zipkin]
      processors:
      - memory_limiter
      - batch
      - resourcedetection
      #- resource/add_environment
      exporters: [sapm, signalfx]
      # Use instead when sending to gateway
      #exporters: [otlp, signalfx]
    metrics:
      receivers: [hostmetrics, otlp, signalfx, smartagent/signalfx-forwarder, smartagent/health-checker_55679, smartagent/health-checker_22]
      processors: [memory_limiter, batch, resourcedetection]
      exporters: [signalfx]
      # Use instead when sending to gateway
      #exporters: [otlp]
    metrics/internal:
      receivers: [prometheus/internal]
      processors: [memory_limiter, batch, resourcedetection]
      # When sending to gateway, at least one metrics pipeline needs
      # to use signalfx exporter so host metadata gets emitted
      exporters: [signalfx]
    logs/signalfx:
      receivers: [signalfx, smartagent/processlist]
      processors: [memory_limiter, batch, resourcedetection]
      exporters: [signalfx]
      # Use instead when sending to gateway
      #exporters: [otlp]
    logs:
      receivers: [fluentforward, otlp]
      processors:
      - memory_limiter
      - batch
      - resourcedetection
      #- resource/add_environment
      exporters: [splunk_hec]
      # Use instead when sending to gateway
      #exporters: [otlp]

3. restart splunk-otel-collector

sudo systemctl restart splunk-otel-collector

위 예제는 55679, 22 두개 port check 하기 위한 테스트

'Splunk > Observability' 카테고리의 다른 글

AWS Auto Scaling Group 개수가 안맞는 현상 (0)	2023.05.18
Alert message 전송시 metric의 custom property 값 활용 (0)	2023.04.03
[Log Observer Connect] Severity 설정 (0)	2023.03.07
Alert message 의 GMT 시간을 local time으로 변 (0)	2022.12.27
AWS metric이 검색 / 추가 (0)	2022.12.26