Documents
  • Invariant Documents
  • Platform
    • Data Platform
      • Install Overview
      • System Requirement
      • Software Requirement
      • Prepare the Environment
      • Installing Ambari Server
      • Setup Ambari Server
      • Start Ambari Server
      • Single Node Install
      • Multi-Node Cluster Install
      • Cluster Install from Ambari
      • Run and monitor HDFS
    • Apache Hadoop
      • Compatible Hadoop Versions
      • HDFS
        • HDFS Architecture
        • Name Node
        • Data Node
        • File Organization
        • Storage Format
          • ORC
          • Parquet
        • Schema Design
      • Hive
        • Data Organization
        • Data Types
        • Data Definition
        • Data Manipulation
          • CRUD Statement
            • Views, Indexes, Temporary Tables
        • Cost-based SQL Optimization
        • Subqueries
        • Common Table Expression
        • Transactions
        • SerDe
          • XML
          • JSON
        • UDF
      • Oozie
      • Sqoop
        • Commands
        • Import
      • YARN
        • Overview
        • Accessing YARN Logs
    • Apache Kafka
      • Compatible Kafka Versions
      • Installation
    • Elasticsearch
      • Compatible Elasticsearch Versions
      • Installation
  • Discovery
    • Introduction
      • Release Notes
    • Methodology
    • Discovery Pipeline
      • Installation
      • DB Event Listener
      • Pipeline Configuration
      • Error Handling
      • Security
    • Inventory Manager
      • Installation
      • Metadata Management
      • Column Mapping
      • Service Configuration
      • Metadata Configuration
      • Metadata Changes and Versioning
        • Generating Artifacts
      • Reconciliation, Merging Current View
        • Running daily reconciliation and merge
      • Data Inventory Reports
    • Schema Registry
  • Process Insight
    • Process Insight
      • Overview
    • Process Pipeline
      • Data Ingestion
      • Data Storage
    • Process Dashboards
      • Panels
      • Templating
      • Alerts
        • Rules
        • Notifications
  • Content Insight
    • Content Insight
      • Release Notes
      • Configuration
      • Content Indexing Pipeline
    • Management API
    • Query DSL
    • Configuration
  • Document Flow
    • Overview
  • Polyglot Data Manager
    • Polyglot Data Manager
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Operational Insight
    • Operational Insight
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Data Science
    • Data Science Notebook
      • Setup JupyterLab
      • Configuration
        • Configuration Settings
        • Libraries
    • Spark DataHub
      • Concepts
      • Cluster Setup
      • Spark with YARN
      • PySpark Setup
        • DataFrame API
      • Reference
  • Product Roadmap
    • Roadmap
  • TIPS
    • Service Troubleshooting
    • Service Startup Errors
    • Debugging YARN Applications
      • YARN CLI
    • Hadoop Credentials
    • Sqoop Troubleshooting
    • Log4j Vulnerability Fix
Powered by GitBook
On this page
  • Filerelay container image
  • Configuration
  • Configuring the agent in podman
  1. Operational Insight
  2. Shippers

Filerelay Container

Filerelay container image

The filerelay container image leverages filebeat for streaming log data with a bind-mounted configuration option. A local mount is used for sourcing the runtime configuration, runtime history as well as identify the logs to be aggregated. This allows the state to be persisted across restarts. This container leverages shared volumes to read the logs meant for shipping as well.

The following is the directory structure expected. This can be created in the pod/docker host and mounted to present this directory structure to the filerelay container.

 - filerelay
   + conf
   + data
   + logs

The configuration file filebeat.yml should be present in the conf subdirectory. The data directory is used to track the log shipping state and is used for start, stop and crash recovery. The logs folders hold the runtime logs and are useful for troubleshooting purposes.

Configuration

A sample configuration file is shown below.

The path section of the configuration file specifies the directories used by the runtime. This will be sourced from the shared volume mounted for the container.

Next is a list of inputs, which specifies the list of paths that should be crawled to locate log files to be forwarded for indexing.

The general section provides optional tags and custom fields, such as name of the shipper that publishes the network data.

The output specifies the address of logstash listener that will process the log events. Ensure the container has network access to communicate with the listener.

The last section is the log level for the shipper. In case of running in debug level the sub system can be selected for logging by setting the value of logging.selector to "beat", "publish", "service" or “*” for all to debug level

#=========================== Path propertuies ================================
path.home: /filerelay
path.conf: /filerelay/conf
path.data: /filerelay/data
path.logs: /filerelay/logs

filebeat.inputs:
- type: log
  enabled: false

- input_type: log

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["127.0.0.1:9711"]

Configuring the agent in podman

An example is shown for reference purposes. Please adjust your host, volume, file sharing security option for mounts to suit your deployment topology.

$ sudo docker pull invariantio/filerelay

$ sudo podman run --net=host                     \
    -v /opt/inv/shipper/podfilerelay:/filerelay  \
    --security-opt label=disable                 \
    -v /opt/inv/shipper:/logs:ro                 \
     invariantio/filerelay

Param

Description

--net=host

allows for the container to use the host network

-v /opt/inv/shipper/podfilerelay:/filerelay

a persistent mount for conf,data,log used by the agent

/opt/inv/shipper:/logs:ro

a shared directory used to source the logs for shipping

--security-opt label=disable

One of the options , can also be switched Z flag

invariantio/filerelay

The file shipper based on a container

PreviousShippersNextProcessors

Last updated 4 years ago