Version information
This version is compatible with:
- Puppet Enterprise 2023.2.x, 2023.1.x, 2023.0.x, 2021.7.x, 2021.6.x, 2021.5.x, 2021.4.x, 2021.3.x, 2021.2.x, 2021.1.x, 2021.0.x, 2019.8.x, 2019.7.x, 2019.5.x, 2019.4.x, 2019.3.x, 2019.2.x, 2019.1.x, 2019.0.x, 2018.1.x, 2017.3.x, 2017.2.x, 2016.4.x
- Puppet >= 4.10.0 < 8.0.0
- , , ,
Start using this module
Add this module to your Puppetfile:
mod 'ssm-nifi', '0.8.0'
Learn more about managing modules with a PuppetfileDocumentation
Puppet module ssm-nifi
Description
Install and configure the Apache NiFi dataflow automation software.
Setup
What nifi affects
This module will download the Apache NiFi tarball to /var/tmp/
.
Please make sure you have space for this file.
The tarball will be unpacked to a subdirectory under /opt/nifi
by default,
where it will require about the same disk space. For ease of access, the
symlink /opt/nifi/current
will point to the managed nifi directory.
NiFi defaults to store logs and state and configuration within the installation directory. This module changes this behaviour.
The module will create /var/opt/nifi
, for persistent storage outside
the software install root. This will also configure the following nifi
properties to create directories under this path.
- nifi.content.repository.directory.default
- nifi.database.directory
- nifi.documentation.working.directory
- nifi.flowfile.repository.directory
- nifi.nar.working.directory
- nifi.provenance.repository.directory.default
- nifi.web.jetty.working.directory
The module will create /var/log/nifi
, and configures NiFi to write log files
to this directory. NiFi handles log rotation by itself. See Managing
logs for more information.
The module will create /opt/nifi/conf
to store puppet managed configuration
files. The NiFi generated configuration files and the flow.xml
configuration
archive will also be stored here.
Setup Requirements
NiFi requires Java Runtime Environment. NiFi 1.14.0 runs on Java 8 or Java 11.
NiFi requires ~ 1.3 GiB download, temporary storage and unpacked
storage. Ensure /opt/nifi
and /var/tmp
has room for the downloaded
and unpacked software.
When installing on local infrastructure, consider download the distribution tarballs, validate them with the Apache distribution keys, and store it on a local repository. Adjust the configuration variables to point to your local repository. The NiFi download page also documents how to verify the integrity and authenticity of the downloaded files.
Beginning with nifi
Add dependency modules to your puppet environment:
- puppet/archive
- puppet/systemd
- puppetlabs/inifile
- puppetlabs/stdlib
You need to ensure java 8 or 11 is installed. If in doubt, use this module:
- puppetlabs/java
By default, NiFi 1.14.0 and later starts with a self-signed TLS certificate,
listens on the lo
interface only, and generates a random username and
password for access. You will need to add nifi properties to override this.
Follow the NiFi administration guide for configuration, or see the example
further down in this README.
Usage
To download and install NiFi, include the module. This will download nifi,
unpack it under /opt/nifi/nifi-<version>
, and start the service with default
configuration and storage locations.
By default, NiFi is not available over the network. It will bind to 127.0.0.1
port 8443
, using HTTPS with a self signed certificate. To make NiFi available
over the network, you will need to ensure it listens on an external interface.
Set the property nifi.web.https.host
to a hostname or an external IP address.
To change the port number, set nifi.web.https.port
.
A minimal manifest for installing Java and NiFi, then making NiFi available over the network is:
class { 'java': }
class { 'nifi':
nifi_properties => {
'nifi.web.https.host' => $trusted['certname'],
}
}
Class['java'] -> Class['nifi::service']
Using a specific version of NiFi
This module installs a specific version of NiFi. If a newer version of NiFi has
been released available, the older one will generally not be downloadable from
the Apache download CDN site. You will need to adjust the module parameters
version
and download_checksum
:
class { 'nifi':
version => 'x.y.z',
download_checksum => 'abcde...' # sha256 checksum
}
The SHA256 checksum of the NiFi tar.gz is available on the NiFi download page.
Hosting NiFi on a local repository
NiFi is a big download. Please consider hosting a copy locally for your own
use. To use a local repository, set the download_url
, download_checksum
and
version
parameters.
Example using puppet manifests:
class { 'nifi':
version => '1.14.0',
download_checksum => '858e12bce1da9bef24edbff8d3369f466dd0c48a4f9892d4eb3478f896f3e68b',
download_url => 'https://repo.example.com/nifi/nifi-1.14.0-bin.tar.gz',
}
Example using hieradata:
include nifi
nifi::version: "1.14.0"
nifi::download_checksum: "858e12bce1da9bef24edbff8d3369f466dd0c48a4f9892d4eb3478f896f3e68b"
nifi::download_url: "https://repo.example.com/nifi/nifi-1.14.0-bin.tar.gz"
Please keep download_url
, download_checksum
and version
in sync. The
URL, checksum and version should match. Otherwise, Puppet will become
confused.
To set nifi properties, like the 'sensitive properties key', add them
to the nifi_properties
class parameter. Example:
class { 'nifi':
nifi_properties => {
'nifi.sensitive.props.key' => 'keep it secret, keep it safe',
},
}
(I recommend you use hiera-eyaml
to store this somewhat securely.)
Example: Configuring TLS
NiFi will generate a self-signed TLS certificate by default.
To use a trusted TLS certificate is outside the scope of this module, a good place for it is in a "profile" wrapping the nifi module with other bits and pieces.
This example is based on the PKI configuration paths on the Red Hat OS family.
It assumes you have certificates and keys stored under /etc/pki/tls
, and use
the system CA trust store under /etc/pki/ca-trust
.
The puppetlabs-java_ks
is used to manage the Java keystore file used by NiFi.
class profile::nifi (
Stdlib::Fqdn $hostname = $trusted['certname'],
Sensitive[String] $keystorepassword = 'changeme',
) {
$hostcert = "/etc/pki/tls/certs/${hostname}.pem"
$hostprivkey = "/etc/pki/tls/private/${hostname}.pem"
class { 'java':
package => 'java-11-openjdk-headless',
before => Class['nifi::service'],
}
class { 'nifi':
nifi_properties => {
# Web properties
'nifi.web.https.host' => $hostname,
# TLS properties
# - Key store generated in this profile
'nifi.security.keystore' => '/opt/nifi/config/kesystore.jks',
'nifi.security.keystoreType' => 'jks',
'nifi.security.keystorePasswd' => $keystorepassword,
# - Default system trust store path and password for Java on Red Hat
'nifi.security.truststore' => '/etc/pki/ca-trust/extracted/java/cacerts',
'nifi.security.truststoreType' => 'jks',
'nifi.security.truststorePasswd' => 'changeit',
}
}
java_ks { "${hostname}:/opt/nifi/config/keystore.jks":
ensure => latest,
password => $keystorepassword,
certificate => $hostcert,
private_key => $hostprivkey,
require => Class['nifi::config'],
before => Class['nifi::service'],
}
}
Clustering NiFi
To create a cluster, set the cluster
class parameter to true, and add cluster
members to the cluster_nodes
hash. This configures the cluster to use
zookeeper for shared state.
Nifi requires you to set nifi.sensitive.props.key
to the same string on all
cluster nodes.
If you cluster nifi and also override the authorizers.xml
file, ensure you
also include the cluster nodes in this file.
Also, you need to configure TLS:
- Generate TLS certificates
- Set the property
nifi.cluster.protocol.is.secure = true
Or continue without TLS:
- Set the property
nifi.web.http.port
class profile::nifi {
class { 'java':
package => 'java-11-openjdk-headless',
before => Class['nifi::service']
}
class { 'nifi':
cluster => true,
nifi_properties => {
'nifi.sensitive.props.key' => 'a shared secret for encrypting properties',
},
cluster_nodes => {
'node1.example.com' => { 'id' => 1 },
'node2.example.com' => { 'id' => 2 },
'node3.example.com' => { 'id' => 3 },
}
}
}
In addition to the clustering parameters, add certificates using the TLS example in this readme from a trusted Certificate Authority for cluster communication.
NiFi user authentication
User authentication is managed using the
nifi.login.identity.providers.configuration.file
and
nifi.security.user.login.identity.provider
properties. On a fresh install,
NiFi uses the single-user-provider
. A random username and password is created
and written to the nifi-app.log
file. This is documented at
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication
This module does not manage login identity provider configuration. If you want to connect your NiFi to Active Directory or other LDAP server, you need to manage this property and provide a file.
class profile::nifi {
$login_identity_providers => '/opt/nifi/conf/custom-login-identity-providers.xml'
class nifi {
nifi_properties => {
nifi.login.identity.providers.configuration.file => $login_identity_providers,
nifi.security.user.login.identity.provider => 'my-custom-identity-provider',
}
}
$template_params = {
# [...]
}
file { $login_identity_providers:
content => epp('profile/nifi/my-custom-login-identity-providers.epp, $template_params')
# [...]
}
}
NiFi user authorization
Authorization is managed using the nifi.authorizer.configuration.file
and
nifi.security.user.authorizer
properties. This is documented at
'https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#multi-tenant-authorization'
This module manages /opt/nifi/conf/authorizers.xml
to support clustering, it
is otherwise similar to the default content.
You can override this file using a collector (using the File <| ... |> {}
syntax) to use your own template by overriding the content
parameter of the
file managed by the nifi module.
class profile::nifi {
$authorizers => '/opt/nifi/conf/authorizers.xml'
class { 'nifi':
nifi_properties => {
nifi.authorizer.configuration.file => $authorizers, # module default, added for clarity
nifi.security.user.authorizer => 'my-custom-authorizer-provider',
}
}
$template_params = {
# [...]
}
File <| title == $authorizers |> {
content => epp('profile/nifi/my-custom-authorizers.epp, $template_params')
}
}
Note: The example above assumes that the module parameter
nifi::config_directory
is left at its default /opt/nifi/conf
.
Managing upgrades
The Upgrade Recommendations lists properties which should be set to enable NiFi upgrades to keep the same configuration and state.
The module has defaults for data storage outside the installation
directory. For now, you need to add add settings to point to the
config_resource_dir
used in the examples above.
nifi.flow.configuration.file
nifi.flow.configuration.archive.dir
nifi.authorizer.configuration.file
Managing logs
The NiFi logs are written to $nifi::log_directory
(default /var/log/nifi
).
The directory prevents access for "other", but the files within are otherwise
readable. You can use ACLs on the directory to permit access to your favourite
log reading program. The
puppet-posix_acl module
can be used like this:
class profile::nifi (
$log_directory => '/var/log/nifi',
) {
class { 'nifi':
log_directory => $log_directory,
# [...]
}
posix_acl { $log_directory:
action => set,
permission => [ 'user:logreader:r-x' ],
require => File[$log_directory],
}
}
NiFi state management
This module configures NiFi to use /opt/nifi/conf/state-management.xml
instead of the ./conf/state-management.xml
in the NiFi install directory. The
values in this file are NiFi defaults, apart from the local state management
directory or the cluster state management connect string.
To override this file with your own values, provide a nifi_properties
class
parameter which includes nifi.state.management.configuration.file
pointing to
your own file.
class profile::nifi (
$custom_state_management => '/path/to/custom/state-management.xml',
) {
class { 'nifi':
nifi_properties => {
'nifi.state.management.configuration.file' => $custom_state_management
}
}
file { $custom_state_management:
notify => Class['nifi::service'],
}
}
Notes and thoughts
About the ZooKeeper connection string. The NiFi administration guide says "This should containe a list of all ZooKeeper instances in the ZooKeeper quorum", while the ZooKeeper overview says "a client connects to one node". This module follows assumes that the NiFi cluster runs its own ZooKeeper and lets any node connect as client to any other node.
nifi 1 nifi 2 nifi 3
| | |
zookeeper 1 --- zookeeper 2 --- zookeeper 3
Java Keystore: NiFi administration guide says "JKS is the preferred type", while the "keytool" utility provided by the java package says "JKS is deprecated, use PKCS12".
Limitations
This module is under development, and therefore somewhat light on functionality and sensible defaults.
State management: This module configures rudimentary NiFi state management for
local state and with zookeeper for cluster state. The redis
method is not
managed with this module.
To manage more configuration files, add a file resource of your own, and set
the related property using the nifi_properties
class parameter.
Development
In the Development section, tell other users the ground rules for contributing to your project and how they should submit their work.
Reference
Table of Contents
Classes
Public Classes
nifi
: Manage Apache NiFi
Private Classes
nifi::config
: Manage configuration for Apache NiFinifi::install
: Install Apache NiFinifi::service
: Manage the Apache NiFi service
Data types
Classes
nifi
Install, configure and run Apache NiFi
The hash must be structured like { 'fqdn.example.com' => { 'id' => 1 },... }
Examples
Defaults
include nifi
Downloading from a different repository
class { 'nifi':
version => 'x.y.z',
download_url => 'https://my.local.repo.example.com/apache/nifi/nifi-x.y.z.tar.gz',
download_checksum => 'abcde...',
}
Configuring a NiFi cluster
class { 'nifi':
cluster => true,
cluster_nodes => {
'nifi-1.example.com' => { 'id' => 1 },
'nifi-2.example.com' => { 'id' => 2 },
'nifi-3.example.com' => { 'id' => 3 },
}
}
Parameters
The following parameters are available in the nifi
class:
version
user
group
download_url
download_checksum
download_checksum_type
download_tmp_dir
service_limit_nofile
service_limit_nproc
install_root
var_directory
log_directory
config_directory
nifi_properties
cluster
cluster_nodes
zookeeper_connect_string
zookeeper_client_port
zookeeper_secure_client_port
zookeeper_use_secure_client_port
initial_admin_identity
version
Data type: String
The version of Apache NiFi. This must match the version in the tarball. This is used for managing files, directories and paths in the service.
Default value: '1.15.3'
user
Data type: String
The user owning the nifi installation files, and running the service.
Default value: 'nifi'
group
Data type: String
The group owning the nifi installation files, and running the service.
Default value: 'nifi'
download_url
Data type: String
Where to download the binary installation tarball from.
Default value: "https://dlcdn.apache.org/nifi/${version}/nifi-${version}-bin.tar.gz"
download_checksum
Data type: String
The expected checksum of the downloaded tarball. This is used for verifying the integrity of the downloaded tarball.
Default value: 'c77fe8e4bc534f16fd5482832285e0bde07495308f31fd6d0fbb3118042daed4'
download_checksum_type
Data type: String
The checksum type of the downloaded tarball. This is used for verifying the integrity of the downloaded tarball.
Default value: 'sha256'
download_tmp_dir
Data type: Stdlib::Absolutepath
Temporary directory for downloading the tarball.
Default value: '/var/tmp'
service_limit_nofile
Data type: Integer
The limit on number of open files permitted for the service. Used for LimitNOFILE= in nifi.service.
Default value: 50000
service_limit_nproc
Data type: Integer
The limit on number of processes permitted for the service. Used for LimitNPROC= in nifi.service.
Default value: 10000
install_root
Data type: Stdlib::Absolutepath
The root directory of the nifi installation.
Default value: '/opt/nifi'
var_directory
Data type: Stdlib::Absolutepath
The root of the writable paths used by NiFi. Nifi will create directories beneath this path. This will implicitly add nifi properties for working directories and repositories.
Default value: '/var/opt/nifi'
log_directory
Data type: Stdlib::Absolutepath
The directory where NiFi stores its user, app and bootstrap logs. Nifi will create log files beneath this path and take care of log rotation and deletion.
Default value: '/var/log/nifi'
config_directory
Data type: Stdlib::Absolutepath
Directory for NiFi version independent configuration files to be kept across NiFi version upgrades. NiFi will also write generated configuration files to this directory. This is used in addition to the "./conf" directory within each NiFi installation.
Default value: '/opt/nifi/config'
nifi_properties
Data type: Hash[String,Nifi::Property]
Hash of parameter key/values to be added to conf/nifi.properties.
Default value: {}
cluster
Data type: Boolean
If true, enables the built-in zookeeper cluster for shared configuration and state management. The cluster_nodes parameter is used to configure the zookeeper cluster, and nifi will connect to their local zookeper instance.
Default value: false
cluster_nodes
Data type: Hash[ Stdlib::Fqdn, Struct[{id => Integer[1,255]}] ]
A hash of zookeeper cluster nodes and their ID. The ID must be an integer between 1 and 255, unique in the cluster, and must not be changed once set.
Default value: {}
zookeeper_connect_string
Data type: Optional[String]
The zookeeper connect string is autogenerated from the cluster_nodes
as
well as the zookeeper client port parameteres. To override this, set the
connecct string using this parameter.
Default value: undef
zookeeper_client_port
Data type: Stdlib::Port::Unprivileged
When clustering Nifi, this port is used by NiFi clustering and state management. This is used for unencrypted communication between NiFi zookeeper client and the embedded zookeeper server.
Depending on the module parameter zookeeper_use_secure_client_port
,
NiFi will use either this port or the port controlled by the parameter
zookeeper_secure_client_port
.
Default value: 2181
zookeeper_secure_client_port
Data type: Stdlib::Port::Unprivileged
When clustering Nifi, this port is used by NiFi clustering and state management. This is used for encrypted communication between NiFi zookeeper client and the embedded zookeeper server.
Depending on the module parameter zookeeper_use_secure_client_port
,
NiFi will use either this port or the port controlled by the parameter
zookeeper_client_port
.
Default value: 2281
zookeeper_use_secure_client_port
Data type: Boolean
Controls if the NiFi cluster will use TLS to connnect to the embedded zookeeper.
If true, NiFi will use TLS and connect to the zookeeper_secure_client_port
.
If false, NiFi will use cleartext communication to connect to zookeeper on the
zookeeper_client_port
.
Default value: true
initial_admin_identity
Data type: Optional[String]
Default value: undef
Data types
Nifi::Property
The Nifi::Property data type.
Alias of
Variant[Boolean, Integer, String, Sensitive[String]]
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.8.0 - 2020-01-21
Added
- Add
zookeeper_*
class parameters. These are used for clustering NiFi using the embedded zookeeper.
Changed
- Permit
Sensitive[String]
as NiFi property values.
0.7.2 - 2022-01-19
Changed
- Install NiFi version 1.15.3 by default
0.7.1 - 2022-01-19
Fixed
- Update missing REFERENCE for 0.7.0 changes
- Update the
puppet/systemd
module in documentation and test fixtures
0.7.0 - 2022-01-18
Changed
- Install NiFi version 1.15.2 by default
- Data type Validation of NiFi properties. Valid keys are
String
, and values must be of typeBoolean
,Integer
orString
.
Added
- Manage a log directory
/var/log/nifi
configurable with thelog_directory
parameter. - Manage a configuration directory
/opt/nifi/config
configuable with theconfig_directory
parameter. This is used for configuration files intended to survive an upgrade of NiFi. - Manage state management configuration file
state-management.xml
in theconfig_directory
. See README to use your own state management configuration. - Manage authorizations configuration file
authorizers.xml
in theconfig_directory
to add cluster nodes and optionally an initial admin identity. See README for how to use your own authorization configuration. - Manage a NiFi cluster with the
cluster
andcluster_nodes
parameters. This enables the built-in zookeeper for cluster state management as well as authorization for cluster nodes. NiFi clustering with this module assumes you have a source of TLS keys, certificates and CA trust. See README for configuring a cluster.
0.6.0 - 2021-12-20
Changed
- Install NiFi version
1.15.1
by default
0.5.0 - 2021-08-09
Changed
- Install NiFi version
1.14.0
by default
0.4.0
Fixed
- Use user and group parameters when managing the install directory instead of hardcoded 'nifi'
- Fix missing parameter
var_directory
Changed
- Install NiFi version
1.13.2
by default. - Update module with PDK 2.1.0
- Adjust upper bounds for dependencies on puppet and modules
- Improve documentation example for basic usage
Added
- Add documentation example for NiFi cluster
0.3.1 - 2020-07-09
Fixed
- Fix syntax error in module metadata (#5)
0.3.0 - 2020-03-17
Added
- Add acceptance testing
Changed
- Install NiFi version
1.11.3
by default. - Set NiFi local state directory to
${var_directory}/state/local
to ensure it survives future NiFi upgrades. To avoid losing state when upgrading to this module version, stop the NiFi service, move/opt/nifi/nifi-${version}/state/local
to/var/opt/nifi/state/local
, then run Puppet to configure NiFi.
0.2.0 - 2020-02-22
Added
- Management of
nifi.properties
Changed
- Default nifi version to install is now
1.11.1
.
0.1.1 - 2020-01-22
Changed
- systemd now starts the process as a simple instead of a forking service.
0.1.0 - 2020-01-14
Added
- Initial release.
- Download, install and start Apache NiFi.
Dependencies
- puppetlabs/stdlib (>= 4.13.0 < 8.0.0)
- puppetlabs/inifile (>= 1.0.0 < 6.0.0)
- puppetlabs/java (>= 6.0.0 < 8.0.0)
- puppet/archive (>= 4.0.0 < 6.0.0)
- puppet/systemd (>= 2.0.0 < 4.0.0)