Puppet Class: profile::analytics::refinery::job::test::gobblin

Defined in:
modules/profile/manifests/analytics/refinery/job/test/gobblin.pp

Overview

SPDX-License-Identifier: Apache-2.0

Class profile::analytics::refinery::job::test::gobblin

Declares gobblin jobs to import data from Kafka into Hadoop. (Gobblin is a replacement for Camus).

These jobs will eventually be moved to Airflow.

Parameters

gobblin_jar_file

Path to shaded jar that will be used to launch gobblin. You should set this in your role hiera to a versioned gobblin-wmf jar. Usually this is deployed alongside of analytics/refinery artifacts.

ensure_timers

This parameter can be used to disable gobblin test jobs, effectively pausing ingestion to Hadoop. This might be necessary for short periods, such as during HDFS maintenance work

Parameters:

  • gobblin_jar_file (Stdlib::Unixpath) (defaults to: lookup('profile::analytics::refinery::job::test::gobblin_jar_file'))
  • ensure_timers (String) (defaults to: lookup('profile::analytics::refinery::job::test::gobblin::ensure_timers', { 'default_value' => 'present' }))


20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# File 'modules/profile/manifests/analytics/refinery/job/test/gobblin.pp', line 20

class profile::analytics::refinery::job::test::gobblin(
    Stdlib::Unixpath $gobblin_jar_file = lookup('profile::analytics::refinery::job::test::gobblin_jar_file'),
    String $ensure_timers = lookup('profile::analytics::refinery::job::test::gobblin::ensure_timers', { 'default_value' => 'present' }),
) {
    require ::profile::analytics::refinery
    $refinery_path = $::profile::analytics::refinery::path

    # analytics-test-hadoop gobblin jobs should all use analytics-test-hadoop.sysconfig.properties.
    Profile::Analytics::Refinery::Job::Gobblin_job {
        sysconfig_properties_file => "${refinery_path}/gobblin/common/analytics-test-hadoop.sysconfig.properties",
        gobblin_jar_file          => $gobblin_jar_file,
    }

    profile::analytics::refinery::job::gobblin_job { 'webrequest_test':
        interval         => '*-*-* *:00/10:00',
    }

    profile::analytics::refinery::job::gobblin_job { 'event_default_test':
        interval         => '*-*-* *:05:00',
    }

    profile::analytics::refinery::job::gobblin_job { 'eventlogging_legacy_test':
        interval         => '*-*-* *:10:00',
    }
}