Skip navigation links
A B C D E F G I J K L M N O P R S T V W 

A

addSingleTaskWriterOutputToExistingDir(Path, Path, WorkUnitState, int, ParallelRunner) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
addWriterOutputToExistingDir(Path, Path, WorkUnitState, int, ParallelRunner) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
addWriterOutputToExistingDir(Path, Path, WorkUnitState, int, ParallelRunner) - Method in class org.wikimedia.gobblin.copy.TimePartitionedDataPublisher
This method needs to be overridden for TimePartitionedDataPublisher, since the output folder structure contains timestamp, we have to move the files recursively.
addWriterOutputToNewDir(Path, Path, WorkUnitState, int, ParallelRunner) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
addWriterOutputToNewDir(Path, Path, WorkUnitState, int, ParallelRunner) - Method in class org.wikimedia.gobblin.copy.TimePartitionedDataPublisher
This method needs to be overridden for TimePartitionedDataPublisher, since the output folder structure contains timestamp, we need to move the files recursively.
ALL_TOPICS - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
AVG_RECORD_MILLIS - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 

B

BaseDataPublisher - Class in org.wikimedia.gobblin.copy
A basic implementation of SingleTaskDataPublisher that publishes the data from the writer output directory to the final output directory.
BaseDataPublisher(State) - Constructor for class org.wikimedia.gobblin.copy.BaseDataPublisher
 
BOOTSTRAP_WITH_OFFSET - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
build() - Method in class org.wikimedia.gobblin.writer.SimpleStringWriterBuilder
Build a DataWriter.
build() - Method in class org.wikimedia.gobblin.writer.TimestampedStringRecordDataWriterBuilder
Build a DataWriter.
buildEncoders() - Method in class org.wikimedia.gobblin.writer.SimpleStringWriterBuilder
 
buildStreamCompressor(Map<String, Object>) - Static method in class org.wikimedia.gobblin.compression.CompressionFactory
 
bytesWritten() - Method in class org.wikimedia.gobblin.writer.SimpleStringWriter
Get the number of bytes written.
bytesWritten() - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 

C

cleanup() - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 
close() - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
close() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
close() - Method in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
close() - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 
closer - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
commit() - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 
commitOffsetsAsync(Map<KafkaPartition, Long>) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
Commit offsets to Kafka asynchronously
commitOffsetsSync(Map<KafkaPartition, Long>) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
Commit offsets to Kafka synchronously
committed(KafkaPartition) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
returns the last committed offset for a KafkaPartition
CompressionFactory - Class in org.wikimedia.gobblin.compression
Duplication of CompressionFactory to use a modified instance of GzipCodec generating '.gz' extensions instead of '.gzip'.
consume(KafkaPartition, long, long) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
consume() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
CONVERTER_TIMESTAMPED_RECORD_WRAPPED - Static variable in class org.wikimedia.gobblin.converter.TimestampedRecordConverterWrapper
 
convertRecord(String, TimestampedRecord<I>, WorkUnitState) - Method in class org.wikimedia.gobblin.converter.TimestampedRecordConverterWrapper
 
convertSchema(String, WorkUnitState) - Method in class org.wikimedia.gobblin.converter.TimestampedRecordConverterWrapper
 
create(Config) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Factory
 
createDestinationDescriptor(WorkUnitState, int) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
Create destination dataset descriptor
CURRENT_TIMESTAMP_AS_DEFAULT_KEY - Static variable in class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordExtractor
 

D

decodeKafkaMessage(KafkaConsumerRecord) - Method in class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordExtractor
We need to override this protected method as the kafka timestamp information is not available in the KafkaExtractor.decodeRecord(ByteArrayBasedKafkaRecord) abstract method.
decodeRecord(ByteArrayBasedKafkaRecord) - Method in class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordExtractor
Unreachable, as KafkaExtractor.decodeKafkaMessage(KafkaConsumerRecord) is override not to use this method.
DEFAULT_BOOTSTRAP_WITH_OFFSET - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_GOBBLIN_KAFKA_CONSUMER_CLIENT_FACTORY_CLASS - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_GOBBLIN_KAFKA_SHOULD_ENABLE_DATASET_STATESTORE - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_KEY_DESERIALIZER - Static variable in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
DEFAULT_MAX_POSSIBLE_OBSERVED_LATENCY_IN_HOURS - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_NAMESPACE_NAME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_OBSERVED_LATENCY_MEASUREMENT_ENABLED - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_OBSERVED_LATENCY_PRECISION - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_PUBLISHER_TIME_PARTITION_FLAG - Static variable in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
DEFAULT_RESET_ON_OFFSET_OUT_OF_RANGE - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_TABLE_TYPE - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
DEFAULT_TIMESTAMP_COLUMN - Static variable in class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
 
DEFAULT_TIMESTAMP_FORMAT - Static variable in class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
 

E

EARLIEST_OFFSET - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
EXTRACTED_TIMESTAMP_TYPES_KEY - Static variable in class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordExtractor
 

F

Factory() - Constructor for class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Factory
 
flagPublisher - Variable in class org.wikimedia.gobblin.publisher.TimePartitionedDataPublisherWithFlag
 
flush() - Method in class org.wikimedia.gobblin.writer.SimpleStringWriter
Flush the staging file.
flush() - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 

G

getEarliestOffset(KafkaPartition) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
getExtractor(WorkUnitState) - Method in class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordSource
 
getFilteredTopics(SourceState) - Method in class org.wikimedia.gobblin.copy.KafkaSource
Return topics to be processed filtered by job-level whitelist and blacklist.
getFilteredTopics(SourceState) - Method in class org.wikimedia.gobblin.kafka.WikimediaKafkaSource
Finds Kafka topics to ingest.
getKey() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Kafka1ConsumerRecord
 
getLatestOffset(KafkaPartition) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
getMetrics() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
getNonKafkaTimestamp(String) - Method in class org.wikimedia.gobblin.writer.partitioner.TimestampedRecordOrJsonStringTimeBasedWriterPartitioner
 
getNonKafkaTimestamp(P) - Method in class org.wikimedia.gobblin.writer.partitioner.TimestampedRecordTimeBasedWriterPartitioner
 
getOutputFilePath() - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 
getPublisherFileSystem(State) - Static method in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
getPublisherOutputDir(WorkUnitState, int) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
Get the output directory path this BaseDataPublisher will write to.
getRecordTimestamp(String) - Method in class org.wikimedia.gobblin.utils.JsonStringTimestampExtractor
 
getRecordTimestamp(String) - Method in class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
 
getRecordTimestamp(TimestampedRecord<P>) - Method in class org.wikimedia.gobblin.writer.partitioner.TimestampedRecordTimeBasedWriterPartitioner
 
getSchema() - Method in class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordExtractor
 
getTag() - Method in class org.wikimedia.gobblin.compression.GzipCodec
 
getTimestamp() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Kafka1ConsumerRecord
 
getTimestampType() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Kafka1ConsumerRecord
 
getTopics() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
getValue() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Kafka1ConsumerRecord
 
getWorkunits(SourceState) - Method in class org.wikimedia.gobblin.copy.KafkaSource
 
getWriterPartitionerTimestampColumns(State, int, int) - Static method in class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
Utility function facilitating getting a timestamp-column.
getWriterPartitionerTimestampFormat(State, int, int) - Static method in class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
Utility function facilitating getting a timestamp-format.
GOBBLIN_CONFIG_KEY_DESERIALIZER_CLASS_KEY - Static variable in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
GOBBLIN_CONFIG_VALUE_DESERIALIZER_CLASS_KEY - Static variable in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
GOBBLIN_KAFKA_CONSUMER_CLIENT_FACTORY_CLASS - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
GOBBLIN_KAFKA_EXTRACT_ALLOW_TABLE_TYPE_NAMESPACE_CUSTOMIZATION - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
GOBBLIN_KAFKA_SHOULD_ENABLE_DATASET_STATESTORE - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
GzipCodec - Class in org.wikimedia.gobblin.compression
Gzip codec using a '.gz' extension instead of '.gzip'.
GzipCodec() - Constructor for class org.wikimedia.gobblin.compression.GzipCodec
 

I

init(WorkUnitState) - Method in class org.wikimedia.gobblin.converter.TimestampedRecordConverterWrapper
 
initialize() - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
initialize() - Method in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
Deprecated.
isSpeculativeAttemptSafe() - Method in class org.wikimedia.gobblin.writer.SimpleStringWriter
 
isTimestampLogAppend() - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Kafka1ConsumerRecord
 
isTopicQualified(KafkaTopic) - Method in class org.wikimedia.gobblin.copy.KafkaSource
Whether a KafkaTopic is qualified to be pulled.

J

JsonStringTimeBasedWriterPartitioner - Class in org.wikimedia.gobblin.writer.partitioner
A TimeBasedWriterPartitioner for byte[] containing json.
JsonStringTimeBasedWriterPartitioner(State) - Constructor for class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
 
JsonStringTimeBasedWriterPartitioner(State, int, int) - Constructor for class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
 
JsonStringTimestampExtractor - Class in org.wikimedia.gobblin.utils
A utility class parsing JSON from byte[] ans extracting a timestamp from the data.
JsonStringTimestampExtractor(Collection<String>, String) - Constructor for class org.wikimedia.gobblin.utils.JsonStringTimestampExtractor
 
JsonStringTimestampExtractor.TimestampFormat - Enum in org.wikimedia.gobblin.utils
 

K

Kafka1ConsumerClient<K,V> - Class in org.wikimedia.gobblin.copy
A GobblinKafkaConsumerClient that uses kafka 1.1 consumer client.
Kafka1ConsumerClient(Config, Consumer<K, V>) - Constructor for class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
 
Kafka1ConsumerClient.Factory - Class in org.wikimedia.gobblin.copy
A factory class to instantiate Kafka1ConsumerClient
Kafka1ConsumerClient.Kafka1ConsumerRecord<K,V> - Class in org.wikimedia.gobblin.copy
A record returned by Kafka1ConsumerClient
Kafka1ConsumerRecord(ConsumerRecord<K, V>) - Constructor for class org.wikimedia.gobblin.copy.Kafka1ConsumerClient.Kafka1ConsumerRecord
 
Kafka1TimestampedRecordExtractor<P> - Class in org.wikimedia.gobblin.kafka
Extracts TimestampedRecord records from Kafka using Kafka1 client.
Kafka1TimestampedRecordExtractor(WorkUnitState) - Constructor for class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordExtractor
 
Kafka1TimestampedRecordSource - Class in org.wikimedia.gobblin.kafka
 
Kafka1TimestampedRecordSource() - Constructor for class org.wikimedia.gobblin.kafka.Kafka1TimestampedRecordSource
 
KAFKA_OFFSET_LOOKBACK - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
kafkaConsumerClient - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
kafkaConsumerClientPool - Variable in class org.wikimedia.gobblin.copy.KafkaSource
 
KafkaSource<S,D> - Class in org.wikimedia.gobblin.copy
A Source implementation for Kafka source.
KafkaSource() - Constructor for class org.wikimedia.gobblin.copy.KafkaSource
 

L

LATEST_OFFSET - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
LEADER_HOSTANDPORT - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
LEADER_ID - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
lineageInfo - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
lineageInfo - Variable in class org.wikimedia.gobblin.copy.KafkaSource
 

M

MAX_POSSIBLE_OBSERVED_LATENCY_IN_HOURS - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
metadataMergers - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
metaDataWriterFileSystemByBranches - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
movePath(ParallelRunner, State, Path, Path, int) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 

N

NEAREST_OFFSET - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
NUM_TOPIC_PARTITIONS - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
numBranches - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 

O

OBSERVED_LATENCY_MEASUREMENT_ENABLED - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
OBSERVED_LATENCY_PRECISION - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
OFFSET_FETCH_EPOCH_TIME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
OFFSET_FETCH_TIMER - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
OFFSET_LOOKBACK - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
org.wikimedia.gobblin - package org.wikimedia.gobblin
 
org.wikimedia.gobblin.compression - package org.wikimedia.gobblin.compression
 
org.wikimedia.gobblin.converter - package org.wikimedia.gobblin.converter
 
org.wikimedia.gobblin.copy - package org.wikimedia.gobblin.copy
 
org.wikimedia.gobblin.kafka - package org.wikimedia.gobblin.kafka
 
org.wikimedia.gobblin.publisher - package org.wikimedia.gobblin.publisher
 
org.wikimedia.gobblin.utils - package org.wikimedia.gobblin.utils
 
org.wikimedia.gobblin.writer - package org.wikimedia.gobblin.writer
 
org.wikimedia.gobblin.writer.partitioner - package org.wikimedia.gobblin.writer.partitioner
 

P

parallelRunnerCloser - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
parallelRunners - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
parallelRunnerThreads - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
ParallelRunnerWithTouch - Class in org.wikimedia.gobblin.publisher
 
ParallelRunnerWithTouch(int, FileSystem) - Constructor for class org.wikimedia.gobblin.publisher.ParallelRunnerWithTouch
 
PARTITION_ID - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
permissions - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
populateClientPool(int, GobblinKafkaConsumerClient.GobblinKafkaConsumerClientFactory, Config) - Method in class org.wikimedia.gobblin.copy.KafkaSource
 
PREVIOUS_HIGH_WATERMARK - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
PREVIOUS_LATEST_OFFSET - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
PREVIOUS_LOW_WATERMARK - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
PREVIOUS_OFFSET_FETCH_EPOCH_TIME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
PREVIOUS_START_FETCH_EPOCH_TIME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
PREVIOUS_STOP_FETCH_EPOCH_TIME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
publish(Collection<? extends WorkUnitState>) - Method in class org.wikimedia.gobblin.publisher.TimePartitionedDataPublisherWithFlag
 
publishData(WorkUnitState) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
publishData(Collection<? extends WorkUnitState>) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
publishData(WorkUnitState, int, boolean, Set<Path>) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
publishData(Collection<? extends WorkUnitState>) - Method in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
PUBLISHER_PUBLISHED_FLAGS_KEY - Static variable in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
PUBLISHER_TIME_PARTITION_FLAG_KEY - Static variable in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
publisherFileSystemByBranches - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
publisherFinalDirOwnerGroupsByBranches - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
publisherOutputDirs - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
publishMetadata(Collection<? extends WorkUnitState>) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
Merge all of the metadata output from each work-unit and publish the merged record.
publishMetadata(WorkUnitState) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
Publish metadata for each branch.
publishMetadata(Collection<? extends WorkUnitState>) - Method in class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
publishMultiTaskData(WorkUnitState, int, Set<Path>) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
This method publishes task output data for the given WorkUnitState, but if there are output data of other tasks in the same folder, it may also publish those data.

R

RECORD_CREATION_TIMESTAMP_FIELD - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
RECORD_CREATION_TIMESTAMP_UNIT - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
RECORD_LEVEL_SLA_MINUTES_KEY - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
recordPublisherOutputDirs(Path, Path, int) - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
recordsWritten() - Method in class org.wikimedia.gobblin.writer.SimpleStringWriter
Get the number of records written.
recordsWritten() - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 
RESET_ON_OFFSET_OUT_OF_RANGE - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
retrierConfig - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 

S

settingsListToMap(Collection<String>) - Method in class org.wikimedia.gobblin.kafka.WikimediaKafkaSource
 
shouldPublishMetadataFirst() - Method in class org.wikimedia.gobblin.copy.BaseDataPublisher
The BaseDataPublisher relies on publishData() to create and clean-up the output directories, so data has to be published before the metadata can be.
shouldRetry - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
shutdown(SourceState) - Method in class org.wikimedia.gobblin.copy.KafkaSource
 
SimpleStringWriter - Class in org.wikimedia.gobblin.writer
Copied and updated from SimpleDataWriter.
SimpleStringWriter(SimpleStringWriterBuilder, State) - Constructor for class org.wikimedia.gobblin.writer.SimpleStringWriter
 
SimpleStringWriterBuilder - Class in org.wikimedia.gobblin.writer
A DataWriterBuilder for building DataWriter that writes String.
SimpleStringWriterBuilder() - Constructor for class org.wikimedia.gobblin.writer.SimpleStringWriterBuilder
 
START_FETCH_EPOCH_TIME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
STOP_FETCH_EPOCH_TIME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
subscribe(String) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
Subscribe to a kafka topic TODO Add multi topic support
subscribe(String, GobblinConsumerRebalanceListener) - Method in class org.wikimedia.gobblin.copy.Kafka1ConsumerClient
Subscribe to a kafka topic with a {#GobblinConsumerRebalanceListener} TODO Add multi topic support

T

TAG - Static variable in class org.wikimedia.gobblin.compression.GzipCodec
 
TimePartitionedDataPublisher - Class in org.wikimedia.gobblin.copy
For time partition jobs, writer output directory is $GOBBLIN_WORK_DIR/task-output/{extractId}/{tableName}/{partitionPath}, where partition path is the time bucket, e.g., 2015/04/08/15.
TimePartitionedDataPublisher(State) - Constructor for class org.wikimedia.gobblin.copy.TimePartitionedDataPublisher
 
TimePartitionedDataPublisherWithFlag - Class in org.wikimedia.gobblin.publisher
 
TimePartitionedDataPublisherWithFlag(State) - Constructor for class org.wikimedia.gobblin.publisher.TimePartitionedDataPublisherWithFlag
 
TimePartitionedFlagDataPublisher - Class in org.wikimedia.gobblin.publisher
IMPORTANT NOTES: - Gobblin published folders are expected to be in the form {PUBLISHERDIR}/{TABLE_NAME}/{PARTITION} This means that the "writer.file.path.type" property is expected to be "tablename" - Gobblin writter partition-scheme is expected to follow time-order when sorted alphabetically Publisher output flag path: {PUBLISHERDIR}/{TABLE_NAME}/{PARTITION}/{FLAG} For topics having crossed time-partitions boundary across all their kafka-partitions.
TimePartitionedFlagDataPublisher(State) - Constructor for class org.wikimedia.gobblin.publisher.TimePartitionedFlagDataPublisher
 
TIMESTAMP_COLUMNS_KEY - Static variable in class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
 
TIMESTAMP_FORMAT_KEY - Static variable in class org.wikimedia.gobblin.writer.partitioner.JsonStringTimeBasedWriterPartitioner
 
TimestampedRecord<P> - Class in org.wikimedia.gobblin
 
TimestampedRecord(P, Optional<Long>) - Constructor for class org.wikimedia.gobblin.TimestampedRecord
 
TimestampedRecordConverterWrapper<I,O> - Class in org.wikimedia.gobblin.converter
 
TimestampedRecordConverterWrapper() - Constructor for class org.wikimedia.gobblin.converter.TimestampedRecordConverterWrapper
 
TimestampedRecordOrJsonStringTimeBasedWriterPartitioner - Class in org.wikimedia.gobblin.writer.partitioner
A TimeBasedWriterPartitioner for Timestamped byte[] records whose payload contains json.
TimestampedRecordOrJsonStringTimeBasedWriterPartitioner(State) - Constructor for class org.wikimedia.gobblin.writer.partitioner.TimestampedRecordOrJsonStringTimeBasedWriterPartitioner
 
TimestampedRecordOrJsonStringTimeBasedWriterPartitioner(State, int, int) - Constructor for class org.wikimedia.gobblin.writer.partitioner.TimestampedRecordOrJsonStringTimeBasedWriterPartitioner
 
TimestampedRecordTimeBasedWriterPartitioner<P> - Class in org.wikimedia.gobblin.writer.partitioner
A TimeBasedWriterPartitioner for byte[] containing json.
TimestampedRecordTimeBasedWriterPartitioner(State) - Constructor for class org.wikimedia.gobblin.writer.partitioner.TimestampedRecordTimeBasedWriterPartitioner
 
TimestampedRecordTimeBasedWriterPartitioner(State, int, int) - Constructor for class org.wikimedia.gobblin.writer.partitioner.TimestampedRecordTimeBasedWriterPartitioner
 
TimestampedRecordWriterWrapper<P> - Class in org.wikimedia.gobblin.writer
Copied and updated from SimpleDataWriter.
TimestampedRecordWriterWrapper(DataWriter<P>) - Constructor for class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
Initialize a new metadata wrapper.
TimestampedStringRecordDataWriterBuilder - Class in org.wikimedia.gobblin.writer
A DataWriterBuilder for building DataWriter that writes a TimestampedRecord containing a byte[] payload.
TimestampedStringRecordDataWriterBuilder() - Constructor for class org.wikimedia.gobblin.writer.TimestampedStringRecordDataWriterBuilder
 
TOPIC_BLACKLIST - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
TOPIC_NAME - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
TOPIC_WHITELIST - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
TOPICS_MOVE_TO_LATEST_OFFSET - Static variable in class org.wikimedia.gobblin.copy.KafkaSource
 
touchPath(Path) - Method in class org.wikimedia.gobblin.publisher.ParallelRunnerWithTouch
 

V

valueOf(String) - Static method in enum org.wikimedia.gobblin.utils.JsonStringTimestampExtractor.TimestampFormat
Returns the enum constant of this type with the specified name.
values() - Static method in enum org.wikimedia.gobblin.utils.JsonStringTimestampExtractor.TimestampFormat
Returns an array containing the constants of this enum type, in the order they are declared.

W

WikimediaKafkaSource<S,D> - Class in org.wikimedia.gobblin.kafka
 
WikimediaKafkaSource() - Constructor for class org.wikimedia.gobblin.kafka.WikimediaKafkaSource
 
wrappedWriter - Variable in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 
write(String) - Method in class org.wikimedia.gobblin.writer.SimpleStringWriter
Write a source record to the staging file.
write(TimestampedRecord<P>) - Method in class org.wikimedia.gobblin.writer.TimestampedRecordWriterWrapper
 
writerFileSystemByBranches - Variable in class org.wikimedia.gobblin.copy.BaseDataPublisher
 
A B C D E F G I J K L M N O P R S T V W 
Skip navigation links

Copyright © 2021. All rights reserved.