You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Azmal Sheik (JIRA)" <ji...@apache.org> on 2017/11/23 07:02:00 UTC

[jira] [Comment Edited] (GOBBLIN-321) CSV to HDFS ISSUE

    [ https://issues.apache.org/jira/browse/GOBBLIN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263892#comment-16263892 ] 

Azmal Sheik edited comment on GOBBLIN-321 at 11/23/17 7:01 AM:
---------------------------------------------------------------

gobblin dist is 10 But I loaded 11 core and core-base jar in lib


was (Author: sheik5azmal):
10

> CSV to HDFS ISSUE
> -----------------
>
>                 Key: GOBBLIN-321
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-321
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Azmal Sheik
>            Priority: Critical
>              Labels: beginner, newbie, starter
>         Attachments: gobblin-current.log, job.txt
>
>
> I was trying to load csv file data to HDFS with below job conf But I'm facing class not found error, I have checked in lib/gobblin-core.jar the class TextFileBasedSource is present but it was saying class not found.
> Can anyone help over here
> Here is JOB,LOGS
> *JOB :
> *
> ###################### job configuration file ######################
> job.name=json-gobblin-hdfs
> job.group=Gobblin-Json-Demo
> job.description=Publishing JSON data from files to HDFS in Avro format.
> job.jars=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/lib/
> job.lock.enabled=false
> distcp.persist.dir=/home/ndxmetadata/Ravi/Gobblin/gobblin-dist/
> source.class=gobblin.source.extractor.filebased.TextFileBasedSource
> converter.classes="gobblin.converter.StringSchemaInjector,gobblin.converter.csv.CsvToJsonConverter,gobblin.converter.avro.JsonIntermediateToAvroConverter"
> writer.builder.class=gobblin.writer.AvroDataWriterBuilder
> source.entity=
> source.filebased.data.directory=file://home/ndxmetadata/Ravi/Gobblin/sample
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> extract.table.name=CsvToAvro
> extract.namespace=gobblin.example
> extract.table.type=APPEND_ONLY
> # source data schema
> source.schema={"namespace":"example.avro", "type":"record", "name":"User", "fields":[{"name":"name", "type":"string"}, {"name":"favorite_number",  "type":"int"}, {#"name":"favorite_color", "type":"string"}]}
> gobblin.converter.schemaInjector.schema=SCHEMA
> converter.csv.to.json.delimiter=","
> # quality checker configuration properties
> qualitychecker.task.policies=gobblin.policies.count.RowCountPolicy,gobblin.policies.schema.SchemaCompatibilityPolicy
> qualitychecker.task.policy.types=OPTIONAL,OPTIONAL
> qualitychecker.row.policies=gobblin.policies.schema.SchemaRowCheckPolicy
> qualitychecker.row.policy.types=OPTIONAL
> # data publisher class to be used
> data.publisher.type=gobblin.publisher.BaseDataPublisher
> # writer configuration properties
> writer.destination.type=HDFS
> writer.output.format=AVRO
> fs.uri=hdfs://........:8020/
> writer.fs.uri=hdfs://.......:8020/
> state.store.fs.uri=hdfs://:8020/
> mr.job.root.dir=/user/ndxmetadata/output/working
> state.store.dir=/user/ndxmetadata/output/state-store
> writer.staging.dir=/user/ndxmetadata/output/task-staging
> writer.output.dir=/user/ndxmetadata/output/task-output
> data.publisher.final.dir=/user/ndxmetadata/output/
> ---------------------------------------------------------------------------
> Log's attached below



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)