You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by Yuchen Zhang <Yu...@microsoft.com.INVALID> on 2019/02/22 07:10:19 UTC

can't load text-dir data source

Hi there,

I'm new to Apache Griffin and Hadoop system and trying to set up a Griffin test env.
For now, I'm trying to get accuracy measure for the Text data source on Hadoop, but get below error:
2019-02-19 09:21:06 WARN  DataSource:36 - load data source [src] fails
2019-02-19 09:21:06 WARN  DataSource:36 - load data source [tgt] fails

But the data source can be load by sparksession : spark.read.text("hdfs:///griffin/src/src.txt");

Could anyone help me to figure out if there's any problems in my configuration?

Here's my text resource file:
[cid:image001.png@01D4CAC0.B7F42770]

Here's the DQ config:
[cid:image002.png@01D4CAC0.B7F42770]

Thanks.


RE: can't load text-dir data source

Posted by Yuchen Zhang <Yu...@microsoft.com.INVALID>.
Hi Eugene,

Thank you for reply.
I didn't config these properties, it should have default config for these, right?
As far as I understand, these properties are used to config the file path of Hadoop system. I can reach the text resource file through Hadoop fs instruction and spark session instruction.
Also, I try to add these properties with reachable file path to the config file, but the error still exists.
So it seems not to be the root cause.

Thanks,
Yuchen Zhang

From: Eugene Liu <li...@apache.org>
Sent: Friday, February 22, 2019 3:17 PM
To: Yuchen Zhang <Yu...@microsoft.com.INVALID>; dev@griffin.apache.org
Subject: Re: can't load text-dir data source

Yuchen,

could you check your hadoop config, /apache/hadoop/etc/hadoop/hdfs-site.xml

like those properties, if folder is valid

<property>

        <name>dfs.namenode.name.dir</name>

        <value>file:///data/hadoop-data/nn</value<file:///data/hadoop-data/nn%3c/value>>

    </property>

    <property>

        <name>dfs.datanode.data.dir</name>

        <value>file:///data/hadoop-data/dn</value<file:///data/hadoop-data/dn%3c/value>>

    </property>

    <property>

        <name>dfs.namenode.checkpoint.dir</name>

        <value>file:///data/hadoop-data/snn</value<file:///data/hadoop-data/snn%3c/value>>

    </property>

________________________________
From: Yuchen Zhang <Yu...@microsoft.com.INVALID>>
Sent: Friday, February 22, 2019 3:10 PM
To: dev@griffin.apache.org<ma...@griffin.apache.org>
Subject: can't load text-dir data source


Hi there,



I'm new to Apache Griffin and Hadoop system and trying to set up a Griffin test env.

For now, I'm trying to get accuracy measure for the Text data source on Hadoop, but get below error:

2019-02-19 09:21:06 WARN  DataSource:36 - load data source [src] fails

2019-02-19 09:21:06 WARN  DataSource:36 - load data source [tgt] fails



But the data source can be load by sparksession : spark.read.text("hdfs:///griffin/src/src.txt");



Could anyone help me to figure out if there's any problems in my configuration?



Here's my text resource file:

[cid:image001.png@01D4CAC0.B7F42770]



Here's the DQ config:

[cid:image002.png@01D4CAC0.B7F42770]



Thanks.



Re: can't load text-dir data source

Posted by Eugene Liu <li...@apache.org>.
Yuchen,

could you check your hadoop config, /apache/hadoop/etc/hadoop/hdfs-site.xml

like those properties, if folder is valid

<property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///data/hadoop-data/nn</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///data/hadoop-data/dn</value>
    </property>
    <property>
        <name>dfs.namenode.checkpoint.dir</name>
        <value>file:///data/hadoop-data/snn</value>
    </property>

________________________________
From: Yuchen Zhang <Yu...@microsoft.com.INVALID>
Sent: Friday, February 22, 2019 3:10 PM
To: dev@griffin.apache.org
Subject: can't load text-dir data source


Hi there,



I’m new to Apache Griffin and Hadoop system and trying to set up a Griffin test env.

For now, I’m trying to get accuracy measure for the Text data source on Hadoop, but get below error:

2019-02-19 09:21:06 WARN  DataSource:36 - load data source [src] fails

2019-02-19 09:21:06 WARN  DataSource:36 - load data source [tgt] fails



But the data source can be load by sparksession : spark.read.text(“hdfs:///griffin/src/src.txt”);



Could anyone help me to figure out if there’s any problems in my configuration?



Here’s my text resource file:

[cid:image001.png@01D4CAC0.B7F42770]



Here’s the DQ config:

[cid:image002.png@01D4CAC0.B7F42770]



Thanks.