You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by Yuchen Zhang <Yu...@microsoft.com.INVALID> on 2019/02/22 07:10:19 UTC
can't load text-dir data source
Hi there,
I'm new to Apache Griffin and Hadoop system and trying to set up a Griffin test env.
For now, I'm trying to get accuracy measure for the Text data source on Hadoop, but get below error:
2019-02-19 09:21:06 WARN DataSource:36 - load data source [src] fails
2019-02-19 09:21:06 WARN DataSource:36 - load data source [tgt] fails
But the data source can be load by sparksession : spark.read.text("hdfs:///griffin/src/src.txt");
Could anyone help me to figure out if there's any problems in my configuration?
Here's my text resource file:
[cid:image001.png@01D4CAC0.B7F42770]
Here's the DQ config:
[cid:image002.png@01D4CAC0.B7F42770]
Thanks.
RE: can't load text-dir data source
Posted by Yuchen Zhang <Yu...@microsoft.com.INVALID>.
Hi Eugene,
Thank you for reply.
I didn't config these properties, it should have default config for these, right?
As far as I understand, these properties are used to config the file path of Hadoop system. I can reach the text resource file through Hadoop fs instruction and spark session instruction.
Also, I try to add these properties with reachable file path to the config file, but the error still exists.
So it seems not to be the root cause.
Thanks,
Yuchen Zhang
From: Eugene Liu <li...@apache.org>
Sent: Friday, February 22, 2019 3:17 PM
To: Yuchen Zhang <Yu...@microsoft.com.INVALID>; dev@griffin.apache.org
Subject: Re: can't load text-dir data source
Yuchen,
could you check your hadoop config, /apache/hadoop/etc/hadoop/hdfs-site.xml
like those properties, if folder is valid
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/hadoop-data/nn</value<file:///data/hadoop-data/nn%3c/value>>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/hadoop-data/dn</value<file:///data/hadoop-data/dn%3c/value>>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///data/hadoop-data/snn</value<file:///data/hadoop-data/snn%3c/value>>
</property>
________________________________
From: Yuchen Zhang <Yu...@microsoft.com.INVALID>>
Sent: Friday, February 22, 2019 3:10 PM
To: dev@griffin.apache.org<ma...@griffin.apache.org>
Subject: can't load text-dir data source
Hi there,
I'm new to Apache Griffin and Hadoop system and trying to set up a Griffin test env.
For now, I'm trying to get accuracy measure for the Text data source on Hadoop, but get below error:
2019-02-19 09:21:06 WARN DataSource:36 - load data source [src] fails
2019-02-19 09:21:06 WARN DataSource:36 - load data source [tgt] fails
But the data source can be load by sparksession : spark.read.text("hdfs:///griffin/src/src.txt");
Could anyone help me to figure out if there's any problems in my configuration?
Here's my text resource file:
[cid:image001.png@01D4CAC0.B7F42770]
Here's the DQ config:
[cid:image002.png@01D4CAC0.B7F42770]
Thanks.
Re: can't load text-dir data source
Posted by Eugene Liu <li...@apache.org>.
Yuchen,
could you check your hadoop config, /apache/hadoop/etc/hadoop/hdfs-site.xml
like those properties, if folder is valid
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/hadoop-data/nn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/hadoop-data/dn</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///data/hadoop-data/snn</value>
</property>
________________________________
From: Yuchen Zhang <Yu...@microsoft.com.INVALID>
Sent: Friday, February 22, 2019 3:10 PM
To: dev@griffin.apache.org
Subject: can't load text-dir data source
Hi there,
I’m new to Apache Griffin and Hadoop system and trying to set up a Griffin test env.
For now, I’m trying to get accuracy measure for the Text data source on Hadoop, but get below error:
2019-02-19 09:21:06 WARN DataSource:36 - load data source [src] fails
2019-02-19 09:21:06 WARN DataSource:36 - load data source [tgt] fails
But the data source can be load by sparksession : spark.read.text(“hdfs:///griffin/src/src.txt”);
Could anyone help me to figure out if there’s any problems in my configuration?
Here’s my text resource file:
[cid:image001.png@01D4CAC0.B7F42770]
Here’s the DQ config:
[cid:image002.png@01D4CAC0.B7F42770]
Thanks.