You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Al Krinker <al...@gmail.com> on 2014/05/01 22:46:06 UTC
Issue with importDirectory
So I am trying to create my own rFile and write it to accumulo...
in the nutshell.
I create my rFile and two directories. One that would contain the file and
one for failures, both required by importDirectory
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://blah:9000/");
conf.set("fs.hdfs.impl",
"org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = FileSystem.get(conf);
Path input = new Path("/accumulo/temp1/testing/");
Path output = new Path("/accumulo/temp1/testing/my_output");
fs.mkdirs(input);
fs.mkdirs(output);
String extension = conf.get(FILE_TYPE);
if (extension == null || extension.isEmpty()) {
extension = RFile.EXTENSION;
}
String filename = "/accumulo/temp1/testing/my_input/testFile." +
extension;
Path file = new Path(filename);
if (fs.exists(file)) {
file.getFileSystem(conf).delete(file, false);
}
FileSKVWriter out =
RFileOperations.getInstance().openWriter(filename, fs, conf,
AccumuloConfiguration.getDefaultConfiguration());
out.startDefaultLocalityGroup();
long timestamp = (new Date()).getTime();
Key key = new Key(new Text("row_1"), new Text("cf"), new Text("cq"),
new ColumnVisibility(), timestamp);
Value value = new Value("".getBytes());
out.append(key, value);
out.close();
at this point i can ssh into my namenode and see the file and two
directories
then i try to bulk import it
String instanceName = "blah";
String zooServers = "blah:2181,blah:2181"
String userName = ; // Provide username
String password = ; // Provide password
// Connect
Instance inst = new ZooKeeperInstance(instanceName, zooServers);
Connector conn = inst.getConnector(userName, password);
TableOperations ops = conn.tableOperations();
ops.delete("mynewtesttable");
ops.create("mynewtesttable");
ops.importDirectory("mynewtesttable", input.toString(),
output.toString(), false);
The exception that I am getting is
SEVERE: null
org.apache.accumulo.core.client.AccumuloException: Bulk import directory
/accumulo/temp1/testing does not exist!
I tried to play around with the file/directory owner by manually setting it
to accumulo and then hadoop, but no luck.
I checked hdfs-site and I have
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
Any ideas?
Any guesses of what might be wrong?
Re: Issue with importDirectory
Posted by David Medinets <da...@gmail.com>.
I'm heading home but I can play with this tomorrow. On the positive side, I
have D4M reading the data that I wrote from Java. So that's nice. :)
On Thu, May 1, 2014 at 4:46 PM, Al Krinker <al...@gmail.com> wrote:
> So I am trying to create my own rFile and write it to accumulo...
>
> in the nutshell.
>
> I create my rFile and two directories. One that would contain the file and
> one for failures, both required by importDirectory
>
> Configuration conf = new Configuration();
> conf.set("fs.default.name", "hdfs://blah:9000/");
> conf.set("fs.hdfs.impl",
> "org.apache.hadoop.hdfs.DistributedFileSystem");
> FileSystem fs = FileSystem.get(conf);
>
> Path input = new Path("/accumulo/temp1/testing/");
> Path output = new Path("/accumulo/temp1/testing/my_output");
> fs.mkdirs(input);
> fs.mkdirs(output);
>
> String extension = conf.get(FILE_TYPE);
> if (extension == null || extension.isEmpty()) {
> extension = RFile.EXTENSION;
> }
> String filename = "/accumulo/temp1/testing/my_input/testFile." +
> extension;
> Path file = new Path(filename);
> if (fs.exists(file)) {
> file.getFileSystem(conf).delete(file, false);
> }
> FileSKVWriter out =
> RFileOperations.getInstance().openWriter(filename, fs, conf,
> AccumuloConfiguration.getDefaultConfiguration());
> out.startDefaultLocalityGroup();
> long timestamp = (new Date()).getTime();
> Key key = new Key(new Text("row_1"), new Text("cf"), new
> Text("cq"),
> new ColumnVisibility(), timestamp);
> Value value = new Value("".getBytes());
> out.append(key, value);
> out.close();
>
> at this point i can ssh into my namenode and see the file and two
> directories
>
> then i try to bulk import it
>
> String instanceName = "blah";
> String zooServers = "blah:2181,blah:2181"
> String userName = ; // Provide username
> String password = ; // Provide password
> // Connect
> Instance inst = new ZooKeeperInstance(instanceName, zooServers);
> Connector conn = inst.getConnector(userName, password);
> TableOperations ops = conn.tableOperations();
> ops.delete("mynewtesttable");
> ops.create("mynewtesttable");
> ops.importDirectory("mynewtesttable", input.toString(),
> output.toString(), false);
>
> The exception that I am getting is
> SEVERE: null
> org.apache.accumulo.core.client.AccumuloException: Bulk import directory
> /accumulo/temp1/testing does not exist!
>
> I tried to play around with the file/directory owner by manually setting
> it to accumulo and then hadoop, but no luck.
>
> I checked hdfs-site and I have
> <property>
> <name>dfs.permissions</name>
> <value>false</value>
> </property>
>
> Any ideas?
>
> Any guesses of what might be wrong?
>
Re: Issue with importDirectory
Posted by Al Krinker <al...@gmail.com>.
Hey Josh,
I checked HDFS and it was there... the issue was and I have to thank one of
my friends who ran into it before..
When importdirectory runs it uses cachedconfig... so it was getting my
local one...
All I did to solve it was to add CachedConfiguration.setInstance(conf);
right after I created conf and pointed it to my hadoop hdfs...
Worked perfectly... i was able to create new rfile and write it to a table
in accumulo... The code that I posted works (plus the fix) for anyone
interested.
Anyway, that was it... and thank you Josh for your feedback! You are
awesome :)
On Thu, May 1, 2014 at 5:52 PM, Josh Elser <jo...@gmail.com> wrote:
> Probably best to start in HDFS. Check to see if the directory you thought
> you made actually exists (/accumulo/temp1/testing).
>
> It's possible that you wrote that file to local FS instead of HDFS.
>
>
> On 5/1/14, 4:46 PM, Al Krinker wrote:
>
>> So I am trying to create my own rFile and write it to accumulo...
>> in the nutshell.
>> I create my rFile and two directories. One that would contain the file
>> and one for failures, both required by importDirectory
>> Configuration conf = new Configuration();
>> conf.set("fs.default.name <http://fs.default.name>",
>>
>> "hdfs://blah:9000/");
>> conf.set("fs.hdfs.impl",
>> "org.apache.hadoop.hdfs.DistributedFileSystem");
>> FileSystem fs = FileSystem.get(conf);
>>
>> Path input = new Path("/accumulo/temp1/testing/");
>> Path output = new Path("/accumulo/temp1/testing/my_output");
>> fs.mkdirs(input);
>> fs.mkdirs(output);
>>
>> String extension = conf.get(FILE_TYPE);
>> if (extension == null || extension.isEmpty()) {
>> extension = RFile.EXTENSION;
>> }
>> String filename = "/accumulo/temp1/testing/my_input/testFile."
>> + extension;
>> Path file = new Path(filename);
>> if (fs.exists(file)) {
>> file.getFileSystem(conf).delete(file, false);
>> }
>> FileSKVWriter out =
>> RFileOperations.getInstance().openWriter(filename, fs, conf,
>> AccumuloConfiguration.getDefaultConfiguration());
>> out.startDefaultLocalityGroup();
>> long timestamp = (new Date()).getTime();
>> Key key = new Key(new Text("row_1"), new Text("cf"), new
>> Text("cq"),
>> new ColumnVisibility(), timestamp);
>> Value value = new Value("".getBytes());
>> out.append(key, value);
>> out.close();
>> at this point i can ssh into my namenode and see the file and two
>> directories
>> then i try to bulk import it
>> String instanceName = "blah";
>> String zooServers = "blah:2181,blah:2181"
>> String userName = ; // Provide username
>> String password = ; // Provide password
>> // Connect
>> Instance inst = new ZooKeeperInstance(instanceName, zooServers);
>> Connector conn = inst.getConnector(userName, password);
>> TableOperations ops = conn.tableOperations();
>> ops.delete("mynewtesttable");
>> ops.create("mynewtesttable");
>> ops.importDirectory("mynewtesttable", input.toString(),
>> output.toString(), false);
>> The exception that I am getting is
>> SEVERE: null
>> org.apache.accumulo.core.client.AccumuloException: Bulk import directory
>> /accumulo/temp1/testing does not exist!
>> I tried to play around with the file/directory owner by manually setting
>> it to accumulo and then hadoop, but no luck.
>> I checked hdfs-site and I have
>> <property>
>> <name>dfs.permissions</name>
>> <value>false</value>
>> </property>
>> Any ideas?
>> Any guesses of what might be wrong?
>>
>
Re: Issue with importDirectory
Posted by Josh Elser <jo...@gmail.com>.
Probably best to start in HDFS. Check to see if the directory you
thought you made actually exists (/accumulo/temp1/testing).
It's possible that you wrote that file to local FS instead of HDFS.
On 5/1/14, 4:46 PM, Al Krinker wrote:
> So I am trying to create my own rFile and write it to accumulo...
> in the nutshell.
> I create my rFile and two directories. One that would contain the file
> and one for failures, both required by importDirectory
> Configuration conf = new Configuration();
> conf.set("fs.default.name <http://fs.default.name>",
> "hdfs://blah:9000/");
> conf.set("fs.hdfs.impl",
> "org.apache.hadoop.hdfs.DistributedFileSystem");
> FileSystem fs = FileSystem.get(conf);
>
> Path input = new Path("/accumulo/temp1/testing/");
> Path output = new Path("/accumulo/temp1/testing/my_output");
> fs.mkdirs(input);
> fs.mkdirs(output);
>
> String extension = conf.get(FILE_TYPE);
> if (extension == null || extension.isEmpty()) {
> extension = RFile.EXTENSION;
> }
> String filename = "/accumulo/temp1/testing/my_input/testFile."
> + extension;
> Path file = new Path(filename);
> if (fs.exists(file)) {
> file.getFileSystem(conf).delete(file, false);
> }
> FileSKVWriter out =
> RFileOperations.getInstance().openWriter(filename, fs, conf,
> AccumuloConfiguration.getDefaultConfiguration());
> out.startDefaultLocalityGroup();
> long timestamp = (new Date()).getTime();
> Key key = new Key(new Text("row_1"), new Text("cf"), new
> Text("cq"),
> new ColumnVisibility(), timestamp);
> Value value = new Value("".getBytes());
> out.append(key, value);
> out.close();
> at this point i can ssh into my namenode and see the file and two
> directories
> then i try to bulk import it
> String instanceName = "blah";
> String zooServers = "blah:2181,blah:2181"
> String userName = ; // Provide username
> String password = ; // Provide password
> // Connect
> Instance inst = new ZooKeeperInstance(instanceName, zooServers);
> Connector conn = inst.getConnector(userName, password);
> TableOperations ops = conn.tableOperations();
> ops.delete("mynewtesttable");
> ops.create("mynewtesttable");
> ops.importDirectory("mynewtesttable", input.toString(),
> output.toString(), false);
> The exception that I am getting is
> SEVERE: null
> org.apache.accumulo.core.client.AccumuloException: Bulk import directory
> /accumulo/temp1/testing does not exist!
> I tried to play around with the file/directory owner by manually setting
> it to accumulo and then hadoop, but no luck.
> I checked hdfs-site and I have
> <property>
> <name>dfs.permissions</name>
> <value>false</value>
> </property>
> Any ideas?
> Any guesses of what might be wrong?