You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Tim Israel <ti...@timisrael.com> on 2014/09/17 21:57:22 UTC

Import/Export problems from 1.5.1 -> 1.6.0?

Hi all,

I posted something similar on the slider mailing list and was directed
here.  After debugging further, it doesn't seem like this is a slider issue.

I have some tables that were exported from another cluster running Accumulo
1.5.1 on hoya and I'm trying to import them in Accumulo 1.6.0 on Slider
0.50.2.  This target cluster is Kerberized but Accumulo is running in
simple authentication mode.

The exported table was distcp'd to a cluster configured with slider.

The table was imported via accumulo shell successfully.  The files get
moved to
/user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1

However, if I scan the imported table, accumulo complains with the
following exception:
Failed to open file hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf
File does not exist: /accumulo/tables/1/b-000005c/I000005d.rf

I can scan the table if I move the files from
/user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1 to
/accumulo/tables/1

I pulled accumulo-site from the slider publisher and saw that
instance.volumes is set as follows:
hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data

Any suggestions would be greatly appreciated.

Thanks,

Tim

Re: Import/Export problems from 1.5.1 -> 1.6.0?

Posted by Keith Turner <ke...@deenlo.com>.
I am thinking of dusting off the patch in ACCUMULO-2145 and trying to add
an upgrade test case using that patch.   I also need to get that patch
pushed.

On Thu, Sep 18, 2014 at 10:17 AM, Billie Rinaldi <bi...@gmail.com>
wrote:

> It looks like the import table operation is creating file entries using
> relative paths in the metadata table, and their names are being resolved
> using the deprecated instance.dfs.dir and instance.dfs.uri properties.
> This seems like a bug.  I think a workaround for the problem would be to
> set those deprecated properties to match your instance.volumes property.  I
> know slider is setting instance.volumes for you, but if you want to verify
> this fixes the problem, it would probably be enough to set instance.dfs.dir
> to /user/accumulo/.slider/cluster/slideraccumulo/database/data in your app
> config (maybe /user/${USER}/.slider/cluster/${CLUSTER_NAME}/database/data
> would work if you're using the develop branch).
>
>
> On Wed, Sep 17, 2014 at 12:57 PM, Tim Israel <ti...@timisrael.com> wrote:
>
>> Hi all,
>>
>> I posted something similar on the slider mailing list and was directed
>> here.  After debugging further, it doesn't seem like this is a slider issue.
>>
>> I have some tables that were exported from another cluster running
>> Accumulo 1.5.1 on hoya and I'm trying to import them in Accumulo 1.6.0 on
>> Slider 0.50.2.  This target cluster is Kerberized but Accumulo is running
>> in simple authentication mode.
>>
>> The exported table was distcp'd to a cluster configured with slider.
>>
>> The table was imported via accumulo shell successfully.  The files get
>> moved to
>> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>>
>> However, if I scan the imported table, accumulo complains with the
>> following exception:
>> Failed to open file hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf
>> File does not exist: /accumulo/tables/1/b-000005c/I000005d.rf
>>
>> I can scan the table if I move the files from
>> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1 to
>> /accumulo/tables/1
>>
>> I pulled accumulo-site from the slider publisher and saw that
>> instance.volumes is set as follows:
>> hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data
>>
>> Any suggestions would be greatly appreciated.
>>
>> Thanks,
>>
>> Tim
>>
>
>

Re: Import/Export problems from 1.5.1 -> 1.6.0?

Posted by Josh Elser <jo...@gmail.com>.
(for archival purposes)

I (re)stumbled on this and, after digging some more, realized that there 
is bigger issue.

https://issues.apache.org/jira/browse/ACCUMULO-3215

Ultimately, the import process in 1.6.0 and 1.6.1 are incorrect and 
generate incorrect entries in the accumulo.metadata table which render 
the imported table unusable.

Tim Israel wrote:
> Billie,
>
> Thank you for the recommendation, I was hunting for deprecated
> properties that would point to /accumulo
>
> Your suggestion worked great and fixed the importtable function.
>
> I set the following properties in my appConfig.json and regenerated my
> client's accumulo-site.xml:
> instance.dfs.dir=/user/accumulo/.slider/cluster/slideraccumulo/database/data
> instance.dfs.uri=hdfs://cluster
>
> I'm currently using slider-0.50.2-incubating-rc0, so I'll have to give
> ${USER} and ${CLUSTER_NAME} a look at a later time.
>
> Thanks!
>
> Tim
>
> On Thu, Sep 18, 2014 at 10:17 AM, Billie Rinaldi
> <billie.rinaldi@gmail.com <ma...@gmail.com>> wrote:
>
>     It looks like the import table operation is creating file entries
>     using relative paths in the metadata table, and their names are
>     being resolved using the deprecated instance.dfs.dir and
>     instance.dfs.uri properties.  This seems like a bug.  I think a
>     workaround for the problem would be to set those deprecated
>     properties to match your instance.volumes property.  I know slider
>     is setting instance.volumes for you, but if you want to verify this
>     fixes the problem, it would probably be enough to set
>     instance.dfs.dir to
>     /user/accumulo/.slider/cluster/slideraccumulo/database/data in your
>     app config (maybe
>     /user/${USER}/.slider/cluster/${CLUSTER_NAME}/database/data would
>     work if you're using the develop branch).
>
>
>     On Wed, Sep 17, 2014 at 12:57 PM, Tim Israel <tim@timisrael.com
>     <ma...@timisrael.com>> wrote:
>
>         Hi all,
>
>         I posted something similar on the slider mailing list and was
>         directed here.  After debugging further, it doesn't seem like
>         this is a slider issue.
>
>         I have some tables that were exported from another cluster
>         running Accumulo 1.5.1 on hoya and I'm trying to import them in
>         Accumulo 1.6.0 on Slider 0.50.2.  This target cluster is
>         Kerberized but Accumulo is running in simple authentication mode.
>
>         The exported table was distcp'd to a cluster configured with
>         slider.
>
>         The table was imported via accumulo shell successfully.  The
>         files get moved to
>         /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>
>         However, if I scan the imported table, accumulo complains with
>         the following exception:
>         Failed to open file
>         hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf File does
>         not exist: /accumulo/tables/1/b-000005c/I000005d.rf
>
>         I can scan the table if I move the files from
>         /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>         to /accumulo/tables/1
>
>         I pulled accumulo-site from the slider publisher and saw that
>         instance.volumes is set as follows:
>         hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data
>
>         Any suggestions would be greatly appreciated.
>
>         Thanks,
>
>         Tim
>
>
>

Re: Import/Export problems from 1.5.1 -> 1.6.0?

Posted by Tim Israel <ti...@timisrael.com>.
Billie,

Thank you for the recommendation, I was hunting for deprecated properties
that would point to /accumulo

Your suggestion worked great and fixed the importtable function.

I set the following properties in my appConfig.json and regenerated my
client's accumulo-site.xml:
instance.dfs.dir=/user/accumulo/.slider/cluster/slideraccumulo/database/data
instance.dfs.uri=hdfs://cluster

I'm currently using slider-0.50.2-incubating-rc0, so I'll have to give
${USER} and ${CLUSTER_NAME} a look at a later time.

Thanks!

Tim

On Thu, Sep 18, 2014 at 10:17 AM, Billie Rinaldi <bi...@gmail.com>
wrote:

> It looks like the import table operation is creating file entries using
> relative paths in the metadata table, and their names are being resolved
> using the deprecated instance.dfs.dir and instance.dfs.uri properties.
> This seems like a bug.  I think a workaround for the problem would be to
> set those deprecated properties to match your instance.volumes property.  I
> know slider is setting instance.volumes for you, but if you want to verify
> this fixes the problem, it would probably be enough to set instance.dfs.dir
> to /user/accumulo/.slider/cluster/slideraccumulo/database/data in your app
> config (maybe /user/${USER}/.slider/cluster/${CLUSTER_NAME}/database/data
> would work if you're using the develop branch).
>
>
> On Wed, Sep 17, 2014 at 12:57 PM, Tim Israel <ti...@timisrael.com> wrote:
>
>> Hi all,
>>
>> I posted something similar on the slider mailing list and was directed
>> here.  After debugging further, it doesn't seem like this is a slider issue.
>>
>> I have some tables that were exported from another cluster running
>> Accumulo 1.5.1 on hoya and I'm trying to import them in Accumulo 1.6.0 on
>> Slider 0.50.2.  This target cluster is Kerberized but Accumulo is running
>> in simple authentication mode.
>>
>> The exported table was distcp'd to a cluster configured with slider.
>>
>> The table was imported via accumulo shell successfully.  The files get
>> moved to
>> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>>
>> However, if I scan the imported table, accumulo complains with the
>> following exception:
>> Failed to open file hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf
>> File does not exist: /accumulo/tables/1/b-000005c/I000005d.rf
>>
>> I can scan the table if I move the files from
>> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1 to
>> /accumulo/tables/1
>>
>> I pulled accumulo-site from the slider publisher and saw that
>> instance.volumes is set as follows:
>> hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data
>>
>> Any suggestions would be greatly appreciated.
>>
>> Thanks,
>>
>> Tim
>>
>
>

Re: Import/Export problems from 1.5.1 -> 1.6.0?

Posted by Billie Rinaldi <bi...@gmail.com>.
It looks like the import table operation is creating file entries using
relative paths in the metadata table, and their names are being resolved
using the deprecated instance.dfs.dir and instance.dfs.uri properties.
This seems like a bug.  I think a workaround for the problem would be to
set those deprecated properties to match your instance.volumes property.  I
know slider is setting instance.volumes for you, but if you want to verify
this fixes the problem, it would probably be enough to set instance.dfs.dir
to /user/accumulo/.slider/cluster/slideraccumulo/database/data in your app
config (maybe /user/${USER}/.slider/cluster/${CLUSTER_NAME}/database/data
would work if you're using the develop branch).

On Wed, Sep 17, 2014 at 12:57 PM, Tim Israel <ti...@timisrael.com> wrote:

> Hi all,
>
> I posted something similar on the slider mailing list and was directed
> here.  After debugging further, it doesn't seem like this is a slider issue.
>
> I have some tables that were exported from another cluster running
> Accumulo 1.5.1 on hoya and I'm trying to import them in Accumulo 1.6.0 on
> Slider 0.50.2.  This target cluster is Kerberized but Accumulo is running
> in simple authentication mode.
>
> The exported table was distcp'd to a cluster configured with slider.
>
> The table was imported via accumulo shell successfully.  The files get
> moved to
> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>
> However, if I scan the imported table, accumulo complains with the
> following exception:
> Failed to open file hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf
> File does not exist: /accumulo/tables/1/b-000005c/I000005d.rf
>
> I can scan the table if I move the files from
> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1 to
> /accumulo/tables/1
>
> I pulled accumulo-site from the slider publisher and saw that
> instance.volumes is set as follows:
> hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data
>
> Any suggestions would be greatly appreciated.
>
> Thanks,
>
> Tim
>

Re: Import/Export problems from 1.5.1 -> 1.6.0?

Posted by Tim Israel <ti...@timisrael.com>.
Josh,
I've sent an email directly to you with the zip -- I'm not sure what the
mailing list behavior is regarding attachments.

For the benefit of the mailing list, the files (and their contents) are as
follows:

distcp.txt
------------
hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/2/default_tablet/F000009g.rf
hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/2/default_tablet/F000009n.rf
hdfs://cluster/tmp/table1_export/exportMetadata.zip

exportMetadata.zip/accumulo_export_info.txt
-------------------------------------------------------------
exportVersion:1
srcInstanceName:instancename
srcInstanceID:b458b1bb-f613-4c3c-a399-d3f275a634da
srcZookeepers:CensoredZK1,CensoredZK2,CensoredZK3
srcTableName:table1_exp
srcTableID:3
srcDataVersion:6
srcCodeVersion:1.6.0

exportMetadata.zip/table_config.txt
-----------------------------------------------
table.constraint.1=org.apache.accumulo.core.constraints.DefaultKeySizeConstraint
table.iterator.majc.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
table.iterator.majc.vers.opt.maxVersions=1
table.iterator.minc.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
table.iterator.minc.vers.opt.maxVersions=1
table.iterator.scan.vers=20,org.apache.accumulo.core.iterators.user.VersioningIterator
table.iterator.scan.vers.opt.maxVersions=1
table.split.threshold=100M

exportMetadata.zip/metadata.bin
-----------------------------------------------
<binary file containing more metadata>

Thanks,

Tim

On Wed, Sep 17, 2014 at 5:45 PM, Josh Elser <jo...@gmail.com> wrote:

> Hi Tim,
>
> Any possibility that you can provide the exportMetadata.zip and the
> distcp.txt?
>
> Fair warning - the data from that table won't be included, but some split
> points might be included in metadata.bin (inside exportMetadata.zip) which
> *might* contain something sensitive. Make sure you dbl check that.
>
> I'll see if I can reproduce what you saw. It definitely seems strange.
>
> - Josh
>
> On 9/17/14, 5:10 PM, Tim Israel wrote:
>
>> Upon further investigation, it looks like I can't even follow the steps
>> outlined on the import/export documentation in 1.6.0
>> http://accumulo.apache.org/1.6/examples/export.html.  I get the same
>> error outlined in my first post
>>
>> [shell.Shell] ERROR: java.lang.RuntimeException:
>> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on
>> server <SERVER_NAME>:58444 <-- port chosen by slider
>>
>> Accumulo Recent Logs
>> ----
>> Failed to open file
>> hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf File does not
>> exist: /accumulo/tables/1/b-000005c/I000005d.rf
>>
>>
>> I tried the export/import procedure on my Accumulo 1.5.1 cluster and get
>> the expected result (i.e. the table is imported and can be scanned
>> without error)
>>
>>
>> Tim
>>
>> On Wed, Sep 17, 2014 at 3:57 PM, Tim Israel <tim@timisrael.com
>> <ma...@timisrael.com>> wrote:
>>
>>     Hi all,
>>
>>     I posted something similar on the slider mailing list and was
>>     directed here.  After debugging further, it doesn't seem like this
>>     is a slider issue.
>>
>>     I have some tables that were exported from another cluster running
>>     Accumulo 1.5.1 on hoya and I'm trying to import them in Accumulo
>>     1.6.0 on Slider 0.50.2.  This target cluster is Kerberized but
>>     Accumulo is running in simple authentication mode.
>>
>>     The exported table was distcp'd to a cluster configured with slider.
>>
>>     The table was imported via accumulo shell successfully.  The files
>>     get moved to
>>     /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>>
>>     However, if I scan the imported table, accumulo complains with the
>>     following exception:
>>     Failed to open file
>>     hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf File does not
>>     exist: /accumulo/tables/1/b-000005c/I000005d.rf
>>
>>     I can scan the table if I move the files from
>>     /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>>     to /accumulo/tables/1
>>
>>     I pulled accumulo-site from the slider publisher and saw that
>>     instance.volumes is set as follows:
>>     hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/
>> database/data
>>
>>     Any suggestions would be greatly appreciated.
>>
>>     Thanks,
>>
>>     Tim
>>
>>
>>

Re: Import/Export problems from 1.5.1 -> 1.6.0?

Posted by Josh Elser <jo...@gmail.com>.
Hi Tim,

Any possibility that you can provide the exportMetadata.zip and the 
distcp.txt?

Fair warning - the data from that table won't be included, but some 
split points might be included in metadata.bin (inside 
exportMetadata.zip) which *might* contain something sensitive. Make sure 
you dbl check that.

I'll see if I can reproduce what you saw. It definitely seems strange.

- Josh

On 9/17/14, 5:10 PM, Tim Israel wrote:
> Upon further investigation, it looks like I can't even follow the steps
> outlined on the import/export documentation in 1.6.0
> http://accumulo.apache.org/1.6/examples/export.html.  I get the same
> error outlined in my first post
>
> [shell.Shell] ERROR: java.lang.RuntimeException:
> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on
> server <SERVER_NAME>:58444 <-- port chosen by slider
>
> Accumulo Recent Logs
> ----
> Failed to open file
> hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf File does not
> exist: /accumulo/tables/1/b-000005c/I000005d.rf
>
>
> I tried the export/import procedure on my Accumulo 1.5.1 cluster and get
> the expected result (i.e. the table is imported and can be scanned
> without error)
>
>
> Tim
>
> On Wed, Sep 17, 2014 at 3:57 PM, Tim Israel <tim@timisrael.com
> <ma...@timisrael.com>> wrote:
>
>     Hi all,
>
>     I posted something similar on the slider mailing list and was
>     directed here.  After debugging further, it doesn't seem like this
>     is a slider issue.
>
>     I have some tables that were exported from another cluster running
>     Accumulo 1.5.1 on hoya and I'm trying to import them in Accumulo
>     1.6.0 on Slider 0.50.2.  This target cluster is Kerberized but
>     Accumulo is running in simple authentication mode.
>
>     The exported table was distcp'd to a cluster configured with slider.
>
>     The table was imported via accumulo shell successfully.  The files
>     get moved to
>     /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>
>     However, if I scan the imported table, accumulo complains with the
>     following exception:
>     Failed to open file
>     hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf File does not
>     exist: /accumulo/tables/1/b-000005c/I000005d.rf
>
>     I can scan the table if I move the files from
>     /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>     to /accumulo/tables/1
>
>     I pulled accumulo-site from the slider publisher and saw that
>     instance.volumes is set as follows:
>     hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data
>
>     Any suggestions would be greatly appreciated.
>
>     Thanks,
>
>     Tim
>
>

Re: Import/Export problems from 1.5.1 -> 1.6.0?

Posted by Tim Israel <ti...@timisrael.com>.
Upon further investigation, it looks like I can't even follow the steps
outlined on the import/export documentation in 1.6.0
http://accumulo.apache.org/1.6/examples/export.html.  I get the same error
outlined in my first post

[shell.Shell] ERROR: java.lang.RuntimeException:
org.apache.accumulo.core.client.impl.AccumuloServerException: Error on
server <SERVER_NAME>:58444 <-- port chosen by slider

Accumulo Recent Logs
----
Failed to open file hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf
File does not exist: /accumulo/tables/1/b-000005c/I000005d.rf


I tried the export/import procedure on my Accumulo 1.5.1 cluster and get
the expected result (i.e. the table is imported and can be scanned without
error)


Tim

On Wed, Sep 17, 2014 at 3:57 PM, Tim Israel <ti...@timisrael.com> wrote:

> Hi all,
>
> I posted something similar on the slider mailing list and was directed
> here.  After debugging further, it doesn't seem like this is a slider issue.
>
> I have some tables that were exported from another cluster running
> Accumulo 1.5.1 on hoya and I'm trying to import them in Accumulo 1.6.0 on
> Slider 0.50.2.  This target cluster is Kerberized but Accumulo is running
> in simple authentication mode.
>
> The exported table was distcp'd to a cluster configured with slider.
>
> The table was imported via accumulo shell successfully.  The files get
> moved to
> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1
>
> However, if I scan the imported table, accumulo complains with the
> following exception:
> Failed to open file hdfs://cluster/accumulo/tables/1/b-000005c/I000005d.rf
> File does not exist: /accumulo/tables/1/b-000005c/I000005d.rf
>
> I can scan the table if I move the files from
> /user/accumulo/.slider/cluster/slideraccumulo/database/data/tables/1 to
> /accumulo/tables/1
>
> I pulled accumulo-site from the slider publisher and saw that
> instance.volumes is set as follows:
> hdfs://cluster/user/accumulo/.slider/cluster/slideraccumulo/database/data
>
> Any suggestions would be greatly appreciated.
>
> Thanks,
>
> Tim
>