You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "David Sidlo (JIRA)" <ji...@apache.org> on 2016/08/18 17:43:20 UTC

[jira] [Comment Edited] (HADOOP-13510) "hadoop fs -getmerge" docs, .../dir does not work, .../dir/* works.

    [ https://issues.apache.org/jira/browse/HADOOP-13510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426869#comment-15426869 ] 

David Sidlo edited comment on HADOOP-13510 at 8/18/16 5:43 PM:
---------------------------------------------------------------

The issue may be based on the total dataset size.

The following command does not work. But, will work with the addition of "/*". The resulting file size would be 4G and 17 files get merged (when it works).
> hdfs dfs -getmerge hdfs://production/apps/hive/warehouse/dgs_tmp.db xxx

[ds_dsidlo@prdslsldsafht11 ~]$ hdfs dfs -ls hdfs://production/apps/hive/warehouse/dgs_tmp.db/*
Found 17 items
-rw-r--r--   3 ds_dsidlo hdfs  275883517 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000000_0
-rw-r--r--   3 ds_dsidlo hdfs  273756223 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000001_0
-rw-r--r--   3 ds_dsidlo hdfs  141912289 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000002_0
-rw-r--r--   3 ds_dsidlo hdfs  141916055 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000003_0
-rw-r--r--   3 ds_dsidlo hdfs  141912300 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000004_0
-rw-r--r--   3 ds_dsidlo hdfs  141913088 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000005_0
-rw-r--r--   3 ds_dsidlo hdfs  141914384 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000006_0
-rw-r--r--   3 ds_dsidlo hdfs  141915583 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000007_0
-rw-r--r--   3 ds_dsidlo hdfs  141912741 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000008_0
-rw-r--r--   3 ds_dsidlo hdfs  131615833 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000009_0
-rw-r--r--   3 ds_dsidlo hdfs  130790330 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000010_0
-rw-r--r--   3 ds_dsidlo hdfs  130257009 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000011_0
-rw-r--r--   3 ds_dsidlo hdfs  129981971 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000012_0
-rw-r--r--   3 ds_dsidlo hdfs  129647880 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000013_0
-rw-r--r--   3 ds_dsidlo hdfs  129046552 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000014_0
-rw-r--r--   3 ds_dsidlo hdfs  128769076 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000015_0
-rw-r--r--   3 ds_dsidlo hdfs   72740496 2015-12-11 15:04 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints/000016_0
Found 3 items
-rw-r--r--   3 ds_dsidlo hdfs   26091915 2015-12-09 15:06 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_fin/000000_0
-rw-r--r--   3 ds_dsidlo hdfs   26061567 2015-12-09 15:06 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_fin/000001_0
-rw-r--r--   3 ds_dsidlo hdfs   26117465 2015-12-09 15:06 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_fin/000002_0
Found 10 items
-rw-r--r--   3 ds_dsidlo hdfs  260920570 2015-12-11 14:39 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000000_0
-rw-r--r--   3 ds_dsidlo hdfs  258917310 2015-12-11 14:39 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000001_0
-rw-r--r--   3 ds_dsidlo hdfs  258702653 2015-12-11 14:38 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000002_0
-rw-r--r--   3 ds_dsidlo hdfs  257919368 2015-12-11 14:39 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000003_0
-rw-r--r--   3 ds_dsidlo hdfs  257411938 2015-12-11 14:38 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000004_0
-rw-r--r--   3 ds_dsidlo hdfs  257154203 2015-12-11 14:39 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000005_0
-rw-r--r--   3 ds_dsidlo hdfs  256840629 2015-12-11 14:39 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000006_0
-rw-r--r--   3 ds_dsidlo hdfs  256269772 2015-12-11 14:39 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000007_0
-rw-r--r--   3 ds_dsidlo hdfs  256005409 2015-12-11 14:38 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000008_0
-rw-r--r--   3 ds_dsidlo hdfs   68796269 2015-12-11 14:38 hdfs://production/apps/hive/warehouse/dgs_tmp.db/hints_test/000009_0

The following works, but the file size is only 1k. 
> hdfs dfs -getmerge hdfs://production/user/ds_dsidlo xxx

[ds_dsidlo@prdslsldsafht11 ~]$ hdfs dfs -ls hdfs://production/user/ds_dsidlo
Found 10 items
drwx------   - ds_dsidlo ds_dsidlo          0 2015-12-18 05:00 hdfs://production/user/ds_dsidlo/.Trash
drwxr-xr-x   - ds_dsidlo ds_dsidlo          0 2016-07-01 18:06 hdfs://production/user/ds_dsidlo/.hiveJars
drwxr-xr-x   - ds_dsidlo ds_dsidlo          0 2016-07-18 10:57 hdfs://production/user/ds_dsidlo/.sparkStaging
drwx------   - ds_dsidlo ds_dsidlo          0 2016-07-07 17:16 hdfs://production/user/ds_dsidlo/.staging
-rw-r--r--   3 ds_dsidlo ds_dsidlo       1402 2015-07-17 14:33 hdfs://production/user/ds_dsidlo/data.json
drwxr-xr-x   - ds_dsidlo ds_dsidlo          0 2015-08-06 20:03 hdfs://production/user/ds_dsidlo/logEvents.parquet
drwxr-xr-x   - ds_dsidlo ds_dsidlo          0 2015-08-05 20:59 hdfs://production/user/ds_dsidlo/logEvents.save
drwxr-xr-x   - ds_dsidlo ds_dsidlo          0 2015-08-03 18:51 hdfs://production/user/ds_dsidlo/logEvents.txt
-rw-r--r--   3 ds_dsidlo ds_dsidlo         90 2015-08-30 22:53 hdfs://production/user/ds_dsidlo/testdata.csv
drwxr-xr-x   - ds_dsidlo ds_dsidlo          0 2015-10-22 15:00 hdfs://production/user/ds_dsidlo/user_audit

So, it may be that the total data set size makes a difference.



was (Author: dsidlo@gmail.com):
The issue may be based on the total dataset size.

The following command does not work. But, will work with the addition of "/*". The resulting file size would be 4G and 17 files get merged (when it works).
> hdfs dfs -getmerge hdfs://production/apps/hive/warehouse/dgs_tmp.db xxx

 1013  hdfs dfs -getmerge hdfs://production/apps/hive/warehouse/dgs_tmp.db/* xxx
 1019  hdfs dfs -getmerge hdfs://production/user/ds_dsidlo xxx
 1028* hdfs dfs -getmerge hdfs://production/tmp/ds_dsidlo
 1029  hdfs dfs -getmerge hdfs://production/tmp/ds_dsidlo.xx xxx


The following works, but the file size is only 1k. 
> hdfs dfs -getmerge hdfs://production/user/ds_dsidlo xxx

 1013  hdfs dfs -getmerge hdfs://production/apps/hive/warehouse/dgs_tmp.db/* xxx
 1019  hdfs dfs -getmerge hdfs://production/user/ds_dsidlo xxx
 1028* hdfs dfs -getmerge hdfs://production/tmp/ds_dsidlo
 1029  hdfs dfs -getmerge hdfs://production/tmp/ds_dsidlo.xx xxx


So, it may be that the total data set size makes a difference.


> "hadoop fs -getmerge" docs, .../dir does not work, .../dir/* works.
> -------------------------------------------------------------------
>
>                 Key: HADOOP-13510
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13510
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>         Environment: HDP 2.4.2
>            Reporter: David Sidlo
>            Priority: Minor
>              Labels: dfs, fs, getmerge, hadoop, hdfs
>
> Docs indicate that the following command would work...
>    hadoop fs -getmerge -nl /src /opt/output.txt
> For me, it results in a zero-length file /opt/output.txt.
> But the following does work...
>    hadoop fs -getmerge -nl /src/* /opt/output.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org