You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by ZORAIDA HIDALGO SANCHEZ <zo...@tid.es> on 2013/11/20 17:25:13 UTC

Missing records from HDFS

Hi all,

my job is not reading all the input records. In the input directory I have a set of files containing a total of 6000000 records but only 5997000 are processed. The Map Input Records counter says 5997000.
I have tried downloading the files with a getmerge to check how many records would return but the correct number is returned(6000000).

Do you have any suggestion?

Thanks.

________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Re: Missing records from HDFS

Posted by Azuryy Yu <az...@gmail.com>.
what's your hadoop version? and which InputFormat are you used?

these files under one directory or there are lots of subdirectory? how ddi
you configure input path in your main?



On Thu, Nov 21, 2013 at 12:25 AM, ZORAIDA HIDALGO SANCHEZ <zo...@tid.es>wrote:

>  Hi all,
>
>  my job is not reading all the input records. In the input directory I
> have a set of files containing a total of 6000000 records but only 5997000
> are processed. The Map Input Records counter says 5997000.
> I have tried downloading the files with a getmerge to check how many
> records would return but the correct number is returned(6000000).
>
>  Do you have any suggestion?
>
>  Thanks.
>
> ------------------------------
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> nuestra política de envío y recepción de correo electrónico en el enlace
> situado más abajo.
> This message is intended exclusively for its addressee. We only send and
> receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>

Re: Missing records from HDFS

Posted by Azuryy Yu <az...@gmail.com>.
what's your hadoop version? and which InputFormat are you used?

these files under one directory or there are lots of subdirectory? how ddi
you configure input path in your main?



On Thu, Nov 21, 2013 at 12:25 AM, ZORAIDA HIDALGO SANCHEZ <zo...@tid.es>wrote:

>  Hi all,
>
>  my job is not reading all the input records. In the input directory I
> have a set of files containing a total of 6000000 records but only 5997000
> are processed. The Map Input Records counter says 5997000.
> I have tried downloading the files with a getmerge to check how many
> records would return but the correct number is returned(6000000).
>
>  Do you have any suggestion?
>
>  Thanks.
>
> ------------------------------
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> nuestra política de envío y recepción de correo electrónico en el enlace
> situado más abajo.
> This message is intended exclusively for its addressee. We only send and
> receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>

Re: Missing records from HDFS

Posted by Azuryy Yu <az...@gmail.com>.
what's your hadoop version? and which InputFormat are you used?

these files under one directory or there are lots of subdirectory? how ddi
you configure input path in your main?



On Thu, Nov 21, 2013 at 12:25 AM, ZORAIDA HIDALGO SANCHEZ <zo...@tid.es>wrote:

>  Hi all,
>
>  my job is not reading all the input records. In the input directory I
> have a set of files containing a total of 6000000 records but only 5997000
> are processed. The Map Input Records counter says 5997000.
> I have tried downloading the files with a getmerge to check how many
> records would return but the correct number is returned(6000000).
>
>  Do you have any suggestion?
>
>  Thanks.
>
> ------------------------------
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> nuestra política de envío y recepción de correo electrónico en el enlace
> situado más abajo.
> This message is intended exclusively for its addressee. We only send and
> receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>

Re: Missing records from HDFS

Posted by Azuryy Yu <az...@gmail.com>.
what's your hadoop version? and which InputFormat are you used?

these files under one directory or there are lots of subdirectory? how ddi
you configure input path in your main?



On Thu, Nov 21, 2013 at 12:25 AM, ZORAIDA HIDALGO SANCHEZ <zo...@tid.es>wrote:

>  Hi all,
>
>  my job is not reading all the input records. In the input directory I
> have a set of files containing a total of 6000000 records but only 5997000
> are processed. The Map Input Records counter says 5997000.
> I have tried downloading the files with a getmerge to check how many
> records would return but the correct number is returned(6000000).
>
>  Do you have any suggestion?
>
>  Thanks.
>
> ------------------------------
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> nuestra política de envío y recepción de correo electrónico en el enlace
> situado más abajo.
> This message is intended exclusively for its addressee. We only send and
> receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>