You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by Suma Shivaprasad <su...@gmail.com> on 2014/08/21 08:16:17 UTC

Hive on Tez Counters

Hi,

Needed info on where I can get detailed job counters for Hive on Tez. Am
running this on a HDP cluster with Hive 0.13 and see only the following job
counters through Hive Tez in Yarn application logs which I got through(
yarn logs -applicationId ...) .

a. Cannot see any ReduceOperator counters and also only DESERIALIZE_ERRORS
is the only counter present in MapOperator
b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS accurate
c. What does COMMITTED_HEAP_BYTES indicate?
d. Is there any other place I should be checking the counters?

[[File System Counters
FILE: BYTES_READ=512,
FILE: BYTES_WRITTEN=3079881,
FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]

[org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
OUTPUT_RECORDS=222543,
OUTPUT_BYTES=23543896,
OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
ADDITIONAL_SPILL_COUNT=0]


[*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter DESERIALIZE_ERRORS=0]]

Thanks
Suma

Re: Hive on Tez Counters

Posted by Siddharth Seth <ss...@apache.org>.
0.5 is almost out, the vote should be closed within a day. You'' have to
use a build of Hive from the Tez branch though.

TEZ-1118 is the patch you'll need to pick up to fix the CPU counters. This
should apply directly on the 0.4 branch, if that's the approach you want to
take.


On Mon, Aug 25, 2014 at 5:38 AM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> Hi Siddharth/Gunther,
>
> Thanks for replying to my queries. I was particularly interested in
> the CPU counter
> since I was doing some benchmarking on queries. Can you please clarify if I
> just blindly take a mod(CPU counter) for all tasks and add them up..would
> they be fine..or should I take a patch from the fix and apply it on Tez 0.4
>  to get it working until 0.5 is released?
>
> Thanks
> Suma
>
>
> On Fri, Aug 22, 2014 at 2:55 AM, Gunther Hagleitner <
> ghagleitner@hortonworks.com> wrote:
>
> > Hive logs the same counters regardless of whether you run with Tez or MR.
> > We've removed some counters in hive 0.13 (HIVE-4518) - the specific one
> > you're looking for might be in that list.
> >
> > Thanks,
> > Gunther.
> >
> >
> > On Thu, Aug 21, 2014 at 11:13 AM, Siddharth Seth <ss...@apache.org>
> wrote:
> >
> > > I'll let Hive folks answer the questions about the Hive counters.
> > >
> > > In terms of the CPU counter - that was a bug in Tez-0.4.0, which has
> been
> > > fixed in 0.5.0.
> > >
> > > COMMITTED_HEAP_BYTES just represents the memory available to the JVM
> > > (Runtime.getRuntime().totalMemory()). This will only vary if the VM is
> > > started with a different Xms and Xmx option.
> > >
> > > In terms of Tez, the application logs are currently the best place.
> Hive
> > > may expose these in a more accessible manner though.
> > >
> > >
> > > On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
> > > sumasai.shivaprasad@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > Needed info on where I can get detailed job counters for Hive on Tez.
> > Am
> > > > running this on a HDP cluster with Hive 0.13 and see only the
> following
> > > job
> > > > counters through Hive Tez in Yarn application logs which I got
> through(
> > > > yarn logs -applicationId ...) .
> > > >
> > > > a. Cannot see any ReduceOperator counters and also only
> > > DESERIALIZE_ERRORS
> > > > is the only counter present in MapOperator
> > > > b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS
> > > accurate
> > > > c. What does COMMITTED_HEAP_BYTES indicate?
> > > > d. Is there any other place I should be checking the counters?
> > > >
> > > > [[File System Counters
> > > > FILE: BYTES_READ=512,
> > > > FILE: BYTES_WRITTEN=3079881,
> > > > FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> > > > BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> > > > LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
> > > >
> > > > [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> > > > GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> > > > PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> > > > COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> > > > OUTPUT_RECORDS=222543,
> > > > OUTPUT_BYTES=23543896,
> > > > OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> > > > ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> > > > ADDITIONAL_SPILL_COUNT=0]
> > > >
> > > >
> > > > [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> > > > DESERIALIZE_ERRORS=0]]
> > > >
> > > > Thanks
> > > > Suma
> > > >
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Hive on Tez Counters

Posted by Siddharth Seth <ss...@apache.org>.
0.5 is almost out, the vote should be closed within a day. You'' have to
use a build of Hive from the Tez branch though.

TEZ-1118 is the patch you'll need to pick up to fix the CPU counters. This
should apply directly on the 0.4 branch, if that's the approach you want to
take.


On Mon, Aug 25, 2014 at 5:38 AM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> Hi Siddharth/Gunther,
>
> Thanks for replying to my queries. I was particularly interested in
> the CPU counter
> since I was doing some benchmarking on queries. Can you please clarify if I
> just blindly take a mod(CPU counter) for all tasks and add them up..would
> they be fine..or should I take a patch from the fix and apply it on Tez 0.4
>  to get it working until 0.5 is released?
>
> Thanks
> Suma
>
>
> On Fri, Aug 22, 2014 at 2:55 AM, Gunther Hagleitner <
> ghagleitner@hortonworks.com> wrote:
>
> > Hive logs the same counters regardless of whether you run with Tez or MR.
> > We've removed some counters in hive 0.13 (HIVE-4518) - the specific one
> > you're looking for might be in that list.
> >
> > Thanks,
> > Gunther.
> >
> >
> > On Thu, Aug 21, 2014 at 11:13 AM, Siddharth Seth <ss...@apache.org>
> wrote:
> >
> > > I'll let Hive folks answer the questions about the Hive counters.
> > >
> > > In terms of the CPU counter - that was a bug in Tez-0.4.0, which has
> been
> > > fixed in 0.5.0.
> > >
> > > COMMITTED_HEAP_BYTES just represents the memory available to the JVM
> > > (Runtime.getRuntime().totalMemory()). This will only vary if the VM is
> > > started with a different Xms and Xmx option.
> > >
> > > In terms of Tez, the application logs are currently the best place.
> Hive
> > > may expose these in a more accessible manner though.
> > >
> > >
> > > On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
> > > sumasai.shivaprasad@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > Needed info on where I can get detailed job counters for Hive on Tez.
> > Am
> > > > running this on a HDP cluster with Hive 0.13 and see only the
> following
> > > job
> > > > counters through Hive Tez in Yarn application logs which I got
> through(
> > > > yarn logs -applicationId ...) .
> > > >
> > > > a. Cannot see any ReduceOperator counters and also only
> > > DESERIALIZE_ERRORS
> > > > is the only counter present in MapOperator
> > > > b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS
> > > accurate
> > > > c. What does COMMITTED_HEAP_BYTES indicate?
> > > > d. Is there any other place I should be checking the counters?
> > > >
> > > > [[File System Counters
> > > > FILE: BYTES_READ=512,
> > > > FILE: BYTES_WRITTEN=3079881,
> > > > FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> > > > BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> > > > LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
> > > >
> > > > [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> > > > GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> > > > PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> > > > COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> > > > OUTPUT_RECORDS=222543,
> > > > OUTPUT_BYTES=23543896,
> > > > OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> > > > ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> > > > ADDITIONAL_SPILL_COUNT=0]
> > > >
> > > >
> > > > [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> > > > DESERIALIZE_ERRORS=0]]
> > > >
> > > > Thanks
> > > > Suma
> > > >
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Hive on Tez Counters

Posted by Suma Shivaprasad <su...@gmail.com>.
Hi Siddharth/Gunther,

Thanks for replying to my queries. I was particularly interested in
the CPU counter
since I was doing some benchmarking on queries. Can you please clarify if I
just blindly take a mod(CPU counter) for all tasks and add them up..would
they be fine..or should I take a patch from the fix and apply it on Tez 0.4
 to get it working until 0.5 is released?

Thanks
Suma


On Fri, Aug 22, 2014 at 2:55 AM, Gunther Hagleitner <
ghagleitner@hortonworks.com> wrote:

> Hive logs the same counters regardless of whether you run with Tez or MR.
> We've removed some counters in hive 0.13 (HIVE-4518) - the specific one
> you're looking for might be in that list.
>
> Thanks,
> Gunther.
>
>
> On Thu, Aug 21, 2014 at 11:13 AM, Siddharth Seth <ss...@apache.org> wrote:
>
> > I'll let Hive folks answer the questions about the Hive counters.
> >
> > In terms of the CPU counter - that was a bug in Tez-0.4.0, which has been
> > fixed in 0.5.0.
> >
> > COMMITTED_HEAP_BYTES just represents the memory available to the JVM
> > (Runtime.getRuntime().totalMemory()). This will only vary if the VM is
> > started with a different Xms and Xmx option.
> >
> > In terms of Tez, the application logs are currently the best place. Hive
> > may expose these in a more accessible manner though.
> >
> >
> > On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
> > sumasai.shivaprasad@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Needed info on where I can get detailed job counters for Hive on Tez.
> Am
> > > running this on a HDP cluster with Hive 0.13 and see only the following
> > job
> > > counters through Hive Tez in Yarn application logs which I got through(
> > > yarn logs -applicationId ...) .
> > >
> > > a. Cannot see any ReduceOperator counters and also only
> > DESERIALIZE_ERRORS
> > > is the only counter present in MapOperator
> > > b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS
> > accurate
> > > c. What does COMMITTED_HEAP_BYTES indicate?
> > > d. Is there any other place I should be checking the counters?
> > >
> > > [[File System Counters
> > > FILE: BYTES_READ=512,
> > > FILE: BYTES_WRITTEN=3079881,
> > > FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> > > BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> > > LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
> > >
> > > [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> > > GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> > > PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> > > COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> > > OUTPUT_RECORDS=222543,
> > > OUTPUT_BYTES=23543896,
> > > OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> > > ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> > > ADDITIONAL_SPILL_COUNT=0]
> > >
> > >
> > > [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> > > DESERIALIZE_ERRORS=0]]
> > >
> > > Thanks
> > > Suma
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Hive on Tez Counters

Posted by Suma Shivaprasad <su...@gmail.com>.
Hi Siddharth/Gunther,

Thanks for replying to my queries. I was particularly interested in
the CPU counter
since I was doing some benchmarking on queries. Can you please clarify if I
just blindly take a mod(CPU counter) for all tasks and add them up..would
they be fine..or should I take a patch from the fix and apply it on Tez 0.4
 to get it working until 0.5 is released?

Thanks
Suma


On Fri, Aug 22, 2014 at 2:55 AM, Gunther Hagleitner <
ghagleitner@hortonworks.com> wrote:

> Hive logs the same counters regardless of whether you run with Tez or MR.
> We've removed some counters in hive 0.13 (HIVE-4518) - the specific one
> you're looking for might be in that list.
>
> Thanks,
> Gunther.
>
>
> On Thu, Aug 21, 2014 at 11:13 AM, Siddharth Seth <ss...@apache.org> wrote:
>
> > I'll let Hive folks answer the questions about the Hive counters.
> >
> > In terms of the CPU counter - that was a bug in Tez-0.4.0, which has been
> > fixed in 0.5.0.
> >
> > COMMITTED_HEAP_BYTES just represents the memory available to the JVM
> > (Runtime.getRuntime().totalMemory()). This will only vary if the VM is
> > started with a different Xms and Xmx option.
> >
> > In terms of Tez, the application logs are currently the best place. Hive
> > may expose these in a more accessible manner though.
> >
> >
> > On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
> > sumasai.shivaprasad@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Needed info on where I can get detailed job counters for Hive on Tez.
> Am
> > > running this on a HDP cluster with Hive 0.13 and see only the following
> > job
> > > counters through Hive Tez in Yarn application logs which I got through(
> > > yarn logs -applicationId ...) .
> > >
> > > a. Cannot see any ReduceOperator counters and also only
> > DESERIALIZE_ERRORS
> > > is the only counter present in MapOperator
> > > b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS
> > accurate
> > > c. What does COMMITTED_HEAP_BYTES indicate?
> > > d. Is there any other place I should be checking the counters?
> > >
> > > [[File System Counters
> > > FILE: BYTES_READ=512,
> > > FILE: BYTES_WRITTEN=3079881,
> > > FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> > > BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> > > LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
> > >
> > > [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> > > GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> > > PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> > > COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> > > OUTPUT_RECORDS=222543,
> > > OUTPUT_BYTES=23543896,
> > > OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> > > ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> > > ADDITIONAL_SPILL_COUNT=0]
> > >
> > >
> > > [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> > > DESERIALIZE_ERRORS=0]]
> > >
> > > Thanks
> > > Suma
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Hive on Tez Counters

Posted by Gunther Hagleitner <gh...@hortonworks.com>.
Hive logs the same counters regardless of whether you run with Tez or MR.
We've removed some counters in hive 0.13 (HIVE-4518) - the specific one
you're looking for might be in that list.

Thanks,
Gunther.


On Thu, Aug 21, 2014 at 11:13 AM, Siddharth Seth <ss...@apache.org> wrote:

> I'll let Hive folks answer the questions about the Hive counters.
>
> In terms of the CPU counter - that was a bug in Tez-0.4.0, which has been
> fixed in 0.5.0.
>
> COMMITTED_HEAP_BYTES just represents the memory available to the JVM
> (Runtime.getRuntime().totalMemory()). This will only vary if the VM is
> started with a different Xms and Xmx option.
>
> In terms of Tez, the application logs are currently the best place. Hive
> may expose these in a more accessible manner though.
>
>
> On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
> > Hi,
> >
> > Needed info on where I can get detailed job counters for Hive on Tez. Am
> > running this on a HDP cluster with Hive 0.13 and see only the following
> job
> > counters through Hive Tez in Yarn application logs which I got through(
> > yarn logs -applicationId ...) .
> >
> > a. Cannot see any ReduceOperator counters and also only
> DESERIALIZE_ERRORS
> > is the only counter present in MapOperator
> > b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS
> accurate
> > c. What does COMMITTED_HEAP_BYTES indicate?
> > d. Is there any other place I should be checking the counters?
> >
> > [[File System Counters
> > FILE: BYTES_READ=512,
> > FILE: BYTES_WRITTEN=3079881,
> > FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> > BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> > LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
> >
> > [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> > GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> > PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> > COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> > OUTPUT_RECORDS=222543,
> > OUTPUT_BYTES=23543896,
> > OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> > ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> > ADDITIONAL_SPILL_COUNT=0]
> >
> >
> > [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> > DESERIALIZE_ERRORS=0]]
> >
> > Thanks
> > Suma
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive on Tez Counters

Posted by Gunther Hagleitner <gh...@hortonworks.com>.
Hive logs the same counters regardless of whether you run with Tez or MR.
We've removed some counters in hive 0.13 (HIVE-4518) - the specific one
you're looking for might be in that list.

Thanks,
Gunther.


On Thu, Aug 21, 2014 at 11:13 AM, Siddharth Seth <ss...@apache.org> wrote:

> I'll let Hive folks answer the questions about the Hive counters.
>
> In terms of the CPU counter - that was a bug in Tez-0.4.0, which has been
> fixed in 0.5.0.
>
> COMMITTED_HEAP_BYTES just represents the memory available to the JVM
> (Runtime.getRuntime().totalMemory()). This will only vary if the VM is
> started with a different Xms and Xmx option.
>
> In terms of Tez, the application logs are currently the best place. Hive
> may expose these in a more accessible manner though.
>
>
> On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
> sumasai.shivaprasad@gmail.com> wrote:
>
> > Hi,
> >
> > Needed info on where I can get detailed job counters for Hive on Tez. Am
> > running this on a HDP cluster with Hive 0.13 and see only the following
> job
> > counters through Hive Tez in Yarn application logs which I got through(
> > yarn logs -applicationId ...) .
> >
> > a. Cannot see any ReduceOperator counters and also only
> DESERIALIZE_ERRORS
> > is the only counter present in MapOperator
> > b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS
> accurate
> > c. What does COMMITTED_HEAP_BYTES indicate?
> > d. Is there any other place I should be checking the counters?
> >
> > [[File System Counters
> > FILE: BYTES_READ=512,
> > FILE: BYTES_WRITTEN=3079881,
> > FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> > BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> > LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
> >
> > [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> > GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> > PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> > COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> > OUTPUT_RECORDS=222543,
> > OUTPUT_BYTES=23543896,
> > OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> > ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> > ADDITIONAL_SPILL_COUNT=0]
> >
> >
> > [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> > DESERIALIZE_ERRORS=0]]
> >
> > Thanks
> > Suma
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive on Tez Counters

Posted by Siddharth Seth <ss...@apache.org>.
I'll let Hive folks answer the questions about the Hive counters.

In terms of the CPU counter - that was a bug in Tez-0.4.0, which has been
fixed in 0.5.0.

COMMITTED_HEAP_BYTES just represents the memory available to the JVM
(Runtime.getRuntime().totalMemory()). This will only vary if the VM is
started with a different Xms and Xmx option.

In terms of Tez, the application logs are currently the best place. Hive
may expose these in a more accessible manner though.


On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> Hi,
>
> Needed info on where I can get detailed job counters for Hive on Tez. Am
> running this on a HDP cluster with Hive 0.13 and see only the following job
> counters through Hive Tez in Yarn application logs which I got through(
> yarn logs -applicationId ...) .
>
> a. Cannot see any ReduceOperator counters and also only DESERIALIZE_ERRORS
> is the only counter present in MapOperator
> b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS accurate
> c. What does COMMITTED_HEAP_BYTES indicate?
> d. Is there any other place I should be checking the counters?
>
> [[File System Counters
> FILE: BYTES_READ=512,
> FILE: BYTES_WRITTEN=3079881,
> FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
>
> [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> OUTPUT_RECORDS=222543,
> OUTPUT_BYTES=23543896,
> OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> ADDITIONAL_SPILL_COUNT=0]
>
>
> [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> DESERIALIZE_ERRORS=0]]
>
> Thanks
> Suma
>

Re: Hive on Tez Counters

Posted by Siddharth Seth <ss...@apache.org>.
I'll let Hive folks answer the questions about the Hive counters.

In terms of the CPU counter - that was a bug in Tez-0.4.0, which has been
fixed in 0.5.0.

COMMITTED_HEAP_BYTES just represents the memory available to the JVM
(Runtime.getRuntime().totalMemory()). This will only vary if the VM is
started with a different Xms and Xmx option.

In terms of Tez, the application logs are currently the best place. Hive
may expose these in a more accessible manner though.


On Wed, Aug 20, 2014 at 11:16 PM, Suma Shivaprasad <
sumasai.shivaprasad@gmail.com> wrote:

> Hi,
>
> Needed info on where I can get detailed job counters for Hive on Tez. Am
> running this on a HDP cluster with Hive 0.13 and see only the following job
> counters through Hive Tez in Yarn application logs which I got through(
> yarn logs -applicationId ...) .
>
> a. Cannot see any ReduceOperator counters and also only DESERIALIZE_ERRORS
> is the only counter present in MapOperator
> b. The CPU_MILLISECONDS in some cases in -ve. Is CPU_MILLISECONDS accurate
> c. What does COMMITTED_HEAP_BYTES indicate?
> d. Is there any other place I should be checking the counters?
>
> [[File System Counters
> FILE: BYTES_READ=512,
> FILE: BYTES_WRITTEN=3079881,
> FILE: READ_OPS=0, FILE: LARGE_READ_OPS=0, FILE: WRITE_OPS=0, HDFS:
> BYTES_READ=8215153, HDFS: BYTES_WRITTEN=0, HDFS: READ_OPS=3, HDFS:
> LARGE_READ_OPS=0, HDFS: WRITE_OPS=0]
>
> [org.apache.tez.common.counters.TaskCounter SPILLED_RECORDS=222543,
> GC_TIME_MILLIS=172, *CPU_MILLISECONDS=-19700*,
> PHYSICAL_MEMORY_BYTES=667566080, VIRTUAL_MEMORY_BYTES=1887797248,
> COMMITTED_HEAP_BYTES=1011023872, INPUT_RECORDS_PROCESSED=222543,
> OUTPUT_RECORDS=222543,
> OUTPUT_BYTES=23543896,
> OUTPUT_BYTES_WITH_OVERHEAD=23989024, OUTPUT_BYTES_PHYSICAL=3079369,
> ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0,
> ADDITIONAL_SPILL_COUNT=0]
>
>
> [*org.apache.hadoop.hive.ql.exec.MapOperator*$Counter
> DESERIALIZE_ERRORS=0]]
>
> Thanks
> Suma
>