You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by kant kodali <ka...@gmail.com> on 2017/06/25 09:27:31 UTC

Question on Spark code

Hi All,

I came across this file
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
and I am wondering what is the purpose of this? Especially it doesn't
prevent any string concatenation and also the if checks are already done by
the library itself right?

Thanks!

Re: Question on Spark code

Posted by Herman van Hövell tot Westerflier <hv...@databricks.com>.
I am not getting the question. The logging trait does exactly what is says
on the box, I don't see what string concatenation has to do with it.

On Sun, Jun 25, 2017 at 11:27 AM, kant kodali <ka...@gmail.com> wrote:

> Hi All,
>
> I came across this file https://github.com/apache/spark/blob/master/core/
> src/main/scala/org/apache/spark/internal/Logging.scala and I am wondering
> what is the purpose of this? Especially it doesn't prevent any string
> concatenation and also the if checks are already done by the library itself
> right?
>
> Thanks!
>
>


-- 

Herman van Hövell

Software Engineer

Databricks Inc.

hvanhovell@databricks.com

+31 6 420 590 27

databricks.com

[image: http://databricks.com] <http://databricks.com/>



[image: Announcing Databricks Serverless. The first serverless data science
and big data platform. Watch the demo from Spark Summit 2017.]
<http://go.databricks.com/announcing-databricks-serverless>

Re: Question on Spark code

Posted by Steve Loughran <st...@hortonworks.com>.
On 25 Jun 2017, at 20:57, kant kodali <ka...@gmail.com>> wrote:

impressive! I need to learn more about scala.

What I mean stripping away conditional check in Java is this.

static final boolean isLogInfoEnabled = false;

public void logMessage(String message) {
    if(isLogInfoEnabled) {
        log.info<http://log.info/>(message)
    }
}

If you look at the byte code the dead if check will be removed.





Generally it's skipped in Java too now people move to SLF4J APIs, which does on-demand string expansion

LOG.info<http://LOG.info>("network IO failure from {}  source to {}", src, dest, ex). That only builds the final string callis src.toString() and dest.toString() when needed; handling null values too. So you can skip those guards everywhere. But the string template is still constructed; it's not free, and there's some merit in maintaining the guard @ debug level, though I don't personally bother.

The spark one takes a closure, so it can do much more. However, you shouldn't do anything with side effects, or indeed, anything prone to throwing exceptions. Always try to write .toString() methods which are robust against null values, that is: valid for the entire life of an instance. Your debuggers will appreciate it too.


Re: Question on Spark code

Posted by Steve Loughran <st...@hortonworks.com>.
On 25 Jun 2017, at 20:57, kant kodali <ka...@gmail.com>> wrote:

impressive! I need to learn more about scala.

What I mean stripping away conditional check in Java is this.

static final boolean isLogInfoEnabled = false;

public void logMessage(String message) {
    if(isLogInfoEnabled) {
        log.info<http://log.info/>(message)
    }
}

If you look at the byte code the dead if check will be removed.





Generally it's skipped in Java too now people move to SLF4J APIs, which does on-demand string expansion

LOG.info<http://LOG.info>("network IO failure from {}  source to {}", src, dest, ex). That only builds the final string callis src.toString() and dest.toString() when needed; handling null values too. So you can skip those guards everywhere. But the string template is still constructed; it's not free, and there's some merit in maintaining the guard @ debug level, though I don't personally bother.

The spark one takes a closure, so it can do much more. However, you shouldn't do anything with side effects, or indeed, anything prone to throwing exceptions. Always try to write .toString() methods which are robust against null values, that is: valid for the entire life of an instance. Your debuggers will appreciate it too.


Re: Question on Spark code

Posted by kant kodali <ka...@gmail.com>.
impressive! I need to learn more about scala.

What I mean stripping away conditional check in Java is this.

static final boolean isLogInfoEnabled = false;

public void logMessage(String message) {
    if(isLogInfoEnabled) {
        log.info(message)
    }
}

If you look at the byte code the dead if check will be removed.








On Sun, Jun 25, 2017 at 12:46 PM, Sean Owen <so...@cloudera.com> wrote:

> I think it's more precise to say args like any expression are evaluated
> when their value is required. It's just that this special syntax causes
> extra code to be generated that makes it effectively a function passed, not
> value, and one that's lazily evaluated. Look at the bytecode if you're
> curious.
>
> An if conditional is pretty trivial to evaluate here. I don't think that
> guidance is sound. The point is that it's not worth complicating the caller
> code in almost all cases by checking the guard condition manually.
>
> I'm not sure what you're referring to, but no, no compiler can elide these
> conditions. They're based on runtime values that can change at runtime.
>
> scala has an @elidable annotation which you can use to indicate to the
> compiler that the declaration should be entirely omitted when configured to
> elide above a certain detail level. This is how scalac elides assertions if
> you ask it to and you can do it to your own code. But this is something
> different, not what's happening here, and a fairly niche/advanced feature.
>
>
> On Sun, Jun 25, 2017 at 8:25 PM kant kodali <ka...@gmail.com> wrote:
>
>> @Sean Got it! I come from Java world so I guess I was wrong in assuming
>> that arguments are evaluated during the method invocation time. How about
>> the conditional checks to see if the log is InfoEnabled or DebugEnabled?
>> For Example,
>>
>> if (log.isInfoEnabled) log.info(msg)
>>
>> I hear we should use guard condition only when the string "msg"
>> construction is expensive otherwise we will be taking a performance hit
>> because of the additional "if" check unless the "log" itself is declared
>> static final and scala compiler will strip away the "if" check and produce
>> efficient byte code. Also log.info does the log.isInfoEnabled check
>> inside the body prior to logging the messages.
>>
>> https://github.com/qos-ch/slf4j/blob/master/slf4j-
>> simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
>> https://github.com/qos-ch/slf4j/blob/master/slf4j-
>> simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599
>>
>> Please correct me if I am wrong.
>>
>>
>>
>>
>> On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>>> Maybe you are looking for declarations like this. "=> String" means the
>>> arg isn't evaluated until it's used, which is just what you want with log
>>> statements. The message isn't constructed unless it will be logged.
>>>
>>> protected def logInfo(msg: => String) {
>>>
>>>
>>> On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I came across this file https://github.com/
>>>> apache/spark/blob/master/core/src/main/scala/org/apache/
>>>> spark/internal/Logging.scala and I am wondering what is the purpose of
>>>> this? Especially it doesn't prevent any string concatenation and also the
>>>> if checks are already done by the library itself right?
>>>>
>>>> Thanks!
>>>>
>>>>
>>

Re: Question on Spark code

Posted by kant kodali <ka...@gmail.com>.
impressive! I need to learn more about scala.

What I mean stripping away conditional check in Java is this.

static final boolean isLogInfoEnabled = false;

public void logMessage(String message) {
    if(isLogInfoEnabled) {
        log.info(message)
    }
}

If you look at the byte code the dead if check will be removed.








On Sun, Jun 25, 2017 at 12:46 PM, Sean Owen <so...@cloudera.com> wrote:

> I think it's more precise to say args like any expression are evaluated
> when their value is required. It's just that this special syntax causes
> extra code to be generated that makes it effectively a function passed, not
> value, and one that's lazily evaluated. Look at the bytecode if you're
> curious.
>
> An if conditional is pretty trivial to evaluate here. I don't think that
> guidance is sound. The point is that it's not worth complicating the caller
> code in almost all cases by checking the guard condition manually.
>
> I'm not sure what you're referring to, but no, no compiler can elide these
> conditions. They're based on runtime values that can change at runtime.
>
> scala has an @elidable annotation which you can use to indicate to the
> compiler that the declaration should be entirely omitted when configured to
> elide above a certain detail level. This is how scalac elides assertions if
> you ask it to and you can do it to your own code. But this is something
> different, not what's happening here, and a fairly niche/advanced feature.
>
>
> On Sun, Jun 25, 2017 at 8:25 PM kant kodali <ka...@gmail.com> wrote:
>
>> @Sean Got it! I come from Java world so I guess I was wrong in assuming
>> that arguments are evaluated during the method invocation time. How about
>> the conditional checks to see if the log is InfoEnabled or DebugEnabled?
>> For Example,
>>
>> if (log.isInfoEnabled) log.info(msg)
>>
>> I hear we should use guard condition only when the string "msg"
>> construction is expensive otherwise we will be taking a performance hit
>> because of the additional "if" check unless the "log" itself is declared
>> static final and scala compiler will strip away the "if" check and produce
>> efficient byte code. Also log.info does the log.isInfoEnabled check
>> inside the body prior to logging the messages.
>>
>> https://github.com/qos-ch/slf4j/blob/master/slf4j-
>> simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
>> https://github.com/qos-ch/slf4j/blob/master/slf4j-
>> simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599
>>
>> Please correct me if I am wrong.
>>
>>
>>
>>
>> On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>>> Maybe you are looking for declarations like this. "=> String" means the
>>> arg isn't evaluated until it's used, which is just what you want with log
>>> statements. The message isn't constructed unless it will be logged.
>>>
>>> protected def logInfo(msg: => String) {
>>>
>>>
>>> On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I came across this file https://github.com/
>>>> apache/spark/blob/master/core/src/main/scala/org/apache/
>>>> spark/internal/Logging.scala and I am wondering what is the purpose of
>>>> this? Especially it doesn't prevent any string concatenation and also the
>>>> if checks are already done by the library itself right?
>>>>
>>>> Thanks!
>>>>
>>>>
>>

Re: Question on Spark code

Posted by Sean Owen <so...@cloudera.com>.
I think it's more precise to say args like any expression are evaluated
when their value is required. It's just that this special syntax causes
extra code to be generated that makes it effectively a function passed, not
value, and one that's lazily evaluated. Look at the bytecode if you're
curious.

An if conditional is pretty trivial to evaluate here. I don't think that
guidance is sound. The point is that it's not worth complicating the caller
code in almost all cases by checking the guard condition manually.

I'm not sure what you're referring to, but no, no compiler can elide these
conditions. They're based on runtime values that can change at runtime.

scala has an @elidable annotation which you can use to indicate to the
compiler that the declaration should be entirely omitted when configured to
elide above a certain detail level. This is how scalac elides assertions if
you ask it to and you can do it to your own code. But this is something
different, not what's happening here, and a fairly niche/advanced feature.


On Sun, Jun 25, 2017 at 8:25 PM kant kodali <ka...@gmail.com> wrote:

> @Sean Got it! I come from Java world so I guess I was wrong in assuming
> that arguments are evaluated during the method invocation time. How about
> the conditional checks to see if the log is InfoEnabled or DebugEnabled?
> For Example,
>
> if (log.isInfoEnabled) log.info(msg)
>
> I hear we should use guard condition only when the string "msg"
> construction is expensive otherwise we will be taking a performance hit
> because of the additional "if" check unless the "log" itself is declared
> static final and scala compiler will strip away the "if" check and produce
> efficient byte code. Also log.info does the log.isInfoEnabled check
> inside the body prior to logging the messages.
>
>
> https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
>
> https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599
>
> Please correct me if I am wrong.
>
>
>
>
> On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> Maybe you are looking for declarations like this. "=> String" means the
>> arg isn't evaluated until it's used, which is just what you want with log
>> statements. The message isn't constructed unless it will be logged.
>>
>> protected def logInfo(msg: => String) {
>>
>>
>> On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I came across this file
>>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
>>> and I am wondering what is the purpose of this? Especially it doesn't
>>> prevent any string concatenation and also the if checks are already done by
>>> the library itself right?
>>>
>>> Thanks!
>>>
>>>
>

Re: Question on Spark code

Posted by Sean Owen <so...@cloudera.com>.
I think it's more precise to say args like any expression are evaluated
when their value is required. It's just that this special syntax causes
extra code to be generated that makes it effectively a function passed, not
value, and one that's lazily evaluated. Look at the bytecode if you're
curious.

An if conditional is pretty trivial to evaluate here. I don't think that
guidance is sound. The point is that it's not worth complicating the caller
code in almost all cases by checking the guard condition manually.

I'm not sure what you're referring to, but no, no compiler can elide these
conditions. They're based on runtime values that can change at runtime.

scala has an @elidable annotation which you can use to indicate to the
compiler that the declaration should be entirely omitted when configured to
elide above a certain detail level. This is how scalac elides assertions if
you ask it to and you can do it to your own code. But this is something
different, not what's happening here, and a fairly niche/advanced feature.


On Sun, Jun 25, 2017 at 8:25 PM kant kodali <ka...@gmail.com> wrote:

> @Sean Got it! I come from Java world so I guess I was wrong in assuming
> that arguments are evaluated during the method invocation time. How about
> the conditional checks to see if the log is InfoEnabled or DebugEnabled?
> For Example,
>
> if (log.isInfoEnabled) log.info(msg)
>
> I hear we should use guard condition only when the string "msg"
> construction is expensive otherwise we will be taking a performance hit
> because of the additional "if" check unless the "log" itself is declared
> static final and scala compiler will strip away the "if" check and produce
> efficient byte code. Also log.info does the log.isInfoEnabled check
> inside the body prior to logging the messages.
>
>
> https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
>
> https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599
>
> Please correct me if I am wrong.
>
>
>
>
> On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> Maybe you are looking for declarations like this. "=> String" means the
>> arg isn't evaluated until it's used, which is just what you want with log
>> statements. The message isn't constructed unless it will be logged.
>>
>> protected def logInfo(msg: => String) {
>>
>>
>> On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I came across this file
>>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
>>> and I am wondering what is the purpose of this? Especially it doesn't
>>> prevent any string concatenation and also the if checks are already done by
>>> the library itself right?
>>>
>>> Thanks!
>>>
>>>
>

Re: Question on Spark code

Posted by kant kodali <ka...@gmail.com>.
@Sean Got it! I come from Java world so I guess I was wrong in assuming
that arguments are evaluated during the method invocation time. How about
the conditional checks to see if the log is InfoEnabled or DebugEnabled?
For Example,

if (log.isInfoEnabled) log.info(msg)

I hear we should use guard condition only when the string "msg"
construction is expensive otherwise we will be taking a performance hit
because of the additional "if" check unless the "log" itself is declared
static final and scala compiler will strip away the "if" check and produce
efficient byte code. Also log.info does the log.isInfoEnabled check inside
the body prior to logging the messages.

https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599

Please correct me if I am wrong.




On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen <so...@cloudera.com> wrote:

> Maybe you are looking for declarations like this. "=> String" means the
> arg isn't evaluated until it's used, which is just what you want with log
> statements. The message isn't constructed unless it will be logged.
>
> protected def logInfo(msg: => String) {
>
>
> On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:
>
>> Hi All,
>>
>> I came across this file https://github.com/apache/spark/blob/master/core/
>> src/main/scala/org/apache/spark/internal/Logging.scala and I am
>> wondering what is the purpose of this? Especially it doesn't prevent any
>> string concatenation and also the if checks are already done by the library
>> itself right?
>>
>> Thanks!
>>
>>

Re: Question on Spark code

Posted by kant kodali <ka...@gmail.com>.
@Sean Got it! I come from Java world so I guess I was wrong in assuming
that arguments are evaluated during the method invocation time. How about
the conditional checks to see if the log is InfoEnabled or DebugEnabled?
For Example,

if (log.isInfoEnabled) log.info(msg)

I hear we should use guard condition only when the string "msg"
construction is expensive otherwise we will be taking a performance hit
because of the additional "if" check unless the "log" itself is declared
static final and scala compiler will strip away the "if" check and produce
efficient byte code. Also log.info does the log.isInfoEnabled check inside
the body prior to logging the messages.

https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L509
https://github.com/qos-ch/slf4j/blob/master/slf4j-simple/src/main/java/org/slf4j/simple/SimpleLogger.java#L599

Please correct me if I am wrong.




On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen <so...@cloudera.com> wrote:

> Maybe you are looking for declarations like this. "=> String" means the
> arg isn't evaluated until it's used, which is just what you want with log
> statements. The message isn't constructed unless it will be logged.
>
> protected def logInfo(msg: => String) {
>
>
> On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:
>
>> Hi All,
>>
>> I came across this file https://github.com/apache/spark/blob/master/core/
>> src/main/scala/org/apache/spark/internal/Logging.scala and I am
>> wondering what is the purpose of this? Especially it doesn't prevent any
>> string concatenation and also the if checks are already done by the library
>> itself right?
>>
>> Thanks!
>>
>>

Re: Question on Spark code

Posted by Sean Owen <so...@cloudera.com>.
Maybe you are looking for declarations like this. "=> String" means the arg
isn't evaluated until it's used, which is just what you want with log
statements. The message isn't constructed unless it will be logged.

protected def logInfo(msg: => String) {

On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:

> Hi All,
>
> I came across this file
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
> and I am wondering what is the purpose of this? Especially it doesn't
> prevent any string concatenation and also the if checks are already done by
> the library itself right?
>
> Thanks!
>
>

Re: Question on Spark code

Posted by Herman van Hövell tot Westerflier <hv...@databricks.com>.
I am not getting the question. The logging trait does exactly what is says
on the box, I don't see what string concatenation has to do with it.

On Sun, Jun 25, 2017 at 11:27 AM, kant kodali <ka...@gmail.com> wrote:

> Hi All,
>
> I came across this file https://github.com/apache/spark/blob/master/core/
> src/main/scala/org/apache/spark/internal/Logging.scala and I am wondering
> what is the purpose of this? Especially it doesn't prevent any string
> concatenation and also the if checks are already done by the library itself
> right?
>
> Thanks!
>
>


-- 

Herman van Hövell

Software Engineer

Databricks Inc.

hvanhovell@databricks.com

+31 6 420 590 27

databricks.com

[image: http://databricks.com] <http://databricks.com/>



[image: Announcing Databricks Serverless. The first serverless data science
and big data platform. Watch the demo from Spark Summit 2017.]
<http://go.databricks.com/announcing-databricks-serverless>

Re: Question on Spark code

Posted by Sean Owen <so...@cloudera.com>.
Maybe you are looking for declarations like this. "=> String" means the arg
isn't evaluated until it's used, which is just what you want with log
statements. The message isn't constructed unless it will be logged.

protected def logInfo(msg: => String) {

On Sun, Jun 25, 2017 at 10:28 AM kant kodali <ka...@gmail.com> wrote:

> Hi All,
>
> I came across this file
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/Logging.scala
> and I am wondering what is the purpose of this? Especially it doesn't
> prevent any string concatenation and also the if checks are already done by
> the library itself right?
>
> Thanks!
>
>