You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Nicholas Chammas <ni...@gmail.com> on 2017/02/18 03:15:33 UTC

Will .count() always trigger an evaluation of each row?

Especially during development, people often use .count() or
.persist().count() to force evaluation of all rows — exposing any problems,
e.g. due to bad data — and to load data into cache to speed up subsequent
operations.

But as the optimizer gets smarter, I’m guessing it will eventually learn
that it doesn’t have to do all that work to give the correct count. (This
blog post
<https://databricks.com/blog/2017/02/16/processing-trillion-rows-per-second-single-machine-can-nested-loop-joins-fast.html>
suggests that something like this is already happening.) This will change
Spark’s practical behavior while technically preserving semantics.

What will people need to do then to force evaluation or caching?

Nick
​

Re: Will .count() always trigger an evaluation of each row?

Posted by zero323 <ms...@gmail.com>.
Technically speaking it is still possible to:


df.createOrReplaceTempView("df")
spark.sql("CACHE TABLE df")
spark.table("df")




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21142.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: Will .count() always trigger an evaluation of each row?

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
I think it is a great idea to have a way to force execution to build a
cached dataset.

The use case for this that we see the most is to build broadcast tables.
Right now, there's a 5-minute timeout to build a broadcast table. That's
plenty of time if the data is sitting in a table, but we see a lot of users
that have a dataframe with a complicated query plan that they know is small
enough to broadcast. If that query plan is several stages, it can cause the
job to fail because of the timeout. I usually recommend caching/persisting
the content and then running the broadcast join to avoid the timeout.

I realize that the right solution is to get rid of the timeout when
building a broadcast table, but forcing materialization is useful for
things like this. I'd like to see a legitimate way to do it, since people
currently rely on count.

rb

On Sun, Feb 19, 2017 at 3:14 AM, assaf.mendelson <as...@rsa.com>
wrote:

> I am not saying you should cache everything, just that it is a valid use
> case.
>
>
>
>
>
> *From:* Jörn Franke [via Apache Spark Developers List] [mailto:ml-node+[hidden
> email] <http:///user/SendEmail.jtp?type=node&node=21027&i=0>]
> *Sent:* Sunday, February 19, 2017 12:13 PM
> *To:* Mendelson, Assaf
> *Subject:* Re: Will .count() always trigger an evaluation of each row?
>
>
>
> I think your example relates to scheduling, e.g. it makes sense to use
> oozie or similar to fetch the data at specific point in times.
>
>
>
> I am also not a big fan of caching everything. In a Multi-user cluster
> with a lot of Applications you waste a lot of resources making everybody
> less efficient.
>
>
> On 19 Feb 2017, at 10:13, assaf.mendelson <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=21026&i=0>> wrote:
>
> Actually, when I did a simple test on parquet
> (spark.read.parquet(“somefile”).cache().count()) the UI showed me that
> the entire file is cached. Is this just a fluke?
>
>
>
> In any case I believe the question is still valid, how to make sure a
> dataframe is cached.
>
> Consider for example a case where we read from a remote host (which is
> costly) and we want to make sure the original read is done at a specific
> time (when the network is less crowded).
>
> I for one used .count() till now but if this is not guaranteed to cache,
> then how would I do that? Of course I could always save the dataframe to
> disk but that would cost a lot more in performance than I would like…
>
>
>
> As for doing a map partitions for the dataset, wouldn’t that cause the row
> to be converted to the case class for each row? That could also be heavy.
>
> Maybe cache should have a lazy parameter which would be false by default
> but we could call .cache(true) to make it materialize (similar to what we
> have with checkpoint).
>
> Assaf.
>
>
>
> *From:* Matei Zaharia [via Apache Spark Developers List] [mailto:ml-node+[hidden
> email] <http:///user/SendEmail.jtp?type=node&node=21025&i=0>]
> *Sent:* Sunday, February 19, 2017 9:30 AM
> *To:* Mendelson, Assaf
> *Subject:* Re: Will .count() always trigger an evaluation of each row?
>
>
>
> Count is different on DataFrames and Datasets from RDDs. On RDDs, it
> always evaluates everything, but on DataFrame/Dataset, it turns into the
> equivalent of "select count(*) from ..." in SQL, which can be done without
> scanning the data for some data formats (e.g. Parquet). On the other hand
> though, caching a DataFrame / Dataset does require everything to be cached.
>
>
>
> Matei
>
>
>
> On Feb 18, 2017, at 2:16 AM, Sean Owen <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=21024&i=0>> wrote:
>
>
>
> I think the right answer is "don't do that" but if you really had to you
> could trigger a Dataset operation that does nothing per partition. I
> presume that would be more reliable because the whole partition has to be
> computed to make it available in practice. Or, go so far as to loop over
> every element.
>
>
>
> On Sat, Feb 18, 2017 at 3:15 AM Nicholas Chammas <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=21024&i=1>> wrote:
>
> Especially during development, people often use .count() or
> .persist().count() to force evaluation of all rows — exposing any
> problems, e.g. due to bad data — and to load data into cache to speed up
> subsequent operations.
>
> But as the optimizer gets smarter, I’m guessing it will eventually learn
> that it doesn’t have to do all that work to give the correct count. (This
> blog post
> <https://databricks.com/blog/2017/02/16/processing-trillion-rows-per-second-single-machine-can-nested-loop-joins-fast.html>
> suggests that something like this is already happening.) This will change
> Spark’s practical behavior while technically preserving semantics.
>
> What will people need to do then to force evaluation or caching?
>
> Nick
>
> ​
>
>
>
>
> ------------------------------
>
> *If you reply to this email, your message will be added to the discussion
> below:*
>
> http://apache-spark-developers-list.1001551.n3.
> nabble.com/Will-count-always-trigger-an-evaluation-of-each-
> row-tp21018p21024.html
>
> To start a new topic under Apache Spark Developers List, email [hidden
> email] <http:///user/SendEmail.jtp?type=node&node=21025&i=1>
> To unsubscribe from Apache Spark Developers List, click here.
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
> ------------------------------
>
> View this message in context: RE: Will .count() always trigger an
> evaluation of each row?
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21025.html>
> Sent from the Apache Spark Developers List mailing list archive
> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at Nabble.com
> .
>
>
> ------------------------------
>
> *If you reply to this email, your message will be added to the discussion
> below:*
>
> http://apache-spark-developers-list.1001551.n3.
> nabble.com/Will-count-always-trigger-an-evaluation-of-each-
> row-tp21018p21026.html
>
> To start a new topic under Apache Spark Developers List, email [hidden
> email] <http:///user/SendEmail.jtp?type=node&node=21027&i=1>
> To unsubscribe from Apache Spark Developers List, click here.
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> ------------------------------
> View this message in context: RE: Will .count() always trigger an
> evaluation of each row?
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21027.html>
> Sent from the Apache Spark Developers List mailing list archive
> <http://apache-spark-developers-list.1001551.n3.nabble.com/> at
> Nabble.com.
>



-- 
Ryan Blue
Software Engineer
Netflix

RE: Will .count() always trigger an evaluation of each row?

Posted by "assaf.mendelson" <as...@rsa.com>.
I am not saying you should cache everything, just that it is a valid use case.


From: Jörn Franke [via Apache Spark Developers List] [mailto:ml-node+s1001551n21026h95@n3.nabble.com]
Sent: Sunday, February 19, 2017 12:13 PM
To: Mendelson, Assaf
Subject: Re: Will .count() always trigger an evaluation of each row?

I think your example relates to scheduling, e.g. it makes sense to use oozie or similar to fetch the data at specific point in times.

I am also not a big fan of caching everything. In a Multi-user cluster with a lot of Applications you waste a lot of resources making everybody less efficient.

On 19 Feb 2017, at 10:13, assaf.mendelson <[hidden email]</user/SendEmail.jtp?type=node&node=21026&i=0>> wrote:
Actually, when I did a simple test on parquet (spark.read.parquet(“somefile”).cache().count()) the UI showed me that the entire file is cached. Is this just a fluke?

In any case I believe the question is still valid, how to make sure a dataframe is cached.
Consider for example a case where we read from a remote host (which is costly) and we want to make sure the original read is done at a specific time (when the network is less crowded).
I for one used .count() till now but if this is not guaranteed to cache, then how would I do that? Of course I could always save the dataframe to disk but that would cost a lot more in performance than I would like…

As for doing a map partitions for the dataset, wouldn’t that cause the row to be converted to the case class for each row? That could also be heavy.
Maybe cache should have a lazy parameter which would be false by default but we could call .cache(true) to make it materialize (similar to what we have with checkpoint).
Assaf.

From: Matei Zaharia [via Apache Spark Developers List] [mailto:ml-node+[hidden email]</user/SendEmail.jtp?type=node&node=21025&i=0>]
Sent: Sunday, February 19, 2017 9:30 AM
To: Mendelson, Assaf
Subject: Re: Will .count() always trigger an evaluation of each row?

Count is different on DataFrames and Datasets from RDDs. On RDDs, it always evaluates everything, but on DataFrame/Dataset, it turns into the equivalent of "select count(*) from ..." in SQL, which can be done without scanning the data for some data formats (e.g. Parquet). On the other hand though, caching a DataFrame / Dataset does require everything to be cached.

Matei

On Feb 18, 2017, at 2:16 AM, Sean Owen <[hidden email]</user/SendEmail.jtp?type=node&node=21024&i=0>> wrote:

I think the right answer is "don't do that" but if you really had to you could trigger a Dataset operation that does nothing per partition. I presume that would be more reliable because the whole partition has to be computed to make it available in practice. Or, go so far as to loop over every element.

On Sat, Feb 18, 2017 at 3:15 AM Nicholas Chammas <[hidden email]</user/SendEmail.jtp?type=node&node=21024&i=1>> wrote:

Especially during development, people often use .count() or .persist().count() to force evaluation of all rows — exposing any problems, e.g. due to bad data — and to load data into cache to speed up subsequent operations.

But as the optimizer gets smarter, I’m guessing it will eventually learn that it doesn’t have to do all that work to give the correct count. (This blog post<https://databricks.com/blog/2017/02/16/processing-trillion-rows-per-second-single-machine-can-nested-loop-joins-fast.html> suggests that something like this is already happening.) This will change Spark’s practical behavior while technically preserving semantics.

What will people need to do then to force evaluation or caching?

Nick
​


________________________________
If you reply to this email, your message will be added to the discussion below:
http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21024.html
To start a new topic under Apache Spark Developers List, email [hidden email]</user/SendEmail.jtp?type=node&node=21025&i=1>
To unsubscribe from Apache Spark Developers List, click here.
NAML<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>

________________________________
View this message in context: RE: Will .count() always trigger an evaluation of each row?<http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21025.html>
Sent from the Apache Spark Developers List mailing list archive<http://apache-spark-developers-list.1001551.n3.nabble.com/> at Nabble.com<http://Nabble.com>.

________________________________
If you reply to this email, your message will be added to the discussion below:
http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21026.html
To start a new topic under Apache Spark Developers List, email ml-node+s1001551n1h20@n3.nabble.com<ma...@n3.nabble.com>
To unsubscribe from Apache Spark Developers List, click here<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=YXNzYWYubWVuZGVsc29uQHJzYS5jb218MXwtMTI4OTkxNTg1Mg==>.
NAML<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21027.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Will .count() always trigger an evaluation of each row?

Posted by Jörn Franke <jo...@gmail.com>.
I think your example relates to scheduling, e.g. it makes sense to use oozie or similar to fetch the data at specific point in times.

I am also not a big fan of caching everything. In a Multi-user cluster with a lot of Applications you waste a lot of resources making everybody less efficient. 

> On 19 Feb 2017, at 10:13, assaf.mendelson <as...@rsa.com> wrote:
> 
> Actually, when I did a simple test on parquet (spark.read.parquet(“somefile”).cache().count()) the UI showed me that the entire file is cached. Is this just a fluke?
> 
>  
> 
> In any case I believe the question is still valid, how to make sure a dataframe is cached.
> 
> Consider for example a case where we read from a remote host (which is costly) and we want to make sure the original read is done at a specific time (when the network is less crowded).
> 
> I for one used .count() till now but if this is not guaranteed to cache, then how would I do that? Of course I could always save the dataframe to disk but that would cost a lot more in performance than I would like…
> 
>  
> 
> As for doing a map partitions for the dataset, wouldn’t that cause the row to be converted to the case class for each row? That could also be heavy.
> 
> Maybe cache should have a lazy parameter which would be false by default but we could call .cache(true) to make it materialize (similar to what we have with checkpoint).
> 
> Assaf.
> 
>  
> 
> From: Matei Zaharia [via Apache Spark Developers List] [mailto:ml-node+[hidden email]] 
> Sent: Sunday, February 19, 2017 9:30 AM
> To: Mendelson, Assaf
> Subject: Re: Will .count() always trigger an evaluation of each row?
> 
>  
> 
> Count is different on DataFrames and Datasets from RDDs. On RDDs, it always evaluates everything, but on DataFrame/Dataset, it turns into the equivalent of "select count(*) from ..." in SQL, which can be done without scanning the data for some data formats (e.g. Parquet). On the other hand though, caching a DataFrame / Dataset does require everything to be cached.
> 
>  
> 
> Matei
> 
>  
> 
> On Feb 18, 2017, at 2:16 AM, Sean Owen <[hidden email]> wrote:
> 
>  
> 
> I think the right answer is "don't do that" but if you really had to you could trigger a Dataset operation that does nothing per partition. I presume that would be more reliable because the whole partition has to be computed to make it available in practice. Or, go so far as to loop over every element.
> 
>  
> 
> On Sat, Feb 18, 2017 at 3:15 AM Nicholas Chammas <[hidden email]> wrote:
> 
> Especially during development, people often use .count() or .persist().count() to force evaluation of all rows — exposing any problems, e.g. due to bad data — and to load data into cache to speed up subsequent operations.
> 
> But as the optimizer gets smarter, I’m guessing it will eventually learn that it doesn’t have to do all that work to give the correct count. (This blog post suggests that something like this is already happening.) This will change Spark’s practical behavior while technically preserving semantics.
> 
> What will people need to do then to force evaluation or caching?
> 
> Nick
> 
> ​
> 
>  
> 
>  
> 
> If you reply to this email, your message will be added to the discussion below:
> 
> http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21024.html
> 
> To start a new topic under Apache Spark Developers List, email [hidden email] 
> To unsubscribe from Apache Spark Developers List, click here.
> NAML
> 
> 
> View this message in context: RE: Will .count() always trigger an evaluation of each row?
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

RE: Will .count() always trigger an evaluation of each row?

Posted by "assaf.mendelson" <as...@rsa.com>.
Actually, when I did a simple test on parquet (spark.read.parquet(“somefile”).cache().count()) the UI showed me that the entire file is cached. Is this just a fluke?

In any case I believe the question is still valid, how to make sure a dataframe is cached.
Consider for example a case where we read from a remote host (which is costly) and we want to make sure the original read is done at a specific time (when the network is less crowded).
I for one used .count() till now but if this is not guaranteed to cache, then how would I do that? Of course I could always save the dataframe to disk but that would cost a lot more in performance than I would like…

As for doing a map partitions for the dataset, wouldn’t that cause the row to be converted to the case class for each row? That could also be heavy.
Maybe cache should have a lazy parameter which would be false by default but we could call .cache(true) to make it materialize (similar to what we have with checkpoint).
Assaf.

From: Matei Zaharia [via Apache Spark Developers List] [mailto:ml-node+s1001551n21024h33@n3.nabble.com]
Sent: Sunday, February 19, 2017 9:30 AM
To: Mendelson, Assaf
Subject: Re: Will .count() always trigger an evaluation of each row?

Count is different on DataFrames and Datasets from RDDs. On RDDs, it always evaluates everything, but on DataFrame/Dataset, it turns into the equivalent of "select count(*) from ..." in SQL, which can be done without scanning the data for some data formats (e.g. Parquet). On the other hand though, caching a DataFrame / Dataset does require everything to be cached.

Matei

On Feb 18, 2017, at 2:16 AM, Sean Owen <[hidden email]</user/SendEmail.jtp?type=node&node=21024&i=0>> wrote:

I think the right answer is "don't do that" but if you really had to you could trigger a Dataset operation that does nothing per partition. I presume that would be more reliable because the whole partition has to be computed to make it available in practice. Or, go so far as to loop over every element.

On Sat, Feb 18, 2017 at 3:15 AM Nicholas Chammas <[hidden email]</user/SendEmail.jtp?type=node&node=21024&i=1>> wrote:

Especially during development, people often use .count() or .persist().count() to force evaluation of all rows — exposing any problems, e.g. due to bad data — and to load data into cache to speed up subsequent operations.

But as the optimizer gets smarter, I’m guessing it will eventually learn that it doesn’t have to do all that work to give the correct count. (This blog post<https://databricks.com/blog/2017/02/16/processing-trillion-rows-per-second-single-machine-can-nested-loop-joins-fast.html> suggests that something like this is already happening.) This will change Spark’s practical behavior while technically preserving semantics.

What will people need to do then to force evaluation or caching?

Nick
​


________________________________
If you reply to this email, your message will be added to the discussion below:
http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21024.html
To start a new topic under Apache Spark Developers List, email ml-node+s1001551n1h20@n3.nabble.com<ma...@n3.nabble.com>
To unsubscribe from Apache Spark Developers List, click here<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=YXNzYWYubWVuZGVsc29uQHJzYS5jb218MXwtMTI4OTkxNTg1Mg==>.
NAML<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Will-count-always-trigger-an-evaluation-of-each-row-tp21018p21025.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Will .count() always trigger an evaluation of each row?

Posted by Matei Zaharia <ma...@gmail.com>.
Count is different on DataFrames and Datasets from RDDs. On RDDs, it always evaluates everything, but on DataFrame/Dataset, it turns into the equivalent of "select count(*) from ..." in SQL, which can be done without scanning the data for some data formats (e.g. Parquet). On the other hand though, caching a DataFrame / Dataset does require everything to be cached.

Matei

> On Feb 18, 2017, at 2:16 AM, Sean Owen <so...@cloudera.com> wrote:
> 
> I think the right answer is "don't do that" but if you really had to you could trigger a Dataset operation that does nothing per partition. I presume that would be more reliable because the whole partition has to be computed to make it available in practice. Or, go so far as to loop over every element.
> 
> On Sat, Feb 18, 2017 at 3:15 AM Nicholas Chammas <nicholas.chammas@gmail.com <ma...@gmail.com>> wrote:
> Especially during development, people often use .count() or .persist().count() to force evaluation of all rows — exposing any problems, e.g. due to bad data — and to load data into cache to speed up subsequent operations.
> 
> But as the optimizer gets smarter, I’m guessing it will eventually learn that it doesn’t have to do all that work to give the correct count. (This blog post <https://databricks.com/blog/2017/02/16/processing-trillion-rows-per-second-single-machine-can-nested-loop-joins-fast.html> suggests that something like this is already happening.) This will change Spark’s practical behavior while technically preserving semantics.
> 
> What will people need to do then to force evaluation or caching?
> 
> Nick
> 


Re: Will .count() always trigger an evaluation of each row?

Posted by Sean Owen <so...@cloudera.com>.
I think the right answer is "don't do that" but if you really had to you
could trigger a Dataset operation that does nothing per partition. I
presume that would be more reliable because the whole partition has to be
computed to make it available in practice. Or, go so far as to loop over
every element.

On Sat, Feb 18, 2017 at 3:15 AM Nicholas Chammas <ni...@gmail.com>
wrote:

> Especially during development, people often use .count() or
> .persist().count() to force evaluation of all rows — exposing any
> problems, e.g. due to bad data — and to load data into cache to speed up
> subsequent operations.
>
> But as the optimizer gets smarter, I’m guessing it will eventually learn
> that it doesn’t have to do all that work to give the correct count. (This
> blog post
> <https://databricks.com/blog/2017/02/16/processing-trillion-rows-per-second-single-machine-can-nested-loop-joins-fast.html>
> suggests that something like this is already happening.) This will change
> Spark’s practical behavior while technically preserving semantics.
>
> What will people need to do then to force evaluation or caching?
>
> Nick
> ​
>