You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@freemarker.apache.org by Christoph Rüger <c....@synesty.com> on 2019/04/02 10:10:16 UTC

Re: Lambda Expressions - filter list without <#list> directive

Am Mo., 1. Apr. 2019 um 00:38 Uhr schrieb Daniel Dekany <ddekany@apache.org
>:

> Monday, February 25, 2019, 10:19:25 AM, Christoph Rüger wrote:
>
> [snip]
> >> > Overall those lambdas allow some great new use cases with pretty
> concise
> >> > syntax. That is good. The ?filter and ?map built-ins are very cool
> >> already.
> >> > If some existing built-ins will be made "smarter" (as you said), then
> it
> >> > will be a great enhancement.
> >> >
> >> > Speaking for us, performance and memory usage is always a concern. So
> it
> >> > would be good to keep an eye on avoiding large new in-memory
> structures.
> >>
> >> Template authors can always do something like
> >> <#assign cheapProducts = prods?filter(...)>, and that will collect
> >> everything into a List internally, as it's eager processing. If an
> >> aggregating ?groupBy is concerning, then this is even more so.
> >
> >
> > Hmm... well.... difficult....
> >
> > So, *<#assign cheapProducts = prods?filter(...)>* will create a new list,
> > while the following does not:
> >
> >  <#list products?filter(it -> it.price < 1000) as product>
> >     ${product.name}
> >   </#list>
> >
> > Right?
> >
> > Earlier you wrote:
> >
> > *"Of course these built-ins aren't specific to #list, they can be
> > usedanywhere. Naturally, they can be chained as well:"*
> >
> > What would be a downside of allowing ?filter / ?map only in #list?
>
> It makes them significantly less useful. A major annoyance that users
> run into in FTL is that you can build a new sequence (wrapped List)
> based on an existing one. They may want to do that, and then assign it
> to a variable, or pass it to a custom macro as argument. (OK, you can,
> with sequence concatenation, but that's very inefficient if you
> exploit to build a sequence element by element.) ?filter and ?map
> would help in lot of those use cases where the users wanted to build a
> new sequence.
>
> Speaking of which, nothing stops users from building gigantic
> sequences from some iterable you provide with sequence concatenation.
> Sure, it's less tempting to do than applying ?filter and ?map, but
> still... users do it.


> > I guess it makes it more complicated to explain.
>
> That too, and also it must be quite off putting to discover a such
> limitation. After all, it's natural to expect that if you can have
> <#list someExp>, then you can factor out someExp into an earlier
> assignment.
>
> Besides, if we really wanted to limit these to #list, then they would
> be the part of the #list syntax, like <#list xs as x where x.foo>.
> Kind of like SQL (yuck).


> > Difficult tradeoffs here :) Sorry sometimes I get confused by jumping
> > between lazy and eager processing.
>
> Well, if you fear users jumping on ?filter/?map outside #list for no
> good enough reason, there can be some option to handle that. But I
> don't think restricting the usage to #list is a good compromise as the
> default.
>

I agree. Just keep as it is.


>
> >> I'm not sure how efficiently could a configuration setting catch these
> >> cases, or if it should be addressed on that level.
> >>
> >
> > Maybe let's postpone configurability discussion a bit until the above is
> > more clear.
>
> In the light of the above, I think we can start thinking about that
> now.
>

On that note on configurability: Would it be possible to programmatically
influence the Collection (Sequence) which is created under the hood?
E.g. by specifying a Factory? I ask because we are using something like
this (
https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections)
in other places for large collections. I know it is very specific, but just
wanted to bring it up.


> Different... I have added some optimization to ?size. So for example
> if you have <#if list?filter(foo)?size != 0>, then not only it will
> not build any in-memory list (?size never does that, as it only has to
> count), but it only fetches elements until it finds one that matches
> the filter predicate, as at that point the comparison result already
> can be known. This works with comparison with any integer literal, and
> other comparison operators as well.
>
> (Also, list?map(foo)?size basically falls back doing list?size.
> Admittedly that has no much practical application, but still...)
>
>
These are great optimizations.


> --
> Thanks,
>  Daniel Dekany
>
>

-- 
Synesty GmbH
Moritz-von-Rohr-Str. 1a
07745 Jena
Tel.: +49 3641 
5596493Internet: https://synesty.com <https://synesty.com>
Informationen 
zum Datenschutz: https://synesty.com/datenschutz 
<https://synesty.com/datenschutz>

Geschäftsführer: Christoph Rüger

Unternehmenssitz: Jena
Handelsregister B beim Amtsgericht: Jena

Handelsregister-Nummer: HRB 508766
Ust-IdNr.: DE287564982

Re: Lambda Expressions - filter list without <#list> directive

Posted by D��niel D��k��ny <dd...@apache.org>.

I was thinking about this, and realized that there are two problems with "where" (though maybe "where" is still better than "filter" overall):

1. I bet users will often write products?where(price < 1000), instead of products?where(p -> p.price < 1000). In SQL, the context is implicitly the table row, so you don't have a lambda argument. With ?filter the same mistake is less likely, as your brain doesn't switch to SQL mode.

2. We also have ?map, ?take_while and ?drop_while. These are Java Stream API names. So after seeing these, may will think there must be ?filter as well. (Of course the error message can tell that it's "?where" instead, but it can be still annoying.)

For now I left it as ?filter, but it's still not too late to change it. Someone else has opinion?

On 2019/07/02 20:18:30, Denis Bredelet <br...@mac.com.INVALID> wrote: 
> 
> > Le 2 juil. 2019 à 20:29, Pete Helgren <pe...@valadd.com> a écrit :
> > 
> > As a more casual Java programmer, the "where" option is much clearer to me. I spend more time using FM syntax than changing the Java underneath, so from a "fading memory" standpoint, "where" would lead to fewer "What the....?" moments,  for me at least.
> 
> I prefer « where » for the reasons Daniel mentioned, also SQL uses WHERE.
> 
> I think SQL has as many users as Javascript, no?
> 
> — Denis.
> 
> > 
> > Pete Helgren
> > www.petesworkshop.com
> > GIAC Secure Software Programmer-Java
> > Twitter - Sys_i_Geek  IBM_i_Geek
> > 
> > On 7/2/2019 2:08 PM, Christoph Rüger wrote:
> >> Good point. Seems you are not the first ones stumbling on that one.
> >> I quickly searched around and found:
> >> 
> >> Similar question on SO:
> >> https://stackoverflow.com/questions/45939202/filter-naming-convention
> >> Javascript: filter :
> >> https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Array/filter
> >> Spark SQL -> "where" is an alias for "filter":
> >> https://stackoverflow.com/a/33887122/135535
> >> <https://stackoverflow.com/questions/33885979/difference-between-filter-and-where-in-scala-spark-sql>
> >> -> search for "filter" or "where" on
> >> https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrame
> >> R Statistics Language : filter
> >> https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html#filter-rows-with-filter
> >> 
> >> Python: filter https://www.geeksforgeeks.org/filter-in-python/
> >> Ruby: they use select:
> >> https://www.codementor.io/tips/8247613177/how-to-filter-arrays-of-data-in-ruby
> >> Kotlin: filter:
> >> https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/filter.html
> >> 
> >> This languages rank in the upper area of the Stackoverflow survey:
> >> https://insights.stackoverflow.com/survey/2019#technology-_-programming-scripting-and-markup-languages
> >> 
> >> I agree that "where" reads pretty nice. I like it. But "filter" seems to be
> >> found in multiple common languages supporting lambdaish syntax.
> >> Python and R is especially common in the data science / statistics
> >> community, which are different target group than e.g. Java-Programmers.
> >> Also web-developers these days are doing lots of javascript to build "html"
> >> websites / templates - and javascript also uses "filter".
> >> 
> >> My vote would still go for "filter", because I think we are working on
> >> lists of objects and objects are closer to "programming" than to "sql".
> >> Maybe the "where"-alias would be a compromise - but might also be confusing
> >> two have both.
> >> 
> >> What do others think?
> >> 
> >> Thanks
> >> Christoph
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> Am Di., 2. Juli 2019 um 20:27 Uhr schrieb Daniel Dekany <ddekany@apache.org
> >>> :
> >>> I wonder if "filter" is a good name. For Java 8 programmers it's
> >>> given, but otherwise I find it confusing, as it's not clear if you
> >>> specify what to filter out, or what to keep. Worse, I believe in every
> >>> day English "foo filter" or "filters foo" means removing foo-s because
> >>> you don't want them, which is just the opposite of the meaning in
> >>> Java. So I think "where", which is familiar for many from SQL (for
> >>> most Java programmers as well, but also for non-Java programmers),
> >>> would be better. Consider:
> >>> 
> >>>   users?filter(user -> user.inactive)
> >>> 
> >>> VS
> >>> 
> >>>   users?where(user -> user.inactive)
> >>> 
> >>> The first can be easily misunderstood as removing the inactive users,
> >>> while the meaning of the second is obvious.
> >>> 
> 
>

Re: Lambda Expressions - filter list without <#list> directive

Posted by Denis Bredelet <br...@mac.com.INVALID>.

> Le 2 juil. 2019 à 20:29, Pete Helgren <pe...@valadd.com> a écrit :
> 
> As a more casual Java programmer, the "where" option is much clearer to me. I spend more time using FM syntax than changing the Java underneath, so from a "fading memory" standpoint, "where" would lead to fewer "What the....?" moments,  for me at least.

I prefer « where » for the reasons Daniel mentioned, also SQL uses WHERE.

I think SQL has as many users as Javascript, no?

— Denis.

> 
> Pete Helgren
> www.petesworkshop.com
> GIAC Secure Software Programmer-Java
> Twitter - Sys_i_Geek  IBM_i_Geek
> 
> On 7/2/2019 2:08 PM, Christoph Rüger wrote:
>> Good point. Seems you are not the first ones stumbling on that one.
>> I quickly searched around and found:
>> 
>> Similar question on SO:
>> https://stackoverflow.com/questions/45939202/filter-naming-convention
>> Javascript: filter :
>> https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Array/filter
>> Spark SQL -> "where" is an alias for "filter":
>> https://stackoverflow.com/a/33887122/135535
>> <https://stackoverflow.com/questions/33885979/difference-between-filter-and-where-in-scala-spark-sql>
>> -> search for "filter" or "where" on
>> https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrame
>> R Statistics Language : filter
>> https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html#filter-rows-with-filter
>> 
>> Python: filter https://www.geeksforgeeks.org/filter-in-python/
>> Ruby: they use select:
>> https://www.codementor.io/tips/8247613177/how-to-filter-arrays-of-data-in-ruby
>> Kotlin: filter:
>> https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/filter.html
>> 
>> This languages rank in the upper area of the Stackoverflow survey:
>> https://insights.stackoverflow.com/survey/2019#technology-_-programming-scripting-and-markup-languages
>> 
>> I agree that "where" reads pretty nice. I like it. But "filter" seems to be
>> found in multiple common languages supporting lambdaish syntax.
>> Python and R is especially common in the data science / statistics
>> community, which are different target group than e.g. Java-Programmers.
>> Also web-developers these days are doing lots of javascript to build "html"
>> websites / templates - and javascript also uses "filter".
>> 
>> My vote would still go for "filter", because I think we are working on
>> lists of objects and objects are closer to "programming" than to "sql".
>> Maybe the "where"-alias would be a compromise - but might also be confusing
>> two have both.
>> 
>> What do others think?
>> 
>> Thanks
>> Christoph
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Am Di., 2. Juli 2019 um 20:27 Uhr schrieb Daniel Dekany <ddekany@apache.org
>>> :
>>> I wonder if "filter" is a good name. For Java 8 programmers it's
>>> given, but otherwise I find it confusing, as it's not clear if you
>>> specify what to filter out, or what to keep. Worse, I believe in every
>>> day English "foo filter" or "filters foo" means removing foo-s because
>>> you don't want them, which is just the opposite of the meaning in
>>> Java. So I think "where", which is familiar for many from SQL (for
>>> most Java programmers as well, but also for non-Java programmers),
>>> would be better. Consider:
>>> 
>>>   users?filter(user -> user.inactive)
>>> 
>>> VS
>>> 
>>>   users?where(user -> user.inactive)
>>> 
>>> The first can be easily misunderstood as removing the inactive users,
>>> while the meaning of the second is obvious.
>>>

Re: Lambda Expressions - filter list without <#list> directive

Posted by Pete Helgren <pe...@valadd.com>.

As a more casual Java programmer, the "where" option is much clearer to 
me. I spend more time using FM syntax than changing the Java underneath, 
so from a "fading memory" standpoint, "where" would lead to fewer "What 
the....?" moments,  for me at least.

Pete Helgren
www.petesworkshop.com
GIAC Secure Software Programmer-Java
Twitter - Sys_i_Geek  IBM_i_Geek

On 7/2/2019 2:08 PM, Christoph Rüger wrote:
> Good point. Seems you are not the first ones stumbling on that one.
> I quickly searched around and found:
>
> Similar question on SO:
> https://stackoverflow.com/questions/45939202/filter-naming-convention
> Javascript: filter :
> https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Array/filter
> Spark SQL -> "where" is an alias for "filter":
> https://stackoverflow.com/a/33887122/135535
> <https://stackoverflow.com/questions/33885979/difference-between-filter-and-where-in-scala-spark-sql>
> -> search for "filter" or "where" on
> https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrame
> R Statistics Language : filter
> https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html#filter-rows-with-filter
>
> Python: filter https://www.geeksforgeeks.org/filter-in-python/
> Ruby: they use select:
> https://www.codementor.io/tips/8247613177/how-to-filter-arrays-of-data-in-ruby
> Kotlin: filter:
> https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/filter.html
>
> This languages rank in the upper area of the Stackoverflow survey:
> https://insights.stackoverflow.com/survey/2019#technology-_-programming-scripting-and-markup-languages
>
> I agree that "where" reads pretty nice. I like it. But "filter" seems to be
> found in multiple common languages supporting lambdaish syntax.
> Python and R is especially common in the data science / statistics
> community, which are different target group than e.g. Java-Programmers.
> Also web-developers these days are doing lots of javascript to build "html"
> websites / templates - and javascript also uses "filter".
>
> My vote would still go for "filter", because I think we are working on
> lists of objects and objects are closer to "programming" than to "sql".
> Maybe the "where"-alias would be a compromise - but might also be confusing
> two have both.
>
> What do others think?
>
> Thanks
> Christoph
>
>
>
>
>
>
>
> Am Di., 2. Juli 2019 um 20:27 Uhr schrieb Daniel Dekany <ddekany@apache.org
>> :
>> I wonder if "filter" is a good name. For Java 8 programmers it's
>> given, but otherwise I find it confusing, as it's not clear if you
>> specify what to filter out, or what to keep. Worse, I believe in every
>> day English "foo filter" or "filters foo" means removing foo-s because
>> you don't want them, which is just the opposite of the meaning in
>> Java. So I think "where", which is familiar for many from SQL (for
>> most Java programmers as well, but also for non-Java programmers),
>> would be better. Consider:
>>
>>    users?filter(user -> user.inactive)
>>
>> VS
>>
>>    users?where(user -> user.inactive)
>>
>> The first can be easily misunderstood as removing the inactive users,
>> while the meaning of the second is obvious.
>>
>>
>> Tuesday, July 2, 2019, 2:57:52 PM, Christoph Rüger wrote:
>>
>>> Thanks for the heads up. Very nice. We will run our test suite to see if
>>> those test are still green.
>>>
>>> Am Mo., 1. Juli 2019 um 09:30 Uhr schrieb Daniel Dekany <
>> ddekany@freemail.hu
>>>> :
>>>> Since then I have also made a change that ensures that if the lambda
>>>> argument is null (which in FTL is the same as if the variable isn't
>>>> there at all), then it will not fall back to find an identically named
>>>> variable in higher variable scopes. This is important when doing
>>>> things like:
>>>>
>>>>    <#-- filters out null-s -->
>>>>    myList?filter(it -> it??)
>>>>
>>>> because if some day someone adds a variable called "it" to the
>>>> data-model, then suddenly the above won't filter out the null-s.
>>>>
>>>> The same thing was always an issue with #list loop variables as well,
>>>> also with #nested arguments. So I have added a configuration setting
>>>> called "fallbackOnNullLoopVariable", which is by default true
>>>> (unfortunate historical baggage... but we can't break backward
>>>> compatibility). If you set it to false, then this will print "N/A" at
>>>> null list items, rather than "Clashing variable in higher scope":
>>>>
>>>> <#assign it = "Clashing variable in higher scope">
>>>> <#list myList as it>
>>>>    ${it!'N/A'}
>>>> </#list>
>>>>
>>>> These changes are pushed and deployed to the Apache snapshot Maven
>>>> repo in both branches.
>>>>
>>>>
>>>> So, apart from documentation, the local lambda feature is about ready,
>>>> or so I hope. I'm worried of rough edges though, so I think I will add
>>>> lambda support to some more builtins (?seq_contains, ?sort_by), and
>>>> explore some more use cases... If you have your own that you actually
>>>> keep running into, or want to be in the 2.3.29, tell it.
>>>>
>>>>
>>>> Monday, June 24, 2019, 1:59:21 AM, Daniel Dekany wrote:
>>>>
>>>>> Well, I'm not exactly fast nowadays either... Anyway, I have pushed
>>>>> and deployed to the snapshot repo the changes I was talking about
>>>>> recently. That is, ?map or ?filter won't make a sequence out of an
>>>>> enumerable non-sequence (typically an Iterator) anymore. Because, it
>>>>> was the concern that if hugeResultSet is an Iterator because it's
>>>>> huge, then someone might writes:
>>>>>
>>>>>    <#assign transformed = hugeResultSet?map(it -> something(it))>
>>>>>    <#list transformed as it>
>>>>>
>>>>> instead of just
>>>>>
>>>>>    <#list hugeResultSet?map(it -> something(it)) as it>
>>>>>
>>>>> and thus consuming a lot of memory without realizing it. So now if
>>>>> hugeResultSet wasn't already a sequence (List-like), the assignment
>>>>> will be an error, since we can't safely store a lazily transformed
>>>>> collection (lambdas will break), and we can't condense it down to a
>>>>> sequence (List-like thing) automatically either, as that might
>>>>> consumes too much memory. If hugeResultSet was a sequence, then it's
>>>>> not an error, as we assume that keeping all of it in memory is fine,
>>>>> as the original was stored there as well (in practice, most of the
>>>>> times... in principle we can't know).
>>>>>
>>>>> Now if the user feels confident about it, they can still write:
>>>>>
>>>>>    <#assign transformed = hugeResultSet?map(it ->
>> something(it))?sequence>
>>>>> Similarly, hugeResultSet?map(it -> something(it))[index] will be an
>>>>> error, as [index] is for sequences only, and ?map will not change a
>>>>> non-sequence to a sequence anymore. Similarly, if the user feels
>>>>> confident about it, they can write hugeResultSet?map(it ->
>>>>> something(it))?sequence[index].
>>>>>
>>>>> An interesting consequence of these is that ?sequence is now a bit
>>>>> smarter than before. Like if you write myIterator?sequnce[n], it will
>>>>> not fetch the elements into an in-memory sequence, it just skips n
>>>>> elements from myIterators, and returns the nth one. Similarly,
>>>>> myIterator?sequence?size won't store the elements in memory, it just
>>>>> counts them.
>>>>>
>>>>> As an interesting note, these two are also identically efficient:
>>>>>
>>>>>    <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
>>>>>    <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>
>>>>>
>>>>> In both cases the actual conversion to a sequence (in-memory list)
>>>>> happens only just before assigning the value to seq. Once again,
>>>>> ?sequence now just means "it's OK to treat this as a sequence, however
>>>>> inefficient it is", and not "convert it to sequence right now".
>>>>>
>>>>>
>>>>> Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:
>>>>>
>>>>>> These optimisations sound great. I will try to run some tests within
>> the
>>>>>> next weeks. A bit busy lately.
>>>>>> Thanks
>>>>>> Christoph
>>>>>>
>>>>>> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <
>>>> ddekany@apache.org
>>>>>>> :
>>>>>>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
>>>>>>>
>>>>>>> [snip]
>>>>>>>>> Well, if you fear users jumping on ?filter/?map outside #list
>> for no
>>>>>>>>> good enough reason, there can be some option to handle that. But
>> I
>>>>>>>>> don't think restricting the usage to #list is a good compromise
>> as
>>>> the
>>>>>>>>> default.
>>>>>>>> I agree. Just keep as it is.
>>>>>>>>
>>>>>>>>>>> I'm not sure how efficiently could a configuration setting
>> catch
>>>>>>> these
>>>>>>>>>>> cases, or if it should be addressed on that level.
>>>>>>>>>> Maybe let's postpone configurability discussion a bit until the
>>>> above
>>>>>>> is
>>>>>>>>>> more clear.
>>>>>>>>> In the light of the above, I think we can start thinking about
>> that
>>>>>>>>> now.
>>>>>>>> On that note on configurability: Would it be possible to
>>>> programmatically
>>>>>>>> influence the Collection (Sequence) which is created under the
>> hood?
>>>>>>>> E.g. by specifying a Factory? I ask because we are using something
>>>> like
>>>>>>>> this (
>>>>>>>>
>> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
>>>>>>> )
>>>>>>>> in other places for large collections. I know it is very specific,
>>>> but
>>>>>>> just
>>>>>>>> wanted to bring it up.
>>>>>>> [snip]
>>>>>>>
>>>>>>> I think a good approach would be to ban the *implicit* collection of
>>>>>>> the result, when the filtered/mapped source is an Iterator, or other
>>>>>>> similar stream-like object that's often used for enumerating a huge
>>>>>>> number of elements. So for example, let's say you have this:
>>>>>>>
>>>>>>>    <#assign xs2 = xs?filter(f)>
>>>>>>>
>>>>>>> If xs is List-like, then this will work. Since the xs List fits into
>>>>>>> the memory (although a List can be backed by disk, that's rather
>>>>>>> rare), hopefully it's not the kind of data amount that can't fit
>> into
>>>>>>> the memory again (as xs2). On the other hand, if xs is an
>>>>>>> Iterator-like object, then the above statement fails, with the hint
>>>>>>> that xs?filter(f)?sequence would work, but might consumes a lot of
>>>>>>> memory.
>>>>>>>
>>>>>>> This is also consistent with how xs[i] works in the existing
>>>>>>> FreeMarker versions. That only works if xs is List-like (an FTL
>>>>>>> sequence). While xs[i] would be trivial to implement even if xs is
>>>>>>> Iterator-like, we don't do that as it's not efficient for a high i,
>>>>>>> and so the template author is probably not meant to do that. If he
>>>>>>> knows what's he doing though, he can write xs?sequence[i]. Yes,
>> that's
>>>>>>> very inefficient if you only use [] once on that sequence, but you
>> see
>>>>>>> the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
>>>>>>> is an Iterator, because filter/map currently always returns a
>>>>>>> sequence. If xs is Iteartor-like, then I want filter/map to return
>> an
>>>>>>> Iterator-like as well, so then [] will fail on it.
>>>>>>>
>>>>>>> As a side note, I will make ?sequence smarter too, so that
>>>>>>> xs?sequence[i] won't actually build a sequence if xs is
>> Iterator-like.
>>>>>>> It just have to skip the first i elements after all. (The ?sequence
>> is
>>>>>>> still required there. It basically says: "I know what I'm doing,
>> treat
>>>>>>> this as a sequence.")
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>>   Daniel Dekany
>>>>>>>
>>>>>>>
>>>> --
>>>> Thanks,
>>>>   Daniel Dekany
>>>>
>>>>
>>> --
>>> Christoph Rüger, Geschäftsführer
>>> Synesty <https://synesty.com/> - Anbinden und Automatisieren ohne
>>> Programmieren - Automatisierung, Schnittstellen, Datenfeeds
>>>
>>> Xing: https://www.xing.com/profile/Christoph_Rueger2
>>> LinkedIn: http://www.linkedin.com/pub/christoph-rueger/a/685/198
>>>
>> --
>> Thanks,
>>   Daniel Dekany
>>
>>

Re: Lambda Expressions - filter list without <#list> directive

Posted by Christoph Rüger <c....@synesty.com>.

Good point. Seems you are not the first ones stumbling on that one.
I quickly searched around and found:

Similar question on SO:
https://stackoverflow.com/questions/45939202/filter-naming-convention
Javascript: filter :
https://developer.mozilla.org/de/docs/Web/JavaScript/Reference/Global_Objects/Array/filter
Spark SQL -> "where" is an alias for "filter":
https://stackoverflow.com/a/33887122/135535
<https://stackoverflow.com/questions/33885979/difference-between-filter-and-where-in-scala-spark-sql>
-> search for "filter" or "where" on
https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrame
R Statistics Language : filter
https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html#filter-rows-with-filter

Python: filter https://www.geeksforgeeks.org/filter-in-python/
Ruby: they use select:
https://www.codementor.io/tips/8247613177/how-to-filter-arrays-of-data-in-ruby
Kotlin: filter:
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/filter.html

This languages rank in the upper area of the Stackoverflow survey:
https://insights.stackoverflow.com/survey/2019#technology-_-programming-scripting-and-markup-languages

I agree that "where" reads pretty nice. I like it. But "filter" seems to be
found in multiple common languages supporting lambdaish syntax.
Python and R is especially common in the data science / statistics
community, which are different target group than e.g. Java-Programmers.
Also web-developers these days are doing lots of javascript to build "html"
websites / templates - and javascript also uses "filter".

My vote would still go for "filter", because I think we are working on
lists of objects and objects are closer to "programming" than to "sql".
Maybe the "where"-alias would be a compromise - but might also be confusing
two have both.

What do others think?

Thanks
Christoph







Am Di., 2. Juli 2019 um 20:27 Uhr schrieb Daniel Dekany <ddekany@apache.org
>:

> I wonder if "filter" is a good name. For Java 8 programmers it's
> given, but otherwise I find it confusing, as it's not clear if you
> specify what to filter out, or what to keep. Worse, I believe in every
> day English "foo filter" or "filters foo" means removing foo-s because
> you don't want them, which is just the opposite of the meaning in
> Java. So I think "where", which is familiar for many from SQL (for
> most Java programmers as well, but also for non-Java programmers),
> would be better. Consider:
>
>   users?filter(user -> user.inactive)
>
> VS
>
>   users?where(user -> user.inactive)
>
> The first can be easily misunderstood as removing the inactive users,
> while the meaning of the second is obvious.
>
>
> Tuesday, July 2, 2019, 2:57:52 PM, Christoph Rüger wrote:
>
> > Thanks for the heads up. Very nice. We will run our test suite to see if
> > those test are still green.
> >
> > Am Mo., 1. Juli 2019 um 09:30 Uhr schrieb Daniel Dekany <
> ddekany@freemail.hu
> >>:
> >
> >> Since then I have also made a change that ensures that if the lambda
> >> argument is null (which in FTL is the same as if the variable isn't
> >> there at all), then it will not fall back to find an identically named
> >> variable in higher variable scopes. This is important when doing
> >> things like:
> >>
> >>   <#-- filters out null-s -->
> >>   myList?filter(it -> it??)
> >>
> >> because if some day someone adds a variable called "it" to the
> >> data-model, then suddenly the above won't filter out the null-s.
> >>
> >> The same thing was always an issue with #list loop variables as well,
> >> also with #nested arguments. So I have added a configuration setting
> >> called "fallbackOnNullLoopVariable", which is by default true
> >> (unfortunate historical baggage... but we can't break backward
> >> compatibility). If you set it to false, then this will print "N/A" at
> >> null list items, rather than "Clashing variable in higher scope":
> >>
> >> <#assign it = "Clashing variable in higher scope">
> >> <#list myList as it>
> >>   ${it!'N/A'}
> >> </#list>
> >>
> >> These changes are pushed and deployed to the Apache snapshot Maven
> >> repo in both branches.
> >>
> >>
> >> So, apart from documentation, the local lambda feature is about ready,
> >> or so I hope. I'm worried of rough edges though, so I think I will add
> >> lambda support to some more builtins (?seq_contains, ?sort_by), and
> >> explore some more use cases... If you have your own that you actually
> >> keep running into, or want to be in the 2.3.29, tell it.
> >>
> >>
> >> Monday, June 24, 2019, 1:59:21 AM, Daniel Dekany wrote:
> >>
> >> > Well, I'm not exactly fast nowadays either... Anyway, I have pushed
> >> > and deployed to the snapshot repo the changes I was talking about
> >> > recently. That is, ?map or ?filter won't make a sequence out of an
> >> > enumerable non-sequence (typically an Iterator) anymore. Because, it
> >> > was the concern that if hugeResultSet is an Iterator because it's
> >> > huge, then someone might writes:
> >> >
> >> >   <#assign transformed = hugeResultSet?map(it -> something(it))>
> >> >   <#list transformed as it>
> >> >
> >> > instead of just
> >> >
> >> >   <#list hugeResultSet?map(it -> something(it)) as it>
> >> >
> >> > and thus consuming a lot of memory without realizing it. So now if
> >> > hugeResultSet wasn't already a sequence (List-like), the assignment
> >> > will be an error, since we can't safely store a lazily transformed
> >> > collection (lambdas will break), and we can't condense it down to a
> >> > sequence (List-like thing) automatically either, as that might
> >> > consumes too much memory. If hugeResultSet was a sequence, then it's
> >> > not an error, as we assume that keeping all of it in memory is fine,
> >> > as the original was stored there as well (in practice, most of the
> >> > times... in principle we can't know).
> >> >
> >> > Now if the user feels confident about it, they can still write:
> >> >
> >> >   <#assign transformed = hugeResultSet?map(it ->
> something(it))?sequence>
> >> >
> >> > Similarly, hugeResultSet?map(it -> something(it))[index] will be an
> >> > error, as [index] is for sequences only, and ?map will not change a
> >> > non-sequence to a sequence anymore. Similarly, if the user feels
> >> > confident about it, they can write hugeResultSet?map(it ->
> >> > something(it))?sequence[index].
> >> >
> >> > An interesting consequence of these is that ?sequence is now a bit
> >> > smarter than before. Like if you write myIterator?sequnce[n], it will
> >> > not fetch the elements into an in-memory sequence, it just skips n
> >> > elements from myIterators, and returns the nth one. Similarly,
> >> > myIterator?sequence?size won't store the elements in memory, it just
> >> > counts them.
> >> >
> >> > As an interesting note, these two are also identically efficient:
> >> >
> >> >   <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
> >> >   <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>
> >> >
> >> > In both cases the actual conversion to a sequence (in-memory list)
> >> > happens only just before assigning the value to seq. Once again,
> >> > ?sequence now just means "it's OK to treat this as a sequence, however
> >> > inefficient it is", and not "convert it to sequence right now".
> >> >
> >> >
> >> > Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:
> >> >
> >> >> These optimisations sound great. I will try to run some tests within
> the
> >> >> next weeks. A bit busy lately.
> >> >> Thanks
> >> >> Christoph
> >> >>
> >> >> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <
> >> ddekany@apache.org
> >> >>>:
> >> >>
> >> >>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
> >> >>>
> >> >>> [snip]
> >> >>> >> Well, if you fear users jumping on ?filter/?map outside #list
> for no
> >> >>> >> good enough reason, there can be some option to handle that. But
> I
> >> >>> >> don't think restricting the usage to #list is a good compromise
> as
> >> the
> >> >>> >> default.
> >> >>> >
> >> >>> > I agree. Just keep as it is.
> >> >>> >
> >> >>> >> >> I'm not sure how efficiently could a configuration setting
> catch
> >> >>> these
> >> >>> >> >> cases, or if it should be addressed on that level.
> >> >>> >> >
> >> >>> >> > Maybe let's postpone configurability discussion a bit until the
> >> above
> >> >>> is
> >> >>> >> > more clear.
> >> >>> >>
> >> >>> >> In the light of the above, I think we can start thinking about
> that
> >> >>> >> now.
> >> >>> >
> >> >>> > On that note on configurability: Would it be possible to
> >> programmatically
> >> >>> > influence the Collection (Sequence) which is created under the
> hood?
> >> >>> > E.g. by specifying a Factory? I ask because we are using something
> >> like
> >> >>> > this (
> >> >>> >
> >> >>>
> >>
> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
> >> >>> )
> >> >>> > in other places for large collections. I know it is very specific,
> >> but
> >> >>> just
> >> >>> > wanted to bring it up.
> >> >>> [snip]
> >> >>>
> >> >>> I think a good approach would be to ban the *implicit* collection of
> >> >>> the result, when the filtered/mapped source is an Iterator, or other
> >> >>> similar stream-like object that's often used for enumerating a huge
> >> >>> number of elements. So for example, let's say you have this:
> >> >>>
> >> >>>   <#assign xs2 = xs?filter(f)>
> >> >>>
> >> >>> If xs is List-like, then this will work. Since the xs List fits into
> >> >>> the memory (although a List can be backed by disk, that's rather
> >> >>> rare), hopefully it's not the kind of data amount that can't fit
> into
> >> >>> the memory again (as xs2). On the other hand, if xs is an
> >> >>> Iterator-like object, then the above statement fails, with the hint
> >> >>> that xs?filter(f)?sequence would work, but might consumes a lot of
> >> >>> memory.
> >> >>>
> >> >>> This is also consistent with how xs[i] works in the existing
> >> >>> FreeMarker versions. That only works if xs is List-like (an FTL
> >> >>> sequence). While xs[i] would be trivial to implement even if xs is
> >> >>> Iterator-like, we don't do that as it's not efficient for a high i,
> >> >>> and so the template author is probably not meant to do that. If he
> >> >>> knows what's he doing though, he can write xs?sequence[i]. Yes,
> that's
> >> >>> very inefficient if you only use [] once on that sequence, but you
> see
> >> >>> the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
> >> >>> is an Iterator, because filter/map currently always returns a
> >> >>> sequence. If xs is Iteartor-like, then I want filter/map to return
> an
> >> >>> Iterator-like as well, so then [] will fail on it.
> >> >>>
> >> >>> As a side note, I will make ?sequence smarter too, so that
> >> >>> xs?sequence[i] won't actually build a sequence if xs is
> Iterator-like.
> >> >>> It just have to skip the first i elements after all. (The ?sequence
> is
> >> >>> still required there. It basically says: "I know what I'm doing,
> treat
> >> >>> this as a sequence.")
> >> >>>
> >> >>> --
> >> >>> Thanks,
> >> >>>  Daniel Dekany
> >> >>>
> >> >>>
> >> >>
> >> >
> >>
> >> --
> >> Thanks,
> >>  Daniel Dekany
> >>
> >>
> >
> > --
> > Christoph Rüger, Geschäftsführer
> > Synesty <https://synesty.com/> - Anbinden und Automatisieren ohne
> > Programmieren - Automatisierung, Schnittstellen, Datenfeeds
> >
> > Xing: https://www.xing.com/profile/Christoph_Rueger2
> > LinkedIn: http://www.linkedin.com/pub/christoph-rueger/a/685/198
> >
>
> --
> Thanks,
>  Daniel Dekany
>
>

-- 
Synesty GmbH
Moritz-von-Rohr-Str. 1a
07745 Jena
Tel.: +49 3641 
5596493Internet: https://synesty.com <https://synesty.com>
Informationen 
zum Datenschutz: https://synesty.com/datenschutz 
<https://synesty.com/datenschutz>

Geschäftsführer: Christoph Rüger

Unternehmenssitz: Jena
Handelsregister B beim Amtsgericht: Jena

Handelsregister-Nummer: HRB 508766
Ust-IdNr.: DE287564982

Re: Lambda Expressions - filter list without <#list> directive

Posted by Daniel Dekany <dd...@apache.org>.

I wonder if "filter" is a good name. For Java 8 programmers it's
given, but otherwise I find it confusing, as it's not clear if you
specify what to filter out, or what to keep. Worse, I believe in every
day English "foo filter" or "filters foo" means removing foo-s because
you don't want them, which is just the opposite of the meaning in
Java. So I think "where", which is familiar for many from SQL (for
most Java programmers as well, but also for non-Java programmers),
would be better. Consider:

  users?filter(user -> user.inactive)

VS

  users?where(user -> user.inactive)

The first can be easily misunderstood as removing the inactive users,
while the meaning of the second is obvious.


Tuesday, July 2, 2019, 2:57:52 PM, Christoph Rüger wrote:

> Thanks for the heads up. Very nice. We will run our test suite to see if
> those test are still green.
>
> Am Mo., 1. Juli 2019 um 09:30 Uhr schrieb Daniel Dekany <ddekany@freemail.hu
>>:
>
>> Since then I have also made a change that ensures that if the lambda
>> argument is null (which in FTL is the same as if the variable isn't
>> there at all), then it will not fall back to find an identically named
>> variable in higher variable scopes. This is important when doing
>> things like:
>>
>>   <#-- filters out null-s -->
>>   myList?filter(it -> it??)
>>
>> because if some day someone adds a variable called "it" to the
>> data-model, then suddenly the above won't filter out the null-s.
>>
>> The same thing was always an issue with #list loop variables as well,
>> also with #nested arguments. So I have added a configuration setting
>> called "fallbackOnNullLoopVariable", which is by default true
>> (unfortunate historical baggage... but we can't break backward
>> compatibility). If you set it to false, then this will print "N/A" at
>> null list items, rather than "Clashing variable in higher scope":
>>
>> <#assign it = "Clashing variable in higher scope">
>> <#list myList as it>
>>   ${it!'N/A'}
>> </#list>
>>
>> These changes are pushed and deployed to the Apache snapshot Maven
>> repo in both branches.
>>
>>
>> So, apart from documentation, the local lambda feature is about ready,
>> or so I hope. I'm worried of rough edges though, so I think I will add
>> lambda support to some more builtins (?seq_contains, ?sort_by), and
>> explore some more use cases... If you have your own that you actually
>> keep running into, or want to be in the 2.3.29, tell it.
>>
>>
>> Monday, June 24, 2019, 1:59:21 AM, Daniel Dekany wrote:
>>
>> > Well, I'm not exactly fast nowadays either... Anyway, I have pushed
>> > and deployed to the snapshot repo the changes I was talking about
>> > recently. That is, ?map or ?filter won't make a sequence out of an
>> > enumerable non-sequence (typically an Iterator) anymore. Because, it
>> > was the concern that if hugeResultSet is an Iterator because it's
>> > huge, then someone might writes:
>> >
>> >   <#assign transformed = hugeResultSet?map(it -> something(it))>
>> >   <#list transformed as it>
>> >
>> > instead of just
>> >
>> >   <#list hugeResultSet?map(it -> something(it)) as it>
>> >
>> > and thus consuming a lot of memory without realizing it. So now if
>> > hugeResultSet wasn't already a sequence (List-like), the assignment
>> > will be an error, since we can't safely store a lazily transformed
>> > collection (lambdas will break), and we can't condense it down to a
>> > sequence (List-like thing) automatically either, as that might
>> > consumes too much memory. If hugeResultSet was a sequence, then it's
>> > not an error, as we assume that keeping all of it in memory is fine,
>> > as the original was stored there as well (in practice, most of the
>> > times... in principle we can't know).
>> >
>> > Now if the user feels confident about it, they can still write:
>> >
>> >   <#assign transformed = hugeResultSet?map(it -> something(it))?sequence>
>> >
>> > Similarly, hugeResultSet?map(it -> something(it))[index] will be an
>> > error, as [index] is for sequences only, and ?map will not change a
>> > non-sequence to a sequence anymore. Similarly, if the user feels
>> > confident about it, they can write hugeResultSet?map(it ->
>> > something(it))?sequence[index].
>> >
>> > An interesting consequence of these is that ?sequence is now a bit
>> > smarter than before. Like if you write myIterator?sequnce[n], it will
>> > not fetch the elements into an in-memory sequence, it just skips n
>> > elements from myIterators, and returns the nth one. Similarly,
>> > myIterator?sequence?size won't store the elements in memory, it just
>> > counts them.
>> >
>> > As an interesting note, these two are also identically efficient:
>> >
>> >   <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
>> >   <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>
>> >
>> > In both cases the actual conversion to a sequence (in-memory list)
>> > happens only just before assigning the value to seq. Once again,
>> > ?sequence now just means "it's OK to treat this as a sequence, however
>> > inefficient it is", and not "convert it to sequence right now".
>> >
>> >
>> > Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:
>> >
>> >> These optimisations sound great. I will try to run some tests within the
>> >> next weeks. A bit busy lately.
>> >> Thanks
>> >> Christoph
>> >>
>> >> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <
>> ddekany@apache.org
>> >>>:
>> >>
>> >>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
>> >>>
>> >>> [snip]
>> >>> >> Well, if you fear users jumping on ?filter/?map outside #list for no
>> >>> >> good enough reason, there can be some option to handle that. But I
>> >>> >> don't think restricting the usage to #list is a good compromise as
>> the
>> >>> >> default.
>> >>> >
>> >>> > I agree. Just keep as it is.
>> >>> >
>> >>> >> >> I'm not sure how efficiently could a configuration setting catch
>> >>> these
>> >>> >> >> cases, or if it should be addressed on that level.
>> >>> >> >
>> >>> >> > Maybe let's postpone configurability discussion a bit until the
>> above
>> >>> is
>> >>> >> > more clear.
>> >>> >>
>> >>> >> In the light of the above, I think we can start thinking about that
>> >>> >> now.
>> >>> >
>> >>> > On that note on configurability: Would it be possible to
>> programmatically
>> >>> > influence the Collection (Sequence) which is created under the hood?
>> >>> > E.g. by specifying a Factory? I ask because we are using something
>> like
>> >>> > this (
>> >>> >
>> >>>
>> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
>> >>> )
>> >>> > in other places for large collections. I know it is very specific,
>> but
>> >>> just
>> >>> > wanted to bring it up.
>> >>> [snip]
>> >>>
>> >>> I think a good approach would be to ban the *implicit* collection of
>> >>> the result, when the filtered/mapped source is an Iterator, or other
>> >>> similar stream-like object that's often used for enumerating a huge
>> >>> number of elements. So for example, let's say you have this:
>> >>>
>> >>>   <#assign xs2 = xs?filter(f)>
>> >>>
>> >>> If xs is List-like, then this will work. Since the xs List fits into
>> >>> the memory (although a List can be backed by disk, that's rather
>> >>> rare), hopefully it's not the kind of data amount that can't fit into
>> >>> the memory again (as xs2). On the other hand, if xs is an
>> >>> Iterator-like object, then the above statement fails, with the hint
>> >>> that xs?filter(f)?sequence would work, but might consumes a lot of
>> >>> memory.
>> >>>
>> >>> This is also consistent with how xs[i] works in the existing
>> >>> FreeMarker versions. That only works if xs is List-like (an FTL
>> >>> sequence). While xs[i] would be trivial to implement even if xs is
>> >>> Iterator-like, we don't do that as it's not efficient for a high i,
>> >>> and so the template author is probably not meant to do that. If he
>> >>> knows what's he doing though, he can write xs?sequence[i]. Yes, that's
>> >>> very inefficient if you only use [] once on that sequence, but you see
>> >>> the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
>> >>> is an Iterator, because filter/map currently always returns a
>> >>> sequence. If xs is Iteartor-like, then I want filter/map to return an
>> >>> Iterator-like as well, so then [] will fail on it.
>> >>>
>> >>> As a side note, I will make ?sequence smarter too, so that
>> >>> xs?sequence[i] won't actually build a sequence if xs is Iterator-like.
>> >>> It just have to skip the first i elements after all. (The ?sequence is
>> >>> still required there. It basically says: "I know what I'm doing, treat
>> >>> this as a sequence.")
>> >>>
>> >>> --
>> >>> Thanks,
>> >>>  Daniel Dekany
>> >>>
>> >>>
>> >>
>> >
>>
>> --
>> Thanks,
>>  Daniel Dekany
>>
>>
>
> -- 
> Christoph Rüger, Geschäftsführer
> Synesty <https://synesty.com/> - Anbinden und Automatisieren ohne
> Programmieren - Automatisierung, Schnittstellen, Datenfeeds
>
> Xing: https://www.xing.com/profile/Christoph_Rueger2
> LinkedIn: http://www.linkedin.com/pub/christoph-rueger/a/685/198
>

-- 
Thanks,
 Daniel Dekany

Re: Lambda Expressions - filter list without <#list> directive

Posted by Christoph Rüger <c....@synesty.com>.

Thanks for the heads up. Very nice. We will run our test suite to see if
those test are still green.

Am Mo., 1. Juli 2019 um 09:30 Uhr schrieb Daniel Dekany <ddekany@freemail.hu
>:

> Since then I have also made a change that ensures that if the lambda
> argument is null (which in FTL is the same as if the variable isn't
> there at all), then it will not fall back to find an identically named
> variable in higher variable scopes. This is important when doing
> things like:
>
>   <#-- filters out null-s -->
>   myList?filter(it -> it??)
>
> because if some day someone adds a variable called "it" to the
> data-model, then suddenly the above won't filter out the null-s.
>
> The same thing was always an issue with #list loop variables as well,
> also with #nested arguments. So I have added a configuration setting
> called "fallbackOnNullLoopVariable", which is by default true
> (unfortunate historical baggage... but we can't break backward
> compatibility). If you set it to false, then this will print "N/A" at
> null list items, rather than "Clashing variable in higher scope":
>
> <#assign it = "Clashing variable in higher scope">
> <#list myList as it>
>   ${it!'N/A'}
> </#list>
>
> These changes are pushed and deployed to the Apache snapshot Maven
> repo in both branches.
>
>
> So, apart from documentation, the local lambda feature is about ready,
> or so I hope. I'm worried of rough edges though, so I think I will add
> lambda support to some more builtins (?seq_contains, ?sort_by), and
> explore some more use cases... If you have your own that you actually
> keep running into, or want to be in the 2.3.29, tell it.
>
>
> Monday, June 24, 2019, 1:59:21 AM, Daniel Dekany wrote:
>
> > Well, I'm not exactly fast nowadays either... Anyway, I have pushed
> > and deployed to the snapshot repo the changes I was talking about
> > recently. That is, ?map or ?filter won't make a sequence out of an
> > enumerable non-sequence (typically an Iterator) anymore. Because, it
> > was the concern that if hugeResultSet is an Iterator because it's
> > huge, then someone might writes:
> >
> >   <#assign transformed = hugeResultSet?map(it -> something(it))>
> >   <#list transformed as it>
> >
> > instead of just
> >
> >   <#list hugeResultSet?map(it -> something(it)) as it>
> >
> > and thus consuming a lot of memory without realizing it. So now if
> > hugeResultSet wasn't already a sequence (List-like), the assignment
> > will be an error, since we can't safely store a lazily transformed
> > collection (lambdas will break), and we can't condense it down to a
> > sequence (List-like thing) automatically either, as that might
> > consumes too much memory. If hugeResultSet was a sequence, then it's
> > not an error, as we assume that keeping all of it in memory is fine,
> > as the original was stored there as well (in practice, most of the
> > times... in principle we can't know).
> >
> > Now if the user feels confident about it, they can still write:
> >
> >   <#assign transformed = hugeResultSet?map(it -> something(it))?sequence>
> >
> > Similarly, hugeResultSet?map(it -> something(it))[index] will be an
> > error, as [index] is for sequences only, and ?map will not change a
> > non-sequence to a sequence anymore. Similarly, if the user feels
> > confident about it, they can write hugeResultSet?map(it ->
> > something(it))?sequence[index].
> >
> > An interesting consequence of these is that ?sequence is now a bit
> > smarter than before. Like if you write myIterator?sequnce[n], it will
> > not fetch the elements into an in-memory sequence, it just skips n
> > elements from myIterators, and returns the nth one. Similarly,
> > myIterator?sequence?size won't store the elements in memory, it just
> > counts them.
> >
> > As an interesting note, these two are also identically efficient:
> >
> >   <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
> >   <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>
> >
> > In both cases the actual conversion to a sequence (in-memory list)
> > happens only just before assigning the value to seq. Once again,
> > ?sequence now just means "it's OK to treat this as a sequence, however
> > inefficient it is", and not "convert it to sequence right now".
> >
> >
> > Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:
> >
> >> These optimisations sound great. I will try to run some tests within the
> >> next weeks. A bit busy lately.
> >> Thanks
> >> Christoph
> >>
> >> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <
> ddekany@apache.org
> >>>:
> >>
> >>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
> >>>
> >>> [snip]
> >>> >> Well, if you fear users jumping on ?filter/?map outside #list for no
> >>> >> good enough reason, there can be some option to handle that. But I
> >>> >> don't think restricting the usage to #list is a good compromise as
> the
> >>> >> default.
> >>> >
> >>> > I agree. Just keep as it is.
> >>> >
> >>> >> >> I'm not sure how efficiently could a configuration setting catch
> >>> these
> >>> >> >> cases, or if it should be addressed on that level.
> >>> >> >
> >>> >> > Maybe let's postpone configurability discussion a bit until the
> above
> >>> is
> >>> >> > more clear.
> >>> >>
> >>> >> In the light of the above, I think we can start thinking about that
> >>> >> now.
> >>> >
> >>> > On that note on configurability: Would it be possible to
> programmatically
> >>> > influence the Collection (Sequence) which is created under the hood?
> >>> > E.g. by specifying a Factory? I ask because we are using something
> like
> >>> > this (
> >>> >
> >>>
> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
> >>> )
> >>> > in other places for large collections. I know it is very specific,
> but
> >>> just
> >>> > wanted to bring it up.
> >>> [snip]
> >>>
> >>> I think a good approach would be to ban the *implicit* collection of
> >>> the result, when the filtered/mapped source is an Iterator, or other
> >>> similar stream-like object that's often used for enumerating a huge
> >>> number of elements. So for example, let's say you have this:
> >>>
> >>>   <#assign xs2 = xs?filter(f)>
> >>>
> >>> If xs is List-like, then this will work. Since the xs List fits into
> >>> the memory (although a List can be backed by disk, that's rather
> >>> rare), hopefully it's not the kind of data amount that can't fit into
> >>> the memory again (as xs2). On the other hand, if xs is an
> >>> Iterator-like object, then the above statement fails, with the hint
> >>> that xs?filter(f)?sequence would work, but might consumes a lot of
> >>> memory.
> >>>
> >>> This is also consistent with how xs[i] works in the existing
> >>> FreeMarker versions. That only works if xs is List-like (an FTL
> >>> sequence). While xs[i] would be trivial to implement even if xs is
> >>> Iterator-like, we don't do that as it's not efficient for a high i,
> >>> and so the template author is probably not meant to do that. If he
> >>> knows what's he doing though, he can write xs?sequence[i]. Yes, that's
> >>> very inefficient if you only use [] once on that sequence, but you see
> >>> the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
> >>> is an Iterator, because filter/map currently always returns a
> >>> sequence. If xs is Iteartor-like, then I want filter/map to return an
> >>> Iterator-like as well, so then [] will fail on it.
> >>>
> >>> As a side note, I will make ?sequence smarter too, so that
> >>> xs?sequence[i] won't actually build a sequence if xs is Iterator-like.
> >>> It just have to skip the first i elements after all. (The ?sequence is
> >>> still required there. It basically says: "I know what I'm doing, treat
> >>> this as a sequence.")
> >>>
> >>> --
> >>> Thanks,
> >>>  Daniel Dekany
> >>>
> >>>
> >>
> >
>
> --
> Thanks,
>  Daniel Dekany
>
>

-- 
Christoph Rüger, Geschäftsführer
Synesty <https://synesty.com/> - Anbinden und Automatisieren ohne
Programmieren - Automatisierung, Schnittstellen, Datenfeeds

Xing: https://www.xing.com/profile/Christoph_Rueger2
LinkedIn: http://www.linkedin.com/pub/christoph-rueger/a/685/198

-- 
Synesty GmbH
Moritz-von-Rohr-Str. 1a
07745 Jena
Tel.: +49 3641 
5596493Internet: https://synesty.com <https://synesty.com>
Informationen 
zum Datenschutz: https://synesty.com/datenschutz 
<https://synesty.com/datenschutz>

Geschäftsführer: Christoph Rüger

Unternehmenssitz: Jena
Handelsregister B beim Amtsgericht: Jena

Handelsregister-Nummer: HRB 508766
Ust-IdNr.: DE287564982

Re: Lambda Expressions - filter list without <#list> directive

Posted by Daniel Dekany <dd...@freemail.hu>.

Since then I have also made a change that ensures that if the lambda
argument is null (which in FTL is the same as if the variable isn't
there at all), then it will not fall back to find an identically named
variable in higher variable scopes. This is important when doing
things like:

  <#-- filters out null-s -->
  myList?filter(it -> it??)

because if some day someone adds a variable called "it" to the
data-model, then suddenly the above won't filter out the null-s.

The same thing was always an issue with #list loop variables as well,
also with #nested arguments. So I have added a configuration setting
called "fallbackOnNullLoopVariable", which is by default true
(unfortunate historical baggage... but we can't break backward
compatibility). If you set it to false, then this will print "N/A" at
null list items, rather than "Clashing variable in higher scope":

<#assign it = "Clashing variable in higher scope">
<#list myList as it>
  ${it!'N/A'}
</#list>

These changes are pushed and deployed to the Apache snapshot Maven
repo in both branches.


So, apart from documentation, the local lambda feature is about ready,
or so I hope. I'm worried of rough edges though, so I think I will add
lambda support to some more builtins (?seq_contains, ?sort_by), and
explore some more use cases... If you have your own that you actually
keep running into, or want to be in the 2.3.29, tell it.


Monday, June 24, 2019, 1:59:21 AM, Daniel Dekany wrote:

> Well, I'm not exactly fast nowadays either... Anyway, I have pushed
> and deployed to the snapshot repo the changes I was talking about
> recently. That is, ?map or ?filter won't make a sequence out of an
> enumerable non-sequence (typically an Iterator) anymore. Because, it
> was the concern that if hugeResultSet is an Iterator because it's
> huge, then someone might writes:
>
>   <#assign transformed = hugeResultSet?map(it -> something(it))>
>   <#list transformed as it>
>
> instead of just
>
>   <#list hugeResultSet?map(it -> something(it)) as it>
>
> and thus consuming a lot of memory without realizing it. So now if
> hugeResultSet wasn't already a sequence (List-like), the assignment
> will be an error, since we can't safely store a lazily transformed
> collection (lambdas will break), and we can't condense it down to a
> sequence (List-like thing) automatically either, as that might
> consumes too much memory. If hugeResultSet was a sequence, then it's
> not an error, as we assume that keeping all of it in memory is fine,
> as the original was stored there as well (in practice, most of the
> times... in principle we can't know).
>
> Now if the user feels confident about it, they can still write:
>
>   <#assign transformed = hugeResultSet?map(it -> something(it))?sequence>
>
> Similarly, hugeResultSet?map(it -> something(it))[index] will be an
> error, as [index] is for sequences only, and ?map will not change a
> non-sequence to a sequence anymore. Similarly, if the user feels
> confident about it, they can write hugeResultSet?map(it ->
> something(it))?sequence[index].
>
> An interesting consequence of these is that ?sequence is now a bit
> smarter than before. Like if you write myIterator?sequnce[n], it will
> not fetch the elements into an in-memory sequence, it just skips n
> elements from myIterators, and returns the nth one. Similarly,
> myIterator?sequence?size won't store the elements in memory, it just
> counts them.
>
> As an interesting note, these two are also identically efficient:
>
>   <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
>   <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>
>
> In both cases the actual conversion to a sequence (in-memory list)
> happens only just before assigning the value to seq. Once again,
> ?sequence now just means "it's OK to treat this as a sequence, however
> inefficient it is", and not "convert it to sequence right now".
>
>
> Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:
>
>> These optimisations sound great. I will try to run some tests within the
>> next weeks. A bit busy lately.
>> Thanks
>> Christoph
>>
>> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <ddekany@apache.org
>>>:
>>
>>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
>>>
>>> [snip]
>>> >> Well, if you fear users jumping on ?filter/?map outside #list for no
>>> >> good enough reason, there can be some option to handle that. But I
>>> >> don't think restricting the usage to #list is a good compromise as the
>>> >> default.
>>> >
>>> > I agree. Just keep as it is.
>>> >
>>> >> >> I'm not sure how efficiently could a configuration setting catch
>>> these
>>> >> >> cases, or if it should be addressed on that level.
>>> >> >
>>> >> > Maybe let's postpone configurability discussion a bit until the above
>>> is
>>> >> > more clear.
>>> >>
>>> >> In the light of the above, I think we can start thinking about that
>>> >> now.
>>> >
>>> > On that note on configurability: Would it be possible to programmatically
>>> > influence the Collection (Sequence) which is created under the hood?
>>> > E.g. by specifying a Factory? I ask because we are using something like
>>> > this (
>>> >
>>> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
>>> )
>>> > in other places for large collections. I know it is very specific, but
>>> just
>>> > wanted to bring it up.
>>> [snip]
>>>
>>> I think a good approach would be to ban the *implicit* collection of
>>> the result, when the filtered/mapped source is an Iterator, or other
>>> similar stream-like object that's often used for enumerating a huge
>>> number of elements. So for example, let's say you have this:
>>>
>>>   <#assign xs2 = xs?filter(f)>
>>>
>>> If xs is List-like, then this will work. Since the xs List fits into
>>> the memory (although a List can be backed by disk, that's rather
>>> rare), hopefully it's not the kind of data amount that can't fit into
>>> the memory again (as xs2). On the other hand, if xs is an
>>> Iterator-like object, then the above statement fails, with the hint
>>> that xs?filter(f)?sequence would work, but might consumes a lot of
>>> memory.
>>>
>>> This is also consistent with how xs[i] works in the existing
>>> FreeMarker versions. That only works if xs is List-like (an FTL
>>> sequence). While xs[i] would be trivial to implement even if xs is
>>> Iterator-like, we don't do that as it's not efficient for a high i,
>>> and so the template author is probably not meant to do that. If he
>>> knows what's he doing though, he can write xs?sequence[i]. Yes, that's
>>> very inefficient if you only use [] once on that sequence, but you see
>>> the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
>>> is an Iterator, because filter/map currently always returns a
>>> sequence. If xs is Iteartor-like, then I want filter/map to return an
>>> Iterator-like as well, so then [] will fail on it.
>>>
>>> As a side note, I will make ?sequence smarter too, so that
>>> xs?sequence[i] won't actually build a sequence if xs is Iterator-like.
>>> It just have to skip the first i elements after all. (The ?sequence is
>>> still required there. It basically says: "I know what I'm doing, treat
>>> this as a sequence.")
>>>
>>> --
>>> Thanks,
>>>  Daniel Dekany
>>>
>>>
>>
>

-- 
Thanks,
 Daniel Dekany

Re: Lambda Expressions - filter list without <#list> directive

Posted by Daniel Dekany <dd...@apache.org>.

Well, I'm not exactly fast nowadays either... Anyway, I have pushed
and deployed to the snapshot repo the changes I was talking about
recently. That is, ?map or ?filter won't make a sequence out of an
enumerable non-sequence (typically an Iterator) anymore. Because, it
was the concern that if hugeResultSet is an Iterator because it's
huge, then someone might writes:

  <#assign transformed = hugeResultSet?map(it -> something(it))>
  <#list transformed as it>

instead of just

  <#list hugeResultSet?map(it -> something(it)) as it>

and thus consuming a lot of memory without realizing it. So now if
hugeResultSet wasn't already a sequence (List-like), the assignment
will be an error, since we can't safely store a lazily transformed
collection (lambdas will break), and we can't condense it down to a
sequence (List-like thing) automatically either, as that might
consumes too much memory. If hugeResultSet was a sequence, then it's
not an error, as we assume that keeping all of it in memory is fine,
as the original was stored there as well (in practice, most of the
times... in principle we can't know).

Now if the user feels confident about it, they can still write:

  <#assign transformed = hugeResultSet?map(it -> something(it))?sequence>

Similarly, hugeResultSet?map(it -> something(it))[index] will be an
error, as [index] is for sequences only, and ?map will not change a
non-sequence to a sequence anymore. Similarly, if the user feels
confident about it, they can write hugeResultSet?map(it ->
something(it))?sequence[index].

An interesting consequence of these is that ?sequence is now a bit
smarter than before. Like if you write myIterator?sequnce[n], it will
not fetch the elements into an in-memory sequence, it just skips n
elements from myIterators, and returns the nth one. Similarly,
myIterator?sequence?size won't store the elements in memory, it just
counts them.

As an interesting note, these two are also identically efficient:

  <#assign seq = hugeResultSet?filter(it -> something(it))?sequence>
  <#assign seq = hugeResultSet?sequence?filter(it -> something(it))>

In both cases the actual conversion to a sequence (in-memory list)
happens only just before assigning the value to seq. Once again,
?sequence now just means "it's OK to treat this as a sequence, however
inefficient it is", and not "convert it to sequence right now".

Friday, June 7, 2019, 10:38:50 AM, Christoph Rüger wrote:

> These optimisations sound great. I will try to run some tests within the
> next weeks. A bit busy lately.
> Thanks
> Christoph
>
> Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <ddekany@apache.org
>>:
>
>> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
>>
>> [snip]
>> >> Well, if you fear users jumping on ?filter/?map outside #list for no
>> >> good enough reason, there can be some option to handle that. But I
>> >> don't think restricting the usage to #list is a good compromise as the
>> >> default.
>> >
>> > I agree. Just keep as it is.
>> >
>> >> >> I'm not sure how efficiently could a configuration setting catch
>> these
>> >> >> cases, or if it should be addressed on that level.
>> >> >
>> >> > Maybe let's postpone configurability discussion a bit until the above
>> is
>> >> > more clear.
>> >>
>> >> In the light of the above, I think we can start thinking about that
>> >> now.
>> >
>> > On that note on configurability: Would it be possible to programmatically
>> > influence the Collection (Sequence) which is created under the hood?
>> > E.g. by specifying a Factory? I ask because we are using something like
>> > this (
>> >
>> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
>> )
>> > in other places for large collections. I know it is very specific, but
>> just
>> > wanted to bring it up.
>> [snip]
>>
>> I think a good approach would be to ban the *implicit* collection of
>> the result, when the filtered/mapped source is an Iterator, or other
>> similar stream-like object that's often used for enumerating a huge
>> number of elements. So for example, let's say you have this:
>>
>>   <#assign xs2 = xs?filter(f)>
>>
>> If xs is List-like, then this will work. Since the xs List fits into
>> the memory (although a List can be backed by disk, that's rather
>> rare), hopefully it's not the kind of data amount that can't fit into
>> the memory again (as xs2). On the other hand, if xs is an
>> Iterator-like object, then the above statement fails, with the hint
>> that xs?filter(f)?sequence would work, but might consumes a lot of
>> memory.
>>
>> This is also consistent with how xs[i] works in the existing
>> FreeMarker versions. That only works if xs is List-like (an FTL
>> sequence). While xs[i] would be trivial to implement even if xs is
>> Iterator-like, we don't do that as it's not efficient for a high i,
>> and so the template author is probably not meant to do that. If he
>> knows what's he doing though, he can write xs?sequence[i]. Yes, that's
>> very inefficient if you only use [] once on that sequence, but you see
>> the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
>> is an Iterator, because filter/map currently always returns a
>> sequence. If xs is Iteartor-like, then I want filter/map to return an
>> Iterator-like as well, so then [] will fail on it.
>>
>> As a side note, I will make ?sequence smarter too, so that
>> xs?sequence[i] won't actually build a sequence if xs is Iterator-like.
>> It just have to skip the first i elements after all. (The ?sequence is
>> still required there. It basically says: "I know what I'm doing, treat
>> this as a sequence.")
>>
>> --
>> Thanks,
>>  Daniel Dekany
>>
>>
>

-- 
Thanks,
 Daniel Dekany

Re: Lambda Expressions - filter list without <#list> directive

Posted by Christoph Rüger <c....@synesty.com>.

These optimisations sound great. I will try to run some tests within the
next weeks. A bit busy lately.
Thanks
Christoph

Am Mi., 29. Mai 2019 um 23:55 Uhr schrieb Daniel Dekany <ddekany@apache.org
>:

> Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:
>
> [snip]
> >> Well, if you fear users jumping on ?filter/?map outside #list for no
> >> good enough reason, there can be some option to handle that. But I
> >> don't think restricting the usage to #list is a good compromise as the
> >> default.
> >
> > I agree. Just keep as it is.
> >
> >> >> I'm not sure how efficiently could a configuration setting catch
> these
> >> >> cases, or if it should be addressed on that level.
> >> >
> >> > Maybe let's postpone configurability discussion a bit until the above
> is
> >> > more clear.
> >>
> >> In the light of the above, I think we can start thinking about that
> >> now.
> >
> > On that note on configurability: Would it be possible to programmatically
> > influence the Collection (Sequence) which is created under the hood?
> > E.g. by specifying a Factory? I ask because we are using something like
> > this (
> >
> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections
> )
> > in other places for large collections. I know it is very specific, but
> just
> > wanted to bring it up.
> [snip]
>
> I think a good approach would be to ban the *implicit* collection of
> the result, when the filtered/mapped source is an Iterator, or other
> similar stream-like object that's often used for enumerating a huge
> number of elements. So for example, let's say you have this:
>
>   <#assign xs2 = xs?filter(f)>
>
> If xs is List-like, then this will work. Since the xs List fits into
> the memory (although a List can be backed by disk, that's rather
> rare), hopefully it's not the kind of data amount that can't fit into
> the memory again (as xs2). On the other hand, if xs is an
> Iterator-like object, then the above statement fails, with the hint
> that xs?filter(f)?sequence would work, but might consumes a lot of
> memory.
>
> This is also consistent with how xs[i] works in the existing
> FreeMarker versions. That only works if xs is List-like (an FTL
> sequence). While xs[i] would be trivial to implement even if xs is
> Iterator-like, we don't do that as it's not efficient for a high i,
> and so the template author is probably not meant to do that. If he
> knows what's he doing though, he can write xs?sequence[i]. Yes, that's
> very inefficient if you only use [] once on that sequence, but you see
> the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
> is an Iterator, because filter/map currently always returns a
> sequence. If xs is Iteartor-like, then I want filter/map to return an
> Iterator-like as well, so then [] will fail on it.
>
> As a side note, I will make ?sequence smarter too, so that
> xs?sequence[i] won't actually build a sequence if xs is Iterator-like.
> It just have to skip the first i elements after all. (The ?sequence is
> still required there. It basically says: "I know what I'm doing, treat
> this as a sequence.")
>
> --
> Thanks,
>  Daniel Dekany
>
>

-- 
Synesty GmbH
Moritz-von-Rohr-Str. 1a
07745 Jena
Tel.: +49 3641 
5596493Internet: https://synesty.com <https://synesty.com>
Informationen 
zum Datenschutz: https://synesty.com/datenschutz 
<https://synesty.com/datenschutz>

Geschäftsführer: Christoph Rüger

Unternehmenssitz: Jena
Handelsregister B beim Amtsgericht: Jena

Handelsregister-Nummer: HRB 508766
Ust-IdNr.: DE287564982

Re: Lambda Expressions - filter list without <#list> directive

Posted by Daniel Dekany <dd...@apache.org>.

Tuesday, April 2, 2019, 12:10:16 PM, Christoph Rüger wrote:

[snip]
>> Well, if you fear users jumping on ?filter/?map outside #list for no
>> good enough reason, there can be some option to handle that. But I
>> don't think restricting the usage to #list is a good compromise as the
>> default.
>
> I agree. Just keep as it is.
>
>> >> I'm not sure how efficiently could a configuration setting catch these
>> >> cases, or if it should be addressed on that level.
>> >
>> > Maybe let's postpone configurability discussion a bit until the above is
>> > more clear.
>>
>> In the light of the above, I think we can start thinking about that
>> now.
>
> On that note on configurability: Would it be possible to programmatically
> influence the Collection (Sequence) which is created under the hood?
> E.g. by specifying a Factory? I ask because we are using something like
> this (
> https://dzone.com/articles/a-filebasedcollection-in-java-for-big-collections)
> in other places for large collections. I know it is very specific, but just
> wanted to bring it up.
[snip]

I think a good approach would be to ban the *implicit* collection of
the result, when the filtered/mapped source is an Iterator, or other
similar stream-like object that's often used for enumerating a huge
number of elements. So for example, let's say you have this:

  <#assign xs2 = xs?filter(f)>

If xs is List-like, then this will work. Since the xs List fits into
the memory (although a List can be backed by disk, that's rather
rare), hopefully it's not the kind of data amount that can't fit into
the memory again (as xs2). On the other hand, if xs is an
Iterator-like object, then the above statement fails, with the hint
that xs?filter(f)?sequence would work, but might consumes a lot of
memory.

This is also consistent with how xs[i] works in the existing
FreeMarker versions. That only works if xs is List-like (an FTL
sequence). While xs[i] would be trivial to implement even if xs is
Iterator-like, we don't do that as it's not efficient for a high i,
and so the template author is probably not meant to do that. If he
knows what's he doing though, he can write xs?sequence[i]. Yes, that's
very inefficient if you only use [] once on that sequence, but you see
the logic. map/filter breaks it, as xs?filter(f)[i] works even if xs
is an Iterator, because filter/map currently always returns a
sequence. If xs is Iteartor-like, then I want filter/map to return an
Iterator-like as well, so then [] will fail on it.

As a side note, I will make ?sequence smarter too, so that
xs?sequence[i] won't actually build a sequence if xs is Iterator-like.
It just have to skip the first i elements after all. (The ?sequence is
still required there. It basically says: "I know what I'm doing, treat
this as a sequence.")

-- 
Thanks,
 Daniel Dekany