You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Kristian Rosenvold <kr...@apache.org> on 2014/06/25 09:46:12 UTC

Java8 and iterators

I've been acquainting myself with the jena code base by trying to find
out how to add idiomatically correct java8-style iteration to jena
(while staying java7 compatible).

One thing I am wondering about is the logic to attempt consistent
iterations over changing models for the the iterators that are
returned by for instance listStatements (as opposed to throwing
ConcurrentModificationException when the underlying model changes). I
can certainly see that it has some interesting merits when it comes to
analytical manipulation/modification of models, but even on my first
attempt at actually modifying a model I was iterating over it fell
over because the consistency is only partial. Is there any
documentation as to what operations are permitted/safe/unsafe while
iterating ?

Futhermore I observed that there is a fairly substantial cost to
maintaining this style of iteration. Using a simple flyweight to
implement "Statement" I can easily make them 3x faster than the
current iterators. Our application is really only running in 2
"modes"; rdf creation or analyzing immutable rdf data. When analyzing
immutable data the current implementation really seem to generate a
lot of hot air, and I'm tempted to suggest that 2 different modes of
iteration could be supported; fail-fast and "consistent".

Currently I'm just dabbling around with the code (can be seen here
https://github.com/krosenvold/jena/compare/java8iterators) and I
expect to keep on with that for some time, but I'd appreciate your
thoughts on this topic :)

Kristian

Re: Java8 and iterators

Posted by Andy Seaborne <an...@apache.org>.
On 25/06/14 09:26, Claude Warren wrote:
> I believe that for the in memory models the iterators fail if the model is
> modified,  However for TDB (Andy ,keep me honest here) if you are using
> transactions you can modify the graph and the iterator will not fail.

Iterators in TDB fail if they are inside the same transaction and the 
data changes.  It's not even on a "maybe basis" - they check and if the 
data changes at all, the next call on the iterator will fail.

Across transaction everything is safe - essentially, the two 
transactions see different data.

---

ARQ etc tends to use Iter and it's operations - they are fairly 
Stream<T> like in behaviour.

* .remove is generally unsupported
* execution is whole iterator, (no passing half consumed iterators 
around) c.f. Stream terminal operations.

	Andy

> I don't know what SDB used to do, but I suspect that it allowed you to
> modify the model while iterating.
>
> I believe that TDB and SDB do this because the query creates a copy of the
> data in the result that is returned, so it is "detached" from the data
> store.
>
> I found this to be a problem in the past as well.  The application side can
> solve the problem by calling toList() on the extended iterator and then
> iterating over that.  Of course this won't work if you are attempting to
> write code that will work against either style iterator.  So a mechanism to
> determine if the iterators for a model are "fail-fast" or "consistent" for
> the current state of the model.  For example using transactions may make
> the iterator "consistent" while the same model without transactions might
> be "fail-fast".
>
> Claude
>
>
> On Wed, Jun 25, 2014 at 9:17 AM, Chris Dollin <ch...@epimorphics.com>
> wrote:
>
>> On Wednesday, June 25, 2014 09:46:12 AM Kristian Rosenvold wrote:
>>> I've been acquainting myself with the jena code base by trying to find
>>> out how to add idiomatically correct java8-style iteration to jena
>>> (while staying java7 compatible).
>>>
>>> ..., but even on my first
>>> attempt at actually modifying a model I was iterating over ...
>>
>> The tradition in Java and Jena has been: don't do that. Else BOOM.
>>
>> Has the Java approach changed in java8?
>>
>>> it fell
>>> over because the consistency is only partial.
>>
>> Could you be more specific about what you did and what happened?
>>
>>> Is there any
>>> documentation as to what operations are permitted/safe/unsafe while
>>> iterating ?
>>
>> "Don't change the model and continue iterating."
>>
>> At the time Jena iterators were designed/built, Java iterators over
>> collections had that same restriction; it avoids the inconvenience
>> of defining and implementing some less explosive behaviour.
>>
>> If we were to have different modes -- fail-fast vs consistent -- I
>> think they'd have to be visible in the type system, which in turn
>> suggests that we'd have two versions of every iterator, which does
>> not feel like a good thing. But maybe there's something better
>> we can do?
>>
>> Chris
>>
>> --
>> "How am I to understand if you won't teach me?"             - Trippa,
>> /Falling/
>>
>> Epimorphics Ltd, http://www.epimorphics.com
>> Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20
>> 6PT
>> Epimorphics Ltd. is a limited company registered in England (number
>> 7016688)
>>
>>
>
>


Re: Java8 and iterators

Posted by Claude Warren <cl...@xenei.com>.
I believe that for the in memory models the iterators fail if the model is
modified,  However for TDB (Andy ,keep me honest here) if you are using
transactions you can modify the graph and the iterator will not fail.

I don't know what SDB used to do, but I suspect that it allowed you to
modify the model while iterating.

I believe that TDB and SDB do this because the query creates a copy of the
data in the result that is returned, so it is "detached" from the data
store.

I found this to be a problem in the past as well.  The application side can
solve the problem by calling toList() on the extended iterator and then
iterating over that.  Of course this won't work if you are attempting to
write code that will work against either style iterator.  So a mechanism to
determine if the iterators for a model are "fail-fast" or "consistent" for
the current state of the model.  For example using transactions may make
the iterator "consistent" while the same model without transactions might
be "fail-fast".

Claude


On Wed, Jun 25, 2014 at 9:17 AM, Chris Dollin <ch...@epimorphics.com>
wrote:

> On Wednesday, June 25, 2014 09:46:12 AM Kristian Rosenvold wrote:
> > I've been acquainting myself with the jena code base by trying to find
> > out how to add idiomatically correct java8-style iteration to jena
> > (while staying java7 compatible).
> >
> > ..., but even on my first
> > attempt at actually modifying a model I was iterating over ...
>
> The tradition in Java and Jena has been: don't do that. Else BOOM.
>
> Has the Java approach changed in java8?
>
> > it fell
> > over because the consistency is only partial.
>
> Could you be more specific about what you did and what happened?
>
> > Is there any
> > documentation as to what operations are permitted/safe/unsafe while
> > iterating ?
>
> "Don't change the model and continue iterating."
>
> At the time Jena iterators were designed/built, Java iterators over
> collections had that same restriction; it avoids the inconvenience
> of defining and implementing some less explosive behaviour.
>
> If we were to have different modes -- fail-fast vs consistent -- I
> think they'd have to be visible in the type system, which in turn
> suggests that we'd have two versions of every iterator, which does
> not feel like a good thing. But maybe there's something better
> we can do?
>
> Chris
>
> --
> "How am I to understand if you won't teach me?"             - Trippa,
> /Falling/
>
> Epimorphics Ltd, http://www.epimorphics.com
> Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20
> 6PT
> Epimorphics Ltd. is a limited company registered in England (number
> 7016688)
>
>


-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Java8 and iterators

Posted by Chris Dollin <ch...@epimorphics.com>.
On Wednesday, June 25, 2014 09:46:12 AM Kristian Rosenvold wrote:
> I've been acquainting myself with the jena code base by trying to find
> out how to add idiomatically correct java8-style iteration to jena
> (while staying java7 compatible).
> 
> ..., but even on my first
> attempt at actually modifying a model I was iterating over ...

The tradition in Java and Jena has been: don't do that. Else BOOM.

Has the Java approach changed in java8?

> it fell
> over because the consistency is only partial.

Could you be more specific about what you did and what happened?

> Is there any
> documentation as to what operations are permitted/safe/unsafe while
> iterating ?

"Don't change the model and continue iterating."
 
At the time Jena iterators were designed/built, Java iterators over
collections had that same restriction; it avoids the inconvenience 
of defining and implementing some less explosive behaviour.

If we were to have different modes -- fail-fast vs consistent -- I
think they'd have to be visible in the type system, which in turn
suggests that we'd have two versions of every iterator, which does
not feel like a good thing. But maybe there's something better
we can do?

Chris

-- 
"How am I to understand if you won't teach me?"             - Trippa, /Falling/

Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)