You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Daniel Ferreira (theiostream)" <bn...@gmail.com> on 2015/04/11 18:33:38 UTC

TDB and Google App Engine

I'm attempting to get my Java website that uses TDB for caching online.
However, for it to work for my purposes on GAE I'd have to adapt the TDB
library for connecting with the Google Cloud Storage API instead of just
reading stuff from the filesystem.

Has anyone tried doing this before, or is patching TDB the only choice in
my case?

Thank you,
Daniel Ferreira.

Re: TDB and Google App Engine

Posted by Claude Warren <cl...@xenei.com>.
Another option is to use my BloomGraph implementation (
https://github.com/Claudenw/BloomGraph) while this is expiremental, it does
appear to be slightly faster than SDB for large data sets.  It runs on top
of mysql currently but other implementations would be fairly easy.  It also
has an in-memory version.



On Mon, Apr 13, 2015 at 12:47 AM, Stian Soiland-Reyes <st...@apache.org>
wrote:

> I can only think of perhaps an in-memory layer and something clever with
> the transaction log.. It would add a 700 MB load to the startup time
> though.
>
> But how would you even get this transaction log distributed with the App
> Engine?
>
> For me going down SDB route sounds like a much easier plumbing job, rather
> than building a new App Engine storage layer into TDB.
>
> Perhaps a good question is what you are looking for in TDB.. For instance
> do you mainly do reads or updates? Are data modifications just adding or
> also removing or changing?
>
> What is the tie to the Google App Engine, is it just $cloud or already
> bound with the remaining architecture?
>
> Would it be worth having a look at Hadoop and Jena Elephas? For nothing
> else, just looking at the storage bit in Elephas could be useful.
> On 12 Apr 2015 19:18, "Andy Seaborne" <an...@apache.org> wrote:
>
> > On 12/04/15 15:21, Daniel Ferreira (theiostream) wrote:
> >
> >> The amount of data is around a 700MB TDB directory.
> >>
> >
> > OK - so not trivial but not gigantic.
> >
> > (I usually think in number of triples and quads)
> >
> >>
> >> The only interface Google Cloud Storage provides in its Java API (
> >> https://cloud.google.com/appengine/docs/java/googlecloudstorageclient/
> >> javadoc/)
> >> is GcsInputChannel – which extends
> java.nio.Channels.ReadableByteChannel.
> >> Honestly, I'm not too familiar with Java IO to know whether it fits your
> >> description.
> >>
> >
> > Sorry - it seems that ReadableByteChannel isn't seekable.  It can only
> > read sequentially through a file.
> >
> > Someone else might see ways to do this but as far as I can see, that
> > reduces the options as far as can see to read everything locally
> (probably
> > not viable) and SDB over Google Cloud SQL.
> >
> >         Andy
> >
> >
> >> I'd also like to avoid using SDB and SQL as much as I can, unless
> patching
> >> TDB into being able to talk with Google Cloud Storage proves to be
> >> practically impossible.
> >>
> >> On Sun, Apr 12, 2015 at 10:17 AM, Andy Seaborne <an...@apache.org>
> wrote:
> >>
> >>  On 11/04/15 17:33, Daniel Ferreira (theiostream) wrote:
> >>>
> >>>  I'm attempting to get my Java website that uses TDB for caching
> online.
> >>>> However, for it to work for my purposes on GAE I'd have to adapt the
> TDB
> >>>> library for connecting with the Google Cloud Storage API instead of
> just
> >>>> reading stuff from the filesystem.
> >>>>
> >>>> Has anyone tried doing this before, or is patching TDB the only choice
> >>>> in
> >>>> my case?
> >>>>
> >>>> Thank you,
> >>>> Daniel Ferreira.
> >>>>
> >>>>
> >>>>  Hmm - tricky.
> >>>
> >>> The best option will depend on how much data you are planning for in
> the
> >>> database.
> >>>
> >>> TDB (direct mode) makes use of random access to the files - does Google
> >>> Cloud Storage offer that?  i.e. does it offer a seekable interface?  If
> >>> it
> >>> does, then adding another implementation of BlockAccess should work. If
> >>> not, it's a bit of a problem.
> >>>
> >>> The other possibility that occurs to me is to use SDB with the Google
> >>> Cloud SQL.  It's MySQL underneath and SDB has MySQL support.
> >>>
> >>>          Andy
> >>>
> >>>
> >>>
> >>
> >
>



-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: TDB and Google App Engine

Posted by Stian Soiland-Reyes <st...@apache.org>.
I can only think of perhaps an in-memory layer and something clever with
the transaction log.. It would add a 700 MB load to the startup time though.

But how would you even get this transaction log distributed with the App
Engine?

For me going down SDB route sounds like a much easier plumbing job, rather
than building a new App Engine storage layer into TDB.

Perhaps a good question is what you are looking for in TDB.. For instance
do you mainly do reads or updates? Are data modifications just adding or
also removing or changing?

What is the tie to the Google App Engine, is it just $cloud or already
bound with the remaining architecture?

Would it be worth having a look at Hadoop and Jena Elephas? For nothing
else, just looking at the storage bit in Elephas could be useful.
On 12 Apr 2015 19:18, "Andy Seaborne" <an...@apache.org> wrote:

> On 12/04/15 15:21, Daniel Ferreira (theiostream) wrote:
>
>> The amount of data is around a 700MB TDB directory.
>>
>
> OK - so not trivial but not gigantic.
>
> (I usually think in number of triples and quads)
>
>>
>> The only interface Google Cloud Storage provides in its Java API (
>> https://cloud.google.com/appengine/docs/java/googlecloudstorageclient/
>> javadoc/)
>> is GcsInputChannel – which extends java.nio.Channels.ReadableByteChannel.
>> Honestly, I'm not too familiar with Java IO to know whether it fits your
>> description.
>>
>
> Sorry - it seems that ReadableByteChannel isn't seekable.  It can only
> read sequentially through a file.
>
> Someone else might see ways to do this but as far as I can see, that
> reduces the options as far as can see to read everything locally (probably
> not viable) and SDB over Google Cloud SQL.
>
>         Andy
>
>
>> I'd also like to avoid using SDB and SQL as much as I can, unless patching
>> TDB into being able to talk with Google Cloud Storage proves to be
>> practically impossible.
>>
>> On Sun, Apr 12, 2015 at 10:17 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>>  On 11/04/15 17:33, Daniel Ferreira (theiostream) wrote:
>>>
>>>  I'm attempting to get my Java website that uses TDB for caching online.
>>>> However, for it to work for my purposes on GAE I'd have to adapt the TDB
>>>> library for connecting with the Google Cloud Storage API instead of just
>>>> reading stuff from the filesystem.
>>>>
>>>> Has anyone tried doing this before, or is patching TDB the only choice
>>>> in
>>>> my case?
>>>>
>>>> Thank you,
>>>> Daniel Ferreira.
>>>>
>>>>
>>>>  Hmm - tricky.
>>>
>>> The best option will depend on how much data you are planning for in the
>>> database.
>>>
>>> TDB (direct mode) makes use of random access to the files - does Google
>>> Cloud Storage offer that?  i.e. does it offer a seekable interface?  If
>>> it
>>> does, then adding another implementation of BlockAccess should work. If
>>> not, it's a bit of a problem.
>>>
>>> The other possibility that occurs to me is to use SDB with the Google
>>> Cloud SQL.  It's MySQL underneath and SDB has MySQL support.
>>>
>>>          Andy
>>>
>>>
>>>
>>
>

Re: TDB and Google App Engine

Posted by Andy Seaborne <an...@apache.org>.
On 12/04/15 15:21, Daniel Ferreira (theiostream) wrote:
> The amount of data is around a 700MB TDB directory.

OK - so not trivial but not gigantic.

(I usually think in number of triples and quads)
>
> The only interface Google Cloud Storage provides in its Java API (
> https://cloud.google.com/appengine/docs/java/googlecloudstorageclient/javadoc/)
> is GcsInputChannel – which extends java.nio.Channels.ReadableByteChannel.
> Honestly, I'm not too familiar with Java IO to know whether it fits your
> description.

Sorry - it seems that ReadableByteChannel isn't seekable.  It can only 
read sequentially through a file.

Someone else might see ways to do this but as far as I can see, that 
reduces the options as far as can see to read everything locally 
(probably not viable) and SDB over Google Cloud SQL.

	Andy

>
> I'd also like to avoid using SDB and SQL as much as I can, unless patching
> TDB into being able to talk with Google Cloud Storage proves to be
> practically impossible.
>
> On Sun, Apr 12, 2015 at 10:17 AM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 11/04/15 17:33, Daniel Ferreira (theiostream) wrote:
>>
>>> I'm attempting to get my Java website that uses TDB for caching online.
>>> However, for it to work for my purposes on GAE I'd have to adapt the TDB
>>> library for connecting with the Google Cloud Storage API instead of just
>>> reading stuff from the filesystem.
>>>
>>> Has anyone tried doing this before, or is patching TDB the only choice in
>>> my case?
>>>
>>> Thank you,
>>> Daniel Ferreira.
>>>
>>>
>> Hmm - tricky.
>>
>> The best option will depend on how much data you are planning for in the
>> database.
>>
>> TDB (direct mode) makes use of random access to the files - does Google
>> Cloud Storage offer that?  i.e. does it offer a seekable interface?  If it
>> does, then adding another implementation of BlockAccess should work. If
>> not, it's a bit of a problem.
>>
>> The other possibility that occurs to me is to use SDB with the Google
>> Cloud SQL.  It's MySQL underneath and SDB has MySQL support.
>>
>>          Andy
>>
>>
>


Re: TDB and Google App Engine

Posted by "Daniel Ferreira (theiostream)" <bn...@gmail.com>.
The amount of data is around a 700MB TDB directory.

The only interface Google Cloud Storage provides in its Java API (
https://cloud.google.com/appengine/docs/java/googlecloudstorageclient/javadoc/)
is GcsInputChannel – which extends java.nio.Channels.ReadableByteChannel.
Honestly, I'm not too familiar with Java IO to know whether it fits your
description.

I'd also like to avoid using SDB and SQL as much as I can, unless patching
TDB into being able to talk with Google Cloud Storage proves to be
practically impossible.

On Sun, Apr 12, 2015 at 10:17 AM, Andy Seaborne <an...@apache.org> wrote:

> On 11/04/15 17:33, Daniel Ferreira (theiostream) wrote:
>
>> I'm attempting to get my Java website that uses TDB for caching online.
>> However, for it to work for my purposes on GAE I'd have to adapt the TDB
>> library for connecting with the Google Cloud Storage API instead of just
>> reading stuff from the filesystem.
>>
>> Has anyone tried doing this before, or is patching TDB the only choice in
>> my case?
>>
>> Thank you,
>> Daniel Ferreira.
>>
>>
> Hmm - tricky.
>
> The best option will depend on how much data you are planning for in the
> database.
>
> TDB (direct mode) makes use of random access to the files - does Google
> Cloud Storage offer that?  i.e. does it offer a seekable interface?  If it
> does, then adding another implementation of BlockAccess should work. If
> not, it's a bit of a problem.
>
> The other possibility that occurs to me is to use SDB with the Google
> Cloud SQL.  It's MySQL underneath and SDB has MySQL support.
>
>         Andy
>
>

Re: TDB and Google App Engine

Posted by Andy Seaborne <an...@apache.org>.
On 11/04/15 17:33, Daniel Ferreira (theiostream) wrote:
> I'm attempting to get my Java website that uses TDB for caching online.
> However, for it to work for my purposes on GAE I'd have to adapt the TDB
> library for connecting with the Google Cloud Storage API instead of just
> reading stuff from the filesystem.
>
> Has anyone tried doing this before, or is patching TDB the only choice in
> my case?
>
> Thank you,
> Daniel Ferreira.
>

Hmm - tricky.

The best option will depend on how much data you are planning for in the 
database.

TDB (direct mode) makes use of random access to the files - does Google 
Cloud Storage offer that?  i.e. does it offer a seekable interface?  If 
it does, then adding another implementation of BlockAccess should work. 
If not, it's a bit of a problem.

The other possibility that occurs to me is to use SDB with the Google 
Cloud SQL.  It's MySQL underneath and SDB has MySQL support.

	Andy