You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by Kyle Dunn <kd...@pivotal.io> on 2017/03/13 23:57:23 UTC

Questions about filesystem / filespace / tablespace

Hello devs -

I'm doing some reading about HAWQ tablespaces here:
http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html

I want to understand the flow of things, please correct me on the following
assumptions:

1) Create a filesystem (not *really* supported after HAWQ init) - the
default is obviously [lib]HDFS[3]:
      SELECT * FROM pg_filesystem;

2) Create a filespace, referencing the above file system:
      CREATE FILESPACE testfs ON hdfs
      ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);

3) Create a tablespace, reference the above filespace:
      CREATE TABLESPACE fastspace FILESPACE testfs;

4) Create objects referencing the above table space, or set it as the
database's default:
      CREATE DATABASE testdb WITH TABLESPACE=testfs;

Given this set of steps, it it true (*in theory*) an arbitrary filesystem
(i.e. storage backend) could be added to HAWQ using *existing* APIs?

I realize the nuances of this are significant, but conceptually I'd like to
gather some details, mainly in support of this
<https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA discussion.
I'm daydreaming about whether this neat tool:
https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3 spike
(which also seems to kind of work on Google Cloud, when interoperability
<https://github.com/s3fs-fuse/s3fs-fuse/issues/109#issuecomment-286222694>
mode is enabled). By it's Linux FUSE nature, it implements the lion's share
of required pg_filesystem functions; in fact, maybe we could actually use
system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/>)
directly in this situation.

Curious to get some feedback.


Thanks,
Kyle
-- 
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io

Re: Questions about filesystem / filespace / tablespace

Posted by Kyle Dunn <kd...@pivotal.io>.
Good discussion here everyone, thank you for all the insight and background!

@Paul - my point for allowing tablespaces to be created with LOCATION
'/some/local/path' is mainly around the fact FUSE is just a special way to
mount things into the normal Linux VFS path. If we were able to support
that syntax, we could easily point a tablespace directly at a FUSE-mounted
location, no signifiant extra work in pg_filesystem, etc. (at least for a
demo)

@Ming - I've started working off a branch
<https://github.com/kdunn926/incubator-hawq/tree/pluggable-fuse/contrib/fusebackend>
in my local repo, any and all discussions and contributions are welcome!
The information about the call stack is very helpful in understanding where
we need to implement. I mostly guessed right, and hence have something that
"builds" but not quite yet functional.

The point about UPDATE is a very valid one. I actually tried "updating"
(random write) a file using the FUSE-mounted S3 bucket... it actually does
work, but I suspect under the hood it's re-uploading the whole thing. It's
too bad we don't have some ability to mark certain tables with the valid
syntax for them... or maybe we can use a similar strategy that prevents DML
against catalog tables, only allowing INSERT / DROP / TRUNCATE?

@Zhanwei
The background story about HAWQ / GPoH is quite interesting. It is
definitely a logical starting point to use FUSE, both then and now, I
think. I wonder how the performance penalty will affect this demo / spike.
In my mind, a big advantage to a design for pluggable storage, for cloud
storage in this case, is we have more performance potential if we cater to
more nodes/parallelism, rather than trying to maximize primarily for
per-node throughput. The inherent latency of networked file system like S3,
GS, and WASB is beyond our control so we could more easily accept an
inherent FUSE degradation, with the expectation you just instantiate more
nodes to compensate.

As I dig into this more, I do see much of your concerns arising - a lot of
specialization in the core HAWQ code around HDFS (data structures, methods,
etc). I'm still not deterred, but it is pretty clear there will be more
generalization required to fully expose this support in a robust way.



Hopefully we can keep this conversation going. I'll spend cycles on this
spike/demo as long as I can or until I'm told not to. :-)



Cheers,
Kyle

On Wed, Mar 15, 2017 at 7:05 AM Zhanwei Wang <ap...@wangzw.org> wrote:

> Hi Kyle
>
> Let me tell some history about HAWQ. It’s about six years ago…
>
> When we were starting design HAWQ. We first implemented a demo version of
> HAWQ, of cause it was not called HAWQ at that time. It was called GoH
> (Greenplum on HDFS).  The first implement is quite simple. We mount HDFS on
> local filesystem with FUSE and run GPDB on it. And quickly we found that
> the performance is unacceptable.
>
> And then we decided to replace the storage layer of GPDB to make it work
> with HDFS. And we implemented a “pluggable filesystem”  layer and added
> pg_filesystem object to GPDB. That was the HAWQ at about early 2012.
>
> At first we wanted to exactly adopt HDFS C API because it is almost the de
> facto standard but we found that it cannot meet our requirement. So based
> on HDFS C API we implement a wrapper of it as our API standard. Any dynamic
> library which implement this API can be loaded into HAWQ and register into
> pg_filesystem catalog, used to access file on target filesystem without
> modify HAWQ code.
>
> But this “pluggable filesystem” is never officially marked as a feature of
> HAWQ. We never tested it with other filesystem except HDFS. And as far as I
> known some new API was never added into pg_filesystem catalog due to
> history reason. So I do not think “pluggable filesystem” can work now
> without any change and bug fix.
>
> Pluggable filesystem is charming but unfortunately it was never get enough
> priority. And the previous design maybe not suitable anymore. I guess it is
> a good change to rethink how we can achieve this goal and make it happen.
>
>
>
>
> Zhanwei Wang
>
> HashData
> http://www.hashdata.cn
>
>
>
> > 在 2017年3月15日,下午6:18,Ming Li <ml...@pivotal.io> 写道:
> >
> > Hi Kyle,
> >
> >
> >      If we keep all these filesystem similar to hdfs, only support append
> > only, then then change must be much less. I think we can go ahead to
> > implement a demo for it if we have resource, we may encounter problems,
> but
> > we can find more solution/workaround for it.
> > --------------------
> >
> >      For your question about the relationship between 3 source code
> files,
> > below is my understanding (because the code is not written by me, my
> > opinion maybe not completely correct.)
> > (1) bin/gpfilesystem/hdfs/gpfshdfs.c -- implement all API used in hdfs
> > tuple in the catalog pg_filesystem, it will directly call API in libhdfs3
> > to access hdfs file system. The reason why make it a wrapper is to define
> > all these API as UDF, so that we can easily support similar filesystem by
> > adding a similar tuple in pg_filesystem, and add similar code as this
> file,
> > without changing any place calling these API. Also because they are UDF,
> we
> > can upgrade the old binary hawq to add new file system.
> > (2) backend/storage/file/filesystem.c -- because all API in (1) is in
> form
> > of UDF,  so we need a conversion if we want to directly call these API.
> > This file is responsible for converting normal hdfs calling in hawq
> kernel
> > to UDF calling.
> > (3) backend/storage/file/fd.c -- Because OS have file description open
> > number limitation, PostgreSQL/HAWQ will use a LUR buffer to cache all
> > opened file handlers. All hdfs API in this file also manage file handler
> > same as native file systems. These functions call API in (2) to interact
> > with hdfs.
> >
> >     In a word,  the calling stack is:  (3) --> (2) --> (1) --> libhdfs3
> > API.
> > -------------------
> >
> >    The last question about tablespace, PostgreSQL introduce it so that
> > user can set different tablespace to different paths, and these paths can
> > be mounted with different file system on linux. But all filesystems API
> are
> > the same, and the functionality are the same (supporting UPDATE in
> place).
> > So we cannot directly use tablespace to hand this scenario.  And also I
> > cannot guess how much effort needed because I did participate the hdfs
> file
> > system supporting in the hawq origin release.
> >
> >
> > That's my opinion, any correction or suggestion are welcomed! Hope it can
> > help you!  Thanks.
> >
> >
> > On Wed, Mar 15, 2017 at 11:07 AM, Paul Guo <pa...@gmail.com> wrote:
> >
> >> Hi Kyle,
> >>
> >> I'm not sure whether I understand your point correctly, but for FUSE
> which
> >> allows userspace file system implementation on Linux, users uses the
> >> filesystem (e.g. S3 in your example) as a block storage, accesses it via
> >> standard sys calls like open, close, read, write although some
> behaviours
> >> or sys call could probably be not supported. That means for query for
> FUSE
> >> fs, you are probably able to access them using the interfaces in fd.c
> >> directly (I'm not sure some hacking is needed), but for such kind of
> >> distributed file systems, compared with fuse access way, lib access is
> >> usually more encouraged since: 1) performance (You could search for the
> >> fuse theory to see the long fuse call paths which are added for file
> >> access) 2) stability (You add the fuse kernel part in your software
> stack
> >> and according to my experience it will be really painful to handle some
> >> exceptions). For such storage, I'd really prefer some other solutions,
> lib
> >> access like hawq or external table, whatever.
> >>
> >> Actually long long time ago I've seen fuse over hdfs on real production
> >> environment, so I'm actually curious whether someone have tried query it
> >> via this solution before and compared with the hawq for the performance,
> >> etc.
> >>
> >>
> >>
> >>
> >> 2017-03-15 1:26 GMT+08:00 Kyle Dunn <kd...@pivotal.io>:
> >>
> >>> Ming -
> >>>
> >>> Great points about append-only. One potential work-around is to split a
> >>> table over multiple backend storage objects, (a new file for each
> append
> >>> operation), Then, maybe as part of VACUUM, perform object compaction.
> For
> >>> GCP, the server-side compaction capability for objects is called
> compose
> >>> <https://cloud.google.com/storage/docs/gsutil/commands/compose>. For
> >> AWS,
> >>> you can emulate this behavior using Multipart upload
> >>> <http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html>
> -
> >>> demonstrated concretely with the Ruby SDK here
> >>> <https://aws.amazon.com/blogs/developer/efficient-amazon-s3-
> >>> object-concatenation-using-the-aws-sdk-for-ruby/>.
> >>> Azure actually supports append-blobs
> >>> <https://blogs.msdn.microsoft.com/windowsazurestorage/2015/
> >>> 04/13/introducing-azure-storage-append-blob/>
> >>> natively.
> >>>
> >>> For the FUSE exploration, can you (or anyone else) help me understand
> the
> >>> relationship and/or call graph between these different implementations?
> >>>
> >>>   - backend/storage/file/filesystem.c
> >>>   - bin/gpfilesystem/hdfs/gpfshdfs.c
> >>>   - backend/storage/file/fd.c
> >>>
> >>> I feel confident that everything HDFS-related ultimately uses
> >>> libhdfs3/src/client/Hdfs.cpp but it seems like a convoluted path for
> >>> getting there from the backend code.
> >>>
> >>> Also, it looks like normal Postgres allows tablespaces to be created
> like
> >>> this:
> >>>
> >>>      CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';
> >>>
> >>> This is much simpler than wrapping glibc calls and is exactly what
> would
> >> be
> >>> necessary if using FUSE modules + mount points to handle a "pluggable"
> >>> backend. Maybe you (or someone) can advise how much effort it would be
> to
> >>> bring "local:// FS" tablespace support back? It is potentially less
> than
> >>> trying to unravel all the HDFS-specific implementation scattered around
> >> the
> >>> backend code.
> >>>
> >>>
> >>> Thanks,
> >>> Kyle
> >>>
> >>> On Mon, Mar 13, 2017 at 8:35 PM Ming Li <ml...@pivotal.io> wrote:
> >>>
> >>>> Hi Kyle,
> >>>>
> >>>> Good investigation!
> >>>>
> >>>> I think we can add a similar tuple as hdfs in the pg_filesystem at
> >> first,
> >>>> then implement all API introduce in this tuple to call the FUSE API.
> >>>>
> >>>> However because HAWQ are designed for hdfs which means only
> append-only
> >>>> file system, so when we support other types of filesystem, we should
> >>>> investigate how to improve the performance and transaction issues. The
> >>>> performance can be investigate after we implement a demo, but the
> >>>> transaction issue should be decided before. Append only file system
> >> don't
> >>>> support UPDATE in place, and the inserted data are traced by file
> >> length
> >>> in
> >>>> pg_aoseg.pg_aoseg_xxxxx or pg_parquet.pg_parquet_xxxxx.
> >>>>
> >>>> Thanks.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Mar 14, 2017 at 7:57 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> >>>>
> >>>>> Hello devs -
> >>>>>
> >>>>> I'm doing some reading about HAWQ tablespaces here:
> >>>>> http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html
> >>>>>
> >>>>> I want to understand the flow of things, please correct me on the
> >>>> following
> >>>>> assumptions:
> >>>>>
> >>>>> 1) Create a filesystem (not *really* supported after HAWQ init) - the
> >>>>> default is obviously [lib]HDFS[3]:
> >>>>>      SELECT * FROM pg_filesystem;
> >>>>>
> >>>>> 2) Create a filespace, referencing the above file system:
> >>>>>      CREATE FILESPACE testfs ON hdfs
> >>>>>      ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);
> >>>>>
> >>>>> 3) Create a tablespace, reference the above filespace:
> >>>>>      CREATE TABLESPACE fastspace FILESPACE testfs;
> >>>>>
> >>>>> 4) Create objects referencing the above table space, or set it as the
> >>>>> database's default:
> >>>>>      CREATE DATABASE testdb WITH TABLESPACE=testfs;
> >>>>>
> >>>>> Given this set of steps, it it true (*in theory*) an arbitrary
> >>> filesystem
> >>>>> (i.e. storage backend) could be added to HAWQ using *existing* APIs?
> >>>>>
> >>>>> I realize the nuances of this are significant, but conceptually I'd
> >>> like
> >>>> to
> >>>>> gather some details, mainly in support of this
> >>>>> <https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA
> >>>> discussion.
> >>>>> I'm daydreaming about whether this neat tool:
> >>>>> https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3
> >> spike
> >>>>> (which also seems to kind of work on Google Cloud, when
> >>> interoperability
> >>>>> <
> >>>> https://github.com/s3fs-fuse/s3fs-fuse/issues/109#
> >> issuecomment-286222694
> >>>>
> >>>>> mode is enabled). By it's Linux FUSE nature, it implements the lion's
> >>>> share
> >>>>> of required pg_filesystem functions; in fact, maybe we could actually
> >>> use
> >>>>> system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/
> >>> )
> >>>>> directly in this situation.
> >>>>>
> >>>>> Curious to get some feedback.
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>> Kyle
> >>>>> --
> >>>>> *Kyle Dunn | Data Engineering | Pivotal*
> >>>>> Direct: 303.905.3171 <(303)%20905-3171> <(303)%20905-3171> <
> 3039053171 <(303)%20905-3171>
> >>> <(303)%20905-3171>>
> >>>> | Email: kdunn@pivotal.io
> >>>>>
> >>>>
> >>> --
> >>> *Kyle Dunn | Data Engineering | Pivotal*
> >>> Direct: 303.905.3171 <(303)%20905-3171> <3039053171 <(303)%20905-3171>>
> | Email: kdunn@pivotal.io
> >>>
> >>
>
> --
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io

Re: Questions about filesystem / filespace / tablespace

Posted by Zhanwei Wang <ap...@wangzw.org>.
Hi Kyle

Let me tell some history about HAWQ. It’s about six years ago…

When we were starting design HAWQ. We first implemented a demo version of HAWQ, of cause it was not called HAWQ at that time. It was called GoH (Greenplum on HDFS).  The first implement is quite simple. We mount HDFS on local filesystem with FUSE and run GPDB on it. And quickly we found that the performance is unacceptable. 

And then we decided to replace the storage layer of GPDB to make it work with HDFS. And we implemented a “pluggable filesystem”  layer and added pg_filesystem object to GPDB. That was the HAWQ at about early 2012.  

At first we wanted to exactly adopt HDFS C API because it is almost the de facto standard but we found that it cannot meet our requirement. So based on HDFS C API we implement a wrapper of it as our API standard. Any dynamic library which implement this API can be loaded into HAWQ and register into pg_filesystem catalog, used to access file on target filesystem without modify HAWQ code.

But this “pluggable filesystem” is never officially marked as a feature of HAWQ. We never tested it with other filesystem except HDFS. And as far as I known some new API was never added into pg_filesystem catalog due to history reason. So I do not think “pluggable filesystem” can work now without any change and bug fix.

Pluggable filesystem is charming but unfortunately it was never get enough priority. And the previous design maybe not suitable anymore. I guess it is a good change to rethink how we can achieve this goal and make it happen.




Zhanwei Wang

HashData
http://www.hashdata.cn



> 在 2017年3月15日,下午6:18,Ming Li <ml...@pivotal.io> 写道:
> 
> Hi Kyle,
> 
> 
>      If we keep all these filesystem similar to hdfs, only support append
> only, then then change must be much less. I think we can go ahead to
> implement a demo for it if we have resource, we may encounter problems, but
> we can find more solution/workaround for it.
> --------------------
> 
>      For your question about the relationship between 3 source code files,
> below is my understanding (because the code is not written by me, my
> opinion maybe not completely correct.)
> (1) bin/gpfilesystem/hdfs/gpfshdfs.c -- implement all API used in hdfs
> tuple in the catalog pg_filesystem, it will directly call API in libhdfs3
> to access hdfs file system. The reason why make it a wrapper is to define
> all these API as UDF, so that we can easily support similar filesystem by
> adding a similar tuple in pg_filesystem, and add similar code as this file,
> without changing any place calling these API. Also because they are UDF, we
> can upgrade the old binary hawq to add new file system.
> (2) backend/storage/file/filesystem.c -- because all API in (1) is in form
> of UDF,  so we need a conversion if we want to directly call these API.
> This file is responsible for converting normal hdfs calling in hawq kernel
> to UDF calling.
> (3) backend/storage/file/fd.c -- Because OS have file description open
> number limitation, PostgreSQL/HAWQ will use a LUR buffer to cache all
> opened file handlers. All hdfs API in this file also manage file handler
> same as native file systems. These functions call API in (2) to interact
> with hdfs.
> 
>     In a word,  the calling stack is:  (3) --> (2) --> (1) --> libhdfs3
> API.
> -------------------
> 
>    The last question about tablespace, PostgreSQL introduce it so that
> user can set different tablespace to different paths, and these paths can
> be mounted with different file system on linux. But all filesystems API are
> the same, and the functionality are the same (supporting UPDATE in place).
> So we cannot directly use tablespace to hand this scenario.  And also I
> cannot guess how much effort needed because I did participate the hdfs file
> system supporting in the hawq origin release.
> 
> 
> That's my opinion, any correction or suggestion are welcomed! Hope it can
> help you!  Thanks.
> 
> 
> On Wed, Mar 15, 2017 at 11:07 AM, Paul Guo <pa...@gmail.com> wrote:
> 
>> Hi Kyle,
>> 
>> I'm not sure whether I understand your point correctly, but for FUSE which
>> allows userspace file system implementation on Linux, users uses the
>> filesystem (e.g. S3 in your example) as a block storage, accesses it via
>> standard sys calls like open, close, read, write although some behaviours
>> or sys call could probably be not supported. That means for query for FUSE
>> fs, you are probably able to access them using the interfaces in fd.c
>> directly (I'm not sure some hacking is needed), but for such kind of
>> distributed file systems, compared with fuse access way, lib access is
>> usually more encouraged since: 1) performance (You could search for the
>> fuse theory to see the long fuse call paths which are added for file
>> access) 2) stability (You add the fuse kernel part in your software stack
>> and according to my experience it will be really painful to handle some
>> exceptions). For such storage, I'd really prefer some other solutions, lib
>> access like hawq or external table, whatever.
>> 
>> Actually long long time ago I've seen fuse over hdfs on real production
>> environment, so I'm actually curious whether someone have tried query it
>> via this solution before and compared with the hawq for the performance,
>> etc.
>> 
>> 
>> 
>> 
>> 2017-03-15 1:26 GMT+08:00 Kyle Dunn <kd...@pivotal.io>:
>> 
>>> Ming -
>>> 
>>> Great points about append-only. One potential work-around is to split a
>>> table over multiple backend storage objects, (a new file for each append
>>> operation), Then, maybe as part of VACUUM, perform object compaction. For
>>> GCP, the server-side compaction capability for objects is called compose
>>> <https://cloud.google.com/storage/docs/gsutil/commands/compose>. For
>> AWS,
>>> you can emulate this behavior using Multipart upload
>>> <http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html> -
>>> demonstrated concretely with the Ruby SDK here
>>> <https://aws.amazon.com/blogs/developer/efficient-amazon-s3-
>>> object-concatenation-using-the-aws-sdk-for-ruby/>.
>>> Azure actually supports append-blobs
>>> <https://blogs.msdn.microsoft.com/windowsazurestorage/2015/
>>> 04/13/introducing-azure-storage-append-blob/>
>>> natively.
>>> 
>>> For the FUSE exploration, can you (or anyone else) help me understand the
>>> relationship and/or call graph between these different implementations?
>>> 
>>>   - backend/storage/file/filesystem.c
>>>   - bin/gpfilesystem/hdfs/gpfshdfs.c
>>>   - backend/storage/file/fd.c
>>> 
>>> I feel confident that everything HDFS-related ultimately uses
>>> libhdfs3/src/client/Hdfs.cpp but it seems like a convoluted path for
>>> getting there from the backend code.
>>> 
>>> Also, it looks like normal Postgres allows tablespaces to be created like
>>> this:
>>> 
>>>      CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';
>>> 
>>> This is much simpler than wrapping glibc calls and is exactly what would
>> be
>>> necessary if using FUSE modules + mount points to handle a "pluggable"
>>> backend. Maybe you (or someone) can advise how much effort it would be to
>>> bring "local:// FS" tablespace support back? It is potentially less than
>>> trying to unravel all the HDFS-specific implementation scattered around
>> the
>>> backend code.
>>> 
>>> 
>>> Thanks,
>>> Kyle
>>> 
>>> On Mon, Mar 13, 2017 at 8:35 PM Ming Li <ml...@pivotal.io> wrote:
>>> 
>>>> Hi Kyle,
>>>> 
>>>> Good investigation!
>>>> 
>>>> I think we can add a similar tuple as hdfs in the pg_filesystem at
>> first,
>>>> then implement all API introduce in this tuple to call the FUSE API.
>>>> 
>>>> However because HAWQ are designed for hdfs which means only append-only
>>>> file system, so when we support other types of filesystem, we should
>>>> investigate how to improve the performance and transaction issues. The
>>>> performance can be investigate after we implement a demo, but the
>>>> transaction issue should be decided before. Append only file system
>> don't
>>>> support UPDATE in place, and the inserted data are traced by file
>> length
>>> in
>>>> pg_aoseg.pg_aoseg_xxxxx or pg_parquet.pg_parquet_xxxxx.
>>>> 
>>>> Thanks.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, Mar 14, 2017 at 7:57 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>>>> 
>>>>> Hello devs -
>>>>> 
>>>>> I'm doing some reading about HAWQ tablespaces here:
>>>>> http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html
>>>>> 
>>>>> I want to understand the flow of things, please correct me on the
>>>> following
>>>>> assumptions:
>>>>> 
>>>>> 1) Create a filesystem (not *really* supported after HAWQ init) - the
>>>>> default is obviously [lib]HDFS[3]:
>>>>>      SELECT * FROM pg_filesystem;
>>>>> 
>>>>> 2) Create a filespace, referencing the above file system:
>>>>>      CREATE FILESPACE testfs ON hdfs
>>>>>      ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);
>>>>> 
>>>>> 3) Create a tablespace, reference the above filespace:
>>>>>      CREATE TABLESPACE fastspace FILESPACE testfs;
>>>>> 
>>>>> 4) Create objects referencing the above table space, or set it as the
>>>>> database's default:
>>>>>      CREATE DATABASE testdb WITH TABLESPACE=testfs;
>>>>> 
>>>>> Given this set of steps, it it true (*in theory*) an arbitrary
>>> filesystem
>>>>> (i.e. storage backend) could be added to HAWQ using *existing* APIs?
>>>>> 
>>>>> I realize the nuances of this are significant, but conceptually I'd
>>> like
>>>> to
>>>>> gather some details, mainly in support of this
>>>>> <https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA
>>>> discussion.
>>>>> I'm daydreaming about whether this neat tool:
>>>>> https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3
>> spike
>>>>> (which also seems to kind of work on Google Cloud, when
>>> interoperability
>>>>> <
>>>> https://github.com/s3fs-fuse/s3fs-fuse/issues/109#
>> issuecomment-286222694
>>>> 
>>>>> mode is enabled). By it's Linux FUSE nature, it implements the lion's
>>>> share
>>>>> of required pg_filesystem functions; in fact, maybe we could actually
>>> use
>>>>> system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/
>>> )
>>>>> directly in this situation.
>>>>> 
>>>>> Curious to get some feedback.
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Kyle
>>>>> --
>>>>> *Kyle Dunn | Data Engineering | Pivotal*
>>>>> Direct: 303.905.3171 <(303)%20905-3171> <3039053171
>>> <(303)%20905-3171>>
>>>> | Email: kdunn@pivotal.io
>>>>> 
>>>> 
>>> --
>>> *Kyle Dunn | Data Engineering | Pivotal*
>>> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>>> 
>> 


Re: Questions about filesystem / filespace / tablespace

Posted by Ming Li <ml...@pivotal.io>.
Hi Kyle,


      If we keep all these filesystem similar to hdfs, only support append
only, then then change must be much less. I think we can go ahead to
implement a demo for it if we have resource, we may encounter problems, but
we can find more solution/workaround for it.
--------------------

      For your question about the relationship between 3 source code files,
below is my understanding (because the code is not written by me, my
opinion maybe not completely correct.)
(1) bin/gpfilesystem/hdfs/gpfshdfs.c -- implement all API used in hdfs
tuple in the catalog pg_filesystem, it will directly call API in libhdfs3
to access hdfs file system. The reason why make it a wrapper is to define
all these API as UDF, so that we can easily support similar filesystem by
adding a similar tuple in pg_filesystem, and add similar code as this file,
without changing any place calling these API. Also because they are UDF, we
can upgrade the old binary hawq to add new file system.
(2) backend/storage/file/filesystem.c -- because all API in (1) is in form
of UDF,  so we need a conversion if we want to directly call these API.
This file is responsible for converting normal hdfs calling in hawq kernel
to UDF calling.
(3) backend/storage/file/fd.c -- Because OS have file description open
number limitation, PostgreSQL/HAWQ will use a LUR buffer to cache all
opened file handlers. All hdfs API in this file also manage file handler
same as native file systems. These functions call API in (2) to interact
with hdfs.

     In a word,  the calling stack is:  (3) --> (2) --> (1) --> libhdfs3
API.
-------------------

    The last question about tablespace, PostgreSQL introduce it so that
user can set different tablespace to different paths, and these paths can
be mounted with different file system on linux. But all filesystems API are
the same, and the functionality are the same (supporting UPDATE in place).
So we cannot directly use tablespace to hand this scenario.  And also I
cannot guess how much effort needed because I did participate the hdfs file
system supporting in the hawq origin release.


That's my opinion, any correction or suggestion are welcomed! Hope it can
help you!  Thanks.


On Wed, Mar 15, 2017 at 11:07 AM, Paul Guo <pa...@gmail.com> wrote:

> Hi Kyle,
>
> I'm not sure whether I understand your point correctly, but for FUSE which
> allows userspace file system implementation on Linux, users uses the
> filesystem (e.g. S3 in your example) as a block storage, accesses it via
> standard sys calls like open, close, read, write although some behaviours
> or sys call could probably be not supported. That means for query for FUSE
> fs, you are probably able to access them using the interfaces in fd.c
> directly (I'm not sure some hacking is needed), but for such kind of
> distributed file systems, compared with fuse access way, lib access is
> usually more encouraged since: 1) performance (You could search for the
> fuse theory to see the long fuse call paths which are added for file
> access) 2) stability (You add the fuse kernel part in your software stack
> and according to my experience it will be really painful to handle some
> exceptions). For such storage, I'd really prefer some other solutions, lib
> access like hawq or external table, whatever.
>
> Actually long long time ago I've seen fuse over hdfs on real production
> environment, so I'm actually curious whether someone have tried query it
> via this solution before and compared with the hawq for the performance,
> etc.
>
>
>
>
> 2017-03-15 1:26 GMT+08:00 Kyle Dunn <kd...@pivotal.io>:
>
> > Ming -
> >
> > Great points about append-only. One potential work-around is to split a
> > table over multiple backend storage objects, (a new file for each append
> > operation), Then, maybe as part of VACUUM, perform object compaction. For
> > GCP, the server-side compaction capability for objects is called compose
> > <https://cloud.google.com/storage/docs/gsutil/commands/compose>. For
> AWS,
> > you can emulate this behavior using Multipart upload
> > <http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html> -
> > demonstrated concretely with the Ruby SDK here
> > <https://aws.amazon.com/blogs/developer/efficient-amazon-s3-
> > object-concatenation-using-the-aws-sdk-for-ruby/>.
> > Azure actually supports append-blobs
> > <https://blogs.msdn.microsoft.com/windowsazurestorage/2015/
> > 04/13/introducing-azure-storage-append-blob/>
> >  natively.
> >
> > For the FUSE exploration, can you (or anyone else) help me understand the
> > relationship and/or call graph between these different implementations?
> >
> >    - backend/storage/file/filesystem.c
> >    - bin/gpfilesystem/hdfs/gpfshdfs.c
> >    - backend/storage/file/fd.c
> >
> > I feel confident that everything HDFS-related ultimately uses
> > libhdfs3/src/client/Hdfs.cpp but it seems like a convoluted path for
> > getting there from the backend code.
> >
> > Also, it looks like normal Postgres allows tablespaces to be created like
> > this:
> >
> >       CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';
> >
> > This is much simpler than wrapping glibc calls and is exactly what would
> be
> > necessary if using FUSE modules + mount points to handle a "pluggable"
> > backend. Maybe you (or someone) can advise how much effort it would be to
> > bring "local:// FS" tablespace support back? It is potentially less than
> > trying to unravel all the HDFS-specific implementation scattered around
> the
> > backend code.
> >
> >
> > Thanks,
> > Kyle
> >
> > On Mon, Mar 13, 2017 at 8:35 PM Ming Li <ml...@pivotal.io> wrote:
> >
> > > Hi Kyle,
> > >
> > > Good investigation!
> > >
> > > I think we can add a similar tuple as hdfs in the pg_filesystem at
> first,
> > > then implement all API introduce in this tuple to call the FUSE API.
> > >
> > > However because HAWQ are designed for hdfs which means only append-only
> > > file system, so when we support other types of filesystem, we should
> > > investigate how to improve the performance and transaction issues. The
> > > performance can be investigate after we implement a demo, but the
> > > transaction issue should be decided before. Append only file system
> don't
> > > support UPDATE in place, and the inserted data are traced by file
> length
> > in
> > > pg_aoseg.pg_aoseg_xxxxx or pg_parquet.pg_parquet_xxxxx.
> > >
> > > Thanks.
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Mar 14, 2017 at 7:57 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> > >
> > > > Hello devs -
> > > >
> > > > I'm doing some reading about HAWQ tablespaces here:
> > > > http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html
> > > >
> > > > I want to understand the flow of things, please correct me on the
> > > following
> > > > assumptions:
> > > >
> > > > 1) Create a filesystem (not *really* supported after HAWQ init) - the
> > > > default is obviously [lib]HDFS[3]:
> > > >       SELECT * FROM pg_filesystem;
> > > >
> > > > 2) Create a filespace, referencing the above file system:
> > > >       CREATE FILESPACE testfs ON hdfs
> > > >       ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);
> > > >
> > > > 3) Create a tablespace, reference the above filespace:
> > > >       CREATE TABLESPACE fastspace FILESPACE testfs;
> > > >
> > > > 4) Create objects referencing the above table space, or set it as the
> > > > database's default:
> > > >       CREATE DATABASE testdb WITH TABLESPACE=testfs;
> > > >
> > > > Given this set of steps, it it true (*in theory*) an arbitrary
> > filesystem
> > > > (i.e. storage backend) could be added to HAWQ using *existing* APIs?
> > > >
> > > > I realize the nuances of this are significant, but conceptually I'd
> > like
> > > to
> > > > gather some details, mainly in support of this
> > > > <https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA
> > > discussion.
> > > > I'm daydreaming about whether this neat tool:
> > > > https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3
> spike
> > > > (which also seems to kind of work on Google Cloud, when
> > interoperability
> > > > <
> > > https://github.com/s3fs-fuse/s3fs-fuse/issues/109#
> issuecomment-286222694
> > >
> > > > mode is enabled). By it's Linux FUSE nature, it implements the lion's
> > > share
> > > > of required pg_filesystem functions; in fact, maybe we could actually
> > use
> > > > system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/
> >)
> > > > directly in this situation.
> > > >
> > > > Curious to get some feedback.
> > > >
> > > >
> > > > Thanks,
> > > > Kyle
> > > > --
> > > > *Kyle Dunn | Data Engineering | Pivotal*
> > > > Direct: 303.905.3171 <(303)%20905-3171> <3039053171
> > <(303)%20905-3171>>
> > > | Email: kdunn@pivotal.io
> > > >
> > >
> > --
> > *Kyle Dunn | Data Engineering | Pivotal*
> > Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
> >
>

Re: Questions about filesystem / filespace / tablespace

Posted by Paul Guo <pa...@gmail.com>.
Hi Kyle,

I'm not sure whether I understand your point correctly, but for FUSE which
allows userspace file system implementation on Linux, users uses the
filesystem (e.g. S3 in your example) as a block storage, accesses it via
standard sys calls like open, close, read, write although some behaviours
or sys call could probably be not supported. That means for query for FUSE
fs, you are probably able to access them using the interfaces in fd.c
directly (I'm not sure some hacking is needed), but for such kind of
distributed file systems, compared with fuse access way, lib access is
usually more encouraged since: 1) performance (You could search for the
fuse theory to see the long fuse call paths which are added for file
access) 2) stability (You add the fuse kernel part in your software stack
and according to my experience it will be really painful to handle some
exceptions). For such storage, I'd really prefer some other solutions, lib
access like hawq or external table, whatever.

Actually long long time ago I've seen fuse over hdfs on real production
environment, so I'm actually curious whether someone have tried query it
via this solution before and compared with the hawq for the performance,
etc.




2017-03-15 1:26 GMT+08:00 Kyle Dunn <kd...@pivotal.io>:

> Ming -
>
> Great points about append-only. One potential work-around is to split a
> table over multiple backend storage objects, (a new file for each append
> operation), Then, maybe as part of VACUUM, perform object compaction. For
> GCP, the server-side compaction capability for objects is called compose
> <https://cloud.google.com/storage/docs/gsutil/commands/compose>. For AWS,
> you can emulate this behavior using Multipart upload
> <http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html> -
> demonstrated concretely with the Ruby SDK here
> <https://aws.amazon.com/blogs/developer/efficient-amazon-s3-
> object-concatenation-using-the-aws-sdk-for-ruby/>.
> Azure actually supports append-blobs
> <https://blogs.msdn.microsoft.com/windowsazurestorage/2015/
> 04/13/introducing-azure-storage-append-blob/>
>  natively.
>
> For the FUSE exploration, can you (or anyone else) help me understand the
> relationship and/or call graph between these different implementations?
>
>    - backend/storage/file/filesystem.c
>    - bin/gpfilesystem/hdfs/gpfshdfs.c
>    - backend/storage/file/fd.c
>
> I feel confident that everything HDFS-related ultimately uses
> libhdfs3/src/client/Hdfs.cpp but it seems like a convoluted path for
> getting there from the backend code.
>
> Also, it looks like normal Postgres allows tablespaces to be created like
> this:
>
>       CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';
>
> This is much simpler than wrapping glibc calls and is exactly what would be
> necessary if using FUSE modules + mount points to handle a "pluggable"
> backend. Maybe you (or someone) can advise how much effort it would be to
> bring "local:// FS" tablespace support back? It is potentially less than
> trying to unravel all the HDFS-specific implementation scattered around the
> backend code.
>
>
> Thanks,
> Kyle
>
> On Mon, Mar 13, 2017 at 8:35 PM Ming Li <ml...@pivotal.io> wrote:
>
> > Hi Kyle,
> >
> > Good investigation!
> >
> > I think we can add a similar tuple as hdfs in the pg_filesystem at first,
> > then implement all API introduce in this tuple to call the FUSE API.
> >
> > However because HAWQ are designed for hdfs which means only append-only
> > file system, so when we support other types of filesystem, we should
> > investigate how to improve the performance and transaction issues. The
> > performance can be investigate after we implement a demo, but the
> > transaction issue should be decided before. Append only file system don't
> > support UPDATE in place, and the inserted data are traced by file length
> in
> > pg_aoseg.pg_aoseg_xxxxx or pg_parquet.pg_parquet_xxxxx.
> >
> > Thanks.
> >
> >
> >
> >
> >
> > On Tue, Mar 14, 2017 at 7:57 AM, Kyle Dunn <kd...@pivotal.io> wrote:
> >
> > > Hello devs -
> > >
> > > I'm doing some reading about HAWQ tablespaces here:
> > > http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html
> > >
> > > I want to understand the flow of things, please correct me on the
> > following
> > > assumptions:
> > >
> > > 1) Create a filesystem (not *really* supported after HAWQ init) - the
> > > default is obviously [lib]HDFS[3]:
> > >       SELECT * FROM pg_filesystem;
> > >
> > > 2) Create a filespace, referencing the above file system:
> > >       CREATE FILESPACE testfs ON hdfs
> > >       ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);
> > >
> > > 3) Create a tablespace, reference the above filespace:
> > >       CREATE TABLESPACE fastspace FILESPACE testfs;
> > >
> > > 4) Create objects referencing the above table space, or set it as the
> > > database's default:
> > >       CREATE DATABASE testdb WITH TABLESPACE=testfs;
> > >
> > > Given this set of steps, it it true (*in theory*) an arbitrary
> filesystem
> > > (i.e. storage backend) could be added to HAWQ using *existing* APIs?
> > >
> > > I realize the nuances of this are significant, but conceptually I'd
> like
> > to
> > > gather some details, mainly in support of this
> > > <https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA
> > discussion.
> > > I'm daydreaming about whether this neat tool:
> > > https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3 spike
> > > (which also seems to kind of work on Google Cloud, when
> interoperability
> > > <
> > https://github.com/s3fs-fuse/s3fs-fuse/issues/109#issuecomment-286222694
> >
> > > mode is enabled). By it's Linux FUSE nature, it implements the lion's
> > share
> > > of required pg_filesystem functions; in fact, maybe we could actually
> use
> > > system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/>)
> > > directly in this situation.
> > >
> > > Curious to get some feedback.
> > >
> > >
> > > Thanks,
> > > Kyle
> > > --
> > > *Kyle Dunn | Data Engineering | Pivotal*
> > > Direct: 303.905.3171 <(303)%20905-3171> <3039053171
> <(303)%20905-3171>>
> > | Email: kdunn@pivotal.io
> > >
> >
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>

Re: Questions about filesystem / filespace / tablespace

Posted by Kyle Dunn <kd...@pivotal.io>.
Ming -

Great points about append-only. One potential work-around is to split a
table over multiple backend storage objects, (a new file for each append
operation), Then, maybe as part of VACUUM, perform object compaction. For
GCP, the server-side compaction capability for objects is called compose
<https://cloud.google.com/storage/docs/gsutil/commands/compose>. For AWS,
you can emulate this behavior using Multipart upload
<http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html> -
demonstrated concretely with the Ruby SDK here
<https://aws.amazon.com/blogs/developer/efficient-amazon-s3-object-concatenation-using-the-aws-sdk-for-ruby/>.
Azure actually supports append-blobs
<https://blogs.msdn.microsoft.com/windowsazurestorage/2015/04/13/introducing-azure-storage-append-blob/>
 natively.

For the FUSE exploration, can you (or anyone else) help me understand the
relationship and/or call graph between these different implementations?

   - backend/storage/file/filesystem.c
   - bin/gpfilesystem/hdfs/gpfshdfs.c
   - backend/storage/file/fd.c

I feel confident that everything HDFS-related ultimately uses
libhdfs3/src/client/Hdfs.cpp but it seems like a convoluted path for
getting there from the backend code.

Also, it looks like normal Postgres allows tablespaces to be created like
this:

      CREATE TABLESPACE fastspace LOCATION '/mnt/sda1/postgresql/data';

This is much simpler than wrapping glibc calls and is exactly what would be
necessary if using FUSE modules + mount points to handle a "pluggable"
backend. Maybe you (or someone) can advise how much effort it would be to
bring "local:// FS" tablespace support back? It is potentially less than
trying to unravel all the HDFS-specific implementation scattered around the
backend code.


Thanks,
Kyle

On Mon, Mar 13, 2017 at 8:35 PM Ming Li <ml...@pivotal.io> wrote:

> Hi Kyle,
>
> Good investigation!
>
> I think we can add a similar tuple as hdfs in the pg_filesystem at first,
> then implement all API introduce in this tuple to call the FUSE API.
>
> However because HAWQ are designed for hdfs which means only append-only
> file system, so when we support other types of filesystem, we should
> investigate how to improve the performance and transaction issues. The
> performance can be investigate after we implement a demo, but the
> transaction issue should be decided before. Append only file system don't
> support UPDATE in place, and the inserted data are traced by file length in
> pg_aoseg.pg_aoseg_xxxxx or pg_parquet.pg_parquet_xxxxx.
>
> Thanks.
>
>
>
>
>
> On Tue, Mar 14, 2017 at 7:57 AM, Kyle Dunn <kd...@pivotal.io> wrote:
>
> > Hello devs -
> >
> > I'm doing some reading about HAWQ tablespaces here:
> > http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html
> >
> > I want to understand the flow of things, please correct me on the
> following
> > assumptions:
> >
> > 1) Create a filesystem (not *really* supported after HAWQ init) - the
> > default is obviously [lib]HDFS[3]:
> >       SELECT * FROM pg_filesystem;
> >
> > 2) Create a filespace, referencing the above file system:
> >       CREATE FILESPACE testfs ON hdfs
> >       ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);
> >
> > 3) Create a tablespace, reference the above filespace:
> >       CREATE TABLESPACE fastspace FILESPACE testfs;
> >
> > 4) Create objects referencing the above table space, or set it as the
> > database's default:
> >       CREATE DATABASE testdb WITH TABLESPACE=testfs;
> >
> > Given this set of steps, it it true (*in theory*) an arbitrary filesystem
> > (i.e. storage backend) could be added to HAWQ using *existing* APIs?
> >
> > I realize the nuances of this are significant, but conceptually I'd like
> to
> > gather some details, mainly in support of this
> > <https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA
> discussion.
> > I'm daydreaming about whether this neat tool:
> > https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3 spike
> > (which also seems to kind of work on Google Cloud, when interoperability
> > <
> https://github.com/s3fs-fuse/s3fs-fuse/issues/109#issuecomment-286222694>
> > mode is enabled). By it's Linux FUSE nature, it implements the lion's
> share
> > of required pg_filesystem functions; in fact, maybe we could actually use
> > system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/>)
> > directly in this situation.
> >
> > Curious to get some feedback.
> >
> >
> > Thanks,
> > Kyle
> > --
> > *Kyle Dunn | Data Engineering | Pivotal*
> > Direct: 303.905.3171 <(303)%20905-3171> <3039053171 <(303)%20905-3171>>
> | Email: kdunn@pivotal.io
> >
>
-- 
*Kyle Dunn | Data Engineering | Pivotal*
Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io

Re: Questions about filesystem / filespace / tablespace

Posted by Ming Li <ml...@pivotal.io>.
Hi Kyle,

Good investigation!

I think we can add a similar tuple as hdfs in the pg_filesystem at first,
then implement all API introduce in this tuple to call the FUSE API.

However because HAWQ are designed for hdfs which means only append-only
file system, so when we support other types of filesystem, we should
investigate how to improve the performance and transaction issues. The
performance can be investigate after we implement a demo, but the
transaction issue should be decided before. Append only file system don't
support UPDATE in place, and the inserted data are traced by file length in
pg_aoseg.pg_aoseg_xxxxx or pg_parquet.pg_parquet_xxxxx.

Thanks.





On Tue, Mar 14, 2017 at 7:57 AM, Kyle Dunn <kd...@pivotal.io> wrote:

> Hello devs -
>
> I'm doing some reading about HAWQ tablespaces here:
> http://hdb.docs.pivotal.io/212/hawq/ddl/ddl-tablespace.html
>
> I want to understand the flow of things, please correct me on the following
> assumptions:
>
> 1) Create a filesystem (not *really* supported after HAWQ init) - the
> default is obviously [lib]HDFS[3]:
>       SELECT * FROM pg_filesystem;
>
> 2) Create a filespace, referencing the above file system:
>       CREATE FILESPACE testfs ON hdfs
>       ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 1);
>
> 3) Create a tablespace, reference the above filespace:
>       CREATE TABLESPACE fastspace FILESPACE testfs;
>
> 4) Create objects referencing the above table space, or set it as the
> database's default:
>       CREATE DATABASE testdb WITH TABLESPACE=testfs;
>
> Given this set of steps, it it true (*in theory*) an arbitrary filesystem
> (i.e. storage backend) could be added to HAWQ using *existing* APIs?
>
> I realize the nuances of this are significant, but conceptually I'd like to
> gather some details, mainly in support of this
> <https://issues.apache.org/jira/browse/HAWQ-1270> ongoing JIRA discussion.
> I'm daydreaming about whether this neat tool:
> https://github.com/s3fs-fuse/s3fs-fuse could be useful for an S3 spike
> (which also seems to kind of work on Google Cloud, when interoperability
> <https://github.com/s3fs-fuse/s3fs-fuse/issues/109#issuecomment-286222694>
> mode is enabled). By it's Linux FUSE nature, it implements the lion's share
> of required pg_filesystem functions; in fact, maybe we could actually use
> system calls from glibc (somewhat <http://www.linux-mag.com/id/7814/>)
> directly in this situation.
>
> Curious to get some feedback.
>
>
> Thanks,
> Kyle
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kdunn@pivotal.io
>