You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Prabhakar Bhosaale <bh...@gmail.com> on 2020/04/11 13:58:37 UTC

Querying encrypted JSON file

Hi All,
I have a  encrypted JSON file. is there any way in drill to query the
encrypted JSON file? Thanks

Regards
Prabhakar

Re: Querying encrypted JSON file

Posted by Paul Rogers <pa...@yahoo.com.INVALID>.
Hi Prabhakar,

Looking at the Drill code, the existing compression support (via "codecs") is in the FileSystemPlugin class, [1]. Looks like Drill uses the compression codec feature of Hadoop [2] based on a CompressionCodec class [3].

This means that you just need to use standard Hadoop mechanisms to define a custom codec. [4].

If you are storing JSON, it might be worthwhile combining compression and encryption together, since JSON files tend to be large (especially if the JSON is indented.) Perhaps one of the existing Hadoop codecs (see [2]) might do the job for you.

Here it might be worth pointing out that you'll need a file system to store the files. If your use case is small enough that your files fit on a single machine, you can use a single Drillbit to query local files. If the set of files is large, then one node will not provide adequate performance so you'll need a Drill cluster. For that, you'll need a distributed file system: HDFS, MapR-FS, S3 or whatever.


Note also that JSON is a convenient, but inefficient, format. If you have to encrypt files, we already suggested compressing them as well. However, JSON files are not block-splittable: if you have a big JSON file, it must be read in a single thread. (Not as much of a problem if you instead have many smaller files.) A format such as Parquet is better suited for queries. So, if you must convert your file to encrypt it, consider converting the files to Parquet to get better query performance. Drill can even do the conversion for you with the CREATE TABLE AS (CTAS) command.


Thanks,
- Paul


[1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L141

[2] https://netjs.blogspot.com/2018/04/data-compression-in-hadoop.html

[3] https://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/CompressionCodec.html 

[4] https://stackoverflow.com/questions/37608227/adding-custom-code-to-hadoop-spark-compression-codec

    On Sunday, April 12, 2020, 12:03:13 AM PDT, Prabhakar Bhosaale <bh...@gmail.com> wrote:  
 
 Hi Paul,
Thanks  for details. As of now i have not finalized on any encryption
tecnique as first i wanted to understand drill capabilities on encryption
and decryption.
To give you more details on my requirent. I will be archiving data in
JSON format from database. And that archived data will be acceased using
drill for reporting pupose. I am already zipping up JSON files using gzip.
But for security reasons i need to encrypt the files also. Thx

Regards
Prabhakar



On Sun, Apr 12, 2020, 11:38 Paul Rogers <pa...@yahoo.com.invalid> wrote:

> Hi Prabhakar,
>
> Depending on how you perform encryption, you may be able to treat it
> similar to compression. Drill handles compression (zip, gzip, etc.) via an
> extra layer of functionality on top of any format plugin. That means,
> rather than writing a new JSON file reader, you write a new compression
> plugin (which will actually do decryption). I have not added one of these,
> but I'll poke around to see if I can find some pointers.
>
> On the other hand, if encryption is part of the access protocol (such as
> S3), then you can configure it via the S3 client.
>
> Can you describe a bit more how you encrypt your files and what is needed
> to decrypt?
>
>
> Thanks,
> - Paul
>
>
>
>    On Saturday, April 11, 2020, 10:39:15 PM PDT, Prabhakar Bhosaale <
> bhosale.p.v@gmail.com> wrote:
>
>  Hi Ted,
> Thanks for your reply. Could you please give some more details on how to
> write to create file format, how to use it. Any pointers will be
> appreciated. Thx
>
> Regards
> Prabhakar
>
> On Sun, Apr 12, 2020, 00:19 Ted Dunning <te...@gmail.com> wrote:
>
> > Yes.
> >
> > You need to write a special file format for that, though.
> >
> >
> > On Sat, Apr 11, 2020 at 6:58 AM Prabhakar Bhosaale <
> bhosale.p.v@gmail.com>
> > wrote:
> >
> > > Hi All,
> > > I have a  encrypted JSON file. is there any way in drill to query the
> > > encrypted JSON file? Thanks
> > >
> > > Regards
> > > Prabhakar
> > >
> >
>
  

Re: Querying encrypted JSON file

Posted by Prabhakar Bhosaale <bh...@gmail.com>.
Hi Paul,
Thanks  for details. As of now i have not finalized on any encryption
tecnique as first i wanted to understand drill capabilities on encryption
and decryption.
To give you more details on my requirent. I will be archiving data in
JSON format from database. And that archived data will be acceased using
drill for reporting pupose. I am already zipping up JSON files using gzip.
But for security reasons i need to encrypt the files also. Thx

Regards
Prabhakar



On Sun, Apr 12, 2020, 11:38 Paul Rogers <pa...@yahoo.com.invalid> wrote:

> Hi Prabhakar,
>
> Depending on how you perform encryption, you may be able to treat it
> similar to compression. Drill handles compression (zip, gzip, etc.) via an
> extra layer of functionality on top of any format plugin. That means,
> rather than writing a new JSON file reader, you write a new compression
> plugin (which will actually do decryption). I have not added one of these,
> but I'll poke around to see if I can find some pointers.
>
> On the other hand, if encryption is part of the access protocol (such as
> S3), then you can configure it via the S3 client.
>
> Can you describe a bit more how you encrypt your files and what is needed
> to decrypt?
>
>
> Thanks,
> - Paul
>
>
>
>     On Saturday, April 11, 2020, 10:39:15 PM PDT, Prabhakar Bhosaale <
> bhosale.p.v@gmail.com> wrote:
>
>  Hi Ted,
> Thanks for your reply. Could you please give some more details on how to
> write to create file format, how to use it. Any pointers will be
> appreciated. Thx
>
> Regards
> Prabhakar
>
> On Sun, Apr 12, 2020, 00:19 Ted Dunning <te...@gmail.com> wrote:
>
> > Yes.
> >
> > You need to write a special file format for that, though.
> >
> >
> > On Sat, Apr 11, 2020 at 6:58 AM Prabhakar Bhosaale <
> bhosale.p.v@gmail.com>
> > wrote:
> >
> > > Hi All,
> > > I have a  encrypted JSON file. is there any way in drill to query the
> > > encrypted JSON file? Thanks
> > >
> > > Regards
> > > Prabhakar
> > >
> >
>

Re: Querying encrypted JSON file

Posted by Paul Rogers <pa...@yahoo.com.INVALID>.
Hi Prabhakar,

Depending on how you perform encryption, you may be able to treat it similar to compression. Drill handles compression (zip, gzip, etc.) via an extra layer of functionality on top of any format plugin. That means, rather than writing a new JSON file reader, you write a new compression plugin (which will actually do decryption). I have not added one of these, but I'll poke around to see if I can find some pointers.

On the other hand, if encryption is part of the access protocol (such as S3), then you can configure it via the S3 client.

Can you describe a bit more how you encrypt your files and what is needed to decrypt?


Thanks,
- Paul

 

    On Saturday, April 11, 2020, 10:39:15 PM PDT, Prabhakar Bhosaale <bh...@gmail.com> wrote:  
 
 Hi Ted,
Thanks for your reply. Could you please give some more details on how to
write to create file format, how to use it. Any pointers will be
appreciated. Thx

Regards
Prabhakar

On Sun, Apr 12, 2020, 00:19 Ted Dunning <te...@gmail.com> wrote:

> Yes.
>
> You need to write a special file format for that, though.
>
>
> On Sat, Apr 11, 2020 at 6:58 AM Prabhakar Bhosaale <bh...@gmail.com>
> wrote:
>
> > Hi All,
> > I have a  encrypted JSON file. is there any way in drill to query the
> > encrypted JSON file? Thanks
> >
> > Regards
> > Prabhakar
> >
>
  

Re: Querying encrypted JSON file

Posted by Prabhakar Bhosaale <bh...@gmail.com>.
Hi Ted,
Thanks for your reply. Could you please give some more details on how to
write to create file format, how to use it. Any pointers will be
appreciated. Thx

Regards
Prabhakar

On Sun, Apr 12, 2020, 00:19 Ted Dunning <te...@gmail.com> wrote:

> Yes.
>
> You need to write a special file format for that, though.
>
>
> On Sat, Apr 11, 2020 at 6:58 AM Prabhakar Bhosaale <bh...@gmail.com>
> wrote:
>
> > Hi All,
> > I have a  encrypted JSON file. is there any way in drill to query the
> > encrypted JSON file? Thanks
> >
> > Regards
> > Prabhakar
> >
>

Re: Querying encrypted JSON file

Posted by Ted Dunning <te...@gmail.com>.
Yes.

You need to write a special file format for that, though.


On Sat, Apr 11, 2020 at 6:58 AM Prabhakar Bhosaale <bh...@gmail.com>
wrote:

> Hi All,
> I have a  encrypted JSON file. is there any way in drill to query the
> encrypted JSON file? Thanks
>
> Regards
> Prabhakar
>