You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Houssem Hosni <Ho...@lip6.fr> on 2018/02/05 17:43:50 UTC

PCAP files with Apache Drill and Sergeant R

Hi,
I am sending this mail with a hope to get some help from you.
I am working on making some analysis and prediction models on large pcap
files.
Can Apache Drill with R Sergeant library help me in this context.
Actually the pcap files are so large (MAWI) and they are available on the
web(http://mawi.wide.ad.jp/mawi/samplepoint-F/2018/). I want to access
them via apache Drill and then make some analysis using Sergeant package
(R) that works well with Drill.
Should I bring those large MAWI pcap files on the web to Amazon S3 and  
then access them with DRILL or is it possible to access them directly  
without amazon storage ?
What steps should I start with ?
Special THANKS in advance for considering my request.
Best regards,
Houssem Hosni
LIP6 - Sorbonne University
houssem.hosni@lip6.fr
Place Jussieu, 75005 Paris.
Tel: (+0033)0644087200



RE: PCAP files with Apache Drill and Sergeant R

Posted by Kunal Khatua <kk...@mapr.com>.
I don’t think you can (or even want to) directly access them, assuming that the HTTP link you shared is your intended way of accessing the data. 

Bringing them into Amazon S3 will make it easier to spin up Drill and access the data, and you could even use the 'tmp' workspace or create temporary tables within a Drill session to work on the data without having to repeatedly pull in the raw data from S3. 

-----Original Message-----
From: Houssem Hosni [mailto:Houssem.Hosni@lip6.fr] 
Sent: Monday, February 05, 2018 9:44 AM
To: dev@drill.apache.org
Subject: PCAP files with Apache Drill and Sergeant R

Hi,
I am sending this mail with a hope to get some help from you.
I am working on making some analysis and prediction models on large pcap files.
Can Apache Drill with R Sergeant library help me in this context.
Actually the pcap files are so large (MAWI) and they are available on the web(https://urldefense.proofpoint.com/v2/url?u=http-3A__mawi.wide.ad.jp_mawi_samplepoint-2DF_2018_&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=-cT6otg6lpT_XkmYy7yg3A&m=ph9AC7KBFF30DWucRa-rMCB36AlwdjoovGNbm5YOzDk&s=Q-snTp608TWJGp5jKX5QCGEkQYOQMLem3NOc3khl0xE&e=). I want to access them via apache Drill and then make some analysis using Sergeant package
(R) that works well with Drill.
Should I bring those large MAWI pcap files on the web to Amazon S3 and then access them with DRILL or is it possible to access them directly without amazon storage ?
What steps should I start with ?
Special THANKS in advance for considering my request.
Best regards,
Houssem Hosni
LIP6 - Sorbonne University
houssem.hosni@lip6.fr
Place Jussieu, 75005 Paris.
Tel: (+0033)0644087200