You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spot.apache.org by Salvatore Elio <El...@sia.eu> on 2017/09/15 12:17:43 UTC

DNS ingestion

​Hello,



for an internal project we have developed a different ingestion process for DNS in order to have a real time ingestion and support early enrichment of the ingested data.

The ingestion process is splitted into 2 processes:



1)     ​From DNS data to Kafka -  An Akka Streams job based on Pcap4j (https://<https://github.com/kaitoy/pcap4j)>github.com/kaitoy/pcap4j<https://github.com/kaitoy/pcap4j)>) which​:

a.     loop through all the filtered UDP packets on port 53 using Pcap4j;

b.     convert Pcap4j packet objects to AVRO using Twitter Bijection;

c.      send Avro objects to Kafka.



2) From Kafka to hive - a Spark Streaming job that read new messages on Kafka and write them in partitioned HDFS parquet folder readable by Hive/Impala.
​

​We would like to know your thoughts about this and if this could be integrated into apache spot. If it is of interest we can share the code.





Thanks

[SIA logo]
________________________________

*******************Internet Email Confidentiality Footer*******************
Qualsiasi utilizzo non autorizzato del presente messaggio nonché dei suoi allegati è vietato e potrebbe costituire reato. Se ha ricevuto per errore il presente messaggio, Le saremmo grati se ci inviasse, via e-mail, una comunicazione al riguardo e provvedesse nel contempo alla distruzione del messaggio stesso e dei suoi eventuali allegati. Le dichiarazioni contenute nel presente messaggio nonche' nei suoi eventuali allegati devono essere attribuite al mittente e non possono essere necessariamente considerate come autorizzate da SIA S.p.A.; le medesime dichiarazioni non impegnano SIA S.p.A. nei confronti del destinatario o di terzi. SIA S.p.A. non si assume alcuna responsabilita' per eventuali intercettazioni, modifiche o danneggiamenti del presente messaggio e-mail.

Any unauthorized use of this e-mail or any of its attachments is prohibited and could constitute an offence. If you are not the intended addressee please advise immediately the sender by using the reply facility in your e-mail software and destroy the message and its attachments. The statements and opinions expressed in this e-mail message are those of the author of the message and do not necessarily represent those of SIA S.p.A. Besides, The contents of this message shall be understood as neither given nor endorsed by SIA S.p.A.. SIA S.p.A. does not accept liability for corruption, interception or amendment, if any, or the consequences thereof.