You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by aj...@thedatateam.in on 2019/06/16 13:48:00 UTC
Spark read csv option - capture exception in a column in permissive mode
Hi Team,
Can we have another column which gives the corrupted record reason in
permissive mode while reading csv.
Thanks,
Ajay
Re: Spark read csv option - capture exception in a column in
permissive mode
Posted by "Anselmi Rodriguez, Agustina, Vodafone UK" <ag...@vodafone.com>.
You can sort of hack this by reading it as an RDD[String] and trying to implement a custom parser i.e.
Val rddRows = rdd.map parseMyCols
Def parseMyCols(rawVal: String) : Row = {
parse(rawVal) match {
case Success(parsedRowValues) = > Row(parsedRowValues :+ “”: _*)
case Failure(exception) => Row( nullList :+ exception.getMessage )
}
}
Hope this helps
On 17 Jun 2019, at 06:31, Ajay Thompson <aj...@thedatateam.in>> wrote:
There's a column which captures the corrupted record. However, the exception isn't captured. If the exception is captured in another column it'll be very useful.
On Mon, 17 Jun, 2019, 10:56 AM Gourav Sengupta, <go...@gmail.com>> wrote:
Hi,
it already does, I think, you just have to add the column in the schema that you are using to read.
Regards,
Gourav
On Sun, Jun 16, 2019 at 2:48 PM <aj...@thedatateam.in>> wrote:
Hi Team,
Can we have another column which gives the corrupted record reason in permissive mode while reading csv.
Thanks,
Ajay
Re: Spark read csv option - capture exception in a column in
permissive mode
Posted by Ajay Thompson <aj...@thedatateam.in>.
There's a column which captures the corrupted record. However, the
exception isn't captured. If the exception is captured in another column
it'll be very useful.
On Mon, 17 Jun, 2019, 10:56 AM Gourav Sengupta, <go...@gmail.com>
wrote:
> Hi,
>
> it already does, I think, you just have to add the column in the schema
> that you are using to read.
>
> Regards,
> Gourav
>
> On Sun, Jun 16, 2019 at 2:48 PM <aj...@thedatateam.in> wrote:
>
>> Hi Team,
>>
>>
>>
>> Can we have another column which gives the corrupted record reason in
>> permissive mode while reading csv.
>>
>>
>>
>> Thanks,
>>
>> Ajay
>>
>
Re: Spark read csv option - capture exception in a column in
permissive mode
Posted by Gourav Sengupta <go...@gmail.com>.
Hi,
it already does, I think, you just have to add the column in the schema
that you are using to read.
Regards,
Gourav
On Sun, Jun 16, 2019 at 2:48 PM <aj...@thedatateam.in> wrote:
> Hi Team,
>
>
>
> Can we have another column which gives the corrupted record reason in
> permissive mode while reading csv.
>
>
>
> Thanks,
>
> Ajay
>