You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Muhammad Rezaul Karim <re...@yahoo.com> on 2016/11/17 03:44:42 UTC

ArrayIndexOutOfBoundsException on Zeppelin notebook example

Hi All,
I have the following Scala code (taken from https://zeppelin.apache.org/docs/0.6.2/quickstart/tutorial.html#data-retrieval) that deals with the sample Bank-details data:
-----------------------------------------------------------*--------------------------------------------------------------------
val bankText = sc.textFile("/home/asif/zeppelin-0.6.2-bin-all/bin/bank-full.csv")
case class Bank(age:Integer, job:String, marital:String, education:String, balance:Integer)

// split each line, filter out header (starts with "age"), and map it into Bank case class
val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map(
    s=>Bank(s(0), 
            s(1).replaceAll("\"", ""),
            s(2).replaceAll("\"", ""),
            s(3).replaceAll("\"", ""),
            s(5).replaceAll("\"", "")
        )
)
// convert to DataFrame and create temporal table
bank.toDF().registerTempTable("bank")
-----------------------------------------------------------*--------------------------------------------------------------------The above code segment runs successfully. However, when I am trying to execute the following line of code: bank.collect(), I am getting the following error:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 6.0 failed 1 times, most recent failure: Lost task 1.0 in stage 6.0 (TID 7, localhost): java.lang.ArrayIndexOutOfBoundsException: 2
    at $anonfun$3.apply(<console>:91)
    at $anonfun$3.apply(<console>:89)

Moreover, I cannot execute the below SQL queries, but getting the same error message (i.e., ArrayIndexOutOfBoundsException: 2 ):
1.    %sql select age, count(1) from bank where age < 30 group by age order by age2. %sql select age, count(1) from bank where age < ${maxAge=30} group by age order by age3. %sql select age, count(1) from bank where marital="${marital=single,single|divorced|married}" group by age order by age
Note: However, when I am executing the following SQL statements I am not getting any error:
1. %sql select age from bank 
2. %sql select * from bank 

I don't understand what's wrong I am doing here! Please help me, someone, to get rid of it. 

 Thanks and Regards,
---------------------------------
Md. Rezaul Karim 
PhD Researcher, Insight Centre for Data Analytics 
National University of Ireland Galway
E-mail: rezaul.karim@insight-centre.org
Web: www.insight-centre.org
Phone: +353892311519

Re: ArrayIndexOutOfBoundsException on Zeppelin notebook example

Posted by Hyung Sung Shim <hs...@nflabs.com>.
Good to hear it helps.
2016년 11월 18일 (금) 오전 1:52, Muhammad Rezaul Karim <re...@yahoo.com>님이
작성:

> Hi Shim,
>
> Now it works perfectly. Thank you so much. Actually, I am from Java
> background and learning the Scala.
>
>
> Thanks and Regards,
> ---------------------------------
> *Md. Rezaul Karim*
> PhD Researcher, Insight Centre for Data Analytics
> National University of Ireland Galway
> *E-mail:* rezaul.karim@insight-centre.org
> <do...@insight-centre.org>
> *Web*: www.insight-centre.org
> *Phone:* +353892311519
>
>
> On Thursday, November 17, 2016 2:00 PM, Hyung Sung Shim <hs...@nflabs.com>
> wrote:
>
>
> Hello Muhammad.
>
> Please check your bank-full.csv file first and you can filter item length
> in your scala code
> for example *val bank = bankText.map(s => s.split(";")).filter(s =>
> (s.size)>5).filter(s => s(0) != "\"age\"")*
>
> Hope this helps.
>
>
>
> 2016-11-17 21:26 GMT+09:00 Dayong <wi...@gmail.com>:
>
> Try to debug your code in IDE. You should look at your array s since it
> complains about array index.
>
> Thanks,
> Wd
>
>
> On Nov 16, 2016, at 10:44 PM, Muhammad Rezaul Karim <re...@yahoo.com>
> wrote:
>
> Hi All,
>
> I have the following Scala code (taken from https://zeppelin.apache.org/
> docs/0.6.2/quickstart/ tutorial.html#data-retrieval
> <https://zeppelin.apache.org/docs/0.6.2/quickstart/tutorial.html#data-retrieval>)
> that deals with the sample Bank-details data:
>
>
> ------------------------------ -----------------------------*
> ------------------------------ ------------------------------ --------
>
> val bankText = sc.textFile("/home/asif/ zeppelin-0.6.2-bin-all/bin/
> bank-full.csv")
>
>
> case class Bank(age:Integer, job:String, marital:String, education:String,
> balance:Integer)
>
> // split each line, filter out header (starts with "age"), and map it into
> Bank case class
> val bank = bankText.map(s=>s.split(";")). filter(s=>s(0)!="\"age\""). map(
>     s=>Bank(s(0),
>             s(1).replaceAll("\"", ""),
>             s(2).replaceAll("\"", ""),
>             s(3).replaceAll("\"", ""),
>             s(5).replaceAll("\"", "")
>         )
> )
> // convert to DataFrame and create temporal table
> bank.toDF().registerTempTable( "bank")
>
> ------------------------------ -----------------------------*
> ------------------------------ ------------------------------ --------
> The above code segment runs successfully. However, when I am trying to
> execute the following line of code: *bank.collect(), *I am getting the
> following error:
>
> org.apache.spark. SparkException: Job aborted due to stage failure: Task 1
> in stage 6.0 failed 1 times, most recent failure: Lost task 1.0 in stage
> 6.0 (TID 7, localhost): java.lang. ArrayIndexOutOfBoundsException : 2
>
>
>     at $anonfun$3.apply(<console>:91)
>     at $anonfun$3.apply(<console>:89)
>
>
> Moreover, I cannot execute the below SQL queries, but getting the same
> error message (i.e., ArrayIndexOutOfBoundsException : 2 ):
>
> 1.    %sql select age, count(1) from bank where age < 30 group by age
> order by age
> 2. %sql select age, count(1) from bank where age < ${maxAge=30} group by
> age order by age
>
> 3. %sql select age, count(1) from bank where marital="${marital=single,
> single|divorced|married}" group by age order by age
>
>
> *Note: However, when I am executing the following SQL statements I am not
> getting any error:*
>
> 1. %sql select age from bank
> 2. %sql select * from bank
>
> I don't understand what's wrong I am doing here! Please help me, someone,
> to get rid of it.
>
>
> Thanks and Regards,
> ------------------------------ ---
> *Md. Rezaul Karim*
> PhD Researcher, Insight Centre for Data Analytics
> National University of Ireland Galway
> *E-mail:* rezaul.karim@insight-centre. org
> <do...@insight-centre.org>
> *Web*: www.insight-centre.org
> *Phone:* +353892311519
>
>

Re: ArrayIndexOutOfBoundsException on Zeppelin notebook example

Posted by Muhammad Rezaul Karim <re...@yahoo.com>.
Hi Shim,
Now it works perfectly. Thank you so much. Actually, I am from Java background and learning the Scala. 

 Thanks and Regards,
---------------------------------
Md. Rezaul Karim 
PhD Researcher, Insight Centre for Data Analytics 
National University of Ireland Galway
E-mail: rezaul.karim@insight-centre.org
Web: www.insight-centre.org
Phone: +353892311519 

    On Thursday, November 17, 2016 2:00 PM, Hyung Sung Shim <hs...@nflabs.com> wrote:
 

 Hello Muhammad. 
Please check your bank-full.csv file first and you can filter item length in your scala code for example val bank = bankText.map(s => s.split(";")).filter(s => (s.size)>5).filter(s => s(0) != "\"age\"")
Hope this helps.



2016-11-17 21:26 GMT+09:00 Dayong <wi...@gmail.com>:

Try to debug your code in IDE. You should look at your array s since it complains about array index.

Thanks,Wd
On Nov 16, 2016, at 10:44 PM, Muhammad Rezaul Karim <re...@yahoo.com> wrote:


Hi All,
I have the following Scala code (taken from https://zeppelin.apache.org/ docs/0.6.2/quickstart/ tutorial.html#data-retrieval) that deals with the sample Bank-details data:
------------------------------ -----------------------------* ------------------------------ ------------------------------ --------
val bankText = sc.textFile("/home/asif/ zeppelin-0.6.2-bin-all/bin/ bank-full.csv")
case class Bank(age:Integer, job:String, marital:String, education:String, balance:Integer)

// split each line, filter out header (starts with "age"), and map it into Bank case class
val bank = bankText.map(s=>s.split(";")). filter(s=>s(0)!="\"age\""). map(
    s=>Bank(s(0), 
            s(1).replaceAll("\"", ""),
            s(2).replaceAll("\"", ""),
            s(3).replaceAll("\"", ""),
            s(5).replaceAll("\"", "")
        )
)
// convert to DataFrame and create temporal table
bank.toDF().registerTempTable( "bank")
------------------------------ -----------------------------* ------------------------------ ------------------------------ --------The above code segment runs successfully. However, when I am trying to execute the following line of code: bank.collect(), I am getting the following error:
org.apache.spark. SparkException: Job aborted due to stage failure: Task 1 in stage 6.0 failed 1 times, most recent failure: Lost task 1.0 in stage 6.0 (TID 7, localhost): java.lang. ArrayIndexOutOfBoundsException : 2
    at $anonfun$3.apply(<console>:91)
    at $anonfun$3.apply(<console>:89)

Moreover, I cannot execute the below SQL queries, but getting the same error message (i.e., ArrayIndexOutOfBoundsException : 2 ):
1.    %sql select age, count(1) from bank where age < 30 group by age order by age2. %sql select age, count(1) from bank where age < ${maxAge=30} group by age order by age3. %sql select age, count(1) from bank where marital="${marital=single, single|divorced|married}" group by age order by age
Note: However, when I am executing the following SQL statements I am not getting any error:
1. %sql select age from bank 
2. %sql select * from bank 

I don't understand what's wrong I am doing here! Please help me, someone, to get rid of it. 

 Thanks and Regards,
------------------------------ ---
Md. Rezaul Karim 
PhD Researcher, Insight Centre for Data Analytics 
National University of Ireland Galway
E-mail: rezaul.karim@insight-centre. org
Web: www.insight-centre.org
Phone: +353892311519




   

Re: ArrayIndexOutOfBoundsException on Zeppelin notebook example

Posted by Hyung Sung Shim <hs...@nflabs.com>.
Hello Muhammad.

Please check your bank-full.csv file first and you can filter item length
in your scala code
for example *val bank = bankText.map(s => s.split(";")).filter(s =>
(s.size)>5).filter(s => s(0) != "\"age\"")*

Hope this helps.




2016-11-17 21:26 GMT+09:00 Dayong <wi...@gmail.com>:

> Try to debug your code in IDE. You should look at your array s since it
> complains about array index.
>
> Thanks,
> Wd
>
> On Nov 16, 2016, at 10:44 PM, Muhammad Rezaul Karim <re...@yahoo.com>
> wrote:
>
> Hi All,
>
> I have the following Scala code (taken from https://zeppelin.apache.org/
> docs/0.6.2/quickstart/tutorial.html#data-retrieval) that deals with the
> sample Bank-details data:
>
> -----------------------------------------------------------*
> --------------------------------------------------------------------
> val bankText = sc.textFile("/home/asif/zeppelin-0.6.2-bin-all/bin/
> bank-full.csv")
> case class Bank(age:Integer, job:String, marital:String, education:String,
> balance:Integer)
>
> // split each line, filter out header (starts with "age"), and map it into
> Bank case class
> val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map(
>     s=>Bank(s(0),
>             s(1).replaceAll("\"", ""),
>             s(2).replaceAll("\"", ""),
>             s(3).replaceAll("\"", ""),
>             s(5).replaceAll("\"", "")
>         )
> )
> // convert to DataFrame and create temporal table
> bank.toDF().registerTempTable("bank")
> -----------------------------------------------------------*
> --------------------------------------------------------------------
> The above code segment runs successfully. However, when I am trying to
> execute the following line of code: *bank.collect(), *I am getting the
> following error:
>
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 1
> in stage 6.0 failed 1 times, most recent failure: Lost task 1.0 in stage
> 6.0 (TID 7, localhost): java.lang.ArrayIndexOutOfBoundsException: 2
>     at $anonfun$3.apply(<console>:91)
>     at $anonfun$3.apply(<console>:89)
>
> Moreover, I cannot execute the below SQL queries, but getting the same
> error message (i.e., ArrayIndexOutOfBoundsException: 2 ):
>
> 1.    %sql select age, count(1) from bank where age < 30 group by age
> order by age
> 2. %sql select age, count(1) from bank where age < ${maxAge=30} group by
> age order by age
> 3. %sql select age, count(1) from bank where marital="${marital=single,
> single|divorced|married}" group by age order by age
>
> *Note: However, when I am executing the following SQL statements I am not
> getting any error:*
>
> 1. %sql select age from bank
> 2. %sql select * from bank
>
> I don't understand what's wrong I am doing here! Please help me, someone,
> to get rid of it.
>
>
> Thanks and Regards,
> ---------------------------------
> *Md. Rezaul Karim*
> PhD Researcher, Insight Centre for Data Analytics
> National University of Ireland Galway
> *E-mail:* rezaul.karim@insight-centre.org
> <do...@insight-centre.org>
> *Web*: www.insight-centre.org
> *Phone:* +353892311519
>
>

Re: ArrayIndexOutOfBoundsException on Zeppelin notebook example

Posted by Dayong <wi...@gmail.com>.
Try to debug your code in IDE. You should look at your array s since it complains about array index.

Thanks,
Wd

> On Nov 16, 2016, at 10:44 PM, Muhammad Rezaul Karim <re...@yahoo.com> wrote:
> 
> Hi All,
> 
> I have the following Scala code (taken from https://zeppelin.apache.org/docs/0.6.2/quickstart/tutorial.html#data-retrieval) that deals with the sample Bank-details data:
> 
> -----------------------------------------------------------*--------------------------------------------------------------------
> val bankText = sc.textFile("/home/asif/zeppelin-0.6.2-bin-all/bin/bank-full.csv")
> case class Bank(age:Integer, job:String, marital:String, education:String, balance:Integer)
> 
> // split each line, filter out header (starts with "age"), and map it into Bank case class
> val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map(
>     s=>Bank(s(0), 
>             s(1).replaceAll("\"", ""),
>             s(2).replaceAll("\"", ""),
>             s(3).replaceAll("\"", ""),
>             s(5).replaceAll("\"", "")
>         )
> )
> // convert to DataFrame and create temporal table
> bank.toDF().registerTempTable("bank")
> -----------------------------------------------------------*--------------------------------------------------------------------
> The above code segment runs successfully. However, when I am trying to execute the following line of code: bank.collect(), I am getting the following error:
> 
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 6.0 failed 1 times, most recent failure: Lost task 1.0 in stage 6.0 (TID 7, localhost): java.lang.ArrayIndexOutOfBoundsException: 2
>     at $anonfun$3.apply(<console>:91)
>     at $anonfun$3.apply(<console>:89)
> 
> Moreover, I cannot execute the below SQL queries, but getting the same error message (i.e., ArrayIndexOutOfBoundsException: 2 ):
> 
> 1.    %sql select age, count(1) from bank where age < 30 group by age order by age
> 2. %sql select age, count(1) from bank where age < ${maxAge=30} group by age order by age
> 3. %sql select age, count(1) from bank where marital="${marital=single,single|divorced|married}" group by age order by age
> 
> Note: However, when I am executing the following SQL statements I am not getting any error:
> 
> 1. %sql select age from bank 
> 2. %sql select * from bank 
> 
> I don't understand what's wrong I am doing here! Please help me, someone, to get rid of it. 
> 
>  
> Thanks and Regards,
> ---------------------------------
> Md. Rezaul Karim 
> PhD Researcher, Insight Centre for Data Analytics 
> National University of Ireland Galway
> E-mail: rezaul.karim@insight-centre.org
> Web: www.insight-centre.org
> Phone: +353892311519