You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Manoj Kumar <ma...@gmail.com> on 2015/01/02 01:54:18 UTC

Highly interested in contributing to spark

Hello,

I am Manoj (https://github.com/MechCoder), an undergraduate student highly
interested in Machine Learning. I have contributed to SymPy and
scikit-learn as part of Google Summer of Code projects and my bachelor's
thesis. I have a few quick (non-technical) questions before I dive into the
issue tracker.

Are the ones marked trivial easy to fix ones, that I could try before
attempting slightly more ambitious ones? Also I would like to know if
Apache Spark takes part in Google Summer of Code projects under the Apache
Software Foundation. It would be really great if it does!

Looking forward!

-- 
Godspeed,
Manoj Kumar,
Mech Undergrad
http://manojbits.wordpress.com

Re: Highly interested in contributing to spark

Posted by "Ganelin, Ilya" <Il...@capitalone.com>.
I might be seeing a similar error - I¹m trying to build behind a proxy. I
was able to build until recently, but now when I run mvn clean package, I
get the following errors:

I would love to know what¹s going on here.

Exception in thread "pool-1-thread-1" Exception in thread "main"
java.lang.ExceptionInInitializerError
java.lang.ExceptionInInitializerError
	at java.lang.J9VMInternals.ensureError(J9VMInternals.java:186)
	at java.lang.J9VMInternals.ensureError(J9VMInternals.java:186)
	at 
java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:175)
	at 
java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:175)

	at javax.crypto.KeyAgreement.getInstance(Unknown Source)
	at com.ibm.jsse2.lb.h(lb.java:129)
	at javax.crypto.KeyAgreement.getInstance(Unknown Source)
	at com.ibm.jsse2.lb.h(lb.java:129)
	at com.ibm.jsse2.lb.a(lb.java:165)
	at com.ibm.jsse2.l$c_.a(l$c_.java:18)
	at com.ibm.jsse2.lb.a(lb.java:165)
	at com.ibm.jsse2.l$c_.a(l$c_.java:18)	at com.ibm.jsse2.l.a(l.java:172)
	at com.ibm.jsse2.m.a(m.java:38)
	at com.ibm.jsse2.l.a(l.java:172)

	at com.ibm.jsse2.m.a(m.java:38)
	at com.ibm.jsse2.m.h(m.java:21)
	at com.ibm.jsse2.m.h(m.java:21)
	at com.ibm.jsse2.qc.a(qc.java:110)
	at com.ibm.jsse2.qc.<init>(qc.java:822)
	at com.ibm.jsse2.qc.a(qc.java:110)
	at com.ibm.jsse2.qc.<init>(qc.java:822)
	at 
com.ibm.jsse2.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:1
0)	at 
com.ibm.jsse2.SSLSocketFactoryImpl.createSocket(SSLSocketFactoryImpl.java:1
0)

	at 
org.apache.maven.wagon.providers.http.httpclient.conn.ssl.SSLConnectionSock
etFactory.createLayeredSocket(SSLConnectionSocketFactory.java:274)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.conn.HttpClientConnec
tionOperator.upgrade(HttpClientConnectionOperator.java:167)
	at 
org.apache.maven.wagon.providers.http.httpclient.conn.ssl.SSLConnectionSock
etFactory.createLayeredSocket(SSLConnectionSocketFactory.java:274)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.conn.HttpClientConnec
tionOperator.upgrade(HttpClientConnectionOperator.java:167)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.conn.PoolingHttpClien
tConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:329)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.establishRoute(MainClientExec.java:392)	at
org.apache.maven.wagon.providers.http.httpclient.impl.conn.PoolingHttpClien
tConnectionManager.upgrade(PoolingHttpClientConnectionManager.java:329)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.establishRoute(MainClientExec.java:392)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.execute(MainClientExec.java:218)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.ProtocolExe
c.execute(ProtocolExec.java:194)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.MainClientE
xec.execute(MainClientExec.java:218)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.ProtocolExe
c.execute(ProtocolExec.java:194)

	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RetryExec.e
xecute(RetryExec.java:85)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RedirectExe
c.execute(RedirectExec.java:108)	at
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RetryExec.e
xecute(RetryExec.java:85)

	at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.InternalHttpCl
ient.doExecute(InternalHttpClient.java:186)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.CloseableHttpC
lient.execute(CloseableHttpClient.java:82)
	at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.execute(Abstr
actHttpClientWagon.java:756)	at
org.apache.maven.wagon.providers.http.httpclient.impl.execchain.RedirectExe
c.execute(RedirectExec.java:108)

	at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.fillInputData
(AbstractHttpClientWagon.java:854)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.InternalHttpCl
ient.doExecute(InternalHttpClient.java:186)
	at 
org.apache.maven.wagon.providers.http.httpclient.impl.client.CloseableHttpC
lient.execute(CloseableHttpClient.java:82)
	at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.execute(Abstr
actHttpClientWagon.java:756)	at
org.apache.maven.wagon.StreamWagon.getInputStream(StreamWagon.java:116)
	at org.apache.maven.wagon.StreamWagon.getIfNewer(StreamWagon.java:88)
	at org.apache.maven.wagon.StreamWagon.get(StreamWagon.java:61)
	at 
org.apache.maven.wagon.providers.http.AbstractHttpClientWagon.fillInputData
(AbstractHttpClientWagon.java:854)

	at org.apache.maven.wagon.StreamWagon.getInputStream(StreamWagon.java:116)
	at org.apache.maven.wagon.StreamWagon.getIfNewer(StreamWagon.java:88)	at
org.eclipse.aether.connector.wagon.WagonRepositoryConnector$GetTask.run(Wag
onRepositoryConnector.java:660)

	at 
org.eclipse.aether.util.concurrency.RunnableErrorForwarder$1.run(RunnableEr
rorForwarder.java:67)	at
org.apache.maven.wagon.StreamWagon.get(StreamWagon.java:61)
	at 
org.eclipse.aether.connector.wagon.WagonRepositoryConnector$GetTask.run(Wag
onRepositoryConnector.java:660)
	at 
org.eclipse.aether.util.concurrency.RunnableErrorForwarder$1.run(RunnableEr
rorForwarder.java:67)
	at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
177)
	at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
177)
	at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
642)

	at java.lang.Thread.run(Thread.java:857)
Caused by: 	at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
642)
	at java.lang.Thread.run(Thread.java:857)
Caused by: java.lang.SecurityException: Cannot set up certs for trusted CAs
java.lang.SecurityException: Cannot set up certs for trusted CAs
	at javax.crypto.b.<clinit>(Unknown Source)
	at javax.crypto.b.<clinit>(Unknown Source)
	... 30 more
Caused by: java.lang.SecurityException: Cannot locate policy or framework
files!
	... 30 more
	at javax.crypto.b.c(Unknown Source)
	at javax.crypto.b.access$600(Unknown Source)Caused by:
java.lang.SecurityException: Cannot locate policy or framework files!

	at javax.crypto.b$0.run(Unknown Source)
	at javax.crypto.b.c(Unknown Source)	at
java.security.AccessController.doPrivileged(AccessController.java:333)
	... 31 more
	at javax.crypto.b.access$600(Unknown Source)

	at javax.crypto.b$0.run(Unknown Source)
	at java.security.AccessController.doPrivileged(AccessController.java:333)
	... 31 more


On 1/2/15, 1:13 PM, "Manoj Kumar" <ma...@gmail.com> wrote:

>Hello,
>
>Thanks for your quick comments and encouragement.
>
>I tried building Spark from source using build/sbt assembly
>
>It however fails at this point
>
>downloading
>https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.10.4/scala-l
>ibrary-2.10.4.jar
>with SSL certificate errors. I understand that it is due to this problem (
>http://apache-spark-user-list.1001560.n3.nabble.com/sbt-sbt-assembly-fails
>-with-ssl-certificate-error-td3046.html
>)
>
>but I'm not sure why it still it uses https when this PR
>https://github.com/apache/spark/pull/209 has fixed it. Any help would be
>greatful.
>
>
>
>
>
>
>On Fri, Jan 2, 2015 at 11:51 AM, Nick Pentreath <ni...@gmail.com>
>wrote:
>
>> Oh actually I was confused with another project, yours was not LSH
>>sorry!
>>
>>
>>
>> ‹
>> Sent from Mailbox <https://www.dropbox.com/mailbox>
>>
>>
>> On Fri, Jan 2, 2015 at 8:19 AM, Nick Pentreath
>><ni...@gmail.com>
>> wrote:
>>
>>> I'm sure Spark will sign up for GSoC again this year - and id be
>>> surprised if there was not some interest now for projects :)
>>>
>>> If I have the time at that point in the year I'd be happy to mentor a
>>> project in MLlib but will have to see how my schedule is at that point!
>>>
>>> Manoj perhaps some of the locality sensitive hashing stuff you did for
>>> scikit-learn could find its way to Spark or spark-projects.
>>>
>>> ‹
>>> Sent from Mailbox <https://www.dropbox.com/mailbox>
>>>
>>>
>>> On Fri, Jan 2, 2015 at 6:28 AM, Reynold Xin <rx...@databricks.com>
>>>wrote:
>>>
>>>> Hi Manoj,
>>>>
>>>> Thanks for the email.
>>>>
>>>> Yes - you should start with the starter task before attempting larger
>>>> ones.
>>>> Last year I signed up as a mentor for GSoC, but no student signed up.
>>>>I
>>>> don't think I'd have time to be a mentor this year, but others might.
>>>>
>>>>
>>>> On Thu, Jan 1, 2015 at 4:54 PM, Manoj Kumar <
>>>> manojkumarsivaraj334@gmail.com>
>>>> wrote:
>>>>
>>>> > Hello,
>>>> >
>>>> > I am Manoj (https://github.com/MechCoder), an undergraduate student
>>>> highly
>>>> > interested in Machine Learning. I have contributed to SymPy and
>>>> > scikit-learn as part of Google Summer of Code projects and my
>>>> bachelor's
>>>> > thesis. I have a few quick (non-technical) questions before I dive
>>>> into the
>>>> > issue tracker.
>>>> >
>>>> > Are the ones marked trivial easy to fix ones, that I could try
>>>>before
>>>> > attempting slightly more ambitious ones? Also I would like to know
>>>>if
>>>> > Apache Spark takes part in Google Summer of Code projects under the
>>>> Apache
>>>> > Software Foundation. It would be really great if it does!
>>>> >
>>>> > Looking forward!
>>>> >
>>>> > --
>>>> > Godspeed,
>>>> > Manoj Kumar,
>>>> > Mech Undergrad
>>>> > http://manojbits.wordpress.com
>>>> >
>>>>
>>>
>>>
>>
>
>
>-- 
>Godspeed,
>Manoj Kumar,
>Intern, Telecom ParisTech
>Mech Undergrad
>http://manojbits.wordpress.com

________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed.  If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Highly interested in contributing to spark

Posted by Manoj Kumar <ma...@gmail.com>.
Hello,

Thanks for your quick comments and encouragement.

I tried building Spark from source using build/sbt assembly

It however fails at this point

downloading
https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar
with SSL certificate errors. I understand that it is due to this problem (
http://apache-spark-user-list.1001560.n3.nabble.com/sbt-sbt-assembly-fails-with-ssl-certificate-error-td3046.html
)

but I'm not sure why it still it uses https when this PR
https://github.com/apache/spark/pull/209 has fixed it. Any help would be
greatful.






On Fri, Jan 2, 2015 at 11:51 AM, Nick Pentreath <ni...@gmail.com>
wrote:

> Oh actually I was confused with another project, yours was not LSH sorry!
>
>
>
> —
> Sent from Mailbox <https://www.dropbox.com/mailbox>
>
>
> On Fri, Jan 2, 2015 at 8:19 AM, Nick Pentreath <ni...@gmail.com>
> wrote:
>
>> I'm sure Spark will sign up for GSoC again this year - and id be
>> surprised if there was not some interest now for projects :)
>>
>> If I have the time at that point in the year I'd be happy to mentor a
>> project in MLlib but will have to see how my schedule is at that point!
>>
>> Manoj perhaps some of the locality sensitive hashing stuff you did for
>> scikit-learn could find its way to Spark or spark-projects.
>>
>> —
>> Sent from Mailbox <https://www.dropbox.com/mailbox>
>>
>>
>> On Fri, Jan 2, 2015 at 6:28 AM, Reynold Xin <rx...@databricks.com> wrote:
>>
>>> Hi Manoj,
>>>
>>> Thanks for the email.
>>>
>>> Yes - you should start with the starter task before attempting larger
>>> ones.
>>> Last year I signed up as a mentor for GSoC, but no student signed up. I
>>> don't think I'd have time to be a mentor this year, but others might.
>>>
>>>
>>> On Thu, Jan 1, 2015 at 4:54 PM, Manoj Kumar <
>>> manojkumarsivaraj334@gmail.com>
>>> wrote:
>>>
>>> > Hello,
>>> >
>>> > I am Manoj (https://github.com/MechCoder), an undergraduate student
>>> highly
>>> > interested in Machine Learning. I have contributed to SymPy and
>>> > scikit-learn as part of Google Summer of Code projects and my
>>> bachelor's
>>> > thesis. I have a few quick (non-technical) questions before I dive
>>> into the
>>> > issue tracker.
>>> >
>>> > Are the ones marked trivial easy to fix ones, that I could try before
>>> > attempting slightly more ambitious ones? Also I would like to know if
>>> > Apache Spark takes part in Google Summer of Code projects under the
>>> Apache
>>> > Software Foundation. It would be really great if it does!
>>> >
>>> > Looking forward!
>>> >
>>> > --
>>> > Godspeed,
>>> > Manoj Kumar,
>>> > Mech Undergrad
>>> > http://manojbits.wordpress.com
>>> >
>>>
>>
>>
>


-- 
Godspeed,
Manoj Kumar,
Intern, Telecom ParisTech
Mech Undergrad
http://manojbits.wordpress.com

Re: Highly interested in contributing to spark

Posted by Nick Pentreath <ni...@gmail.com>.
Oh actually I was confused with another project, yours was not LSH sorry!






—
Sent from Mailbox

On Fri, Jan 2, 2015 at 8:19 AM, Nick Pentreath <ni...@gmail.com>
wrote:

> I'm sure Spark will sign up for GSoC again this year - and id be surprised if there was not some interest now for projects :)
> If I have the time at that point in the year I'd be happy to mentor a project in MLlib but will have to see how my schedule is at that point!
> Manoj perhaps some of the locality sensitive hashing stuff you did for scikit-learn could find its way to Spark or spark-projects.
> —
> Sent from Mailbox
> On Fri, Jan 2, 2015 at 6:28 AM, Reynold Xin <rx...@databricks.com> wrote:
>> Hi Manoj,
>> Thanks for the email.
>> Yes - you should start with the starter task before attempting larger ones.
>> Last year I signed up as a mentor for GSoC, but no student signed up. I
>> don't think I'd have time to be a mentor this year, but others might.
>> On Thu, Jan 1, 2015 at 4:54 PM, Manoj Kumar <ma...@gmail.com>
>> wrote:
>>> Hello,
>>>
>>> I am Manoj (https://github.com/MechCoder), an undergraduate student highly
>>> interested in Machine Learning. I have contributed to SymPy and
>>> scikit-learn as part of Google Summer of Code projects and my bachelor's
>>> thesis. I have a few quick (non-technical) questions before I dive into the
>>> issue tracker.
>>>
>>> Are the ones marked trivial easy to fix ones, that I could try before
>>> attempting slightly more ambitious ones? Also I would like to know if
>>> Apache Spark takes part in Google Summer of Code projects under the Apache
>>> Software Foundation. It would be really great if it does!
>>>
>>> Looking forward!
>>>
>>> --
>>> Godspeed,
>>> Manoj Kumar,
>>> Mech Undergrad
>>> http://manojbits.wordpress.com
>>>

Re: Highly interested in contributing to spark

Posted by Nick Pentreath <ni...@gmail.com>.
I'm sure Spark will sign up for GSoC again this year - and id be surprised if there was not some interest now for projects :)


If I have the time at that point in the year I'd be happy to mentor a project in MLlib but will have to see how my schedule is at that point!




Manoj perhaps some of the locality sensitive hashing stuff you did for scikit-learn could find its way to Spark or spark-projects.


—
Sent from Mailbox

On Fri, Jan 2, 2015 at 6:28 AM, Reynold Xin <rx...@databricks.com> wrote:

> Hi Manoj,
> Thanks for the email.
> Yes - you should start with the starter task before attempting larger ones.
> Last year I signed up as a mentor for GSoC, but no student signed up. I
> don't think I'd have time to be a mentor this year, but others might.
> On Thu, Jan 1, 2015 at 4:54 PM, Manoj Kumar <ma...@gmail.com>
> wrote:
>> Hello,
>>
>> I am Manoj (https://github.com/MechCoder), an undergraduate student highly
>> interested in Machine Learning. I have contributed to SymPy and
>> scikit-learn as part of Google Summer of Code projects and my bachelor's
>> thesis. I have a few quick (non-technical) questions before I dive into the
>> issue tracker.
>>
>> Are the ones marked trivial easy to fix ones, that I could try before
>> attempting slightly more ambitious ones? Also I would like to know if
>> Apache Spark takes part in Google Summer of Code projects under the Apache
>> Software Foundation. It would be really great if it does!
>>
>> Looking forward!
>>
>> --
>> Godspeed,
>> Manoj Kumar,
>> Mech Undergrad
>> http://manojbits.wordpress.com
>>

Re: Highly interested in contributing to spark

Posted by Reynold Xin <rx...@databricks.com>.
Hi Manoj,

Thanks for the email.

Yes - you should start with the starter task before attempting larger ones.
Last year I signed up as a mentor for GSoC, but no student signed up. I
don't think I'd have time to be a mentor this year, but others might.


On Thu, Jan 1, 2015 at 4:54 PM, Manoj Kumar <ma...@gmail.com>
wrote:

> Hello,
>
> I am Manoj (https://github.com/MechCoder), an undergraduate student highly
> interested in Machine Learning. I have contributed to SymPy and
> scikit-learn as part of Google Summer of Code projects and my bachelor's
> thesis. I have a few quick (non-technical) questions before I dive into the
> issue tracker.
>
> Are the ones marked trivial easy to fix ones, that I could try before
> attempting slightly more ambitious ones? Also I would like to know if
> Apache Spark takes part in Google Summer of Code projects under the Apache
> Software Foundation. It would be really great if it does!
>
> Looking forward!
>
> --
> Godspeed,
> Manoj Kumar,
> Mech Undergrad
> http://manojbits.wordpress.com
>