You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Kelsea Flores <kj...@dons.usfca.edu> on 2018/08/17 03:05:16 UTC

Getting started

Hi, my name is Kelsea Flores and I am a senior at the University of San
Francisco. I will be graduating in December 2018 with a Bachelor’s in
Computer Science. I am currently contributing to the SIS project; however,
I am also interested in working on Lucene.


Many of the projects that I have worked on so far have been centered around
data structures, searching, and sorting. In my Software Development class,
I worked individually to build a search engine in Java. The first of this
four-part project was to write a Java program that processes all HTML files
in a directory and its subdirectories, cleans and parses the HTML into
words, and builds an inverted index to store the mapping from words to the
documents and positions within those documents where those words were
found. The second part of the project was to support exact search and
partial search by parsing a query file, generating a sorted list of search
results from the inverted index, and writing those results to a JSON file.
The third part consisted of extending part two to support multithreading by
making a thread-safe inverted index and using a work queue to build and
search an inverted index using multiple threads. The final part of the
project was to support building the index from the web instead of a
directory of text files using multithreading, an inverted index, sockets,
and HTTP.



Would anyone be willing to help me find work that would be suitable for me
to contribute to?

Thanks for taking the time to read this and I look forward to hearing from
you.

Kelsea Flores

Re: Getting started

Posted by Steve Rowe <sa...@gmail.com>.
Hi Kelsea,

Welcome!  

You can find information about Lucene’s development process here: https://wiki.apache.org/lucene-java/HowToContribute . In particular the section “Getting your feet wet: where to begin?”, near the bottom of the page, should interest you.

Good luck,

--
Steve
www.lucidworks.com

> On Aug 16, 2018, at 11:05 PM, Kelsea Flores <kj...@dons.usfca.edu> wrote:
> 
> Hi, my name is Kelsea Flores and I am a senior at the University of San Francisco. I will be graduating in December 2018 with a Bachelor’s in Computer Science. I am currently contributing to the SIS project; however, I am also interested in working on Lucene. 
> 
> Many of the projects that I have worked on so far have been centered around data structures, searching, and sorting. In my Software Development class, I worked individually to build a search engine in Java. The first of this four-part project was to write a Java program that processes all HTML files in a directory and its subdirectories, cleans and parses the HTML into words, and builds an inverted index to store the mapping from words to the documents and positions within those documents where those words were found. The second part of the project was to support exact search and partial search by parsing a query file, generating a sorted list of search results from the inverted index, and writing those results to a JSON file. The third part consisted of extending part two to support multithreading by making a thread-safe inverted index and using a work queue to build and search an inverted index using multiple threads. The final part of the project was to support building the index from the web instead of a directory of text files using multithreading, an inverted index, sockets, and HTTP. 
>  
> Would anyone be willing to help me find work that would be suitable for me to contribute to? 
> 
> Thanks for taking the time to read this and I look forward to hearing from you. 
> 
> Kelsea Flores


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org