You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Leonardo Foderaro <st...@gmail.com> on 2015/09/10 14:38:36 UTC

Introducing Alba, a small framework to simplify Solr plugins development

Hi everyone,

this is my first post on this list and my first opensource project, so
please don't expect too much from either of them.

I've spent these last weeks trying to understand how to create Solr
plugins, so I started a simple project (a plugin itself) which evolved into
a small framework named Alba (the Italian word for 'sunrise'), aimed to
simplify their development. To summarize it, each plugin is just an
annotated method:

@AlbaPlugin(name="myPluginsLibrary")
public class MyPlugins {

    @DocTransformer(name="helloworld")
     public void hello(SolrDocument doc) {
         doc.setField("message", "Hello, World!");
     }

    @FunctionQuery(name="len", description="returns the length of a string")
     public Integer len(@Param(name="string", description="the string to
measure") String s) {
        return s.length();
     }

}

and this is how you call it, assuming 'author' is a valid field in your
schema:

fl=[alba name="helloworld"],alba(len,string=author),message

Plugins currently supported are:

- FunctionQuery
- ResponseWriter
- RequestHandler
- SearchComponent
- DocTransformer

Of course it's far from being complete and I still have to learn a lot of
things about Solr (not to mention Java itself!), nontheless working on it
is a terrific learning experience for me, and I think it could evolve into
something useful.

At http://github.com/leonardofoderaro/ you can find the project with a
(still-in-progress) tutorial in the wiki and some related repos, e.g. the
plugins built in the tutorial or the script used to generate the sample
dataset.

I still have many questions about Solr, but first I'd like to ask you if
you think it's a good idea. Any feedback is very welcome.

Kind regards,
Leonardo

Re: Introducing Alba, a small framework to simplify Solr plugins development

Posted by Leonardo Foderaro <st...@gmail.com>.
thank you for sharing, it looks like a challenging project.
I'm not sure if alba could be the right tool
but if you want to give it a try for a simple proof-of-concept
I will gladly help you to decide if it can be.
I also agree with Alexandre, I'm not sure if this thread
 is more appropriate here on the dev or on the users list,
eventually we can continue it there.

thanks
leo

On Mon, Sep 14, 2015 at 4:27 PM, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> On 14 September 2015 at 07:55, Toke Eskildsen <te...@statsbiblioteket.dk>
> wrote:
> > The idea is to introduce named filters, where the construction of the
> > filters themselves is done internally in Solr.
>
> That would be a custom query parser, right? Just thinking out loud.
>
> Regards,
>    Alex.
> P.s. Also, is this conversation suitable for DEV mailing list? Should
> it move to User?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Introducing Alba, a small framework to simplify Solr plugins development

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
On 14 September 2015 at 07:55, Toke Eskildsen <te...@statsbiblioteket.dk> wrote:
> The idea is to introduce named filters, where the construction of the
> filters themselves is done internally in Solr.

That would be a custom query parser, right? Just thinking out loud.

Regards,
   Alex.
P.s. Also, is this conversation suitable for DEV mailing list? Should
it move to User?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Introducing Alba, a small framework to simplify Solr plugins development

Posted by Erik Hatcher <er...@gmail.com>.
Toke - this (named filters that can be combined in boolean expressions) sounds like https://issues.apache.org/jira/browse/SOLR-7276 <https://issues.apache.org/jira/browse/SOLR-7276> - whatcha think?

	Erik




> On Sep 14, 2015, at 7:55 AM, Toke Eskildsen <te...@statsbiblioteket.dk> wrote:
> 
> On Mon, 2015-09-14 at 12:34 +0200, Leonardo Foderaro wrote:
> 
>> Should you have any issue or suggestion on how to improve it please
>> let me know. 
> 
> I can explain my planned project, as it seems relevant in a broader
> scope. Maybe you can tell me if such a project fits into your framework?
> 
> 
> We have a SolrCloud setup with billions of documents, with 2-300M
> documents in each shard. We need to define multiple "sub-corpora", with
> a granularity that can be at single-document-level. In Solr-speak that
> could be done with filters. A filter could be (id:1234 OR id:5678),
> which is easy enough. But that does not scale to millions of IDs.
> 
> The idea is to introduce named filters, where the construction of the
> filters themselves is done internally in Solr.
> 
> Creating a filter could be a call with a user-specified name (aka
> filter-ID) and an URL to a filter-setup. The filter-setup would just be
> a list of queries, one on each line
> id:1234
> id:5678
> domain:example.com
> id:7654
> The lines are processed one at a time and each match is OR'ed to the
> named filter being constructed. As this is a streaming process, there is
> not real limit to the size.
> 
> Using a previously constructed named filter would (guessing here) be a
> matter of writing a small alba-annotated class that takes the filter-ID
> as input and returns the corresponding custom-made Filter, which really
> is just a list of docIDs underneath (probably represented as a bitmap).
> 
> 
> - Toke Eskildsen, State and University Library, Denmark
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


Re: Introducing Alba, a small framework to simplify Solr plugins development

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Mon, 2015-09-14 at 12:34 +0200, Leonardo Foderaro wrote:

> Should you have any issue or suggestion on how to improve it please
> let me know. 

I can explain my planned project, as it seems relevant in a broader
scope. Maybe you can tell me if such a project fits into your framework?


We have a SolrCloud setup with billions of documents, with 2-300M
documents in each shard. We need to define multiple "sub-corpora", with
a granularity that can be at single-document-level. In Solr-speak that
could be done with filters. A filter could be (id:1234 OR id:5678),
which is easy enough. But that does not scale to millions of IDs.

The idea is to introduce named filters, where the construction of the
filters themselves is done internally in Solr.

Creating a filter could be a call with a user-specified name (aka
filter-ID) and an URL to a filter-setup. The filter-setup would just be
a list of queries, one on each line
 id:1234
 id:5678
 domain:example.com
 id:7654
The lines are processed one at a time and each match is OR'ed to the
named filter being constructed. As this is a streaming process, there is
not real limit to the size.

Using a previously constructed named filter would (guessing here) be a
matter of writing a small alba-annotated class that takes the filter-ID
as input and returns the corresponding custom-made Filter, which really
is just a list of docIDs underneath (probably represented as a bitmap).


- Toke Eskildsen, State and University Library, Denmark




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Introducing Alba, a small framework to simplify Solr plugins development

Posted by Leonardo Foderaro <st...@gmail.com>.
Hi Toke,
I'm glad to know that.
That's exactly why I'm writing Alba: to lower as much as possible the
initial learning curve about Solr plugins architecture so you can do a
quick test and evaluate if a custom plugin is the right tool for a
particular task (sometimes you can get the same result working on the
config) or even just to explore what can be done with them.

As soon as I can allocate some time I'll try to add more features and more
examples in the Wiki.

Should you have any issue or suggestion on how to improve it please let me
know.

Thanks
Leonardo


On Mon, Sep 14, 2015 at 9:58 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
wrote:

> On Thu, 2015-09-10 at 14:38 +0200, Leonardo Foderaro wrote:
> > @AlbaPlugin(name="myPluginsLibrary")
> > public class MyPlugins {
> >
> >     @DocTransformer(name="helloworld")
> >      public void hello(SolrDocument doc) {
> >          doc.setField("message", "Hello, World!");
> >      }
> >
> [... http://github.com/leonardofoderaro/]
>
> The is very timely for me, as I'll have to dig into Solr plugin writing
> before the year is over.
>
> > I still have many questions about Solr, but first I'd like to ask you
> > if you think it's a good idea. Any feedback is very welcome.
>
> I know very little writing plugins, so I am in no position to qualify
> how much alba helps with that: From what I can see in your GitHub
> repository, it seems very accessible though.
>
> Thank you for sharing,
> Toke Eskildsen, State and University Library, Denmark
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Introducing Alba, a small framework to simplify Solr plugins development

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Thu, 2015-09-10 at 14:38 +0200, Leonardo Foderaro wrote:
> @AlbaPlugin(name="myPluginsLibrary")
> public class MyPlugins {
> 
>     @DocTransformer(name="helloworld")
>      public void hello(SolrDocument doc) {
>          doc.setField("message", "Hello, World!");
>      }
> 
[... http://github.com/leonardofoderaro/]

The is very timely for me, as I'll have to dig into Solr plugin writing
before the year is over. 

> I still have many questions about Solr, but first I'd like to ask you
> if you think it's a good idea. Any feedback is very welcome.

I know very little writing plugins, so I am in no position to qualify
how much alba helps with that: From what I can see in your GitHub
repository, it seems very accessible though.

Thank you for sharing,
Toke Eskildsen, State and University Library, Denmark





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org