You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2006/11/09 01:42:43 UTC

[Solr Wiki] Update of "TaskList" by HossMan

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by HossMan:
http://wiki.apache.org/solr/TaskList

The comment on the change is:
gutting all of the "DONE" items, and clarifying the two lists a little bit

------------------------------------------------------------------------------
- This is a list of ideas for improving Solr.  All users should feel free to add new ideas to this page.
+ This is a list of ideas for improving Solr.  
+ 
+ All users should feel free to add new ideas to this page, or add links to other wiki pages containing more involved designs.
+ 
+ Users should also feel free to open "New Feature", "Improvement", or "Wish" issues in [http://issues.apache.org/jira/browse/SOLR Jira] -- particularaly if they already have code that makes progress towards the idea.
  
  Many of the ideas on this page have been discussed on the [http://www.nabble.com/Solr-f14479.html Solr mailing lists], you should search there for more informaiton.
  
- == Near Term Tasks ==
+ == Simple Non-Invasive Tasks ==
+ 
+ This section is for ideas that are relatively straight forward or don't involve major changes to the Solr codebase.  People who are eager to "give back" to the Solr community but don't have a lot of familiarity with the Solr code base may be interested in taking on these Tasks...
+ 
   * Create a "Powered By Solr" icon that people can include in their applications if they so choose.
-  * [DONE] Move webapp from ROOT to solr
-  * Allow multiple independent Solr webapps in the same app server.
-    * [DONE] [http://www.nabble.com/multiple+solr+webapps-t1469811.html#a3991310 JNDI] support implemented
-    * we need more docs on how to configure this in resin/jetty (!TomCat is done)
-  * [DONE] provide alternate way to specify where solr configuration and index directories are (don't have to depend on cwd)
-     * (solr.solr.home system property implemented ... JNDI provides a second method)
   * alternate ways of indexing (it currently requires an HTTP POST of an XML document
-    * be able to load a text file, with variable delimiters
+    * be able to load a text file, with variable delimiters (see also [http://issues.apache.org/jira/browse/SOLR-66 SOLR-66])
     * point it at a database and give it SQL
-  * [DONE] try and clean up css for IE
-  * [DONE] clean up the admin pages more, refactor repeated code, remove stuff that doesn't work
-  * document how user can supply a plugin, since we can't depend on the previous method we had used with resin (which added a webapp external directory to the classpath). .. java code is just a classpath issue that can be documented, but what about stylesheets or other types of files?
-  * [DONE] Simple faceted browsing (grouping) support in the standard query handler [http://issues.apache.org/jira/browse/SOLR-44 see bug]
-    * group by field (provide counts for each distinct value in that field)
-    * group by (query1, query2, query3, query4, query5)
-  * [DONE] Highlighter integration/support
-    * [DONE] Allow specification of begin and end markup for highlighted regions, perhaps with namespace like query params highlight.start=<b> highlight.end=</b>
-    * [DONE] Allow specification of default highlighter markup via init params in solrconfig.xml
-    * [DONE] decide on the best default... <B></B>, <em></em>, or <SPAN class="highlight"></SPAN> 
   * a !DateTime field (or Query Parser extension) that allows flexible input for easier human entered queries
     * date math in the Query Parser would also be usefull for "baking" filter queries into configs, ie...
       * `expireDate:[now TO *]`
@@ -34, +25 @@

   * [DONEish?] good multi-field querying support integrated with standard request handler, or as a separate handler
     * see: DisMaxRequestHandler
   * support for max disjunction and minNrShouldMatch in query parser (really a Lucene item) 
-  * [DONE] Refactor standard request handler to use the new PluginUtils
-  * UnitsFilter... 17" => 17 inch, etc
+  * !UnitsFilter... 17" => 17 inch, etc
+  * Admin query interface: add highlighting options, query writer options, facet options (see also [http://issues.apache.org/jira/browse/SOLR-67 SOLR-67]) 
-  * !QueryResponseWriter...
-      * add init() method that gets passed info from solrconfig.xml (like !SolrRequestHandler)
-      * [DONE via getContentType()] add mechanism for controlling mime/type (add new method !QueryResponseWriter.getMimeType(!SolrQueryRequest) ? ... change return type of !QueryResponseWriter.write to String ?)
-  * Admin query interface: add highlighting options, query writer options 
   * Documentation
-    * [DONE] Analyzer components available from the schema
-    * [DONE] Document how to set up distribution/replication
-    * [DONE] Standard Query Handler: query params
     * result XML format - needed, or self-explanatory?
-    * [DONE] update XML format
-    * [DONE] analysis stuff (available tokenizers and filters)
-    * Installation/Configuration Wiki for Tomcat (and other servlet containers)
     * Java Docs
        * good overview.html
-       * pacakge.html for every package
+       * package.html for every package
        * class level documentation for every class
+       * detailed method javadocs for every method in all of the "plugable" classes and every method in a key class used when writting a request handler...
-       * detailed method javadocs for every method in all of the "plugable" classes...
-          * [DONE] !SolrRequestHandler
           * !SolrCache ... in progress
           * !SolrEventListener 
-          * [DONE] SolrInfoMBean
-          * [DONE] !CacheRegenerator
-          * [DONE] !TokenizerFactory, !BaseTokenizerFactory
-          * [DONE] !TokenFilterFactory, !BaseTikenFilterFactory
           * !UpdateHandler
           * !FieldType ... in progress
-          * [DONE] !QueryResponseWriter
-       * detailed javadocs all methods in the key classes used when dealing with a request
-          * [DONE] !NamedList
-          * [DONE] !SolrIndexSearcher
-          * [DONE] !SolrQueryRequest, !SolrQueryRequestBase
-          * [DONE] !SolrQueryResponse
-          * [DONE] !DocList, !DocSlice
-          * [DONE] !DocSet
-          * [DONE] !DocIterator
-          * [DONE] !IndexSchema
           * !SchemaField
-          * [DONE] !SolrQueryParser
-          * [DONE] !QueryParsing
-    * Wiki page giving an overview of what things are "plugable"
-  * Replication
-    * [DONE] clean up the scripts, provide an alternate configuration mechanism (in the CNET internal version,
-      they had macro expansion performed on them) 
   * [MOSTLY DONE] Testing (see [https://issues.apache.org/jira/browse/SOLR-3 SOLR-3])
-    * [DONE] make more JUnit tests
-    * [DONE] try to make the current big testapp more modular, and maybe make a junit test out of it.
     * Either eliminate testapp, or refactor it to use new test harness and document it as a performance testing tool.
   * !SolrQueryParser Configuration in schema.xml
     * make more options configurable via schema.xml besides operator ([http://www.nabble.com/QueryParser-default-operator---AND-tf1977319.html#a5425106 discussion])
     * refactor option setting into a utility (possibly in !IndexSchema) so people constructing a !SolrQueryParser instance directly get the built in defaults. ([http://www.nabble.com/QueryParser-default-operator---AND-tf1977319.html#a5425271 discussion])
   * Live demo server or application (perhaps host on apache lucene zone)
     * Mailing List Index?
-  * Better handling of arbitrary XML charsets: see [http://www.nabble.com/double-curl-calls-in-post.sh--tf2287469.html#a6369448 1] and [http://www.nabble.com/wana-use-CJKAnalyzer-tf2303256.html#a6451918 2]
  
- == Ideas for the future ==
+ == Big Ideas for The Future ==
+ 
+ 
+ This section should be used for ideas that are more involved and may require major changes to the Solr codebase, and definitely should involve a lot of discussion among developers about the appropriate way to tackle them...
+ 
    * Alternate replication strategy that can work on Windows?
     * NTFS w/ WinXP or later does support hard links for files (and cygwin "ln" works for files).  The current replication scripts could be ported to cygwin.
    * Support for IndexPartitioning within a single solr webapp instance
    * A more powerful query language allowing one to express complicated logic without resorting to a custom Java query handler plugin.
    * Make use of [http://jakarta.apache.org/hivemind/ HiveMind] or Spring for configuration & dependency injection
-   * [JSON DONE] Alternate output formats (JSON?)
-   * utilize Lucene's new field selector / lazy field loading mechanism to speed up requests that select only a few stored fields out of many.  Beware interaction with the DocCache... it may need to be modified or bypassed.
+   * utilize Lucene's new field selector / lazy field loading mechanism to speed up requests that select only a few stored fields out of many.  Beware interaction with the DocCache... it may need to be modified or bypassed. (see also [http://issues.apache.org/jira/browse/SOLR-52 SOLR-52])
    * Implement some ideas for ComplexFacetingBrainstorming
    * Implement some ideas to MakeSolrMoreSelfService
    * support for an an optional "timestamp" style field in schemas which allways want every doc to include the datetime the document was added to the index.  This might be a special case (like the uniqueKey field) or it could involve more general "default" support for fields and fieldtypes (ie: `<field>` and `<fieldtype>` declarations could include a `default="..."` attribute that gets put into any document that doesn't already have a value for that field, with the underlying !FieldType parsing the text each time it's used, so that the !DateField class can convert `default="now"` to the current time.
@@ -107, +67 @@

    * A "user query" parameter for standard request handler, much like what dismax handler has, for unstructured queries entered from a search box
    * refactor and separate update XML parsing from update handling... possibly implement support for JSON updates.
    * refactor all of the JSP pages into servlets so a JDK/JSP compiler isn't needed (the current JSPs are very sparse on presentation, and use no custom tags, so there is almost no advantage to them being JSPs)
+  * Better handling of arbitrary XML charsets: see [http://www.nabble.com/double-curl-calls-in-post.sh--tf2287469.html#a6369448 1] and [http://www.nabble.com/wana-use-CJKAnalyzer-tf2303256.html#a6451918 2]