You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/08/18 17:39:26 UTC

[GitHub] RestfulBlue opened a new issue #6189: Lucene indexing for free form text

RestfulBlue opened a new issue #6189: Lucene indexing for free form text
URL: https://github.com/apache/incubator-druid/issues/6189
 
 
   Currently druid uses classic inverted indexes to index string columns. But it not really useful when using free-form text. Currently possible to disable indexes, to have no overhead of such columns, but will be very useful to have possibility to enable full text search.
   For example, setup in configuration like this
   ```json
       {
         "type": "string",
         "name": "additional_info",
         "indexType": "unindexed" // without bitmap
       },
       {
         "type": "string",
         "name": "hostname",
         "indexType": "default" // current inverted index
       },
       {
         "type": "string",
         "name": "log_record",
         "indexType": "lucene" // lucene indexing
       }
   ```
   
   with this possibility druid can be used to store almost everything related to monitoring and log data, making possible to get fast result for query like this :
   ```sql
   select 
      time_floor(__time, "PT1H") , count(*)
   from 
      system_logs
   where 
       log_record satisfy "*something*" 
       and hostname = "node1"
   group by
       time_floor(__time, "PT1H")
   order by
      time_floor(__time, "PT1H")
   ```
   
   where satisfy apply lucene filter log_record:*errormessage*.
   
   Adding full text search will made druid universal instrument for monitoring and logging different systems(Currently filter by free form text require almost full scan, which not work well, so necessary to store such data in solr or elasticsearch) 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org