You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Raymond Xie <xi...@gmail.com> on 2018/04/06 11:07:22 UTC

Urgent! How to retrieve the whole message in the Solr search result?

I am using Solr for the following search need:

raw data: in FIX format, it's OK if you don't know what it is, treat it as
csv with a special delimiter.

parsed data: from raw data, all in the same format of a bunch of JSON
format with all 100+ fields.

Example:

Raw data: delimiter is \u001:

8=FIX.4.4 9=653 35=RIO 1=TEST 11=337912000000002 38=1 44=2.0 39=A 40=2
49=VIPER 50=JPNIK01 54=1 55=JNI253D8.OS 56=XSVC 59=0 75=20180350 100=XOSE
10039=viperooe 10241=viperooe 150=A 372=D 122=20180320-08:08:35.038
10066=20180320-08:08:35.038 10436=20180320-08:08:35.038 202=25375.0
52=20180320-08:08:35.088 60=20180320-08:08:35.088
10071=20180320-08:08:35.088 11210=337912000000002 37=337912000000002
10184=337912000000002 201=1 29=4 10438=RIO.4.5 10005=178 10515=178
10518=178 581=13 660=102 1133=G 528=P 10104=Y 10202=APMKTMAKING
10208=APAC.VIPER.OOE 10217=Y 10292=115 11032=-1 382=0 10537=XOSE 15=JPY
167=OPT 48=179492540 455=179492540 22=101 456=101 151=1.0 421=JPN 10=200

Parsed data: in json:

{"122": "20180320-08:08:35.038", "49": "VIPER", "382": "0", "151": "1.0",
"9": "653", "10071": "20180320-08:08:35.088", "15": "JPY", "56": "XSVC",
"54": "1", "10202": "APMKTMAKING", "10537": "XOSE", "10217": "Y", "48":
"179492540", "201": "1", "40": "2", "8": "FIX.4.4", "167": "OPT", "421":
"JPN", "10292": "115", "10184": "337912000000002", "456": "101", "11210":
"337912000000002", "1133": "G", "10515": "178", "10": "200", "11032": "-1",
"10436": "20180320-08:08:35.038", "10518": "178", "11":
"337912000000002", *"75":
"20180320"*, "10005": "178", "10104": "Y", "35": "RIO", "10208":
"APAC.VIPER.OOE", "59": "0", "60": "20180320-08:08:35.088", "528": "P",
"581": "13", "1": "TEST", "202": "25375.0", "455": "179492540", "55":
"JNI253D8.OS", "100": "XOSE", "52": "20180320-08:08:35.088", "10241":
"viperooe", "150": "A", "10039": "viperooe", "39": "A", "10438": "RIO.4.5",
"38": "1", *"37": "337912000000002"*, "372": "D", "660": "102", "44":
"2.0", "10066": "20180320-08:08:35.038", "29": "4", "50": "JPNIK01", "22":
"101"}

The fields used for searching is order_id (tag 37) and trd_date(tag 75). I
will create the schema with the two fields added to it

<field name="37" type="text_general" indexed="true" stored="false"
multiValued="true"/>
<field name="75" type="text_general" indexed="true" stored="false"
multiValued="true"/>

At the moment I can get the result by:
http://192.168.112.141:8983/solr/fix_messages/select?q=37:337912000000002
where 37 is the order_id and  337912000000002 is the value to search in
field of "37"


The result I get is:

{
  "responseHeader":{
    "status":0,
    "QTime":6,
    "params":{
      "q":"37:337912000000002"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "122":["20180320-08:08:35.038"],
        "49":["VIPER"],
        "382":[0],
        "151":[1.0],
        "9":[653],
        "10071":["20180320-08:08:35.088"],
        "15":["JPY"],
        "56":["XSVC"],
        "54":[1],
        "10202":["APMKTMAKING"],

........

I need to show the result like below:

1. the order_id: the term of "order_id" must be displayed instead of its
actual tag 37;
2. the trd_date: the term of "trd_date" must be displayed in the result;
3. the whole message: the whole and raw message must be displayed in the
result;
4. the two fields of order_id and trd_date must be highlighted.

Can anyone tell me how do I do it? Thank you very much in advance.

*------------------------------------------------*
*Sincerely yours,*


*Raymond*

Re: Urgent! How to retrieve the whole message in the Solr search result?

Posted by Stefan Matheis <st...@mathe.is>.
Raymond,

please don't get this the wrong way - it's entirely possible that you
didn't mean it like it sounded to me ...

"Urgent" is a term you can use with your colleagues or a supplier of yours
.. but not a mailing list like this. We're doing this in our spare time,
trying to help out others working in the same area.

Pointers regarding your questions:

> 1. the order_id: the term of "order_id"
> must be displayed instead of its
> actual tag 37;
> 2. the trd_date: the term of "trd_date"
> must be displayed in the result;

Than you either have to index it like that or apply such a mapping at
runtime. Solr does return whatever data it is that you store while you're
indexing it.

You are responsible to map the source fields to the names you want - Solr
doesn't care about that. It's just field names and their values ...

> 3. the whole message: the whole and
> raw message must be displayed in the
> result;

Again, you're responsible for it. If you need it, introduce another field
in your schema where you store the original message in whatever format you
like.

> 4. the two fields of order_id and
> trd_date must be highlighted.

In Solr "highlighting" typically means that you're trying that part(s) of
a/multiple words that matched the term(s) using for querying .. but given
your sample, which is just a simple filter, I don't think we're talking
about the same .. do we?

What is it you're thinking about when you say those fields need to be
highlighted?

HTH,
Stefan

On Fri, Apr 6, 2018, 1:07 PM Raymond Xie <xi...@gmail.com> wrote:

> I am using Solr for the following search need:
>
> raw data: in FIX format, it's OK if you don't know what it is, treat it as
> csv with a special delimiter.
>
> parsed data: from raw data, all in the same format of a bunch of JSON
> format with all 100+ fields.
>
> Example:
>
> Raw data: delimiter is \u001:
>
> 8=FIX.4.4 9=653 35=RIO 1=TEST 11=337912000000002 38=1 44=2.0 39=A 40=2
> 49=VIPER 50=JPNIK01 54=1 55=JNI253D8.OS 56=XSVC 59=0 75=20180350 100=XOSE
> 10039=viperooe 10241=viperooe 150=A 372=D 122=20180320-08:08:35.038
> 10066=20180320-08:08:35.038 10436=20180320-08:08:35.038 202=25375.0
> 52=20180320-08:08:35.088 60=20180320-08:08:35.088
> 10071=20180320-08:08:35.088 11210=337912000000002 37=337912000000002
> 10184=337912000000002 201=1 29=4 10438=RIO.4.5 10005=178 10515=178
> 10518=178 581=13 660=102 1133=G 528=P 10104=Y 10202=APMKTMAKING
> 10208=APAC.VIPER.OOE 10217=Y 10292=115 11032=-1 382=0 10537=XOSE 15=JPY
> 167=OPT 48=179492540 455=179492540 22=101 456=101 151=1.0 421=JPN 10=200
>
> Parsed data: in json:
>
> {"122": "20180320-08:08:35.038", "49": "VIPER", "382": "0", "151": "1.0",
> "9": "653", "10071": "20180320-08:08:35.088", "15": "JPY", "56": "XSVC",
> "54": "1", "10202": "APMKTMAKING", "10537": "XOSE", "10217": "Y", "48":
> "179492540", "201": "1", "40": "2", "8": "FIX.4.4", "167": "OPT", "421":
> "JPN", "10292": "115", "10184": "337912000000002", "456": "101", "11210":
> "337912000000002", "1133": "G", "10515": "178", "10": "200", "11032": "-1",
> "10436": "20180320-08:08:35.038", "10518": "178", "11":
> "337912000000002", *"75":
> "20180320"*, "10005": "178", "10104": "Y", "35": "RIO", "10208":
> "APAC.VIPER.OOE", "59": "0", "60": "20180320-08:08:35.088", "528": "P",
> "581": "13", "1": "TEST", "202": "25375.0", "455": "179492540", "55":
> "JNI253D8.OS", "100": "XOSE", "52": "20180320-08:08:35.088", "10241":
> "viperooe", "150": "A", "10039": "viperooe", "39": "A", "10438": "RIO.4.5",
> "38": "1", *"37": "337912000000002"*, "372": "D", "660": "102", "44":
> "2.0", "10066": "20180320-08:08:35.038", "29": "4", "50": "JPNIK01", "22":
> "101"}
>
> The fields used for searching is order_id (tag 37) and trd_date(tag 75). I
> will create the schema with the two fields added to it
>
> <field name="37" type="text_general" indexed="true" stored="false"
> multiValued="true"/>
> <field name="75" type="text_general" indexed="true" stored="false"
> multiValued="true"/>
>
> At the moment I can get the result by:
> http://192.168.112.141:8983/solr/fix_messages/select?q=37:337912000000002
> where 37 is the order_id and  337912000000002 is the value to search in
> field of "37"
>
>
> The result I get is:
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":6,
>     "params":{
>       "q":"37:337912000000002"}},
>   "response":{"numFound":1,"start":0,"docs":[
>       {
>         "122":["20180320-08:08:35.038"],
>         "49":["VIPER"],
>         "382":[0],
>         "151":[1.0],
>         "9":[653],
>         "10071":["20180320-08:08:35.088"],
>         "15":["JPY"],
>         "56":["XSVC"],
>         "54":[1],
>         "10202":["APMKTMAKING"],
>
> ........
>
> I need to show the result like below:
>
> 1. the order_id: the term of "order_id" must be displayed instead of its
> actual tag 37;
> 2. the trd_date: the term of "trd_date" must be displayed in the result;
> 3. the whole message: the whole and raw message must be displayed in the
> result;
> 4. the two fields of order_id and trd_date must be highlighted.
>
> Can anyone tell me how do I do it? Thank you very much in advance.
>
> *------------------------------------------------*
> *Sincerely yours,*
>
>
> *Raymond*
>