You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Damiano Porta <da...@gmail.com> on 2017/03/03 16:03:14 UTC

BUG in NameSample

Hello everybody,

I think i found a bug in NameSample. This is the use case:

String[] tokens = new String[] {
"0",
"1",
"2",
"3",
"4",
",",
"6",
"7",
"8"
};

Span[] spans = new Span[] {
new Span(7,8, "zipcode"),
new Span(1,7, "address"),
};

NameSample n = new NameSample(tokens, spans, true);

then i do:

System.out.println(n.toString())

*0 <START:address> 1 2 3 4 , 6 <START:zipcode> <END> 7 <END> 8*    // WRONG

As you can see the entities are not ordered, i wrote "zipcode" entity first
(7-8) and then "address" (1-7)

If i order them i see the correct output:

*0 <START:address> 1 2 3 4 , 6 <END> <START:zipcode> 7 <END> 8* // CORRECT

I am using OpenNLP 1.7.0 but i see 1.7.2 has the same toString() method.

Do you confirm it?
Regards

Damiano