You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@devicemap.apache.org by "eberhard speer jr." <se...@ducis.net> on 2013/06/25 16:46:37 UTC

DeviceMapClient - results

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

OK, results...

So, I used the test data set :

https://svn.apache.org/repos/asf/incubator/devicemap/trunk/openddr/test-data/src/main/resources/test-data/dmap_20130522.txt

leaving aside the desktop issue and the other stuff like bots and
plain junk strings...

When everything was set up it *flew* thru the +47k ua-strings in 35
seconds !

The result data can be found here :

http://www.ducis.net/static/result_20130625.zip

it is a pipe-separated file with header :

Parser : time taken in ms
DMap : DeviceMapClient claimed device
UserAgent : useragent string
OpenDdr : 'Old' openddr claimed device

best thing to do is to import the lot in database...

then weed out records WHERE DMap = 'unknown' : there are devices which
no longer occur in the current XML resources *OR* where
DeviceMapClient.classify returned 'Nothing'

This leaves 17,919 records to compare.
Of these 5,042 (28%) match, i.e. : both DeviceMapClient and the 'old'
OpenDDR agree on the DeviceId

I picked out a few string and ran them in the simple console app to
double-check and the results were identical.

Time to bring back some regex I fear :-(

esjr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRya1NAAoJEOxywXcFLKYc1WAH/2t7eJE4r4kbH8gBYYVv9UWj
HvOzHARdv3K5iAVsKKsSgrFIP/0Rqp49INqieE79bLwrwfE8TCVgieh4LhIFa7gl
ZtihVthNrD+dWcFW6iitUL9JIS57lfe5sXow4PxIhs+2nyHTT0kjABAbWSt4pQYV
lZwU5eGQLYHwGv1tZfm7ceonm49j8HV7zXrz54IQ0R77FZXUQKMfoLYv/w7fB76R
5E/BN41Ei9XI1XkfPowlJ7L99k320T4C2z+eOIn80yDsrnhegW1+kOxljXbL7jFf
YefSkayF/Ss6/IkzMNBNJxXt33S+l4FPAit8zocjn0bKl6IPSXdAfOud9Sb7K0U=
=awDC
-----END PGP SIGNATURE-----

Re: DeviceMapClient - results

Posted by Reza <re...@yahoo.com>.
So I noticed a few things. First, I think there may be something wrong with your client. For example, when I run

'Browser Mozilla/4.0 (compatible; MSIE 7.0; Windows Phone OS 7.0; Trident/3.1; IEMobile/7.0; SAMSUNG; SGH-i917)'


Thru the java client, I get:

2013-06-25 14:56:42,120 [dmapjclient] classify: Browser Mozilla/4.0 (compatible; MSIE 7.0; Windows Phone OS 7.0; Trident/3.1; IEMobile/7.0; SAMSUNG; SGH-i917)'
2013-06-25 14:56:42,121 [dmapjclient] Hit candidate: samsungsgh => genericPhone
2013-06-25 14:56:42,130 [dmapjclient] Hit candidate: mozilla40compatible => desktopDevice
2013-06-25 14:56:42,131 [dmapjclient] Hit candidate: i917 => SGH-i917
Classify result: 'SGH-i917'

In your results, you get 'Sprint M370'.

Also, what DDR data are you using? Im using OpenDDR 1.18. So for these 2 user agents:

lg-t300 UNTRUSTED/1.0

Nokia7370


There are no patterns for them. Your .NET client found patterns, so im guessing we are using different DDR data.

So one of the main issues I found with my client implementations is that it only supports 1 pattern per device. A lot of devices have multiple patterns and only 1 pattern is being considered. So I plan on fixing this by allowing a device to have multiple patterns. Not sure how we are going to keep our changes in sync since I think we have some divergence in our algorithms. I may just run your test set thru the java client and manually compare.

When we get these 3 issues straightened out, we should see a lot more parity between the algorithms. This is even with the new algorithm ignoring spaces/symbols/regex, which I plan on address shortly too.


________________________________
 From: eberhard speer jr. <se...@ducis.net>
To: devicemap-dev@incubator.apache.org 
Sent: Tuesday, June 25, 2013 10:46 AM
Subject: DeviceMapClient - results
 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

OK, results...

So, I used the test data set :

https://svn.apache.org/repos/asf/incubator/devicemap/trunk/openddr/test-data/src/main/resources/test-data/dmap_20130522.txt

leaving aside the desktop issue and the other stuff like bots and
plain junk strings...

When everything was set up it *flew* thru the +47k ua-strings in 35
seconds !

The result data can be found here :

http://www.ducis.net/static/result_20130625.zip

it is a pipe-separated file with header :

Parser : time taken in ms
DMap : DeviceMapClient claimed device
UserAgent : useragent string
OpenDdr : 'Old' openddr claimed device

best thing to do is to import the lot in database...

then weed out records WHERE DMap = 'unknown' : there are devices which
no longer occur in the current XML resources *OR* where
DeviceMapClient.classify returned 'Nothing'

This leaves 17,919 records to compare.
Of these 5,042 (28%) match, i.e. : both DeviceMapClient and the 'old'
OpenDDR agree on the DeviceId

I picked out a few string and ran them in the simple console app to
double-check and the results were identical.

Time to bring back some regex I fear :-(

esjr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRya1NAAoJEOxywXcFLKYc1WAH/2t7eJE4r4kbH8gBYYVv9UWj
HvOzHARdv3K5iAVsKKsSgrFIP/0Rqp49INqieE79bLwrwfE8TCVgieh4LhIFa7gl
ZtihVthNrD+dWcFW6iitUL9JIS57lfe5sXow4PxIhs+2nyHTT0kjABAbWSt4pQYV
lZwU5eGQLYHwGv1tZfm7ceonm49j8HV7zXrz54IQ0R77FZXUQKMfoLYv/w7fB76R
5E/BN41Ei9XI1XkfPowlJ7L99k320T4C2z+eOIn80yDsrnhegW1+kOxljXbL7jFf
YefSkayF/Ss6/IkzMNBNJxXt33S+l4FPAit8zocjn0bKl6IPSXdAfOud9Sb7K0U=
=awDC
-----END PGP SIGNATURE-----