You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@devicemap.apache.org by "eberhard speer jr." <se...@ducis.net> on 2013/06/23 17:12:02 UTC

device map java client - .Net version

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

managed to get a .Net version of Reza's java client up & running,
simple enough...

Nice !

Using the same user-agents strings for 'testing' I obtain a similar
average of around 1 ms
slowest : HTC Aria : 6.5651 ms
fastest : iPhone   : 0.1801 ms

Nice indeed !
Next : a much larger test set to see if the device Id's match the
one's returned by the 'old' version...
I had thought about a 'parser' along the lines of Reza's current java
client but shelved it, thinking it might return the wrong device Id is
some cases. Well, I guess now we're going find out...

I'll use this test data :

https://svn.apache.org/repos/asf/incubator/devicemap/trunk/openddr/test-data/src/main/resources/test-data/dmap_20130522.txt


esjr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRxxBCAAoJEOxywXcFLKYcARQH/0uXnKYgrKsWocMMNBqv68Jk
TlnDd4RmfGBtTa5hOzp8DOl8aXrB3M6EJSgtAewWqCxzksYMWVE9SUHlCyjRqe0A
RC6NONC+XLXEyfC0UP46Yd88FLnkBH+Xy78BerLWFB44QSwpU06M7FX7K0lJW/Zr
GnIeSy8tRoQNKsd8vLZd43usb3yT2ICyjzYok0/6tTOC747gArBxashJIY1TbbZO
Cgj9EZlQoC9FxO+QE7c/QreeVj5sOTmY512g8IDA945Yl2At3S5YScBtpCGYpd0L
i1vCzs2K5mOMgSxvzR3uw3J0Jc5pzFCVhzfhSttytE0tJ7/fQNT/Fgvn4sSmSSg=
=NDx0
-----END PGP SIGNATURE-----

Re: device map java client - .Net version

Posted by Reza <re...@yahoo.com>.
So I did some more tests, I had to increase the range of pattern separator chars from just space to: " -_/\". Just space was a bit too naive. I also increased the word group threshold from 3 to 4. Eberhard, can you make this change before doing your tests?

http://svn.apache.org/viewvc/incubator/devicemap/trunk/devicemapjava/src/main/java/org/apache/devicemap/client/DeviceMapClient.java?view=markup

Line 72, 76


________________________________
 From: Reza <re...@yahoo.com>
To: "devicemap-dev@incubator.apache.org" <de...@incubator.apache.org> 
Sent: Sunday, June 23, 2013 12:05 PM
Subject: Re: device map java client - .Net version
 

Nice, good to hear. I tried to keep it as simple as possible, nothing too exotic :]

Another thing I want to do is make sure the algorithm is accurate. For example, I just checked in a fix to always choose the longest length pattern:

http://svn.apache.org/viewvc/incubator/devicemap/trunk/devicemapjava/src/main/java/org/apache/devicemap/client/DeviceMapClient.java?view=markup

Line 105

So let me know if you see any bad classifications.

If this algorithm is suitable, it should be simple to port it over to other languages.

Just to explain how it works, all patterns are stripped of regex and normalized into pure alpha numeric. Example:

DROID.?BIONIC.?4G => droidbionic4g


The same treatment is given to the input string. All possible single, double, and triple word combinations are passed thru the pattern index. Right now spaces are used as the default token separator. Example:

This (1234.5 Agent) Test =>
 this
 this12345
 this12345agent
 12345
 12345agent
 12345agenttest
 agent
 agenttest
 test

Then some simple rules are used to choose the best match and filter out false positives (incomplete TwoStepDeviceBuilder patterns). I think this approach will be pretty accurate. If not, adjustments can be made. So let me know.

If we want to use this algorithm across our clients, then we should adjust our pattern data to better suit this so we don't get conflicts down the road.

So keep me posted on any sort of accuracy results.


________________________________
From: eberhard speer jr. <se...@ducis.net>
To: Apache Device Map DEV <de...@incubator.apache.org> 
Sent: Sunday, June 23, 2013 11:12 AM
Subject: device map java client - .Net version


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

managed to get a .Net version of Reza's java client up & running,
simple enough...

Nice !

Using the same user-agents strings for 'testing' I obtain a similar
average of around 1 ms
slowest : HTC Aria : 6.5651 ms
fastest : iPhone   : 0.1801 ms

Nice indeed !
Next : a much larger test set to see if the device Id's match the
one's returned by the 'old' version...
I had thought about a 'parser' along the lines of Reza's current java
client but shelved it, thinking it might return the wrong device Id is
some cases. Well, I guess now we're going find out...

I'll use this test data :

https://svn.apache.org/repos/asf/incubator/devicemap/trunk/openddr/test-data/src/main/resources/test-data/dmap_20130522.txt


esjr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRxxBCAAoJEOxywXcFLKYcARQH/0uXnKYgrKsWocMMNBqv68Jk
TlnDd4RmfGBtTa5hOzp8DOl8aXrB3M6EJSgtAewWqCxzksYMWVE9SUHlCyjRqe0A
RC6NONC+XLXEyfC0UP46Yd88FLnkBH+Xy78BerLWFB44QSwpU06M7FX7K0lJW/Zr
GnIeSy8tRoQNKsd8vLZd43usb3yT2ICyjzYok0/6tTOC747gArBxashJIY1TbbZO
Cgj9EZlQoC9FxO+QE7c/QreeVj5sOTmY512g8IDA945Yl2At3S5YScBtpCGYpd0L
i1vCzs2K5mOMgSxvzR3uw3J0Jc5pzFCVhzfhSttytE0tJ7/fQNT/Fgvn4sSmSSg=
=NDx0
-----END PGP SIGNATURE-----

Re: device map java client - .Net version

Posted by Reza <re...@yahoo.com>.
Nice, good to hear. I tried to keep it as simple as possible, nothing too exotic :]

Another thing I want to do is make sure the algorithm is accurate. For example, I just checked in a fix to always choose the longest length pattern:

http://svn.apache.org/viewvc/incubator/devicemap/trunk/devicemapjava/src/main/java/org/apache/devicemap/client/DeviceMapClient.java?view=markup

Line 105

So let me know if you see any bad classifications.

If this algorithm is suitable, it should be simple to port it over to other languages.

Just to explain how it works, all patterns are stripped of regex and normalized into pure alpha numeric. Example:

DROID.?BIONIC.?4G => droidbionic4g


The same treatment is given to the input string. All possible single, double, and triple word combinations are passed thru the pattern index. Right now spaces are used as the default token separator. Example:

This (1234.5 Agent) Test =>
 this
 this12345
 this12345agent
 12345
 12345agent
 12345agenttest
 agent
 agenttest
 test

Then some simple rules are used to choose the best match and filter out false positives (incomplete TwoStepDeviceBuilder patterns). I think this approach will be pretty accurate. If not, adjustments can be made. So let me know.

If we want to use this algorithm across our clients, then we should adjust our pattern data to better suit this so we don't get conflicts down the road.

So keep me posted on any sort of accuracy results.


________________________________
 From: eberhard speer jr. <se...@ducis.net>
To: Apache Device Map DEV <de...@incubator.apache.org> 
Sent: Sunday, June 23, 2013 11:12 AM
Subject: device map java client - .Net version
 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

managed to get a .Net version of Reza's java client up & running,
simple enough...

Nice !

Using the same user-agents strings for 'testing' I obtain a similar
average of around 1 ms
slowest : HTC Aria : 6.5651 ms
fastest : iPhone   : 0.1801 ms

Nice indeed !
Next : a much larger test set to see if the device Id's match the
one's returned by the 'old' version...
I had thought about a 'parser' along the lines of Reza's current java
client but shelved it, thinking it might return the wrong device Id is
some cases. Well, I guess now we're going find out...

I'll use this test data :

https://svn.apache.org/repos/asf/incubator/devicemap/trunk/openddr/test-data/src/main/resources/test-data/dmap_20130522.txt


esjr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRxxBCAAoJEOxywXcFLKYcARQH/0uXnKYgrKsWocMMNBqv68Jk
TlnDd4RmfGBtTa5hOzp8DOl8aXrB3M6EJSgtAewWqCxzksYMWVE9SUHlCyjRqe0A
RC6NONC+XLXEyfC0UP46Yd88FLnkBH+Xy78BerLWFB44QSwpU06M7FX7K0lJW/Zr
GnIeSy8tRoQNKsd8vLZd43usb3yT2ICyjzYok0/6tTOC747gArBxashJIY1TbbZO
Cgj9EZlQoC9FxO+QE7c/QreeVj5sOTmY512g8IDA945Yl2At3S5YScBtpCGYpd0L
i1vCzs2K5mOMgSxvzR3uw3J0Jc5pzFCVhzfhSttytE0tJ7/fQNT/Fgvn4sSmSSg=
=NDx0
-----END PGP SIGNATURE-----