You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@devicemap.apache.org by "eberhard speer jr." <se...@ducis.net> on 2014/07/08 10:38:29 UTC

Test Data

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I do not know whether DeviceMap should 'release' test data like
official releases. Maintaining a collection of test data with 'known'
data-sets is *vital*. These sets could then be supplied with tags not
so much as a 'versions' but a 'description'.

A 'good' data-set, like good antique -- except here things can't be
new enough -- has 'provenance' : geographical spread of IPs [APNIC,
LACNIC, ARIN,...], an idea of the 'market' segment and a time-frame.
So, I renew my plea : *please* send user-agent strings [web-access logs] !

I have developed a system to extract this info from web-access logs
and retain the pertinent data without violating anyone's privacy [of
course, if a vendor tags the UA string with a 'finger-print', you
could argue it uniquely identifies 'someone'].

With regard to this release I did include a small file of user-agents
for testing in the release. [ua_strings.txt]
It is part of the DeviceMap Console 'demo'/client. It is made so that
if you set up the package 'as is' per instruction, you can start-up
DeviceMapConsole.exe, and it will load the XML resources from URL and
then run thru the supplied test file of some 10,000 ua-strings as a demo.

One 'test' data-set I think we *must* have is one with at least one
UA-string per device/pattern in the DDR, which I think is currently
not the case.

Oh, did I mention : *please* send user-agent strings !

esjr
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTu64FAAoJEOxywXcFLKYcGWUH/198XDWUGEAId0l9TvorXb4e
4UdUm8HzfyMl9mP0xNaXY3Bfq6JAHh33gFGgklUbEKdHQMyZlmrV+cXESfhkCc31
ZtjCdysDAzov/u/HcC5YTKFs+8fr1RDrqRIQrA7tGqhuxd/BhkJXsOboP04SKWGT
L6+Z7GltuFKHxr0fTC2pvpWGusr8lN8dHsPhdTpFfGTKajGcAO6e2xK14mh1D78p
maPTMVwRfDjJicDvQHBuGcbxdI/v++NPAGcI3NdpvSUmg14x8q2Sk1KoeoZbjmLo
0laZGkHlWHbAPGdneR5MRY6sO43n9BOzmXOxIFM+HfmXvG6eUiH3Dhi6bKjjreE=
=fa1n
-----END PGP SIGNATURE-----

Re: Test Data

Posted by Werner Keil <we...@gmail.com>.
Hi,

Thanks for the input.


On Tue, Jul 8, 2014 at 10:38 AM, eberhard speer jr. <se...@ducis.net>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi,
>
> I do not know whether DeviceMap should 'release' test data like
> official releases. Maintaining a collection of test data with 'known'
> data-sets is *vital*. These sets could then be supplied with tags not
> so much as a 'versions' but a 'description'.
>
>

We may find a different tag for those, but the general pattern of artifact
naming (Bertrand started with an OSGi-like name, now Reza proposed to
change them to something like "devicemap-data", etc. which for most Apache
projects seems more common[?]) should also be the same as for the "real"
data.


> A 'good' data-set, like good antique -- except here things can't be
> new enough -- has 'provenance' : geographical spread of IPs [APNIC,
> LACNIC, ARIN,...], an idea of the 'market' segment and a time-frame.
> So, I renew my plea : *please* send user-agent strings [web-access logs] !
>
> I have developed a system to extract this info from web-access logs
> and retain the pertinent data without violating anyone's privacy [of
> course, if a vendor tags the UA string with a 'finger-print', you
> could argue it uniquely identifies 'someone'].
>
> With regard to this release I did include a small file of user-agents
> for testing in the release. [ua_strings.txt]
> It is part of the DeviceMap Console 'demo'/client. It is made so that
> if you set up the package 'as is' per instruction, you can start-up
> DeviceMapConsole.exe, and it will load the XML resources from URL and
> then run thru the supplied test file of some 10,000 ua-strings as a demo.
>
> One 'test' data-set I think we *must* have is one with at least one
> UA-string per device/pattern in the DDR, which I think is currently
> not the case.
>
> Oh, did I mention : *please* send user-agent strings !
>
>
Where do you mean, JIRA? We don't have a user-agent collecting service now,
or did you just put one up now?[?]


Werner


> esjr
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJTu64FAAoJEOxywXcFLKYcGWUH/198XDWUGEAId0l9TvorXb4e
> 4UdUm8HzfyMl9mP0xNaXY3Bfq6JAHh33gFGgklUbEKdHQMyZlmrV+cXESfhkCc31
> ZtjCdysDAzov/u/HcC5YTKFs+8fr1RDrqRIQrA7tGqhuxd/BhkJXsOboP04SKWGT
> L6+Z7GltuFKHxr0fTC2pvpWGusr8lN8dHsPhdTpFfGTKajGcAO6e2xK14mh1D78p
> maPTMVwRfDjJicDvQHBuGcbxdI/v++NPAGcI3NdpvSUmg14x8q2Sk1KoeoZbjmLo
> 0laZGkHlWHbAPGdneR5MRY6sO43n9BOzmXOxIFM+HfmXvG6eUiH3Dhi6bKjjreE=
> =fa1n
> -----END PGP SIGNATURE-----
>