You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@devicemap.apache.org by Konstantin Papkovskiy <ko...@papkovskiy.com> on 2015/03/28 16:10:35 UTC

DM effectiveness

Hello all,

I tested how effective DeviceMap in user device detection. I ran tests on a
dataset of 1M user agents (mostly mobile). Here are my results.

Top 10 devices
Device ID# of detections% of allgenericAndroid27784227.78%iPad648816.49%
iPhone468034.68%NokiaN8-00345173.45%GT-I9300242152.42%unknown180231.80%
GT-I9100159501.60%GT-P3100147581.48%Nokia5800d140041.40%GT-P5100123321.23%


DeviceMap data version: 1.0.2
DeviceMap gem version: 0.1.1
Number of user agents: 1 000 000
Number of unique user agents: 18 743
Number of successful detections: 981 977
Detection rate: 98.20 %
Number of unique successful detections: 1 027
Detection rate (unique UA): 5.48 %
*Detection rate (without genericAndroid): 70.42 %*

As you can see, if I don't count "genericAndroid" and "unknown" then
detection rate is about 70 %, which is rather low. If you add some of the
most popular android UA to DM, its effectiveness will improve drastically.
But it will get progressively harder to increase this metric.

I can provide the most popular UA from our logs if it will be useful.


Regards

Konstantin

Re: DM effectiveness

Posted by Werner Keil <we...@gmail.com>.
Konstantin/all,

Thanks for the input and your findings.
The idea of the DDR for server side recognition is to provide at least a
"device class" not necessarily the exact device, especially where these
change a lot.
"generic Android" or "generic Windows" seems too broad, but we should be
able to introduce further layers of "parent" device families, say something
like a "10 inch Galaxy" which allows to bundle certain popular devices. If
a new device with the same metrics can get recognized, that'll help,
especially those users in need of UI optimization.

I had a chat with a fellow codemotion speaker from Microsoft after his
great final session talk about browsers and Web app UX. He said, UAs often
lie, but promised, I could ask him for a Spartan preview as soon as they
can share those. He said Spartan shall be backward compatible in UA, but
we'll certainly see interesting development in devices like e.g. Nokia
Lumia, etc. if they get a Windows 10 Upgrade and therefore a new browser,
too;-)

Greetings from Rome,
Werner
Am 29.03.2015 00:57 schrieb "Reza Naghibi" <re...@apache.org>:

> I made the following epic for these devices:
>
> https://issues.apache.org/jira/browse/DMAP-154
>
> A lot of them boil down to a handful of device classes, so it shouldnt be
> too hard to get a large chunk of these into our next release.
>
> On Sat, Mar 28, 2015 at 7:06 PM, Konstantin Papkovskiy <
> konstantin@papkovskiy.com> wrote:
>
> > Probably the UA dataset isn't very representative, hence such a low
> > detection rate.
> >
> >
> > > Also, over 90% of these unknown devices are Android. Konstantin, is
> there
> > > any value in that Android identification for you?
> >
> >
> > Yes, it is useful. For examples, we gather statistics about device OS.
> >
> > What device attributes are important here? Its worth noting that in 2.0,
> we
> > > can still parse out the devicename for unknown well formed android user
> > > agents. But obviously we know nothing else about saiddevices.
> >
> >
> > Device name alone isn't very useful. It really depends on use cases for
> DM.
> > In earlier emails I mentioned which attributes are important for us.
> >
> >
> > On Sat, Mar 28, 2015 at 11:06 PM, Reza Naghibi <re...@apache.org> wrote:
> >
> > > Attached are the unknown (and generic) devices with counts. So as
> > > Konstantin noted, these comprise a little less than 30% of the devices.
> > The
> > > first 40 entries in the attachment comprise around 13% of this latter
> > set,
> > > so by adding these 40 devices, we can increase the accuracy of the
> > original
> > > list to a tad under 85%. Adding the top 100 devices would bring that
> > almost
> > > to 90% accuracy.
> > >
> > > Also, over 90% of these unknown devices are Android. Konstantin, is
> there
> > > any value in that Android identification for you?
> > >
> > > What device attributes are important here? Its worth noting that in
> 2.0,
> > > we can still parse out the device name for unknown well formed android
> > user
> > > agents. But obviously we know nothing else about said devices.
> > >
> > > Its also worth noting these are mostly Russian devices :)
> > >
> > >
> > > On Sat, Mar 28, 2015 at 2:21 PM, Konstantin Papkovskiy <
> > > konstantin@papkovskiy.com> wrote:
> > >
> > >> I made the sample UA list from logs of our backend servers for mobile
> > >> apps.
> > >> Here is the list:
> > >>
> https://www.dropbox.com/s/ne35o5etd7oj40f/ua_mobile_sample.csv.zip?dl=0
> > >>
> > >> You are welcome.
> > >>
> > >> On Sat, Mar 28, 2015 at 8:12 PM, Reza Naghibi <re...@apache.org>
> wrote:
> > >>
> > >> > If you can make JIRA tickets for the missing devices, that would be
> > >> great.
> > >> > We still have one more 1.0.x release scheduled for the
> spring/summer.
> > >> >
> > >> > Also, yes, if you can upload the user-agent list you used somewhere,
> > >> that
> > >> > would be great. Where did you get these user agents from?
> > >> >
> > >> > thanks!
> > >> >
> > >> > On Sat, Mar 28, 2015 at 11:10 AM, Konstantin Papkovskiy <
> > >> > konstantin@papkovskiy.com> wrote:
> > >> >
> > >> > > Hello all,
> > >> > >
> > >> > > I tested how effective DeviceMap in user device detection. I ran
> > tests
> > >> > on a
> > >> > > dataset of 1M user agents (mostly mobile). Here are my results.
> > >> > >
> > >> > > Top 10 devices
> > >> > > Device ID# of detections% of
> > >> allgenericAndroid27784227.78%iPad648816.49%
> > >> > >
> > >>
> iPhone468034.68%NokiaN8-00345173.45%GT-I9300242152.42%unknown180231.80%
> > >> > >
> > >> >
> > >>
> >
> GT-I9100159501.60%GT-P3100147581.48%Nokia5800d140041.40%GT-P5100123321.23%
> > >> > >
> > >> > >
> > >> > > DeviceMap data version: 1.0.2
> > >> > > DeviceMap gem version: 0.1.1
> > >> > > Number of user agents: 1 000 000
> > >> > > Number of unique user agents: 18 743
> > >> > > Number of successful detections: 981 977
> > >> > > Detection rate: 98.20 %
> > >> > > Number of unique successful detections: 1 027
> > >> > > Detection rate (unique UA): 5.48 %
> > >> > > *Detection rate (without genericAndroid): 70.42 %*
> > >> > >
> > >> > > As you can see, if I don't count "genericAndroid" and "unknown"
> then
> > >> > > detection rate is about 70 %, which is rather low. If you add some
> > of
> > >> the
> > >> > > most popular android UA to DM, its effectiveness will improve
> > >> > drastically.
> > >> > > But it will get progressively harder to increase this metric.
> > >> > >
> > >> > > I can provide the most popular UA from our logs if it will be
> > useful.
> > >> > >
> > >> > >
> > >> > > Regards
> > >> > >
> > >> > > Konstantin
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: DM effectiveness

Posted by Reza Naghibi <re...@apache.org>.
I made the following epic for these devices:

https://issues.apache.org/jira/browse/DMAP-154

A lot of them boil down to a handful of device classes, so it shouldnt be
too hard to get a large chunk of these into our next release.

On Sat, Mar 28, 2015 at 7:06 PM, Konstantin Papkovskiy <
konstantin@papkovskiy.com> wrote:

> Probably the UA dataset isn't very representative, hence such a low
> detection rate.
>
>
> > Also, over 90% of these unknown devices are Android. Konstantin, is there
> > any value in that Android identification for you?
>
>
> Yes, it is useful. For examples, we gather statistics about device OS.
>
> What device attributes are important here? Its worth noting that in 2.0, we
> > can still parse out the devicename for unknown well formed android user
> > agents. But obviously we know nothing else about saiddevices.
>
>
> Device name alone isn't very useful. It really depends on use cases for DM.
> In earlier emails I mentioned which attributes are important for us.
>
>
> On Sat, Mar 28, 2015 at 11:06 PM, Reza Naghibi <re...@apache.org> wrote:
>
> > Attached are the unknown (and generic) devices with counts. So as
> > Konstantin noted, these comprise a little less than 30% of the devices.
> The
> > first 40 entries in the attachment comprise around 13% of this latter
> set,
> > so by adding these 40 devices, we can increase the accuracy of the
> original
> > list to a tad under 85%. Adding the top 100 devices would bring that
> almost
> > to 90% accuracy.
> >
> > Also, over 90% of these unknown devices are Android. Konstantin, is there
> > any value in that Android identification for you?
> >
> > What device attributes are important here? Its worth noting that in 2.0,
> > we can still parse out the device name for unknown well formed android
> user
> > agents. But obviously we know nothing else about said devices.
> >
> > Its also worth noting these are mostly Russian devices :)
> >
> >
> > On Sat, Mar 28, 2015 at 2:21 PM, Konstantin Papkovskiy <
> > konstantin@papkovskiy.com> wrote:
> >
> >> I made the sample UA list from logs of our backend servers for mobile
> >> apps.
> >> Here is the list:
> >> https://www.dropbox.com/s/ne35o5etd7oj40f/ua_mobile_sample.csv.zip?dl=0
> >>
> >> You are welcome.
> >>
> >> On Sat, Mar 28, 2015 at 8:12 PM, Reza Naghibi <re...@apache.org> wrote:
> >>
> >> > If you can make JIRA tickets for the missing devices, that would be
> >> great.
> >> > We still have one more 1.0.x release scheduled for the spring/summer.
> >> >
> >> > Also, yes, if you can upload the user-agent list you used somewhere,
> >> that
> >> > would be great. Where did you get these user agents from?
> >> >
> >> > thanks!
> >> >
> >> > On Sat, Mar 28, 2015 at 11:10 AM, Konstantin Papkovskiy <
> >> > konstantin@papkovskiy.com> wrote:
> >> >
> >> > > Hello all,
> >> > >
> >> > > I tested how effective DeviceMap in user device detection. I ran
> tests
> >> > on a
> >> > > dataset of 1M user agents (mostly mobile). Here are my results.
> >> > >
> >> > > Top 10 devices
> >> > > Device ID# of detections% of
> >> allgenericAndroid27784227.78%iPad648816.49%
> >> > >
> >> iPhone468034.68%NokiaN8-00345173.45%GT-I9300242152.42%unknown180231.80%
> >> > >
> >> >
> >>
> GT-I9100159501.60%GT-P3100147581.48%Nokia5800d140041.40%GT-P5100123321.23%
> >> > >
> >> > >
> >> > > DeviceMap data version: 1.0.2
> >> > > DeviceMap gem version: 0.1.1
> >> > > Number of user agents: 1 000 000
> >> > > Number of unique user agents: 18 743
> >> > > Number of successful detections: 981 977
> >> > > Detection rate: 98.20 %
> >> > > Number of unique successful detections: 1 027
> >> > > Detection rate (unique UA): 5.48 %
> >> > > *Detection rate (without genericAndroid): 70.42 %*
> >> > >
> >> > > As you can see, if I don't count "genericAndroid" and "unknown" then
> >> > > detection rate is about 70 %, which is rather low. If you add some
> of
> >> the
> >> > > most popular android UA to DM, its effectiveness will improve
> >> > drastically.
> >> > > But it will get progressively harder to increase this metric.
> >> > >
> >> > > I can provide the most popular UA from our logs if it will be
> useful.
> >> > >
> >> > >
> >> > > Regards
> >> > >
> >> > > Konstantin
> >> > >
> >> >
> >>
> >
> >
>

Re: DM effectiveness

Posted by Konstantin Papkovskiy <ko...@papkovskiy.com>.
Probably the UA dataset isn't very representative, hence such a low
detection rate.


> Also, over 90% of these unknown devices are Android. Konstantin, is there
> any value in that Android identification for you?


Yes, it is useful. For examples, we gather statistics about device OS.

What device attributes are important here? Its worth noting that in 2.0, we
> can still parse out the devicename for unknown well formed android user
> agents. But obviously we know nothing else about saiddevices.


Device name alone isn't very useful. It really depends on use cases for DM.
In earlier emails I mentioned which attributes are important for us.


On Sat, Mar 28, 2015 at 11:06 PM, Reza Naghibi <re...@apache.org> wrote:

> Attached are the unknown (and generic) devices with counts. So as
> Konstantin noted, these comprise a little less than 30% of the devices. The
> first 40 entries in the attachment comprise around 13% of this latter set,
> so by adding these 40 devices, we can increase the accuracy of the original
> list to a tad under 85%. Adding the top 100 devices would bring that almost
> to 90% accuracy.
>
> Also, over 90% of these unknown devices are Android. Konstantin, is there
> any value in that Android identification for you?
>
> What device attributes are important here? Its worth noting that in 2.0,
> we can still parse out the device name for unknown well formed android user
> agents. But obviously we know nothing else about said devices.
>
> Its also worth noting these are mostly Russian devices :)
>
>
> On Sat, Mar 28, 2015 at 2:21 PM, Konstantin Papkovskiy <
> konstantin@papkovskiy.com> wrote:
>
>> I made the sample UA list from logs of our backend servers for mobile
>> apps.
>> Here is the list:
>> https://www.dropbox.com/s/ne35o5etd7oj40f/ua_mobile_sample.csv.zip?dl=0
>>
>> You are welcome.
>>
>> On Sat, Mar 28, 2015 at 8:12 PM, Reza Naghibi <re...@apache.org> wrote:
>>
>> > If you can make JIRA tickets for the missing devices, that would be
>> great.
>> > We still have one more 1.0.x release scheduled for the spring/summer.
>> >
>> > Also, yes, if you can upload the user-agent list you used somewhere,
>> that
>> > would be great. Where did you get these user agents from?
>> >
>> > thanks!
>> >
>> > On Sat, Mar 28, 2015 at 11:10 AM, Konstantin Papkovskiy <
>> > konstantin@papkovskiy.com> wrote:
>> >
>> > > Hello all,
>> > >
>> > > I tested how effective DeviceMap in user device detection. I ran tests
>> > on a
>> > > dataset of 1M user agents (mostly mobile). Here are my results.
>> > >
>> > > Top 10 devices
>> > > Device ID# of detections% of
>> allgenericAndroid27784227.78%iPad648816.49%
>> > >
>> iPhone468034.68%NokiaN8-00345173.45%GT-I9300242152.42%unknown180231.80%
>> > >
>> >
>> GT-I9100159501.60%GT-P3100147581.48%Nokia5800d140041.40%GT-P5100123321.23%
>> > >
>> > >
>> > > DeviceMap data version: 1.0.2
>> > > DeviceMap gem version: 0.1.1
>> > > Number of user agents: 1 000 000
>> > > Number of unique user agents: 18 743
>> > > Number of successful detections: 981 977
>> > > Detection rate: 98.20 %
>> > > Number of unique successful detections: 1 027
>> > > Detection rate (unique UA): 5.48 %
>> > > *Detection rate (without genericAndroid): 70.42 %*
>> > >
>> > > As you can see, if I don't count "genericAndroid" and "unknown" then
>> > > detection rate is about 70 %, which is rather low. If you add some of
>> the
>> > > most popular android UA to DM, its effectiveness will improve
>> > drastically.
>> > > But it will get progressively harder to increase this metric.
>> > >
>> > > I can provide the most popular UA from our logs if it will be useful.
>> > >
>> > >
>> > > Regards
>> > >
>> > > Konstantin
>> > >
>> >
>>
>
>

Re: DM effectiveness

Posted by Reza Naghibi <re...@apache.org>.
Attached are the unknown (and generic) devices with counts. So as
Konstantin noted, these comprise a little less than 30% of the devices. The
first 40 entries in the attachment comprise around 13% of this latter set,
so by adding these 40 devices, we can increase the accuracy of the original
list to a tad under 85%. Adding the top 100 devices would bring that almost
to 90% accuracy.

Also, over 90% of these unknown devices are Android. Konstantin, is there
any value in that Android identification for you?

What device attributes are important here? Its worth noting that in 2.0, we
can still parse out the device name for unknown well formed android user
agents. But obviously we know nothing else about said devices.

Its also worth noting these are mostly Russian devices :)


On Sat, Mar 28, 2015 at 2:21 PM, Konstantin Papkovskiy <
konstantin@papkovskiy.com> wrote:

> I made the sample UA list from logs of our backend servers for mobile apps.
> Here is the list:
> https://www.dropbox.com/s/ne35o5etd7oj40f/ua_mobile_sample.csv.zip?dl=0
>
> You are welcome.
>
> On Sat, Mar 28, 2015 at 8:12 PM, Reza Naghibi <re...@apache.org> wrote:
>
> > If you can make JIRA tickets for the missing devices, that would be
> great.
> > We still have one more 1.0.x release scheduled for the spring/summer.
> >
> > Also, yes, if you can upload the user-agent list you used somewhere, that
> > would be great. Where did you get these user agents from?
> >
> > thanks!
> >
> > On Sat, Mar 28, 2015 at 11:10 AM, Konstantin Papkovskiy <
> > konstantin@papkovskiy.com> wrote:
> >
> > > Hello all,
> > >
> > > I tested how effective DeviceMap in user device detection. I ran tests
> > on a
> > > dataset of 1M user agents (mostly mobile). Here are my results.
> > >
> > > Top 10 devices
> > > Device ID# of detections% of
> allgenericAndroid27784227.78%iPad648816.49%
> > > iPhone468034.68%NokiaN8-00345173.45%GT-I9300242152.42%unknown180231.80%
> > >
> >
> GT-I9100159501.60%GT-P3100147581.48%Nokia5800d140041.40%GT-P5100123321.23%
> > >
> > >
> > > DeviceMap data version: 1.0.2
> > > DeviceMap gem version: 0.1.1
> > > Number of user agents: 1 000 000
> > > Number of unique user agents: 18 743
> > > Number of successful detections: 981 977
> > > Detection rate: 98.20 %
> > > Number of unique successful detections: 1 027
> > > Detection rate (unique UA): 5.48 %
> > > *Detection rate (without genericAndroid): 70.42 %*
> > >
> > > As you can see, if I don't count "genericAndroid" and "unknown" then
> > > detection rate is about 70 %, which is rather low. If you add some of
> the
> > > most popular android UA to DM, its effectiveness will improve
> > drastically.
> > > But it will get progressively harder to increase this metric.
> > >
> > > I can provide the most popular UA from our logs if it will be useful.
> > >
> > >
> > > Regards
> > >
> > > Konstantin
> > >
> >
>

Re: DM effectiveness

Posted by Konstantin Papkovskiy <ko...@papkovskiy.com>.
I made the sample UA list from logs of our backend servers for mobile apps.
Here is the list:
https://www.dropbox.com/s/ne35o5etd7oj40f/ua_mobile_sample.csv.zip?dl=0

You are welcome.

On Sat, Mar 28, 2015 at 8:12 PM, Reza Naghibi <re...@apache.org> wrote:

> If you can make JIRA tickets for the missing devices, that would be great.
> We still have one more 1.0.x release scheduled for the spring/summer.
>
> Also, yes, if you can upload the user-agent list you used somewhere, that
> would be great. Where did you get these user agents from?
>
> thanks!
>
> On Sat, Mar 28, 2015 at 11:10 AM, Konstantin Papkovskiy <
> konstantin@papkovskiy.com> wrote:
>
> > Hello all,
> >
> > I tested how effective DeviceMap in user device detection. I ran tests
> on a
> > dataset of 1M user agents (mostly mobile). Here are my results.
> >
> > Top 10 devices
> > Device ID# of detections% of allgenericAndroid27784227.78%iPad648816.49%
> > iPhone468034.68%NokiaN8-00345173.45%GT-I9300242152.42%unknown180231.80%
> >
> GT-I9100159501.60%GT-P3100147581.48%Nokia5800d140041.40%GT-P5100123321.23%
> >
> >
> > DeviceMap data version: 1.0.2
> > DeviceMap gem version: 0.1.1
> > Number of user agents: 1 000 000
> > Number of unique user agents: 18 743
> > Number of successful detections: 981 977
> > Detection rate: 98.20 %
> > Number of unique successful detections: 1 027
> > Detection rate (unique UA): 5.48 %
> > *Detection rate (without genericAndroid): 70.42 %*
> >
> > As you can see, if I don't count "genericAndroid" and "unknown" then
> > detection rate is about 70 %, which is rather low. If you add some of the
> > most popular android UA to DM, its effectiveness will improve
> drastically.
> > But it will get progressively harder to increase this metric.
> >
> > I can provide the most popular UA from our logs if it will be useful.
> >
> >
> > Regards
> >
> > Konstantin
> >
>

Re: DM effectiveness

Posted by Reza Naghibi <re...@apache.org>.
If you can make JIRA tickets for the missing devices, that would be great.
We still have one more 1.0.x release scheduled for the spring/summer.

Also, yes, if you can upload the user-agent list you used somewhere, that
would be great. Where did you get these user agents from?

thanks!

On Sat, Mar 28, 2015 at 11:10 AM, Konstantin Papkovskiy <
konstantin@papkovskiy.com> wrote:

> Hello all,
>
> I tested how effective DeviceMap in user device detection. I ran tests on a
> dataset of 1M user agents (mostly mobile). Here are my results.
>
> Top 10 devices
> Device ID# of detections% of allgenericAndroid27784227.78%iPad648816.49%
> iPhone468034.68%NokiaN8-00345173.45%GT-I9300242152.42%unknown180231.80%
> GT-I9100159501.60%GT-P3100147581.48%Nokia5800d140041.40%GT-P5100123321.23%
>
>
> DeviceMap data version: 1.0.2
> DeviceMap gem version: 0.1.1
> Number of user agents: 1 000 000
> Number of unique user agents: 18 743
> Number of successful detections: 981 977
> Detection rate: 98.20 %
> Number of unique successful detections: 1 027
> Detection rate (unique UA): 5.48 %
> *Detection rate (without genericAndroid): 70.42 %*
>
> As you can see, if I don't count "genericAndroid" and "unknown" then
> detection rate is about 70 %, which is rather low. If you add some of the
> most popular android UA to DM, its effectiveness will improve drastically.
> But it will get progressively harder to increase this metric.
>
> I can provide the most popular UA from our logs if it will be useful.
>
>
> Regards
>
> Konstantin
>