You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Alexandre Rafalovitch (Jira)" <ji...@apache.org> on 2021/03/21 01:36:00 UTC

[jira] [Commented] (SOLR-15189) Tracking downloads on new Solr site

    [ https://issues.apache.org/jira/browse/SOLR-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305594#comment-17305594 ] 

Alexandre Rafalovitch commented on SOLR-15189:
----------------------------------------------

We don't have enough data yet for super deep insights, but it is already interesting. We have a dual-linked property configured (classic and GA4), I jumped around a bit trying to understand both.

A couple of things that popped out:
 # 93% of our users are on desktop. That's higher than normal, but perhaps reflects that we are delivering software, not service as such
 # We get about 3x number of users on work days than on weekends. So, this seems to correspond to work-time interest.
 # Home page is most popular which is usual; the second most popular (half of that) is downloads (good); everything else drops off half again or worse.
 # The guide page is only 5% of visits; I find that under-promoted, though this does not count actual reference guide pages usage (no GA there)
 # Landing (first) page is home page page for 67% and downloads 15%, the rest are much less and don't get much (first page) attention
 # Half of the traffic arrives from Organic Search (and that's basically all Google); 30% is direct which may be copy/pastes, PDF links, etc; interestingly we get a tiny (but more than bing) traffic from the links on localhost:8983 (from our Admin UI links clearly); we don't have a partner promoting Solr so much that they are driving noticeable traffic to us.
 # Most of the social traffic (70%) is from Twitter, our presence on Hacker News, Reddit and even Stack Overflow is not effective
 # Most visitors are from USA, then China, India, and Germany and then a heavy drop-off; I would be curious who
 # 83% of users are marked as new, though this may be issue of time and of cookie-clearing technologies

 

Couple of notes, looking for feedback/+-1
 * I really really wish we would wire Reference Guide for analytics, even for 30 days or so - I think (hope) that may change a lot of above numbers and - even more importantly - would give us some visibility into user flow through the information
 * Search keywords that lead to the site are mostly in Search Console (as well as errors) - but it failed to validate through GA4 tag, we can do by uploading a file to the site - I recommend doing it as - again - this is about getting access to the information Google already collects (regardless of GA actually)
 * In GA4, I saw file_download event and it even had 'filename' attribute, but it would only show that attribute for last 30 minutes. I could not figure out how it was setup and/or how to see the values of filename attribute over longer period of time; if there is any information on whether/how that was setup, it would be good to know.

 

 

> Tracking downloads on new Solr site
> -----------------------------------
>
>                 Key: SOLR-15189
>                 URL: https://issues.apache.org/jira/browse/SOLR-15189
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Jan Høydahl
>            Assignee: Alexandre Rafalovitch
>            Priority: Major
>         Attachments: Analytics All Web Site Data All Traffic 20210219-20210320.pdf, Analytics All Web Site Data Pages 20210219-20210320.pdf
>
>
> On lucene.apache.org we use Google Analytics tracking
> {quote}GOOGLE_ANALYTICS_TRACKING_ID = 'UA-94576-12'
> {quote}
> I think the reason was so that we could estimate downloads from mirrors, by counting number of clicks on the links from download pages. But are anyone ever looking at or publishing those numbers?
> The ASF wants projects to stop using 3rd party tracking of users and instead ask INFRA for aggregated stats for the page. WDYT? Should we
>  # Remove trackers from both sites and rely on stats from infra
>  # Continue using Google analytics, but have someone actually publish numbers from it every month?
>  # Use some other way of counting downloads?
> h2. What do we get without a tracker?
> INFRA provides anonymous page view stats here [https://uls.apache.org/exports/lucene.apache.org.yaml] which gives some insight. But not downloads specifically. We see 12k visits to Solr downloads page last months, but we don't know how many of those clicked...
> {code:java}
> Sheet3:
>   Name: Most visited pages, past month
>   Values:
>     /solr/index.html: 33604
>     /index.html: 27588
>     /solr/downloads.html: 12118
>     /core/2_9_4/queryparsersyntax.html: 11135
>     /core/index.html: 10353
>     /solr/guide/solr-tutorial.html: 9734
>     /solr/resources.html: 8014
>     /solr/features.html: 7046
>     /solr/guide/8_8/solr-tutorial.html: 6099
>     /solr/news.html: 5843
>     /solr/guide/6_6/the-standard-query-parser.html: 5216
>     /solr/guide/index.html: 4430
>     /solr/guide/6_6/common-query-parameters.html: 4379
>     /core/downloads.html: 3644
> {code}
> There's an interesting section at the bottom of that YAML page, wonder if it could be enabled in some way
> {code}
> Sheet6:
>   Name: Downloads, past month
>   Values: {}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)