You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Dmitriy Setrakyan <ds...@apache.org> on 2015/09/02 01:36:06 UTC

Re: website changes

Raul,

Sorry for the late reply, but better late than never :)

On Mon, Aug 24, 2015 at 4:30 AM, Raul Kripalani <ra...@apache.org> wrote:

> wget spider output here:
> https://gist.github.com/raulk/7d6713aa7b3d21ecaacd
>
> No issues with regards to the domain migration, but we have 404 in
> robots.txt and some jquery JS across many pages.
>
> I also ran a spider on our readme.io docs, and it was quite OK except that
> it found these 404s:
>
> --2015-08-24 12:16:43--
> https://apacheignite.readme.io/docs/distributed-closures%22  HTTP/1.1 404
> Not Found
> --2015-08-24 12:26:04--
> https://apacheignite.readme.io/docs/%7B%7Burl('v'%20+%20v.version)%7D%7D
>  HTTP/1.1 404 Not Found
>
>
Raul, is there a way to find out the referrer pages that have these links?


> With regards to the jquery URL references:
>
> --2015-08-24 12:19:31--
> http://ignite.apache.org/use-cases/spark/js/jquery-1.11.1.min.js
>   HTTP/1.1 404 Not Found
> --2015-08-24 12:19:27--
> http://ignite.apache.org/use-cases/caching/js/jquery-1.11.1.min.js
>   HTTP/1.1 404 Not Found
>

Prachi, any chance you can look at these?


>
> Are some examples. I guess these HTMLs are referring to jquery in the
> context directory rather than in a common directory.
>
> Do you think it makes sense to add a robots.txt for SEO purposes?
>

Raul, I am not sure what we would put into robots.txt. Is there any benefit
in having this file vs not having it?


>
> Regards,
>
> *Raúl Kripalani*
> Apache Camel PMC Member & Committer | Enterprise Architect, Open Source
> Integration specialist
> http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
> http://blog.raulkr.net | twitter: @raulvk
>
> On Mon, Aug 24, 2015 at 10:45 AM, Dmitriy Setrakyan <dsetrakyan@apache.org
> >
> wrote:
>
> > Igniters,
> >
> > I have updated the Ignite website to reflect the project graduation
> (turned
> > out that many links were not working).
> >
> > Would be nice if the community clicked around and verified that all the
> > links are working and all the wording and examples are correct.
> >
> > Thanks,
> > D.
> >
>

Re: website changes

Posted by Konstantin Boudnik <co...@apache.org>.
On Tue, Sep 01, 2015 at 04:36PM, Dmitriy Setrakyan wrote:
> Raul,
> 
> Sorry for the late reply, but better late than never :)
> 
> On Mon, Aug 24, 2015 at 4:30 AM, Raul Kripalani <ra...@apache.org> wrote:
> 
> > wget spider output here:
> > https://gist.github.com/raulk/7d6713aa7b3d21ecaacd
> >
> > No issues with regards to the domain migration, but we have 404 in
> > robots.txt and some jquery JS across many pages.

The fact that some crawler is looking for robots.txt (or certain JS files)
doesn't oblige us to provide them.

Cos

> > I also ran a spider on our readme.io docs, and it was quite OK except that
> > it found these 404s:
> >
> > --2015-08-24 12:16:43--
> > https://apacheignite.readme.io/docs/distributed-closures%22  HTTP/1.1 404
> > Not Found
> > --2015-08-24 12:26:04--
> > https://apacheignite.readme.io/docs/%7B%7Burl('v'%20+%20v.version)%7D%7D
> >  HTTP/1.1 404 Not Found
> >
> >
> Raul, is there a way to find out the referrer pages that have these links?
> 
> 
> > With regards to the jquery URL references:
> >
> > --2015-08-24 12:19:31--
> > http://ignite.apache.org/use-cases/spark/js/jquery-1.11.1.min.js
> >   HTTP/1.1 404 Not Found
> > --2015-08-24 12:19:27--
> > http://ignite.apache.org/use-cases/caching/js/jquery-1.11.1.min.js
> >   HTTP/1.1 404 Not Found
> >
> 
> Prachi, any chance you can look at these?
> 
> 
> >
> > Are some examples. I guess these HTMLs are referring to jquery in the
> > context directory rather than in a common directory.
> >
> > Do you think it makes sense to add a robots.txt for SEO purposes?
> >
> 
> Raul, I am not sure what we would put into robots.txt. Is there any benefit
> in having this file vs not having it?
> 
> 
> >
> > Regards,
> >
> > *Raúl Kripalani*
> > Apache Camel PMC Member & Committer | Enterprise Architect, Open Source
> > Integration specialist
> > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
> > http://blog.raulkr.net | twitter: @raulvk
> >
> > On Mon, Aug 24, 2015 at 10:45 AM, Dmitriy Setrakyan <dsetrakyan@apache.org
> > >
> > wrote:
> >
> > > Igniters,
> > >
> > > I have updated the Ignite website to reflect the project graduation
> > (turned
> > > out that many links were not working).
> > >
> > > Would be nice if the community clicked around and verified that all the
> > > links are working and all the wording and examples are correct.
> > >
> > > Thanks,
> > > D.
> > >
> >

Re: website changes

Posted by Konstantin Boudnik <co...@apache.org>.
On Thu, Sep 10, 2015 at 05:38PM, Prachi Garg wrote:
> I've created Google site verification for both, http and https version of
> Ignite website. I've also added sitemap.xml and robots.txt.
> 
> If I may, I would like to suggest a server-side 301 redirect from http to
> https version of the website to ensure that users and search engines are
> directed to the preferred protocol (I'm guessing https pages are preferred
> over http pages).

That's a good idea! I guess not everyone just installs HttpsEverywhere and
forget that the websites still handle non-secure transport ;)

Cos

> Thanks,
> 
> -Prachi
> 
> On Tue, Sep 1, 2015 at 5:16 PM, Raul Kripalani <ra...@evosent.com> wrote:
> 
> > robots.txt is not functionally important in this context because we don't
> > have content we want to exclude from crawling.
> >
> > But it's always wise to serve at least a generic one because you don't know
> > if Google penalises sites that return an HTTP 404 for this file in terms of
> > SEO. I wouldn't be surprised if it did. And it's a simple file to add.
> >
> > Moreover, I would suggest creating a google-site-verification with a
> > sitemap controlled by us (if it hasn't been done yet - I'm on mobile now
> > and it's a pain to check). And also to hint the crawler to also crawl
> > readme.io and the javadoc. Currently if I google the term IgniteContext I
> > get the 1.1.0 javadoc page, which tells me that Google needs some hints to
> > crawl more recent javadocs better.
> >
> > Regards,
> > Raúl.
> > On 2 Sep 2015 00:36, "Dmitriy Setrakyan" <ds...@apache.org> wrote:
> >
> > > Raul,
> > >
> > > Sorry for the late reply, but better late than never :)
> > >
> > > On Mon, Aug 24, 2015 at 4:30 AM, Raul Kripalani <ra...@apache.org>
> > wrote:
> > >
> > > > wget spider output here:
> > > > https://gist.github.com/raulk/7d6713aa7b3d21ecaacd
> > > >
> > > > No issues with regards to the domain migration, but we have 404 in
> > > > robots.txt and some jquery JS across many pages.
> > > >
> > > > I also ran a spider on our readme.io docs, and it was quite OK except
> > > that
> > > > it found these 404s:
> > > >
> > > > --2015-08-24 12:16:43--
> > > > https://apacheignite.readme.io/docs/distributed-closures%22  HTTP/1.1
> > > 404
> > > > Not Found
> > > > --2015-08-24 12:26:04--
> > > >
> > https://apacheignite.readme.io/docs/%7B%7Burl('v'%20+%20v.version)%7D%7D
> > > >  HTTP/1.1 404 Not Found
> > > >
> > > >
> > > Raul, is there a way to find out the referrer pages that have these
> > links?
> > >
> > >
> > > > With regards to the jquery URL references:
> > > >
> > > > --2015-08-24 12:19:31--
> > > > http://ignite.apache.org/use-cases/spark/js/jquery-1.11.1.min.js
> > > >   HTTP/1.1 404 Not Found
> > > > --2015-08-24 12:19:27--
> > > > http://ignite.apache.org/use-cases/caching/js/jquery-1.11.1.min.js
> > > >   HTTP/1.1 404 Not Found
> > > >
> > >
> > > Prachi, any chance you can look at these?
> > >
> > >
> > > >
> > > > Are some examples. I guess these HTMLs are referring to jquery in the
> > > > context directory rather than in a common directory.
> > > >
> > > > Do you think it makes sense to add a robots.txt for SEO purposes?
> > > >
> > >
> > > Raul, I am not sure what we would put into robots.txt. Is there any
> > benefit
> > > in having this file vs not having it?
> > >
> > >
> > > >
> > > > Regards,
> > > >
> > > > *Raúl Kripalani*
> > > > Apache Camel PMC Member & Committer | Enterprise Architect, Open Source
> > > > Integration specialist
> > > > http://about.me/raulkripalani |
> > http://www.linkedin.com/in/raulkripalani
> > > > http://blog.raulkr.net | twitter: @raulvk
> > > >
> > > > On Mon, Aug 24, 2015 at 10:45 AM, Dmitriy Setrakyan <
> > > dsetrakyan@apache.org
> > > > >
> > > > wrote:
> > > >
> > > > > Igniters,
> > > > >
> > > > > I have updated the Ignite website to reflect the project graduation
> > > > (turned
> > > > > out that many links were not working).
> > > > >
> > > > > Would be nice if the community clicked around and verified that all
> > the
> > > > > links are working and all the wording and examples are correct.
> > > > >
> > > > > Thanks,
> > > > > D.
> > > > >
> > > >
> > >
> >

Re: website changes

Posted by Prachi Garg <pg...@gridgain.com>.
I've created Google site verification for both, http and https version of
Ignite website. I've also added sitemap.xml and robots.txt.

If I may, I would like to suggest a server-side 301 redirect from http to
https version of the website to ensure that users and search engines are
directed to the preferred protocol (I'm guessing https pages are preferred
over http pages).


Thanks,

-Prachi

On Tue, Sep 1, 2015 at 5:16 PM, Raul Kripalani <ra...@evosent.com> wrote:

> robots.txt is not functionally important in this context because we don't
> have content we want to exclude from crawling.
>
> But it's always wise to serve at least a generic one because you don't know
> if Google penalises sites that return an HTTP 404 for this file in terms of
> SEO. I wouldn't be surprised if it did. And it's a simple file to add.
>
> Moreover, I would suggest creating a google-site-verification with a
> sitemap controlled by us (if it hasn't been done yet - I'm on mobile now
> and it's a pain to check). And also to hint the crawler to also crawl
> readme.io and the javadoc. Currently if I google the term IgniteContext I
> get the 1.1.0 javadoc page, which tells me that Google needs some hints to
> crawl more recent javadocs better.
>
> Regards,
> Raúl.
> On 2 Sep 2015 00:36, "Dmitriy Setrakyan" <ds...@apache.org> wrote:
>
> > Raul,
> >
> > Sorry for the late reply, but better late than never :)
> >
> > On Mon, Aug 24, 2015 at 4:30 AM, Raul Kripalani <ra...@apache.org>
> wrote:
> >
> > > wget spider output here:
> > > https://gist.github.com/raulk/7d6713aa7b3d21ecaacd
> > >
> > > No issues with regards to the domain migration, but we have 404 in
> > > robots.txt and some jquery JS across many pages.
> > >
> > > I also ran a spider on our readme.io docs, and it was quite OK except
> > that
> > > it found these 404s:
> > >
> > > --2015-08-24 12:16:43--
> > > https://apacheignite.readme.io/docs/distributed-closures%22  HTTP/1.1
> > 404
> > > Not Found
> > > --2015-08-24 12:26:04--
> > >
> https://apacheignite.readme.io/docs/%7B%7Burl('v'%20+%20v.version)%7D%7D
> > >  HTTP/1.1 404 Not Found
> > >
> > >
> > Raul, is there a way to find out the referrer pages that have these
> links?
> >
> >
> > > With regards to the jquery URL references:
> > >
> > > --2015-08-24 12:19:31--
> > > http://ignite.apache.org/use-cases/spark/js/jquery-1.11.1.min.js
> > >   HTTP/1.1 404 Not Found
> > > --2015-08-24 12:19:27--
> > > http://ignite.apache.org/use-cases/caching/js/jquery-1.11.1.min.js
> > >   HTTP/1.1 404 Not Found
> > >
> >
> > Prachi, any chance you can look at these?
> >
> >
> > >
> > > Are some examples. I guess these HTMLs are referring to jquery in the
> > > context directory rather than in a common directory.
> > >
> > > Do you think it makes sense to add a robots.txt for SEO purposes?
> > >
> >
> > Raul, I am not sure what we would put into robots.txt. Is there any
> benefit
> > in having this file vs not having it?
> >
> >
> > >
> > > Regards,
> > >
> > > *Raúl Kripalani*
> > > Apache Camel PMC Member & Committer | Enterprise Architect, Open Source
> > > Integration specialist
> > > http://about.me/raulkripalani |
> http://www.linkedin.com/in/raulkripalani
> > > http://blog.raulkr.net | twitter: @raulvk
> > >
> > > On Mon, Aug 24, 2015 at 10:45 AM, Dmitriy Setrakyan <
> > dsetrakyan@apache.org
> > > >
> > > wrote:
> > >
> > > > Igniters,
> > > >
> > > > I have updated the Ignite website to reflect the project graduation
> > > (turned
> > > > out that many links were not working).
> > > >
> > > > Would be nice if the community clicked around and verified that all
> the
> > > > links are working and all the wording and examples are correct.
> > > >
> > > > Thanks,
> > > > D.
> > > >
> > >
> >
>

Re: website changes

Posted by Raul Kripalani <ra...@evosent.com>.
robots.txt is not functionally important in this context because we don't
have content we want to exclude from crawling.

But it's always wise to serve at least a generic one because you don't know
if Google penalises sites that return an HTTP 404 for this file in terms of
SEO. I wouldn't be surprised if it did. And it's a simple file to add.

Moreover, I would suggest creating a google-site-verification with a
sitemap controlled by us (if it hasn't been done yet - I'm on mobile now
and it's a pain to check). And also to hint the crawler to also crawl
readme.io and the javadoc. Currently if I google the term IgniteContext I
get the 1.1.0 javadoc page, which tells me that Google needs some hints to
crawl more recent javadocs better.

Regards,
Raúl.
On 2 Sep 2015 00:36, "Dmitriy Setrakyan" <ds...@apache.org> wrote:

> Raul,
>
> Sorry for the late reply, but better late than never :)
>
> On Mon, Aug 24, 2015 at 4:30 AM, Raul Kripalani <ra...@apache.org> wrote:
>
> > wget spider output here:
> > https://gist.github.com/raulk/7d6713aa7b3d21ecaacd
> >
> > No issues with regards to the domain migration, but we have 404 in
> > robots.txt and some jquery JS across many pages.
> >
> > I also ran a spider on our readme.io docs, and it was quite OK except
> that
> > it found these 404s:
> >
> > --2015-08-24 12:16:43--
> > https://apacheignite.readme.io/docs/distributed-closures%22  HTTP/1.1
> 404
> > Not Found
> > --2015-08-24 12:26:04--
> > https://apacheignite.readme.io/docs/%7B%7Burl('v'%20+%20v.version)%7D%7D
> >  HTTP/1.1 404 Not Found
> >
> >
> Raul, is there a way to find out the referrer pages that have these links?
>
>
> > With regards to the jquery URL references:
> >
> > --2015-08-24 12:19:31--
> > http://ignite.apache.org/use-cases/spark/js/jquery-1.11.1.min.js
> >   HTTP/1.1 404 Not Found
> > --2015-08-24 12:19:27--
> > http://ignite.apache.org/use-cases/caching/js/jquery-1.11.1.min.js
> >   HTTP/1.1 404 Not Found
> >
>
> Prachi, any chance you can look at these?
>
>
> >
> > Are some examples. I guess these HTMLs are referring to jquery in the
> > context directory rather than in a common directory.
> >
> > Do you think it makes sense to add a robots.txt for SEO purposes?
> >
>
> Raul, I am not sure what we would put into robots.txt. Is there any benefit
> in having this file vs not having it?
>
>
> >
> > Regards,
> >
> > *Raúl Kripalani*
> > Apache Camel PMC Member & Committer | Enterprise Architect, Open Source
> > Integration specialist
> > http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
> > http://blog.raulkr.net | twitter: @raulvk
> >
> > On Mon, Aug 24, 2015 at 10:45 AM, Dmitriy Setrakyan <
> dsetrakyan@apache.org
> > >
> > wrote:
> >
> > > Igniters,
> > >
> > > I have updated the Ignite website to reflect the project graduation
> > (turned
> > > out that many links were not working).
> > >
> > > Would be nice if the community clicked around and verified that all the
> > > links are working and all the wording and examples are correct.
> > >
> > > Thanks,
> > > D.
> > >
> >
>