Author Topic: robots.txt  (Read 5115 times)

aenor

  • Newbie
  • *
  • Posts: 8
    • View Profile
robots.txt
« on: November 15, 2009, 09:23:57 pm »
I couldnt find a bugtracker or anything so I am posting my feature requests in this forum.
It would be nice to see a bug/feature-tracker like trac, redmine or bugzilla. That does specially help the developers to keep an overview. If theres allready such a thing it could be promoted more on the website.

Now to the feature request:
Please add a robots.txt in the root directory with the following content:
Code: [Select]
User-agent: *
Disallow: /

That prevents searchengines from indexing the login site.
On the one hand it is useless to have a login page indexed and on the other hand it could be possible to find opengoo installations through i.e. google and possibly even certain versions (or at least versions prior to a specific version if the login page changed (title or a string)) which can be used to exploit possible security holes.

a2opinion

  • Full Member
  • ***
  • Posts: 177
  • Christian
    • View Profile
    • A Second Opinion
Re: robots.txt
« Reply #1 on: November 16, 2009, 01:43:24 pm »
good call.

It's an easy thing for us to add ourselves, but we shouldn't have to.  Should be super easy to implement also.

ignacio

  • Hero Member
  • *****
  • Posts: 1703
    • View Profile
Re: robots.txt
« Reply #2 on: November 27, 2009, 04:29:16 pm »
It won't have any effect unless you install OpenGoo in the document root of your server, am I right?

aenor

  • Newbie
  • *
  • Posts: 8
    • View Profile
Re: robots.txt
« Reply #3 on: November 29, 2009, 10:56:40 am »
It won't have any effect unless you install OpenGoo in the document root of your server, am I right?

That is right, it will only apply to the users, who use opengoo in the document root, i.e. in a subdomain. But it won't "harm" others. That is exactly the way roundcube and phpmyadmin handle it.

a2opinion

  • Full Member
  • ***
  • Posts: 177
  • Christian
    • View Profile
    • A Second Opinion
Re: robots.txt
« Reply #4 on: December 18, 2009, 12:24:37 pm »
Just discovered today that putting this in your robots.txt will prevent google from spidering your calendar if you have the feed imported into your google calendar.

internalkernel

  • Freshman
  • *
  • Posts: 15
    • View Profile
    • Tek Twelve
Re: robots.txt
« Reply #5 on: February 08, 2010, 10:52:29 pm »
Yep... that's exactly how I came across this thread... after upgrading from 1.6.1 to 1.6.2 - the robots.txt file must have come to life. And google is complaining about not being able to crawl the contents of an exported ical.

My workaround was to add:

Code: [Select]
User-agent: Googlebot
Allow: /*.php?c=feed&a=ical_export$

to the top of my robots.txt

a2opinion

  • Full Member
  • ***
  • Posts: 177
  • Christian
    • View Profile
    • A Second Opinion
Re: robots.txt
« Reply #6 on: February 26, 2010, 09:24:40 pm »
oh yeah, that's much better than just killing it off, which is what I was doing...