Results 1 to 6 of 6
  1. #1
    Member
    Join Date
    May 18th, 2005
    Posts
    119
    AE Script and DYSE
    Will having the following line in the robots txt file keep the pages from being spidered and indexed?

    Disallow: /cgi-bin/

  2. #2
    ABW Ambassador cusimano's Avatar
    Join Date
    January 18th, 2005
    Location
    Toronto, Canada
    Posts
    1,369
    Assuming that the spider obeys the /robots.txt file (some spiders don't), that line will stop URL's that start with /cgi-bin/ from being spidered, such as /cgi-bin/ae.pl?PARAMETERS

    If you are using AE virtual directory (mod_rewrite) so that products show up in a virtual directory (e.g.: mydomain.com/amazon/) then you would need a Disallow statement for that virtual directory directory, e.g.:

    User-agent: *
    Disallow: /amazon/

    DySE always uses virtual directory (mod_rewrite) so you would need a Disallow for the virtual directory where the DySE store is located. Disallowing /cgi-bin/ would have no effect for a DySE store. For example, mydomain.com/calendars/ would require:

    User-agent: *
    Disallow: /calendars/

    Yours truly,
    Cusimano.Com Corporation
    per: David Cusimano

  3. #3
    Member
    Join Date
    May 18th, 2005
    Posts
    119
    Quote Originally Posted by cusimano
    Assuming that the spider obeys the /robots.txt file (some spiders don't), that line will stop URL's that start with /cgi-bin/ from being spidered, such as /cgi-bin/ae.pl?PARAMETERS

    If you are using AE virtual directory (mod_rewrite) so that products show up in a virtual directory (e.g.: mydomain.com/amazon/) then you would need a Disallow statement for that virtual directory directory, e.g.:

    User-agent: *
    Disallow: /amazon/

    DySE always uses virtual directory (mod_rewrite) so you would need a Disallow for the virtual directory where the DySE store is located. Disallowing /cgi-bin/ would have no effect for a DySE store. For example, mydomain.com/calendars/ would require:

    User-agent: *
    Disallow: /calendars/

    Yours truly,
    Cusimano.Com Corporation
    per: David Cusimano
    Having it as a subdomain how would one deal with that in the robot txt file?

  4. #4
    ABW Ambassador cusimano's Avatar
    Join Date
    January 18th, 2005
    Location
    Toronto, Canada
    Posts
    1,369
    Upload your robots.txt file to your server so that it shows up at:

    http:// subdomain.domain.com/robots.txt

    That is, upload the robots.txt file to the directory where your subdomain's home page is located.

    Yours truly,
    Cusimano.Com Corporation
    per: David Cusimano

  5. #5
    Member
    Join Date
    May 18th, 2005
    Posts
    119
    Quote Originally Posted by cusimano
    Upload your robots.txt file to your server so that it shows up at:

    http:// subdomain.domain.com/robots.txt

    That is, upload the robots.txt file to the directory where your subdomain's home page is located.

    Yours truly,
    Cusimano.Com Corporation
    per: David Cusimano
    If my subdomain is such ...

    for say rockler ...

    woodworking.domain.com and in the root of /woodworking I see

    the files cgi-bin and dyse where the robots.txt file is then would I still dissallow: /woodworking ?

    It seems odd to do this.

    Thanks

  6. #6
    ABW Ambassador cusimano's Avatar
    Join Date
    January 18th, 2005
    Location
    Toronto, Canada
    Posts
    1,369
    In the /woodworking directory create a robots.txt file containing:

    User-agent: *
    Disallow: /

    When a search engine spider wants to determine if http://woodworking.domain.com/something/something.html is spiderable, the spider reads http://woodworking.domain.com/robots.txt ---- From what you have described, that URL corresponds to the file /woodworking/robots.txt on our server, so your server reads that file and sends its contents to the spider.

    Yours truly,
    Cusimano.Com Corporation
    per: David Cusimano

  7. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. DySE::GirlyChecks v10.05.08 released -- new script
    By cusimano in forum Cusimano.com Scripts
    Replies: 0
    Last Post: May 8th, 2010, 02:13 PM
  2. DySE::NS (NetShops.com) script released
    By cusimano in forum Cusimano.com Scripts
    Replies: 8
    Last Post: February 9th, 2008, 10:51 AM
  3. Replies: 1
    Last Post: February 24th, 2006, 12:25 PM
  4. DYSe script Documentation download
    By g352004 in forum Cusimano.com Scripts
    Replies: 1
    Last Post: July 24th, 2005, 02:46 PM
  5. DySE::CFB Script
    By bhey in forum Cusimano.com Scripts
    Replies: 1
    Last Post: July 8th, 2005, 10:15 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •