Results 1 to 9 of 9
  1. #1
    pph Expert! Gordon's Avatar
    Join Date
    January 18th, 2005
    Location
    Edmonton Canada
    Posts
    5,781
    Can anybody help me with this please?
    I want to remove some URL's from Google's index as I have changed the sites navigation methods. They are still listing URL's like mysite.com/?b= and mysite.com/?a= and mysite.com/?ip=

    I am wondering is there anything I can put in either the .htaccess or the robots.txt file that will prevent them listing/cause them to remove as dead links all files that start with mysite.com/?b= and mysite.com/?a= and mysite.com/?ip=

    Thanks in advance for any help.

    [edited to add] Am I correct in thinking these in my robots.txt file would do what I am asking above?

    User-agent: Googlebot
    Disallow: /*?b=
    User-agent: Googlebot
    Disallow: /*?a=
    User-agent: Googlebot
    Disallow: /*?ip=

    or maybe like this
    User-agent: Googlebot
    Disallow: /*?ip=*


    Thanks again
    Last edited by Gordon; August 4th, 2005 at 02:24 PM.
    One day parasites and their ilk will be made illegal, I bet a few Lawyers will be pissed off when the day comes.
    Mr. Spitzer is fetching it nearer

    YouTrek

  2. #2
    Lite On The Do, Heavy On The Nuts Donuts's Avatar
    Join Date
    January 18th, 2005
    Location
    Winter Park, FL
    Posts
    6,930
    how about the new google site map deal where you tell them what the pages of your site are - I would guess that would remove the ones you don't tell them exist...

  3. #3
    pph Expert! Gordon's Avatar
    Join Date
    January 18th, 2005
    Location
    Edmonton Canada
    Posts
    5,781
    Thanks donuts. I've not managed to get the sitemap program working yet, my server does not allow me to run a script on it. I am with httpme.com
    One day parasites and their ilk will be made illegal, I bet a few Lawyers will be pissed off when the day comes.
    Mr. Spitzer is fetching it nearer

    YouTrek

  4. #4
    Crazy like a fox suzigeek's Avatar
    Join Date
    January 18th, 2005
    Posts
    1,096
    I believe htaccess would be the route to go.

    I'm fairly new to writing rewrite rules but I think I've read somewhere you can write a rule to forward all those type of urls to a 404 page or do a moved permanently 301 or 302 redirect (I forget which one is which) and redirict to the appropriate page. You would have to get up to speed with regular expressions.

    The robots.txt file basically tells the bots what directories they are allowed to access. It doesn't tell them that a page has moved or is no longer available (I think).

    Do a search for mod-rewrite htaccess. There's a good forum on it out there. I had some good tutorials bookmarked but have switched computers and left those bookmarks behind.

    Good Luck!!
    Suz~~GearGirl~~

  5. #5
    Lite On The Do, Heavy On The Nuts Donuts's Avatar
    Join Date
    January 18th, 2005
    Location
    Winter Park, FL
    Posts
    6,930
    Quote Originally Posted by Gordon
    Thanks donuts. I've not managed to get the sitemap program working yet, my server does not allow me to run a script on it. I am with httpme.com
    I believe G lets you upload a text file of them - skipping all the software, the sitemap generator and xml and scripting requirements that they have. See it here:
    http://www.google.com/webmasters/sit.../en/other.html

    Look for "Text file" at top of page and last section on bottom of page.

  6. #6
    Member
    Join Date
    January 18th, 2005
    Location
    Choctaw, Oklahoma
    Posts
    114
    Quote Originally Posted by suzigeek
    I believe htaccess would be the route to go.

    I'm fairly new to writing rewrite rules but I think I've read somewhere you can write a rule to forward all those type of urls to a 404 page or do a moved permanently 301 or 302 redirect (I forget which one is which) and redirict to the appropriate page. You would have to get up to speed with regular expressions.

    The robots.txt file basically tells the bots what directories they are allowed to access. It doesn't tell them that a page has moved or is no longer available (I think).

    Do a search for mod-rewrite htaccess. There's a good forum on it out there. I had some good tutorials bookmarked but have switched computers and left those bookmarks behind.

    Good Luck!!
    htaccess will tell google this is a permanent change and google will act accordingly.

    here's what you put in your htaccess file:

    redirect 301 /thisisoldfile.html http://www.thisisnewfile.com/newfile.html

    Use 301 since it is a permanent change (and I read somewhere google does not like 302's anyway)

    Peace & Blessings,
    White Wolf
    Blessed Be,
    White Wolf

  7. #7
    ABW Ambassador Radegast's Avatar
    Join Date
    January 18th, 2005
    Posts
    1,978
    Are there any disadvantages to redirecting non-existent pages to the index page?
    ie:
    ErrorDocument 404 http://<mydomain.com>/index.html

  8. #8
    notary sojac Herb ԿԬ's Avatar
    Join Date
    January 18th, 2005
    Location
    Central/Western NY State
    Posts
    7,741
    Cool whoa, folks!
    the last time I went to google.com/addurl I saw a box where you could report a dead url.

  9. #9
    ABW Ambassador
    Join Date
    January 18th, 2005
    Location
    Los Angeles
    Posts
    4,053
    No problem with 301 redirects for absent pages to the right ones.

  10. Newsletter Signup

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •