Results 1 to 2 of 2

Thread: Twitter's robots.txt question:

  1. #1

    Default Twitter's robots.txt question:

    Twitter's robots.txt, It shows everything is disallowed, but surprisingly search engines are crawling and indexing everybody's profiles pages, Why?

  2. #2
    Join Date
    Jan 2011
    Posts
    1

    Default

    Twitter doesn't disallow all URLs. If they wanted to do that they'd do "Disallow: /". Their "Disallow: /*?" disallows all pages with a ? in the URL for robots that recognize wildcards (Google and Yahoo only as far as I know). For others the * is interpreted like any other character would be. Twitter profile, tweet, etc pages don't use a ? in the URL (no HTTP GET parameters) so search engines index them.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •