# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file # and http://tool.motoricerca.info/robots-checker.phtml for a handly robots.txt checker # and http://phpweby.com/services/robots to ensure that the big boys (eg. Google, Yahoo) are still allowed in # CW: disallowed 23/05/2011 as it has created a whole load of malformed urls - each of which was causing our app # to generate an email exception report (and the bot gives us no benefit as it is an anti-plagiarism site) User-agent: TurnitinBot Disallow: / # CW: disallowed 01/09/2011 : see trac #3 User-agent: plukkie Disallow: / # #584 : stop the good crawlers from getting the you_eye pages # #769 : overruled, so now good crawlers can get the you_eye pages # User-agent: * # Disallow: /offices/*/you_eye* # Disallow: /offices/*/historical_sales* # #740 : Rob asked to have this removed. # # #717 : and stop them from getting to the default-redirected page when coming from wellsteadandwellstead.co.uk # User-agent: * # Disallow: /offices/bournemouth?wellstead=ferndown$ # CW: add in our site map, #82 Sitemap: http://www.youhome.co.uk/sitemap.xml # TB: popup information urls, presumably extracted from our js. #731 User-agent: * Disallow: */popup_information?ids=* User-agent: * Disallow: */popup_information # CW: prevent Google from hitting some JS-only urls User-agent: * Disallow: */extra_info?*