# /robots.txt file for http://webcrawler.com/ # mail webmaster@webcrawler.com for constructive criticism User-agent: webcrawler Disallow: User-agent: googlebot Disallow: User-agent: slurp Disallow: User-agent: inktomi Disallow: User-agent: hotbot Disallow: User-agent: lycra Disallow: User-agent: * Disallow: /tmp Disallow: /logs robot-id: googlebot robot-name: Googlebot robot-cover-url: http://www.googlebot.com/ robot-details-url: http://www.googlebot.com/bot.html robot-owner-name: Google Inc. robot-owner-url: http://www.google.com/ robot-owner-email: googlebot@google.com robot-status: active robot-purpose: indexing robot-type: standalone robot-platform: Linux robot-availability: none robot-exclusion: yes robot-exclusion-useragent: googlebot robot-noindex: yes robot-host: googlebot.com robot-from: yes robot-useragent: Googlebot/2.X (+http://www.googlebot.com/bot.html) robot-language: c++ robot-description: Google's crawler robot-history: Developed by Google Inc robot-environment: commercial modified-date: Wed Jan 7 00:00:01 PST 2007 modified-by: googlebot@google.com robot-id: slurp robot-name: Inktomi Slurp robot-cover-url: http://www.inktomi.com/ robot-details-url: http://www.inktomi.com/slurp.html robot-owner-name: Inktomi Corporation robot-owner-url: http://www.inktomi.com/ robot-owner-email: slurp@inktomi.com robot-status: active robot-purpose: indexing, statistics robot-type: standalone robot-platform: unix robot-availability: none robot-exclusion: yes robot-exclusion-useragent: slurp robot-noindex: yes robot-host: *.inktomi.com robot-from: yes robot-useragent: Slurp/2.0 robot-language: C/C++ robot-description: Indexing documents for the HotBot search engine (www.hotbot.com), collecting Web statistics robot-history: Switch from Slurp/1.0 to Slurp/2.0 November 1996 robot-environment: service modified-date: Wed Jan 7 00:00:01 PST 2007 modified-by: slurp@inktomi.com robot-id: wz101 robot-name: WebZinger robot-details-url: http://www.imaginon.com/wzindex.html robot-cover-url: http://www.imaginon.com robot-owner-name: ImaginOn, Inc robot-owner-url: http://www.imaginon.com robot-owner-email: info@imaginon.com robot-status: active robot-purpose: indexing robot-type: standalone robot-platform: windows95, windowsNT 4, mac, solaris, unix robot-availability: binary robot-exclusion: no robot-exclusion-useragent: none robot-noindex: no robot-host: http://www.imaginon.com/wzindex.html * robot-from: no robot-useragent: none robot-language: java robot-description: commercial Web Bot that accepts plain text queries, uses webcrawler, lycos or excite to get URLs, then visits sites. If the user's filter parameters are met, downloads one picture and a paragraph of test. Playsback slide show format of one text paragraph plus image from each site. robot-history: developed by ImaginOn in 1996 and 1997 robot-environment: commercial modified-date: Wed Jan 7 00:00:01 PST 2007 modified-by: schwartz@imaginon.com robot-id: architext robot-name: ArchitextSpider robot-cover-url: http://www.excite.com/ robot-details-url: robot-owner-name: Architext Software robot-owner-url: http://www.atext.com/spider.html robot-owner-email: spider@atext.com robot-status: robot-purpose: indexing, statistics robot-type: standalone robot-platform: robot-availability: robot-exclusion: yes robot-exclusion-useragent: robot-noindex: no robot-host: *.atext.com robot-from: yes robot-useragent: ArchitextSpider robot-language: perl 5 and c robot-description: Its purpose is to generate a Resource Discovery database, and to generate statistics. The ArchitextSpider collects information for the Excite and WebCrawler search engines. robot-history: robot-environment: modified-date: Wed Jan 7 00:00:01 PST 2007 modified-by: