"Crawler" (sometimes also called a "robot" or "spider") is a generic term for any program that is used to automatically discover and scan websites by following links from one webpage to another. Google's main crawler is called Googlebot. This table lists information about the common Google
crawlers you may see in your referrer logs, and how to specify them in robots.txt, the robots meta tags, and the X-Robots-Tag HTTP directives. The following table shows the
crawlers used by various products and services at Google: Checks Android web page ad quality. Checks iPhone web page ad quality. Checks desktop web page ad quality. Checks Android app page ad quality. Obeys AdsBot-Google robots rules. Current agents: Former agent (deprecated): google-speakr Mozilla/5.0 (X11; Linux x86_64; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36 Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Mobile Safari/537.36 Wherever you see the string Chrome/W.X.Y.Z in the user agent strings in the table, W.X.Y.Z is actually a placeholder that represents the version of the Chrome browser used by that user agent: for example, 41.0.2272.96. This version number will increase over time to
match the latest Chromium release version used by Googlebot. If you are searching your logs or filtering your server for a user agent with this pattern, use wildcards for the version number rather than specifying an exact version number. Where several user agents are
recognized in the robots.txt file, Google will follow the most specific. If you want all of Google to be able to crawl your pages, you don't need a robots.txt file at all. If you want to block or allow all of Google's crawlers from accessing some of your content, you can do this by specifying Googlebot as the user agent. For example, if you want all your pages to appear in Google Search, and if you want AdSense ads to appear on your pages, you don't need a robots.txt file. Similarly, if you want
to block some pages from Google altogether, blocking the Googlebot user agent will also block all Google's other user agents. Overview of Google crawlers (user agents)
Crawlers
APIs-Google
User agent token
APIs-Google
Full user agent string
APIs-Google (+//developers.google.com/webmasters/APIs-Google.html)
AdsBot Mobile Web Android
User agent token
AdsBot-Google-Mobile
Full user agent string
Mozilla/5.0 (Linux; Android 5.0; SM-G920A) AppleWebKit (KHTML, like Gecko) Chrome Mobile Safari (compatible; AdsBot-Google-Mobile; +//www.google.com/mobile/adsbot.html)
AdsBot Mobile Web
User agent token
AdsBot-Google-Mobile
Full user agent string
Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1 (compatible; AdsBot-Google-Mobile; +//www.google.com/mobile/adsbot.html)
AdsBot
User agent token
AdsBot-Google
Full user agent string
AdsBot-Google (+//www.google.com/adsbot.html)
AdSense
User agent token
Mediapartners-Google
Full user agent string
Mediapartners-Google
Googlebot Image
User agent tokens
Full user agent string
Googlebot-Image/1.0
Googlebot News
User agent tokens
Full user agent string
The Googlebot-News user agent uses the various Googlebot user agent strings.
Googlebot Video
User agent tokens
Full user agent string
Googlebot-Video/1.0
Googlebot Desktop
User agent token
Googlebot
Full user agent strings
Googlebot Smartphone
User agent token
Googlebot
Full user agent string
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; Googlebot/2.1; +//www.google.com/bot.html)
Mobile AdSense
User agent token
Mediapartners-Google
Full user agent string
(Various mobile device types) (compatible; Mediapartners-Google/2.1; +//www.google.com/bot.html)
Mobile Apps Android
User agent token
AdsBot-Google-Mobile-Apps
Full user agent string
AdsBot-Google-Mobile-Apps
Feedfetcher
User agent token
FeedFetcher-Google
Full user agent string
FeedFetcher-Google; (+//www.google.com/feedfetcher.html)
Google Read Aloud
User agent token
Google-Read-Aloud
Full user agent strings
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36 (compatible; Google-Read-Aloud; +//developers.google.com/search/docs/crawling-indexing/overview-google-crawlers)
Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36 (compatible; Google-Read-Aloud; +//developers.google.com/search/docs/crawling-indexing/overview-google-crawlers)Google Favicon
User agent token
Full user agent string
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google Favicon
Google StoreBot
User agent token
Storebot-Google
Full user agent strings
Google Site Verifier
User agent token
Google-Site-Verification
Full user agent string
Mozilla/5.0 (compatible; Google-Site-Verification/1.0)
A note about Chrome/W.X.Y.Z in user agents
User agents in robots.txt
But if you want more fine-grained control, you can get more specific. For example, you might want all your pages to appear in Google Search, but you don't want images in your personal directory to be crawled. In this case, use robots.txt to disallow the Googlebot-Image user agent from crawling the files in your personal directory (while allowing Googlebot to crawl all files), like this:
User-agent: Googlebot Disallow: User-agent: Googlebot-Image Disallow: /personalTo take another example, say that you want ads on all your pages, but you don't want those pages to appear in Google Search. Here, you'd block Googlebot, but allow the Mediapartners-Google user agent, like this:
User-agent: Googlebot Disallow: / User-agent: Mediapartners-Google Disallow:Some pages use multiple robots meta tags to specify directives for different crawlers, like this:
<meta name="robots" content="nofollow"> <meta name="googlebot" content="noindex">In this case, Google will use the sum of the negative directives, and Googlebot will follow both the noindex and nofollow directives. More detailed information about controlling how Google crawls and indexes your site.
Controlling crawl speed
Each Google crawler accesses sites for a specific purpose and at different rates. Google uses algorithms to determine the optimal crawl rate for each site. If a Google crawler is crawling your site too often, you can reduce the crawl rate.
Retired Google crawlers
The following Google crawlers are no longer in use, and are only noted here for historical reference.
Duplex on the web | Supported the Duplex on the web service.
| ||||
Web Light | Checked for the presence of the no-transform header whenever a user clicked your page in search under appropriate conditions. The Web Light user agent was used only for explicit browse requests of a human visitor, and so it ignored robots.txt rules, which are used to block automated crawling requests.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2022-12-19 UTC.
[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }] [{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]