Where do I put robots.txt?

It must live at the site root, accessible at https://example.com/robots.txt. Subdomains each need their own copy because crawlers don't share rules across them. If your site runs on HTTPS, make sure the HTTPS host serves the file too.

What's the difference between Allow and Disallow?

Disallow names paths that should not be crawled. Allow grants explicit permission, which is mainly useful for carving exceptions out of a broader Disallow. For example, Disallow: /admin/ blocks the section, while a follow-up Allow: /admin/public/ keeps that one subdirectory crawlable.

Can I block specific bots?

Yes. A User-agent block targets a single crawler by name. User-agent: BadBot followed by Disallow: / locks that bot out entirely while a separate User-agent: * section keeps everyone else under different rules. The most specific match wins, so order the file from specific to general.

What is crawl budget?

Search engines allocate a finite number of fetches per site per day. On a large site, that ceiling matters because the crawler may not reach every page often enough. Blocking low-value paths in robots.txt redirects the budget toward important pages and is a common technical SEO win for ecommerce and big content sites.

Should I include sitemap?

Yes, always. Adding a Sitemap: line costs nothing and helps crawlers find your full URL list. You can include multiple Sitemap lines if your site uses several files, and you should still submit them directly in Google Search Console for the live status reporting.

Is robots.txt secure?

No, and treating it like a security mechanism is a frequent mistake. Honest crawlers honor the rules; hostile scrapers ignore them entirely. Anything that truly needs to stay private should sit behind authentication or network-level access controls instead.

Is the data sent anywhere?

No. The generator builds the file as plain text inside your browser. The path information you enter stays local until you upload the resulting file to your own server.

Robots.txt Generator

Generate robots.txt files online with crawler rules and sitemap directives. Free robots.txt generator for SEO and search engine control.

Web & SEO Developer Tools

Instant results

Rule #1

User-Agent

Disallow Paths

/admin//private/

Allow Paths (exceptions)

Sitemap URL

Crawl Delay (seconds)

Generated robots.txt

User-agent: *
Disallow: /admin/
Disallow: /private/

Important Notes:

Place robots.txt in your website root (e.g., example.com/robots.txt)
robots.txt is public - don't hide sensitive URLs in it
Use Allow to create exceptions within Disallowed paths
Not all bots respect robots.txt - use authentication for sensitive content

About Robots.txt Generator

Generate robots.txt files to control how search engine crawlers access your website. Set rules for different user agents, specify allowed and disallowed paths, and include your sitemap URL.

How to Use Robots.txt Generator

Set the default policy

Decide whether the wildcard User-agent block should allow everything or restrict by default. Most sites start permissive and add specific Disallow rules from there.

Add the paths you want hidden

List the directories that shouldn't appear in search, like /admin/, /private/, or internal search result pages. Add Allow rules to carve exceptions back out if needed.

Add per-crawler blocks if necessary

Want to lock out a specific scraper or grant Googlebot looser rules? Add a named User-agent section with its own Disallow and Allow lines.

Reference your sitemap

Drop in a Sitemap: line pointing at your sitemap.xml. It's a small addition that helps crawlers discover the full URL list, especially on sites with sparse internal linking.

Save and upload to the site root

Save the generated text as a file named robots.txt and upload it so it's reachable at https://your-domain/robots.txt. Crawlers won't find it anywhere else.

When to Use Robots.txt Generator

Telling crawlers where they're welcome

A well-formed robots.txt steers Googlebot, Bingbot, and the rest of the polite crawlers toward what matters and away from what doesn't. Common candidates for blocking are admin and login flows, internal search results, and development sections that shouldn't appear in the public index. The generator builds the syntax so you don't have to remember the exact directive ordering.

Keeping internal areas out of search

Account pages, internal tools, staging environments, and API endpoints rarely belong in search results. robots.txt is the standard way to ask well-behaved crawlers to skip them. It's not a security boundary (malicious scrapers ignore it), but it does cleanly handle visibility for the bots that respect the protocol.

Spending crawl budget on the right pages

Large sites have a finite amount of attention from search engines. If Googlebot wastes its allotment on faceted-navigation duplicates and calendar archives, your important pages get crawled less often. Disallowing low-value paths concentrates the budget where it matters and is one of the higher-leverage technical SEO levers on big sites.

Pointing crawlers at your sitemap

Including a Sitemap: line in robots.txt is the simplest way to make sure crawlers find your full URL list, especially on sites with sparse internal linking. It complements (rather than replaces) submitting the sitemap directly through Google Search Console and Bing Webmaster Tools.

Robots.txt Generator Examples

Open the doors to everyone

Input

Allow all crawlers everywhere

Output

User-agent: *\nDisallow: \nSitemap: https://example.com/sitemap.xml

The most permissive robots.txt you can ship. The empty Disallow value means every path is fair game, and the Sitemap line still gives crawlers a hand. Plenty of small sites never need anything more elaborate.

Block a couple of paths

Input

Hide /admin/ and /private/, leave everything else

Output

User-agent: *\nDisallow: /admin/\nDisallow: /private/\nAllow: /\nSitemap: https://example.com/sitemap.xml

Workhorse pattern for typical CMS-driven sites. Two Disallow rules carve out the protected sections, the explicit Allow keeps the rest open, and the sitemap reference helps crawlers reach every public page efficiently.

Per-crawler rules

Input

Block one specific bot, allow the rest

Output

User-agent: *\nDisallow:\n\nUser-agent: BadBot\nDisallow: /\n\nUser-agent: Googlebot\nAllow: /

When a single misbehaving crawler causes problems, you can target it by name. The catch-all section permits everyone, the BadBot block forbids that one user agent entirely, and the explicit Googlebot section makes its allowance unambiguous.

Tips & Best Practices for Robots.txt Generator

1.robots.txt is a visibility convention, not a security control. Honest crawlers obey it; aggressive scrapers ignore it. Anything that genuinely needs to stay private should sit behind authentication or IP rules.
2.The file has to live at the site root, exactly at /robots.txt. Subdomains each need their own copy. Crawlers don't check anywhere else.
3.Validate before you ship. A misplaced slash can deindex an entire site. The robots.txt tester in Google Search Console will tell you exactly which URLs each rule allows or blocks.
4.Mind the difference between an empty Disallow value and Disallow: /. The first allows everything, the second blocks everything. Two characters, opposite outcomes.
5.More specific User-agent blocks beat the wildcard. If you have rules under both User-agent: * and User-agent: Googlebot, Googlebot follows its named block and ignores the catch-all entirely.
6.Always include the Sitemap line, even if you also submit the sitemap manually. It costs nothing and helps crawlers that arrive without prior knowledge of your site.

Frequently Asked Questions

It's a small text file that lives at the root of a site and tells search engine crawlers which paths they're welcome to fetch. The syntax pairs User-agent lines with Disallow and Allow directives. The protocol dates back to 1994 and is honored by every major crawler from Googlebot on down.

Robots.txt Generator

Rule #1

About Robots.txt Generator

How to Use Robots.txt Generator

Set the default policy

Add the paths you want hidden

Add per-crawler blocks if necessary

Reference your sitemap

Save and upload to the site root

When to Use Robots.txt Generator

Telling crawlers where they're welcome

Keeping internal areas out of search

Spending crawl budget on the right pages

Pointing crawlers at your sitemap

Robots.txt Generator Examples

Open the doors to everyone

Block a couple of paths

Per-crawler rules

Tips & Best Practices for Robots.txt Generator

Frequently Asked Questions

Related Tools

Sitemap Generator

.htaccess Generator

User Agent Parser

URL Parser

HTML Viewer

CSP Header Generator