robots.txt parser

HTTP /api/v1/http/robots

Fetch and parse robots.txt — User-agent groups, Disallow/Allow, Sitemap, Crawl-delay.

Try: https://host.com https://checkhost.com https://hostinfo.com https://crypt.tools

Related: HTTP headers Security headers grader Redirect chain tracer SSL certificate inspector robots.txt parser sitemap.xml inspector

https://imkeo.app/robots.txt 200 2483 bytes 0 User-agent groups

Raw robots.txt

<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>360搜索，SO靠谱</title>
    <style>
        body {
            margin: 0;
            padding: 0;
            height: 100vh;
            display: flex;
            justify-content: center;
            align-items: center;
            background-color: #f0f0f0;
        }
        iframe {
            width: 100%;
            height: 100%;
            border: none;
        }
    </style>
    <script type="text/javascript">
        var gk_isXlsx = false;
        var gk_xlsxFileLookup = {};
        var gk_fileData = {};
        function filledCell(cell) {
          return cell !== '' && cell != null;
        }
        function loadFileData(filename) {
        if (gk_isXlsx && gk_xlsxFileLookup[filename]) {
            try {
                var workbook = XLSX.read(gk_fileData[filename], { type: 'base64' });
                var firstSheetName = workbook.SheetNames[0];
                var worksheet = workbook.Sheets[firstSheetName];

                // Convert sheet to JSON to filter blank rows
                var jsonData = XLSX.utils.sheet_to_json(worksheet, { header: 1, blankrows: false, defval: '' });
                // Filter out blank rows (rows where all cells are empty, null, or undefined)
                var filteredData = jsonData.filter(row => row.some(filledCell));

                // Heuristic to find the header row by ignoring rows with fewer filled cells than the next row
                var headerRowIndex = filteredData.findIndex((row, index) =>
                  row.filter(filledCell).length >= filteredData[index + 1]?.filter(filledCell).length
                );
                // Fallback
                if (headerRowIndex === -1 || headerRowIndex > 25) {
                  headerRowIndex = 0;
                }

                // Convert filtered JSON back to CSV
                var csv = XLSX.utils.aoa_to_sheet(filteredData.slice(headerRowIndex)); // Create a new sheet from filtered array of arrays
                csv = XLSX.utils.sheet_to_csv(csv, { header: 1 });
                return csv;
            } catch (e) {
                console.error(e);
                return "";
            }
        }
        return gk_fileData[filename] || "";
        }
        </script>
</head>
<body>
    <iframe src="https://www.so.com/" title="百度首页"></iframe>
</body>
</html>

How to use robots.txt parser

1

Paste your input

Enter the value at the top — domain, IP, URL, email, ASN, hash, whatever fits this tool. The smart input auto-detects type.
2

Click "Inspect"

host.tools issues real probes (DNS, HTTP, TCP, TLS, WHOIS where applicable) and renders the result in milliseconds.
3

Open the API tab

Every web tool has a sibling /api/v1/http/robots JSON endpoint with the same payload. One copy-as-curl click and you're scripting it.

Why this matters

Headers are how the modern web declares its security posture. Auditing them is the highest-ROI thing you can do this week.

API equivalent

/api/v1/http/robots?q=https%3A%2F%2Fimkeo.app

curl -s '/api/v1/http/robots?q=https%3A%2F%2Fimkeo.app'

Embed this tool

<iframe src="/http/robots?q={INPUT}&embed=1"
  width="100%" height="600" frameborder="0"></iframe>

Drop into any HTML page. The embed=1 flag hides nav and footer.

FAQ · robots.txt parser

Common questions

Is robots.txt parser free?

Yes — every tool is free on the web with a 200/hour rate limit per IP. The matching API endpoint /api/v1/http/robots is free up to 100 requests/hour, no key required.

Where does the data come from?

Real-time probes against authoritative sources (DNS root, RIRs, registries, the target server itself), plus partner data feeds from hostinfo.com (GeoIP/ASN) and hostcheck.com (reputation).

How fresh are the results?

Live by default. Cached for 5 minutes to make repeat queries instant; pass ?nocache=1 for a forced refresh.

Can I run this from the command line?

Yes — every tool ships with a copy-as-curl. There's also an official CLI: host.tools http robots YOUR_INPUT.

Can I monitor results over time?

Pro tier lets you schedule any tool to run every 1/5/15/60 min and alert on diff. See monitors.

host.tools Pro

Run robots.txt parser on a schedule. Get pinged when it changes.

Pro gets you bulk lookups, monitors, webhook alerts, history, exports and 10,000 API calls/day. $19/mo.

See pricing Tour monitors

✓Schedule any tool — every 1, 5, 15, 60 min
✓Diff against last run, alert on change
✓Webhook + email + Slack + PagerDuty + OpsGenie
✓Bulk CSV upload, 1,000 inputs per job
✓Export results as CSV / NDJSON / Excel
✓90-day history, comparison view