AI search company Perplexity is suspected of stealth crawling websites, ignoring established guidelines. Allegations include changing identities to mimic regular browsers and evading firewalls. Cloudflare's testing revealed that Perplexity's crawlers, when blocked, adopted generic user agents and used varied IPs and ASNs. This behavior violates accepted internet standards, particularly the robots.txt protocol that governs bot access to site content. Cloudflare emphasizes the importance of crawler transparency amid increasing AI usage for information gathering, which is critical for the integrity of website operations.
Perplexity crawls websites under its own name before switching methods to avoid detection when traffic is blocked, violating internet standards and guidelines.
Cloudflare found that crawlers altered their identity by mimicking regular browsers like Chrome and evading firewall rules with changing IP addresses and ASNs.
Perplexity's techniques of stealth crawling challenge the transparency standards for bots, raising concerns over ethical practices in AI-driven information retrieval.
The robots.txt protocol allows website owners to control automated access, and Cloudflare stresses the necessity of transparency regarding crawler identity and purpose.
Collection
[
|
...
]