aiHit - All about our web crawler

Welcome to the aiHit Search Engine Web Crawler

It is likely you have reached this server because you have seen its IP listed in your visitor logs and you are curious as to why we should be looking at your website.

Who we are and what we do

aiHit is a UK based company, registered with both the Information Commissioner's Office and the DMA, which collects information about companies across the world to discover trends in various market sectors. We are only interested in company information and actively do not collect information from personal websites, personal blogs, and other websites that are detected as being personal in nature or from website directories. When information that is clearly not about a company is discovered, it is discarded as not being appropriate to collect. We have a number of servers in the UK, USA, and Ukraine which index company websites in mainly English speaking countries. These servers always identify themselves as being owned by aiHit.

Unlike other search engines, we do not rank information according to some esoteric criteria. To us all companies are equally important - only our customers can decide how companies should be ranked. Our customers include some of the world's leading company credit reference agencies, corporate data providers and research companies.

Our policy for crawling

Our crawler is designed to actively prevent disruption to servers and websites - you will never see us making thousands of requests to your site in a very short period of time. If you see any crawler identifying itself as “aiHit” accessing your website faster than one or two requests per second then it is another crawler that is illegally using our name in an attempt to appear legitimate. If you are a host provider, it is possible we shall be crawling up to three websites on all your servers in a single subnet at the same time.

Like all well behaved search engines, our crawlers respect the rules provided by you in a “robots.txt” file. As defined by web standards, your “robots.txt” file must be in the root directory of your website. In other words, for a domain such as “abc.com” the “robots.txt” file must appear in “abc.com/robots.txt”. If you observe a crawler examining parts of your website that are specifically forbidden by a properly constructed “robots.txt” file then it is another crawler using our name in an attempt to appear legitimate.

More about “robots.txt” and how to contact us

Before contacting us, it is important you check your “robots.txt” file is properly formatted and has accurate rules.

Follow this link to discover more about “robots.txt” and how to contact us if you believe we have breached our policy.

About robots.txt...