Entireweb
Home About Services
   

About Entireweb Search Technology

Entireweb is committed to the goal of providing the freshest and most useful search database on the planet. We believe that one key element in a great search engine is to include the best pages and keep them updated - just having a huge index is not synonymous with great search results.

Entireweb's index currently encompasses several hundred million documents and receives about 100 million searches every month.

Features of Entireweb Search Technology

Here are some key features of our custom-developed search technology:

  • Super-scalable architecture. EST is designed to handle collections of over 100 billion documents distributed across thousands of servers with loads exceeding hundreds of queries per second. Entireweb's main index today consists of several hundred million full-text indexed documents.
  • Advanced query capabilities. The search engine handles complex hierarchical AND, OR, NOT, phrase and proximity queries, along with several special search modes, such as field and meta tag search, restrictions by language, region, continent, IP number, host, filetype and much more.
  • Quick. Typical query response times are below 0.2 seconds, even at loads of several hundred queries per second.
  • File types. We are currently able to full-text index HTML, PDF, Word, Excel, PowerPoint, plain text and XML documents. A plug-in architecture makes adding new parsers trivial.
  • Intelligent crawler. With one of the smartest crawling algorithms available, Entireweb makes sure to index important and frequently changing documents often, while not wasting resources on other documents. Our crawlers currently handle tens of millions of documents every day.
  • Extreme fault tolerance. Every server in the search engine can be arbitrarily replicated and run as either hot-spare or for load balancing.

100% Developed in Sweden

Entireweb's search technology, which is unique in Europe, was developed by the Entireweb team in Halmstad, Sweden. All software used in the search engine is owned and developed by Entireweb. Hence, the company founds the development of its next-generation search technologies on the unique experience and know-how obtained from successfully developing and running global-scale search technologies over the last 8 years.

Entireweb embraces Linux as its primary platform. The core search technology consists of hundreds of thousands of lines of C code, and runs on a large cluster of servers running a highly customized version of Linux.