Over 6 trillion Google searches per day….
The Google search engine is a tool we have grown to rely on as an integral part of our modern lives but have you ever stopped to consider how the search engine works? Here’s a bite-sized summary of how the Google search engine works to return your accurate results from over 60 trillion web pages in just a few milliseconds.
Crawling and Indexing
When you search the web using Google you are not actually searching the web, you are searching Google’s index of the web. The index contains over 60 trillion individual pages and over 100 million gigabytes of data.
Google use software is known as web crawlers or “spiders” to discover publicly available web pages. Crawlers follow the links found on web pages (much like you would do when browsing the web) and return information from those pages to the Google servers. The crawl process starts with a list of web addresses from past crawls as well as sitemaps provided by website owners. The crawlers literally go from link to link paying particular attention to new sites, changes to existing sites and dead links they find.
As Google gathers information via the crawl process it creates an index of that information – a bit like you’d find at the back of a book. The index contains information relating to the words and their location, as well as when pages were published, whether those pages contain pictures and videos, which other websites link to the pages and much more.
At the most basic level, when you do a search, Google looks up your search words in their index and returns pages to the search results page which contain those words.
For an average search query, there are thousands if not millions of web pages that contain useful information. Algorithms are the computer processes that interpret your questions and turn them into answers. Google’s current algorithms use more than 200 unique signals to best guess what you might be looking for. These signals include the words on the site, the recency of content, the domains authority and the location from where you are searching.
Based on the clues Google finds it pulls relevant information (documents) from its index and ranks the results in ordered of perceived importance.
All this happens in about 1/8th of a second.
Google’s algorithms are constantly changing. The changes begin as ideas in the minds of Google’s technicians, then after experimentation and testing, they are rolled out to the world-wide-web.
Google continually fight web spam in order to maintain the quality and relevancy of its results. Most spam removal is done automatically by the algorithmic filters, however sometimes Google will examine questionable documents by hand, and if they find them to be spam will take manual action against webmasters.
Google update their algorithm as much as 500 times per year. This is done to fight webmasters who try to manipulate the search results in their own favour using manufactured SEO tactics, and to ensure the highest relevancy of results for users.
The Search Engine
Behind your simple page of search results, is a very deep and complex system, carefully designed and tested by Google technicians to provide relevant results for more than one-hundred billion searches per month.
Will you think differently about how Google works next time you use the search engine?