How search engines changed with proxies

Search engines all hide their indexing, ranking, and sorting algorithms under lock and key. During the early stages of the most popular search engine – Google – ranking signals were relatively simple. These were the days where almost all that was needed were keywords and backlinks.

If you were to ask how many ranking factors Google today has, the honest answer is that no one knows exactly. But most SEO professionals will tell you there’s well over 200 of them.

Clearly, search engines have become significantly more complex. Part of the reason is that the web has become more complicated and much more vast. Search engines are also more complex today because people are always trying to reverse engineer the hidden ranking factors for their own benefit.

Do search engines themselves need proxies?

If you were to build a search engine from scratch today, you would need proxy servers to make it function properly. The underlying technology of a search engine is relatively simple. It runs an automated script that crawls through a website, downloads the HTML, and analyzes the content.

Assuming the content is deemed relevant, it’s added to an index. Users can then use the search bar to run through the index to find the content they need. Of course, the internet is now so vast and advanced that such a simplistic search engine would be considered a failure.

Since a crawler needs to send thousands or even millions of requests to a website to index every piece of content, it’s likely your regular IP address would get banned. Additionally, some content may only be accessible if you’re from a particular country or region, making proxies a necessity.

Established search engines like Google are not banned by websites as nearly everyone wants their content indexed. They still need to acquire location-based content, but they likely use their own servers instead of third-party proxy providers – although we can’t be certain.

How proxies impacted search engines

Then there’s the other end of the spectrum. SEO professionals and enthusiasts have always wanted to figure out how search engines rank websites and what influences the top positions.

If you were a regular user, that would be impossible to figure out. Although Google provides guidelines, most of them are vague (such as “produce good content”). SEO professionals, however, have figured that they can scrape search engines at scale, and add internal data from websites to start gaining insights into ranking factors.

Proxies have to be used, however, as most search engines quickly ban users that are sending too many requests. Additionally, localized content won’t be served effectively without hundreds of IP addresses.

As a result, SEO professionals created dedicated tools (that are also sold to other professionals) to understand how and why search engines rank some content over others. Some of these insights can be quite apt and direct.

A great example of understanding and abusing ranking systems was the SEO contest held way back in 2018. Someone managed to rank a rhinoplasty website for English search results, but the entire website was written in Latin.

Search engines, of course, know that there are tools that purportedly showcase ranking factors. So, if the content is too accurate, search engines have to tweak their algorithms to avoid people from abusing rankings.

Even if it’s used by legitimate SEO use cases, people knowing too much about how search engines rank content can cause issues. The result is search engines have to play a constant cat-and-mouse game.

And that cat-and-mouse game is only really possible because of large-scale data collection from search engines, only made possible by proxies. So, in the end, proxies have a great deal of influence over the way search engines function.

It’s all too easy to think that proxies are negative – in that they reveal some sort of secret ranking system and allow people to abuse it. But many of the changes and tweaks that have been made to the algorithms over the years have been with the idea of improving search results.

In some sense, SEO tools, and web data collection with proxies, provide indirect competition to search engines. They don’t take away the revenue of these companies. But they do nudge them to constantly improve and tweak ranking algorithms – benefiting everyone in the long run.

Link!

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro