The basic algorithm is quite straightforward, it’s the scale and edge cases that make it hard to compete.
“Ideally”, from a pure data perspective, everybody would have all the data and all the processing power to search through it on their own with whatever algorithm they prefer, like a massive P2P network of per-person datacenters.
Back to reality, that’s pretty much insanely impossible. So we get a few search engines, with huge entry costs, offering more value the larger they get… which leads to lock-in, trying to game their algorithms, filtering, monetization, and all the other issues.
There is an experimental distributed open source search engine: https://dawnsearch.org/
It has a series of issues of its own, though.
Per-user weighting was out of the reach of hardware 20 years ago… and is still out of the reach of anything other than very large distributed systems. No single machine is currently capable of holding even the index for the ~200 million active websites, much less the ~800 billion webpages in the Wayback Machine. Multiple page attributes… yes, that would be great, but again things escalate quickly. The closest “hope”, would be some sort of LLM on the scale of hundreds of trillions of parameters… and even that might fall short.
Distributed indexes, with queries getting shared among peers, mean that privacy goes out the window. Homomorphic encryption could potentially help with that, but that requires even more hardware.
TL;DR: it’s being researched, but it’s hard.