Programmer and sysadmin (DevOps?), wannabe polymath in tech, science and the mind. Neurodivergent, disabled, burned out, and close to throwing in the towel, but still liking ponies 🦄 and sometimes willing to discuss stuff.

  • 5 Posts
  • 1.25K Comments
Joined 2 years ago
cake
Cake day: June 26th, 2023

help-circle
  • There is an experimental distributed open source search engine: https://dawnsearch.org/

    It has a series of issues of its own, though.

    Per-user weighting was out of the reach of hardware 20 years ago… and is still out of the reach of anything other than very large distributed systems. No single machine is currently capable of holding even the index for the ~200 million active websites, much less the ~800 billion webpages in the Wayback Machine. Multiple page attributes… yes, that would be great, but again things escalate quickly. The closest “hope”, would be some sort of LLM on the scale of hundreds of trillions of parameters… and even that might fall short.

    Distributed indexes, with queries getting shared among peers, mean that privacy goes out the window. Homomorphic encryption could potentially help with that, but that requires even more hardware.

    TL;DR: it’s being researched, but it’s hard.


  • The basic algorithm is quite straightforward, it’s the scale and edge cases that make it hard to compete.

    “Ideally”, from a pure data perspective, everybody would have all the data and all the processing power to search through it on their own with whatever algorithm they prefer, like a massive P2P network of per-person datacenters.

    Back to reality, that’s pretty much insanely impossible. So we get a few search engines, with huge entry costs, offering more value the larger they get… which leads to lock-in, trying to game their algorithms, filtering, monetization, and all the other issues.




  • Reddit’s moderation bots have been extremely trigger happy for many years.

    I got my main account, 10+ years club, suspended… appealed it, and got banned. Then every account I had ever logged into with the same IP, app, or browser as the banned one, at any moment in the past, got banned in cascade.

    Once you get on Reddit’s bad side, there’s no going back. Suspensions add flags to Reddit’s internal “shadow profile” of every account ever linked in any way. They all become more likely to get flagged and suspended, which gets them flagged even more in turn, until Reddit’s ban-evasion system kicks in. Then, they’re all toast.

    To add insult to injury… once triggered, the bots go back checking your history, applying the most recent moderation guidelines retroactively. Over the following months, the account kept getting notifications about old comments being removed, followed by subreddit bans.








  • Education is supposed to teach “how to learn to learn”.

    Left to his own devices, then, without knowing quite what to ask or how to interpret the responses, the man in this case study “did his own research”

    The whole thing with “do your own research”, is kind of funny:

    • some use it to avoid explaining their points
    • others use it to come up with a lot of nonsense
    • while the proper way to begin any “research”, is to… ask an expert.

    Nobody has ended up in a psych hold, just by reading a bunch of Wikipedia articles, asking ChatGPT… then consulting a doctor.






  • Keywords: NPU, unified RAM

    Apple is doing it, AMD is doing it, phones are doing it.

    GPUs with dedicated VRAM are an inefficient way of doing inference. They’ve been great for research purposes, into what type of NPU may be the best one, but that’s been answered already for LLMs. Current step is, achieving mass production.

    5 years sounds realistic, unless WW3.