LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI

geneva_convenience@lemmy.ml · 3 days ago

LEAKED: A New List Reveals Top Websites Meta Is Scraping of Copyrighted Content to Train Its AI

mindbleach@sh.itjust.works · 3 days ago

Always amused when leftist instances treat intellectual property like it’s real.

irelephant [he/him]@lemmy.dbzer0.com · 12 hours ago

its not, but scraping is annoyingly resource intensive.

Vendetta9076@sh.itjust.works · 2 days ago

IP debate aside, LLM scrapers absolutely annihilate system resources. I host a wordpress site and before setting up cloudflare labyrinth my whole server would get ddos’d at least twice a day.