Salamander

  • 87 Posts
  • 312 Comments
Joined 4 years ago
cake
Cake day: December 19th, 2021

help-circle
  • I have experienced issues both over tor and over clearnet. The tor front-end exists on its own server, but it connects to the mander server. So, the server that hosts the front-end via Tor will see the exit node connecting to it, and then the mander server gets the requests via that Tor server. Ultimately some bandwidth is used for both servers because the data travels from mander, to the tor front-end, and then to the exit node. There is also another server that hosts and serves the images.

    What I see is not a bandwidth problem, though. It seems like the database queries are the bottleneck. There is a limited number of connections to the database, and some of the queries are complex and use a lot of CPU. It is the intense searching through the database what appears to throttle the website.









  • I do have a wall with similar boxes. From the image, I am not sure if they are the same size. I just measured one of my small drawers and it is 14 cm x 5.5 cm x 5 cm. Since I have many different tiny components, I quickly ran out of space when I tried to give each component its own drawer.

    But I think that I might be able to do a better job with these if I take everything out and start organizing again. I set the rules for how to place things before I started buying SMD components, and many of the through-hole components I can combine without problem. An improvement would be if I can find something like this but with many more and much smaller boxes.





  • Thank you.

    A few days ago, I blocked several IP ranges to solve this. I unblocked them about two days ago in an attempt to solve some federation issues… The bots from this IP range came back.

    This time I blocked only the IP range that has the most bot-like activity. Hopefully that resolves it.



  • For mander.xyz it has been bot scrapers. That time that you are mentioning it was scraping via the onion front end that I am hosting for easier access over Tor. Yesterday an army of bots scraping via Alibaba cloud servers made the server unusable for a few minutes. The instance would receive a bunch of requests from the same IP range (47.79.0.0/16), and denying that full IP range fixed the problem.

    Some instances implement anti-bot measures. For example, https://sopuli.xyz/ makes use of Anubis. I think that instances behind Cloudfare get some protection too. I am considering using Anubis for mander.xyz, but for now I have just been dealing with this manually as it does not happen too often.



  • Thanks!

    I don’t see those specific IPs, nor 16514. But now I see what scrapers tend to look like in the logs :)

    I am now pretty sure that the cause was scraper-like activity coming from the Mlmym front-end that I am serving over an onion site. I am not sure if it randomly started mis-behaving or if a tor scraper was using it.

    After blocking this, federation was restored, performance increased, and CPU use came down:


  • I just realized that it is not a ‘scraper’, the requests came from the server that I am using to provide an interface to the site as an Onion site. The amount of requests was suspiciously high so maybe a bot is scraping through Tor. I will leave it off for a few days and see if I can turn it back on later.



  • I did not update or change anything in the past few days.

    But, now that you mentioned an AI scrapper I looked into the logs and noticed some heavy requests to the API from a specific IP.

    Requests look consistent to scraping - just consistently and continuously issuing GET requests to different API endpoints.

    XX.XX.XX.XX - - [14/Oct/2025:21:28:54 +0000] "GET /api/v3/community/list?limit=20&sort=TopAll HTTP/2.0" 403 107 "-" "Mlmym"
    
    

    I have started denying their requests and it is the first thing that seems to have actually helped!

    I don’t want to speak too early but I think you may have identified the cause. Thanks!


  • So far, I have been able to ‘control’ the CPU use by setting limits to the process that pulls stuff from the database (pool size, CPU, memory).

    This does release some of the CPU for other tasks, but I think that that what creates the lag might actually be the clogged database queries. So, constraining those resources might not solve the lag problem.