At some point digital security turns into physical security, and there are national security interests that have fine-tuned their detection logic on these kinds of "buggy" behavior.
If you patch it, you'd need to find another way to de-anonymize those users.
Yeah, I dont know how anybody stays sane without it. I have a list of over a thousand ASNs I blackhole at this point...
Mine is a daily bash cronjob that fetches a text-based database and uses grep to build an nftables-apply script with all the IPs for the blocked ASNs. I keep meaning to share it, but it's embarrassingly messy I haven't had time to clean it up...
It's been a real game of cat and mouse over the last few years. I used to do daily iptables updates to block repeat scrapers on my small niche stats site I run. About 5-6 ago it become more common to see broader ranges - so I started blocking ASNs which worked great (esp for the regulars like Alibaba, Tencent, compromised DigitalOcean/OVH, ...). In the last 2-3 years though the overall bot traffic has skyrocketed - it's easy to spot bot activity after the fact (no requests to the CDN for static assets, user agent changes from one request to the next, predictable ID enumeration, etc) but not in a real time. They're also often using residential-based proxies and Cloudflare bot detection has become pretty bad.
Arms races suck. I've managed to find a few L7 tricks to catch the residential proxies and serve them an empty 200, but there are obvious trivial workarounds on the other end and if I start talking about them in public they won't last long... I wish I could share :/
Cloudflare is so easy to defeat and almost everyone in the scrapping industry is selling solutions that automatically bypass, hcaptcha solving is also really cheap nowadays.
It would break the internet to make this available to the average person. A large swath would actively choose to block stuff like: all of Meta, Alphabet, Apple, Amazon, etc etc etc.
Anyhoo, now you mention it this is the tack I am going to take in my own network, thanks!
It's a real pain in the ass because in the absence of ASN based blocking, you often have to give something a long list of IP ranges in CIDR notation, and be certain you don't "miss" even one ipv4 /23 or /24 or a crawler will get through.
nation state actor picking right time to sabotage a tiny part of the key rotation process. on monday someone cut major fiber lines, on tuesday DENIC is failing.
Unironically yeah, we are at the level of weaponizable sophistication that this metaphorical dick waving you are suggesting is probably something that happens
On Monday there was a huge outage affecting several cities quite close to Frankfurt because someone cut major fiber line; today DENIC is having a party and right when everyone is drunk this happens because some post-rotation task cannot be completed.
It's perfectly fine for fingerprinting though. Innocuous artifacts in file formats such as custom matrices, digits on the seventh decimal position of a floating point number or millisecond-precision timestamps allow identification and cross-referencing of internet users.
Just last week I noticed that when a reddit user uploads a screenshot taken on MacOS as PNG image to a reddit post, the PNG will still contain uniquely identifying information about the monitor that is attached to the MacOS system and when it was last calibrated. You can deduce type of Macbook they are using from the screen resolution and see when they switched machines once you notice a different monitor calibration timestamp. Just from a single PNG image that was uploaded by the user themselves. If those two pieces of information are not stored in the PNG you know that they must be Windows or Linux user.
It's these small breadcrumbs all over the place which make forensics so interesting.
Someone having met Epstein is about as interesting and concerning as having met Theodore McCarrick [1] or Harvey Weinstein [2].
Those kind of people aren't cartoon villains who only meet with people as part of their villainous activities.
The vast majority of people they meet with are for their other activities and interests. Most people meeting with McCarrick for example were meeting for the reasons they would meet with any Archbishop (or priest or bishop, depending on when they met him), or met with him for some other mutual, legal, interest or business reason.
Same with Weinstein. Most people he met would be meeting for the same reason they would meet any producer, or for some other mutual, legal, interest of business reason.
And same with Epstein. Epstein fancied himself as a patron of the sciences and made contact with a lot of scientists over their research and possible funding of their labs. He also fancied himself a philanthropist and had many contacts related to that.
[1] Former Archbishop of Newark and of Washington.
[2] One of the biggest and most successful Hollywood movie producers.
Did you measure the performance impact of having multiple trees in a single file vs. having one tree per file? I'd assume one per file is faster, is that correct?
If you patch it, you'd need to find another way to de-anonymize those users.
reply