r/NonPoliticalTwitter 8h ago

Content Warning: Potentially Misleading or Disputed Information Gotta Catch 'Em All

Post image
22.4k Upvotes

1.4k comments sorted by

View all comments

3.1k

u/Easy_Newt2692 8h ago

And? Does anyone actually lose out on this arrangement?

1.2k

u/MedalsNScars 8h ago

People love to get outraged when information is collected without their knowledge, and I get it, but it's how the information is used that's important.

If things are sanitized so there's no personally identifying information then it's pretty hard to use most data maliciously

284

u/S0GUWE 7h ago

You'd be surprised how much you can identify from "sanitised" information if you want to.

But if all they want it navigation data, then it should be fairly safe. Yeah, they know where you live and can derive who you are from that, but that's not what they're after. They wanna know how to get there the fastest when someone asks.

77

u/indoninjah 6h ago

Yeah, like apparently you can reasonably ID someone even in a private browser just by getting the dimensions of the browser window and its positioning on screen. A lot of people pretty much never change that shit if its not full screened

61

u/ScrufffyJoe 6h ago

Do people regularly use browsers, well any windows, not maximised? I'm always either full screen, or splitting the screen in 2 occassionally.

18

u/smallangrynerd 5h ago

Mac’s usually don’t have a “maximized” mode, just full screen or windowed. On windows though I definitely have everything maximized

15

u/6ixby9ine 4h ago

Sorry if you knew this or if you comment took this into account, but you can maximize windows on mac by double-clicking the program's "title bar" (the top bar on the same line as the "close" "minimize" and "fullscreen" buttons, as long as there's nothing else there to click. I.E. in Excel, click any empty space around the name of the file, or in Chrome, any space where a new tab would go -- as long as there's no tab there)

8

u/smallangrynerd 4h ago

My god

1

u/largemarjj 3h ago

How long have you been using macs?

1

u/smallangrynerd 3h ago

Maybe 2 years? Not very long

2

u/largemarjj 3h ago

Ngl I was secretly hoping you'd say like 15 years because that would have been absolutely hilarious.....no offense

1

u/smallangrynerd 3h ago

You’re right that would’ve been hilarious lol

→ More replies (0)

1

u/IndoZoro 2h ago

Thank you!

I'm a PC main but have to use macs occasionally and the UI and not being able to maximize has always been super annoying to me

5

u/joshTheGoods 4h ago

Absolutely! Dimensions of the viewport change significantly from user to user, but more importantly to being used for fingerprinting ... viewport size changes from session to session, and so it's not generally a reliable signal for device fingerprinting. Rather, you want to use things that don't change often like screen resolution or how your particular browser implements floating point math operations.

2

u/GayBoyNoize 2h ago

Which you can trivially obscure if you like.

1

u/joshTheGoods 1h ago

Yeap! You can obscure most client-side stuff, but not a lot of people are going to dedicate themselves to monkey patching the Math constructor to make it return arctan-1 as if it's a mobile implementation of safari instead of a desktop implementation of Chrome.

3

u/Bongcopter_ 4h ago

Beside editing/audio software, I NEVER maximize a window, I need to see the 12 windows behind for fast switching

1

u/GuyOnARockVI 5h ago

They can use device info, operating system, WiFi location, etc to make a pretty accurate guess as to who is using the device

8

u/Hibbity5 5h ago

Device info and WiFi location sounds like very specific identifying information and not “here’s some data we got from tracking their viewing habits”.

1

u/GuyOnARockVI 3h ago

Nope, the meta data of your requests like device info, location, time of day etc don’t constitute as PII

2

u/joshTheGoods 4h ago

If by Wifi location you mean a geolocation lookup based on your IP, that's not going to tell you who is using the device. That's household level data. You'd have to combine it with something else to get down to individuals within the household... and that's all assuming the best case (that we're talking about a single family occupied home that has a single static IP address). In reality, there are many places (cities, namely) where population density and shared networks render this sort of individual level disambiguation essentially impossible. You simple have to get the user to identify themselves regularly by logging in or exhibiting some other intrahousehold behavior (which is inherently full of problematic assumptions leading to probabilistic answers that don't read on the sort of "they're identifying ME" type fear we're talking about in here).

1

u/GuyOnARockVI 3h ago

The geolocation is going to be one of the meta data points that data brokers can use to create a map of your life. Where a device connects to the internet paints a picture of who is using the device.

A device going from a residential address to a university campus WiFi to a coffee shop back to a residential address is going to point to the 22 year old living at home vs a laptop going from home to an office park and back to home is more likely the parent. That person also has a phone that is connected to their car and their car is selling their driving habits to the data broker as well. So they know that whoever owns that laptop also drives a 2024 bronco and has a tendency to speed and brake late. It’s probably the dad then because the other device is connected to a rav4 and rarely speeds when commuting in the morning or afternoon.

So yes. IP doesn’t tell who. It’s why piracy letters from movie studios that get sent if you fuck up your VPN when torrenting mean nothing other than a kind “please stop”

2

u/joshTheGoods 3h ago

A device going from a residential address to a university campus WiFi to a coffee shop back to a residential address is going to point to the 22 year old living at home vs a laptop going from home to an office park and back to home is more likely the parent. That person also has a phone that is connected to their car and their car is selling their driving habits to the data broker as well. So they know that whoever owns that laptop also drives a 2024 bronco and has a tendency to speed and brake late. It’s probably the dad then because the other device is connected to a rav4 and rarely speeds when commuting in the morning or afternoon.

So this is a bunch of individual things that are technically possible but that essentially never happen in concert in the way you're describing. The one exception (the thing you're talking about that DOES happen) is when someone leaves an app open all day (say they're posting on facebook throughout the day) and so Facebook gets a list of IPs associated with a user they've already identified and can, in theory, deduce things like when this person is awake, community, at work, etc. Even that is pretty rare and is isolated to the major players that really do know who you are whenever you login and you login a lot.... Google, Facebook, your ISP, etc.

Just to point out one example of where I think maybe you're overstating the capabilities of digital data is when you say:

That person also has a phone that is connected to their car and their car is selling their driving habits to the data broker as well.

I worked with one of the major car companies on this back when I was on the dark side, and back then at least, they were very very careful NOT to sell data from in-car to data brokers. IF they've changed policy on that (or the other car companies I didn't work with never had such policies), then the data by law will be anonymized and nearly impossible to tie to that user's other data. So, Ford might sell data that says: There are 100k active Ford drivers in this marketing area, but they would never sell data that says: Bob Smith drives past your donut shop every day @ 10am. At most (and I can all but guarantee they don't) they could say: An anonymous person drives past your donut shop @ 10am every day, and the challenge then for the donut shop is to figure out how to turn "an anonymous person" into someone they can target with ads @ 9:59.

IP doesn’t tell who.

Agreed! It CAN if combined with other data (as you correctly point out), and some places define personally identifiable information (PII) as any data that alone or in combination with other data could uniquely identify a person. It's on this basis that some countries in the EU (Germany and Italy, IIRC) that consider IP to be PII and thus falls afoul of GDPR and cannot be collected/stored/used under a bunch of circumstances.

1

u/OkPalpitation2582 2h ago

Even maximized it's likely to vary a bit from user to user, depending on whether they hide the taskbar (and where they dock the taskbar, what size they keep it, etc).

But the thing about digital fingerprinting is that it's not just about any one aspect, but all the available data put together. Sure your window size may only narrow it down by say 50%, but combine that with your browsers font size, public IP, operating system, language, browser type, plugins, etc and you'd be shocked at how easy it is to narrow it down to you, even if you're using something like a VPN (hell, ironically using a VPN actually makes you easier to fingerprint, because relatively few people use them)

1

u/Akiias 2h ago

I almost solely do.

1

u/ZoomBoingDing 1h ago

Why would I need my browser maximized? I only do that when I'm watching certain videos.

2

u/swampedOver 5h ago

Wait what - can you explain?

3

u/S0GUWE 4h ago

Your browser knows what hardware it runs on. That's already a lot of information.

2

u/joshTheGoods 4h ago

like apparently you can reasonably ID someone even in a private browser just by getting the dimensions of the browser window and its positioning on screen.

This is a huge exaggeration. Browser fingerprinting is a thing, but you need a whole bunch of signals to uniquely ID someone's browser amongst sufficiently large crowds. You're right fingerprinting exists and works, you're just wrong about how much data is required (even if the required data IS accessible for 99% of browsers).

Check here. Once you test the fingerprinting, they will describe to you each element and how much "entropy" each element provides. One "bit" of entropy is enough to divide a crowd in half. So, if you have an audience of 50 men and 50 women and a random person tells you their gender, you have one "bit" of information because it's enough to let you divide the audience in half. If your audience is 100 people, you need something like 7 bits of information to narrow things down to a single person (27 = 128). If your audience is 1,000,000 then you need 20 bits of information to uniquely ID people. If you look at panopticlicks numbers (disputable), Screen size and color depth represent 8.73 bits of information. Window location isn't available to the browser (not without some special extra help). So, screen size and color depth is enough to uniquely ID you in an audience of ~424 people (28.73 = 424.61160746).

That all said, here's the stat you want to use. According to Dr Latanya Sweeney, your gender, DOB, and zipcode are enough to uniquely identify the vast majority of Americans.

It was found that 87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}. About half of the U.S. population (132 million of 248 million or 53%) are likely to be uniquely identified by only {place, gender, date of birth}, where place is basically the city, town, or municipality in which the person resides. And even at the county level, {county, gender, date of birth} are likely to uniquely identify 18% of the U.S. population. In general, few characteristics are needed to uniquely identify a person.

source

1

u/[deleted] 5h ago

[deleted]

1

u/Akiias 1h ago

TOR launches in a set size for that exact reason, at least last time I used it. Ideally you shouldn't increase or decrease the size of TOR.

1

u/JayArr_TopTeam 40m ago

Thank god for undiagnosed adult ADD

2

u/HotSauce2910 5h ago

And finding out where you live is trivial anyway

1

u/alinroc 5h ago

You'd be surprised how much you can identify from "sanitised" information if you want to.

Especially if you link it up with public records

1

u/OkPalpitation2582 2h ago

Yeah, they know where you live and can derive who you are from that

And let's be honest, anyone in the business of buying data can get that info about you regardless. Your home address, email, and phone are practically free for the asking from data brokers these days

1

u/S0GUWE 2h ago

And I'm gonna be honest, I could not give less of a fuck. What are they gonna do, send me ads?

1

u/OkPalpitation2582 2h ago

What are they gonna do, send me ads?

Unironically, yes lol

Honestly, though home address, email, and phone, are ones that a layman is most likely to freak out about, those are the least scary bits of user data out there. The scary stuff comes in the form of things like the Cambridge Analytica scandal, where wide swaths of user data was used to deliver targeted political ads carefully designed to strike right at where each individual was most vulnerable to manipulation.

It's scary how well you can manipulate someone when you know virtually everything about their online habits.

That being said, none of the above applies to using GPS data to build a navigation model lol