home tags events about login
one honk maybe more

benjojo posted 17 May 2024 09:34 +0000

Ha! The Discord GDPR/Data Export thing reveals that it's running models to figure out what gender you are. If you go to /activity/analytics/events-*.json and grep for predicted_gender you get something like:

{
  "user_id": "282657081457115136",
  "predicted_gender": "male",
  "probability": 0.8413839340209961,
  "prob_male": 0.8413839340209961,
  "prob_female": 0.11650349199771881,
  "prob_non_binary_gender_expansive": 0.04211260750889778,
  "model_version": "2024-05-08T00:00:00.000000Z",
  "day_pt": "2024-05-15 00:00:00 UTC"
}

Anyway, they seem to have this datapoint _over time_! Meaning you can make a graph of how male/female/NB you are according to discord, here is mine:

A graph showing various levels of "prob_male" (the highest over all time) from 2022 to 2024

LonM@social.vivaldi... replied 17 May 2024 10:42 +0000
in reply to: https://toot.cat/users/nickcolley/statuses/112455779819977335

@nickcolley @benjojo It's illegal to ask for data that isn't specifically required for delivering the service. I guess that's how they justify data prediction - just that it means they don't have to ask for it.

Data protection allows to to request that any decisions made about you not be done via automated processes if you are not happy with the result. So I guess you could demand that discord have a human manually go through the data and decide your gender for you, and discord would be legally compelled to do so? If enough people demanded that it would certainly bog them down.

benjojo replied 17 May 2024 09:55 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/3RDQ97768tt28Tk321

It also seems that the same archive has a guess at how old you are too, Discord has gotten this entirely wrong, except one time.

{
  "user_id": "282657081457115136",
  "predicted_age": "35+",
  "probability": 0.7547529339790344,
  "prob_13_17": 0.0005852651665918529,
  "prob_18_24": 0.014580278657376766,
  "prob_25_34": 0.23008151352405548,
  "prob_35_over": 0.7547529339790344,
  "model_version": "2024-03-20T00:00:00.000000Z",
  "day_pt": "2024-03-27 00:00:00 UTC"
}

A graph showing probability over time for age, this graph shows discord thinks that I am 35+, when I am actually 29

arikb@mastodon.sdf.o.. replied 17 May 2024 10:06 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/7y6svX4M692gVB727w

@benjojo following your toot I requested the data dump, it doesn't have the "/activity/analytics" folder.

The readme file states that "you will not have Analytics or Modeling folders in your data package if you've opted out of those activities" so thank you to past me, for opting me out of this at some point though I forgot about it.

I did notice a LOT of information about logins going back to 2018, and I wonder why Discord needs to store all of it. IP addresses and kernel versions included.