So in this case, personality traits or political orientation, or what have you.” “In this case, it’s Facebook data, but it could be, for example, text, like natural language, or it could be clickstream data” – the complete record of your browsing activity on the web.“Those are all the features that you want to predict.”Īt the other end, you need your “target variables” – in Wylie’s words, “the things that you’re trying to predict for. Most important, it needs to contain your “feature set”: “The underlying data that you want to make predictions on,” Wylie says. The “training set” refers, then, to that data in its entirety: the Facebook likes, the personality tests, and everything else you want to learn from. Before you can use Facebook likes to predict a person’s psychological profile, you need to get a few hundred thousand people to do a 120-question personality quiz. Step one, he says, over the phone as he scrambles to catch a train: “When you’re building an algorithm, you first need to create a training set.” That is: no matter what you want to use fancy data science to discover, you first need to gather the old-fashioned way. According to Wylie, all you need to know is a little bit about data science, a little bit about bored rich women, and a little bit about human psychology. ![]() ![]() For those 87 million people probably wondering what was actually done with their data, I went back to Christopher Wylie, the ex-Cambridge Analytica employee who blew the whistle on the company’s problematic operations in the Observer.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |