The link between Twitter use and sex: context, correlation and courting on OkCupid
Which is bigger, the earth or the sun?
This is just one of the hundreds of questions OkCupid users can answer in order to increase the accuracy of the dating site’s algorithm. Each question is multiple choice, requires you to also select which answer you’d hope your ideal match would pick and must be ranked in terms of importance.
In this case, if you answered that the earth is bigger than the sun, you’re not alone: data published in a blog post by the company in 2010 shows that
- over 10% of straight women
- slightly less than 10% of gay women
- 5% of gay men
- and just under 5% of straight men
also believe the earth is larger than the sun (OkTrends 2010).
The team at OkCupid love to correlate data sets. My favourite of the bizarre examples from their blog include “Odds of Masturbating Today: people who use Twitter every day vs everyone else” and “Could you imagine yourself killing someone? Yes answer = 82% implied odds of first-date sex” (OkTrends 2011a and 2011b).
Of course many of these data correlations are made tongue-in-cheek and the sense of humour of OkCupid’s founder Christian Rudder shows. However the underlying sentiment of OkTrends, and Rudder’s recently released book, Dataclysm, is that the online dating giant is in possession of the kind of data goldmine any marketer would covet. Whether a brand wants to find out what Stuff Gay People Like or how often people bathe or shower, OkCupid has data sets from literally hundreds of thousands of respondents.
OkCupid is owned by IAC/InterActiveCorp, a company which currently ranks 8th in a Nielson list of the top 10 global web parent companies (or 5th in the US list). IAC owns many of the net’s most visited sites, including about.com, CollegeHumor, Vimeo, Urbanspoon and fellow dating sites match.com and Tinder. Aside from the benefits data sharing would have internally for IAC, OkCupid also sells targeted advertising space on the dating site itself. What better way for an advertiser to tailor its message, than with detailed personal information about its audience? And with the hundreds of questions posed under the guise of matchmaking, OkCupid certainly gathers a lot of personal information.
But having access to big sets of data is not nearly enough to understand the individuals who provided those numbers. Big Data critics and new media commentators danah boyd and Kate Crawford explain that without the context in which the information was provided in, it might actually mean less than data analysts would like to think. They point out the issues with the “problematic underlying ethos that bigger is better, that quantity necessarily means quality” when it comes building consumer profiles or analysing behaviour using Big Data (boyd & Crawford 2001, p. 6).
Turning to the possible context within which data might be supplied to OkCupid by its users, one would assume that most individuals are hoping to attract a suitable mate. But even within this overarching context, there are many possible variations. Who is looking for love and who is looking merely for a hook up? This would define the kind of details they might publish about themselves and certainly influence their answers to OkCupid’s matchmaking questions. How many people simply couldn’t be bothered answering more than a few questions, or any questions at all, and how does this impact OkCupid’s data? What about those who set up multiple profiles for various reasons: as to experiment with their identity, to see what qualities get more positive responses, or for other unknown purposes (see user comment cited in my previous article)? Are there certain types of people who are drawn to online dating and does OkCupid attract a different audience from other sites?
How could these variables impact the quality of information an advertiser could hope to draw out of OkCupid’s users? As boyd and Crawford state, “taken out of context, data lose meaning and value. Context matters” (boyd & Crawford 2001, p. 8).
There are of course also ethical considerations. A few months ago Rudder proudly announced that “We experiment with humans!” (OkTrends 2014), outlining several examples in which the matchmaker toyed with its user’s hearts. In 2013, OkCupid removed user photos for the day, establishing, in Rudder’s words, that “people are exactly as shallow as their technology allows them to be.”
In yet another experiment, OkCupid sought to test the accuracy of its algorithm, by suggesting unlikely pairings to see if they’d work out. They mostly didn’t, showing that there is merit to the company’s maths. Perhaps Rudder took a calculated risk here: by exposing to its users (and potential future users) that it can meddle with what they see on the site at any given moment, OkCupid is actually a) showcasing just how accurate their matching algorithm is and b) flaunting the massive amount of user behavioural insight OkCupid has available to those who are willing to pay for it.
Like most websites featuring memberships, users have tacitly agreed to OkCupid’s terms upon sign up – whether they have read them or not. Even upon reading them, many users may not fully understand the implications of data sharing on the site. Even with understanding, there may be situations in which identity information is used in a context not intended, or viewed by parties who should never be privy to sensitive and deeply personal information, including sexuality, sexual fantasies, drug use, faith and political views. The “opportunity to opt out” is also problematic – subjects of any ethical research should opt in, not opt out.
Even though this data maybe public to a certain degree, this doesn’t mean that it should be used. As boyd and Crawford point out, “there is a considerable difference between being in public and being public” (boyd & Crawford 2001, p.12). I might walk in public down the street, but in all likelihood will choose not to tell everyone my name, my deepest fantasies, whether I could imagine killing someone, or whether I think the earth is larger than the sun.
boyd, danah & Crawford, Kate, 2011, ‘Six Provocations for Big Data’, paper presented at the Oxford Internet Institute Decade in Internet Time Symposium