Pete Warden’s plan to release data from more than 215 million Facebook profiles for academic research is critical to the next phase of the Web’s evolution. We’ll never get Web 3.0 without a higher understanding of the conversation that social media enables.
The work that begins this week will lead to vital revelations about human interaction and the way we live online — but it’s also important to maintain a little perspective about the limits of this kind of data.
- Facebook doesn’t stand for everyone. It might feel like everyone and their mother is on Facebook, but they’re not. Not yet, anyway. While the network is open to everyone, it’s going to be more appealing to certain kinds of people, so don’t expect a given Facebook population to represent its offline equivalent.
- Privacy counts. This data doesn’t come from Facebook. It comes from a Web crawler, which means it sees what you choose to make public.
- There may be a gap between what people believe and what they say on Facebook. Take for example, Warden’s assertion that “Texans are more likely to be fans of the Dallas Cowboys than God.” That’s a funny, memorable statement. But it could very easily be taken out of context. That statement doesn’t mean there are more Dallas Cowboys fans than there are believers in God in Texas. It doesn’t mean that Texans care more about football than God. It just means that more Texans who happen to be on Facebook chose to associate themselves with a particular sports fan page than a particular religious fan page. There are as many reasons for why this might be as there are Facebook users in Texas. Don’t leap to conclusions.
- Our social graphs are irrational. I have three siblings, but only one of them is recognized as such on Facebook. That doesn’t mean I don’t care about the other two. I haven’t gotten around to labeling them yet. I’m willing to bet most people have a quirk or two like that in their social network. Because our relationships and our data aren’t reported in a standardized way, there are always going to be strange flaws in the data.
But these caveats don’t mean the information won’t be useful. Far from it. It will teach us an incredible amount about how people use Facebook and about how they choose to publicly interact with each other online. It may also provide some other interesting sociological tidbits, and I think it’s OK to keep those discoveries in mind. Just remember that it’s not the unvarnished truth, and you probably shouldn’t make any life-changing decisions based on it.
What questions are you hoping this treasure trove of social data will reveal? What applications do you see for this kind of data?
Image credit, Talaj, via iStock