Wednesday, December 9, 2015

The Problem of Poor Data



The lights glare down on a sparkling stage, where several young women nervously face the judges. The year is 2007, and the future Ms. Teen USA is in this group. The judge proceeds down the aisle. Unknown to all present, a piece of YouTube comedy gold is about to be born. Turning to Ms. South Carolina, she asks her to explain the problem of political unawareness in the United States. In a moment, it becomes obvious that she is the queen of the unaware. Stammering, a cascade of disconnected words tumble from her mouth: "personally believe... and the maps... Iraq". The blended mess of buzzwords fades out and the audience claps uneasily. Besides the several data points evident on the internet, the world had a new reason to distrust teenage selfie-posters as a reliable source of information.

Chuckle at this story though we may, it underlines a serious issue that may be illustrated with a simple venn diagram.
The world we live in takes its financial advice from advertisers, accepts dieting tips from fashion magazines, and awards the camera to whoever represents an opinion in the most obnoxious way possible. The casual manner that we adopt while tossing around statistics is particularly disturbing. An old joke claims that 87 percent of statistics are made up on the spot, but the truth of the statement overshadows the humor. Statistics are supposed to be the gatekeeper of truth, but all too often, pundits and presidents misquote or mislead using these very tools. Although such data is invaluable, social integrity and decency both dictate that statistics should be regulated in a way that does not skew the truth.



In no way am I raising a call to arms against those who use statistics. On the contrary, they are perhaps the most solid bedrock one can use to support a claim. A quick scan of all sorts of historical incidents easily brings this to light. It's a story we know too well by now. The smoking industry hid behind a wall of happy-go-lucky commercials, only to have a wrecking ball of studies break through their smokescreen. The same pattern showed up a few decades later surrounding climate change, and still shows up every time there is a health or human rights conference. In all of these situations, statistics were the hammer that drove the final nail in the coffin. Unfortunately, the back end of the same hammer keeps trying to pull the nails out. The reason it took so much solid research to overcome smoking bias was because for every study used to prove the risk of cancer, another report was published denying the risk. Climate change is consistently swept under the rug. Even a country such as North Korea can paint itself through rose-tinted glasses if they can control the sample interviewed.

Usually, the driving power behind such blatantly biased results is a blatantly biased experiment. Data collection is a remarkably delicate program, subject to more manipulation than a cat faced with a laser pointer. For example, one report claimed that over 70% of Americans were in favor of legalizing marijuana, including 67% of Republicans: the sworn enemies of fun and freedom. However, a closer look at this survey reveals a few discrepancies. For one thing, the poll was conducted by a group known as the MPP, This name may sound innocent enough, until the hyperlink takes you to the Marijuana Policy Project homepage, with a banner that proclaims "we change laws". If this name weren't bad enough, that 67% figure rounds nicely to two out of three. It is entirely possible that the survey takers made their way down to Berkeley, CA, went to the local hangouts, and asked all the college students their views on smoking the marijuana like a cigarette. On such a street, finding more than three republicans may prove quite the challenge.

Similar studies can be found for nearly any topic, ranging anywhere from gun control to gay rights. the more hot button an issue, the more backing there will be. After finding the obvious trends, such statistics are easy to spot. Unfortunately, emotion is a more powerful trigger than facts, and the cycle perpetuates itself. The greatest perpetrator is public opinion polls. Shoddy reporting leads to charged reactions, which in turn leads to a very easy to steer mob mentality. In such instances, the careful rewording of a question can have incredible power. "Do you think homosexuals hold the same rights as others?" will yield a far more mild result than a question that contains such words as "oppressed" or "discriminated". Leading the witness happens far more often outside of the courtroom than inside of it.

Cat in Shoes
Based on a True Story
As the problem is highlighted, the need for a change is obvious. Although there are several possible solutions, the simplest and most effective is one that has been used for years. Misinformation is a plague whose only cure is regulation. Several years ago, I was sent to the Redbox in search of a family movie. My little sister was in tow. As I browsed through the selection, her eye caught a movie entitled "The True Story of Puss in Boots". Upon seeing this, she instantly confused it with the popular Shrek spin-off with a similar name. Exasperated, I attempted to explain the concept of a mockbuster to my dear sister. As my coup de etat, I cleverly pointed out that there was no mention of Shrek anywhere on the box. However, unyielding screaming in public usually beats out logic. Thus, I spent the next 2 hours watching the CGI mess that marked the low point of William Shatner's career.

In this example, Shrek was the undeniable, trademarked proof that separated the real deal from the raw deal. Statistics needs such a guardian. we need to sort the chocolate chip goodness of strong, true statistics from the nefariously identical oatmeal raisin lookalikes. It would be a simple matter to set up a committee that acts as the snopes.com of the statistical world. Every cute bar graph displayed across the news would require some watermark in the corner, signifying that it had been approved my a decent statistician.

Some may argue the ethical implications of such data control. Others may question its plausibility. Both are strong concerns which never seem to fade from the echo chamber of special interest groups. An answer to those who say such practices are unethical, the answer is obvious. Although advertisements almost always exaggerate, there is a point where the line crosses from hyperbole to hypocrisy. Surely the demand for accurate information eclipses the right to what may be falsely dubbed "free speech".

As for those who claim it is impractical, their argument is much more solid. Even so, the fundamentals of statistics easily clarify this problem. To find the length of the average trout, you do not measure every fish in the lake. Similarly, I will be the first to admit that it is not reasonable to police every statistic. However, in the same way that Legos command a fondness that Megabloks never will, so too will the mark of a checked statistic. There will, of course, be those who will not come, but in the process, they will rob themselves of an authority that makes or breaks a sale.

Another essential step is to avoid causing this effect by poorly designed surveys. Most large polling institutions are already very good at preventing this. It's the smaller ones that often experience problems. The reason telephone surveys exist in the first place is to create a wider, less biased sample. For the same reason, pollsters often ask different versions of the same question to eliminate response bias. Such practices ought to be observed across the field. In addition, it is the responsibility of every citizen to participate in such surveys. Uncomfortable though it may be, elections show us what happens when only the extremists show up to voice their opinions.

The problem of shoddy statistics, although certainly not new, is still not set in stone. It is a problem that can be uprooted. In the modern world, information is one of the most important commodities available. So many people who think they have it are being paid with counterfeit bills. Now is the time to choose decency over control. Now is the time to choose enlightenment over ignorance. Although it will take the cooperation of all involved, the world's perception of data could be changed very easily. In this assertion, I advocate for no side except for the truth. If truthful reporting leads to some uncomfortable truths, let them come to light. Whatever the political cost, it is cheaper than the cost of ignorance.


1. "Uhhh...what Did She Just Say?? Miss Teen South Carolina 2007 - Caitlin Upton." YouTube. YouTube. Web. 9 Dec. 2015

2. Example of the Marijuana Policy Project from Wong, David. "5 Easy Ways to Spot a B.S. News Story on the Internet." Cracked.com. 27 Feb. 2013. Web. 9 Dec. 2015.

No comments:

Post a Comment