Friday, February 29, 2008

Top 10 Commandments of Statistical Inference: #6

So, the 6th commandment of Statistical Inference is:

Thy shalt not covet thy Colleague’s data.

This sounds like a pretty easy one, but I am amazed to see even now people who view other people’s data and “want what they have.” Why is this so bad? This can drive people to reach beyond what they should in an effort to find “statistical significance” to keep up with the Joneses. How do they do this? Maybe when the numbers don’t match-up they use a less stringent model. Perhaps they refuse to use an adjustment when the situation calls for it. This leads them to travel in and out of the grey that is statistics. These tactics are egregious enough, but then there are those that understand statistics even less and wonder why their data doesn’t look like someone else’s. Many times I have been asked, and sometimes almost blamed or considered a bad statistician, if the data doesn’t look like someone else’s. In some limited fashions I have been prodded to make it look more favorable. This was refused much to the persons chagrin as they did not understand that it was much more than just data integrity on the line.

Bottom line, data is data. It can be made into anything that you want, but only those that truly understand it and use it correctly will learn and help improve the situation. Wishing to have the results of others is not an issue in itself; if it helps you drive improvement towards that goal. It’s when it drives you to look the other way in the analysis where it causes problems.

Wednesday, February 27, 2008

Top 10 Commandments of Statistical Inference: #7

The seventh commandment of Statistical Inference is:

Though Shalt not bear false witness against they control group.

To understand this commandment, first you must know what a control group. Typically, when doing a scientific study, one has looks at the difference between two groups that are statistically the same. The first group is the experimental group in which they receive the treatment, the second; the control group, is the group which does not receive the treatment (e.g., placebo). So, how can someone bear false witness (lie) about the control group? The first, and most common is not making sure the control group is statistically the same from the experimental group. By not sampling properly, and then saying that they are the “same” as the experimental group when they are not, this could affect your results. Also, you could have introduced confounding variables into the mix by not controlling both groups properly. You know have no idea if what you see in the experiment is happening because of the treatment, or because of variables you did not control for. Finally, all people can be biased, even researchers. They want to prove their theory. If the experiment is testing in anyway the effectiveness of a treatment, they may have a tendency to favor the experimental group unknowingly. This is why researchers do “double-blind’ studies in which neither the test group nor the researcher knows who belongs in which group.

Not all testing is strict scientific testing, but make sure that you are sampling the correct groups, controlling for the correct variables, and not allowing biases to enter into the results.

Tuesday, February 26, 2008

Top 10 Commandments of Statistical Inference: #8

In our last few posts, we talked about two of the top 10 commandments of statistical inference. We learned that: Though shalt not infer causal relationships from statistical inference and though shalt not apply large sample approximation in vain. Today, for the number 8 commandment:

Though shalt not worship the 0.05 significance level!

This is one I run into all the time. It goes with my earlier post on “Statistical Inference Is Not A License.’ For some reason, a common mistake is to focus solely on a 0.05 significance level and I think it is because of a couple of reasons. First, it is what is taught in schools for the most part. I remember back to my first few stats classes and it seems like it was best case to just teach “Above or Below 0.05.” It’s also the misunderstanding and overreaching that we tend to do when we have statistical significance. Finally, of course, there can be statistical reasons, but when I ask people why the 0.05 level is so important to them, few understand it this deeply.

The main thing to remember is what it really means. It means that you are 95% confident depending upon your models assumptions and are willing to take that 5% risk that there is no difference (Type I Error). So, the real question is, how important is this to you? If you are in a lab, then a 0.01 significance might be what you shoot for. In a sociological study, perhaps maybe a 0.10-0.15 level will satisfy you. Whatever the case, it will depend on how strict you wish to be or not be and what situation you are in.

Monday, February 25, 2008

Top 10 Commandments of Statistical Inference: #9

OK, so last post we discussed the first rule (or, number 10) of statistical inference. That was, Thou shalt not infer causal relationships from statistical significance. Number 9 is:

Thou shalt not apply large sample approximation in vain.

This is more of a tricky concept. For the most part, the larger the sample size the closer to the population you are. Right? Well, based off of that, the closer one is to the sample representing the population. Most statistical models are based off of this assumption, and therefore the larger sample size you have the “easier” it becomes to find statistical significance. Even a first year stats student begins to be able to point this out. Just look at the back of any statistical book and look at the t-distribution and watch what happens to the t-value needed to find significance. It goes down….

In fact, I remember one of my interview questions for my first job was:

You have one sample with a correlation value of .30 and no significance and another value of .28 and statistical significance. Why would a smaller value have significance?

Because significance has little to do with strength and a larger sample size can help “find significance.”

In other words, don’t use large sample sizes just to find significance. It is important to take the correct sample size for the statistical model you are using. Each model is different and if you don’t know the assumptions of the model, you should search out and ask of the ramifications of a large sample size. In other words, know thy model!

Thursday, February 21, 2008

Top 10 Commandments of Statistical Inference: #10

All,

I am starting a new series on the Top 10 Commandments of Statistical Inference. In my first job as a statistical consultant I was given this and it’s been hanging on my wall ever since. I wish I could claim it was mine, but I can’t. For today:

Number 10: Thou shalt not infer causal relationships from statistical significance.

This seems like it should be the number 1, but it isn’t. This goes back to my last post. All too many times, we see statistical significance and we use this as a license to do what we want, or saying that x caused y. Bottom line, there are very few, if any, situations in which you can infer a causal relationship from finding significance. Only if you are able to control for every outside variable, and are able to directly manipulate the variables you are testing, could you indicate causality. Of course this is next to impossible. Therefore, as mentioned before, you can only state “Based off of what I know, I can indicate that I am X% confident that what I found in the study, I would find in the general population.”

Wednesday, February 20, 2008

Statistical Significance Is Not A License

Please go check out Avinash Kaushik’s blog about statistical significance. I found his blog very helpful and in the entry I like the fact that he begins to discuss how we must use statistics when testing our assumptions. He also points to Brian Teasley’s stats calculator. I pulled this down and tried to find the assumption underneath. I am contacting both, to see if I can get those. I will let you know my thoughts on those.

However, one concern I have, is that it brought up an all too familiar ring to my ear. I am increasingly seeing “Statistical Significance” become a license to do what we want. I want to remind everyone, what statistical significance really means. Simply put, in most cases that we are dealing with, statistical significance indicates that you are x% confident that what you found in your testing or sample, you would find in the general population. In the case of an A/B test, it simply tells you that I am x% confident that there is a difference between A and B and what I found in my testing, I would find in population. That is ALL that it tells you. Furthermore, it is contingent upon you doing the right test the right way in the first place. So, even if you have statistical significance, does not mean what you found was really right. Wrong assumptions, wrong manipulations, and wrong sampling are the issues I find the most often. The sampling piece can be can be the most problematic. You could do a test in one month and find results, and do the same test the next month and find widely different results…both being significant! What went wrong? You probably do not have your assumptions or sampling down pat. Make sure you do that before you test. Otherwise, “statistical significance” can change from the license to do what you want to a pink slip!

Again, I will let you know what I find out about the calculator!

Friday, February 8, 2008

Synergy

In business, most of the time, it doesn’t happen. People tend to work on their own area and own sides leaving two or more extremely strong wheels to spin on their own.

As an analyst who was initially trained in a “conservative” setting, I find that it is sometimes hard to create synergy with those who do not think in tight theoretical ways. When I first started my career, I was pretty idealistic and very conservative with data. All things had to be “balanced.” This made it extremely difficult to create synergy with those who do not have to have things balanced. As I moved on with my career, I quickly realized that more flexibility was needed on my side. This was accelerated when I began working for a photomask company as the corporate statistician. You see, photomask manufacturing is pretty much N-of-1 manufacturing, and as you know statistics in manufacturing is all about replication. So, I quickly found out the key was not to focus on what was different and how to “fit” a statistical model to non-replicates, but to find out really what was the same, what WAS replicated, and control the heck out of that. It worked well, and led to my first two publications. However, I never really was part of a synergy, since my main focus was to pound out reports and focus on how to analyze the same data differently.

I then moved to my current position, and marketing was a new area of focus for me. It was much more fluid than the “lab” of a research facility. What I did realize, however, is that there was some synergy happening within the company. People were working together, not in silos. Lately, we are really starting to pick-up momentum, and it is very exciting. Things are “coming together” in a way few analysts actually get to see. Most of us typically sit back and pound reports and think of new ways to get and analyze data. What is really exciting though is when the metrics begin to line up with the corporate identity. That is a great feeling for an analyst.

So, what’s my point? To create synergy, everyone needs to change and it takes time. It took me years and different situations, to change from a “theoretical” statistician and come closer to the middle. Yet, it can’t just be one person. It will not work if one person moves all the way to the other person. Others also have to come towards the middle. When it does happen though, it can be a very exciting time for everyone.

Monday, February 4, 2008

More Polls

OK, this primary on the democratic side is going to be a wild one (at least in the news). We have another “swing” with a poll. The CNN/Opinion Research Poll that was reported today has indicated that Obama has now “erased” a gap between himself and Clinton with one day to go. Compare this to a few months ago when Clinton had a “significant” lead over Obama. Meaning? Not much.

Here’s why. Although I appreciate that they state the current poll has a 4.5 point error rate and do not say he is now in the lead, it means little to nothing as to how things will shake out tomorrow. This is a national survey. So, they are again sampling from a population who may not even live in a state that will vote tomorrow, and even if he or she does, who knows if they will even vote in a primary. So, it means nothing to the amount of delegates that Obama or Clinton could pick up. But it sure makes for great headlines, which is the scary part. By dissecting each poll and showing these “wild swings” the media is creating news, not reporting it. For the casual observer, if they see this, they may decide to hitch themselves on the wagon of the winner and it may have a small effect on the outcome tomorrow.

Maybe more interesting is the vote in California, who is voting tomorrow. There was a large fluctuation between two weeks ago when Clinton had a double-digit lead to a poll on Sunday that shows an insignificant lead for Clinton (within the 4.5 points). How interesting is this? Not as interesting as they want it to seem. Could it be Oprah? Could it be Maria Shriver? Or could it just be bad sampling. I am still on point to say that a poll should not fluctuate this much within a two week period if the sampling is right (whether the premise, delivery or results of the poll is right or not.) Bottom line, anytime you are sampling from the same population there should not be such a fluctuation, even if you are asking the wrong thing. In my work, the first thing I look to when I see something like we see here is, “Did I get my sampling right?” “Am I asking the same questions from the same population?” In the case of these polls, I say probably not.

So, sorry, maybe Oprah isn’t responsible for such a wild swing after all. Who could be? Hmmm, didn’t Edwards just drop out in the last two weeks? A point that is lost on them I suppose…

Friday, February 1, 2008

Primary Polls

So, why have political polls been so wrong? I get asked this from time to time. Well, it is a complicated answer. First, we must address which polls have the problems. The first type of polls, (the ones that seem so wrong) are the pre-voting polls that are taken weeks or days prior to polling. The second type, the exit polls, is taken directly after the voter has voted. Obviously, this one is much more accurate (although the last two elections, even these are failing much more frequently.)

Let’s focus on the pre-voting polls. These polls are taken months, weeks, and days before the vote. To keep things simple, people from a particular demographic are sampled and polled about who they would vote for in the upcoming primary. In the recent primaries, they have been WAY off. Why? Well, it can be quite complicated, but I think there are several things at play. First, the models are broken. Many people are still living in a world where they think that the old social model is still in existence. This is not true. No longer can we typecast people according to strict demographics. Where you could once count on a particular demographic to react or vote one way, you can no longer do so. Why? People have so much more information at their finger types due to technology. In years past, people would get their information from regional and perhaps one national news source and they could be swayed easier since they only got a couple of views. Now, people are inundated with news 24 hours a day 7 days a week. They also no longer have to count on social networking with people in their vicinity, but rather can converse with people from all across the world who actually hold many of their views, creating micro-groups of people with the same thoughts. One-person on an island no longer exists. In other words, the old models are no longer accurate. This leads to the second issue which is sampling. If the models are broken, surely the sampling is as well. When you rely on asking a few people to predict the whole, you must have the correct samples in place. Because of what was stated above, undoubtedly the samples are wrong. How can you tell? Look and see how fast the same polls are changing from week to week or in some cases day to day. One polling center can have wild swings. This is no fluke. If your sample is not accurate, this can happen anytime you are attempting to predict. Furthermore, when polling a primary, you may be asking people who have no plan on voting in the primary. Another reason for the wild swings? Because of the information explosion, people tend to change their mind much quicker than before. We are a society of instant news and change, which makes it that much harder to predict.

So, what to look for with Super Tuesday coming up? Well, certainly do not look too far into the polls to tell you what is going to happen. Only way to be for sure on who will come out ahead is by watching the actual results come in.