Wednesday, March 19, 2008

Top 10 Commandments of Statistical Inference: #3

We are currently following the 10 commandments of Statistical Inference. The 3rd commandment, is that:

Thou shalt not make statistical inference with the absence of a model

The 4th commandment was to honor the assumptions of your model…as we discussed why that is important, however some people go even further down the road of insanity and not only “misplace” the assumptions, but misplace the model itself. Situations in which one wishes to infer statistical inference calls for the use of a model. This ensures that you are “following the rules” of the prescribed model. For some reason I am amazed at the backlash this sometimes gets. The “why can’t you just give me an answer” or “why do you need to make it so complicated” complaints. In our society of “quick hit answers” a lot of times, there is this thought that there is no time to set up the proper model. I used to work for this person who said “it is better to ask forgiveness, than permission” and forced the collection and analysis of data beyond the ability to model the data correctly. What happened? Well, sure, we got an answer, and he went on his merry little way…only to have to come back and do the exact same experiment because the results of the first one proved to be invalid. The result was lost time, money, and resources. If the design was set-up correctly in the beginning, it would have taken 3 days to run. He wanted it in 1. He got it in 1, and then spent months trying to backtrack to get to the answer he would have gotten in 3 days…

So, the next time someone tells you, I can model it and it will take x amount of time, you have every right to ask why, but make sure your push for a faster initial result doesn’t cause long term implications.

Monday, March 10, 2008

Top 10 Commandments of Statistical Inference: #4

Now, we are really cooking with the Ten Commandments of Statistical Inference. A long time ago, I wrote an entry that talked about how one should really look and can vs. should. The 4th commandment really speaks to that point:

Thou Shalt honor the assumptions of thy model!

From the engineer who chooses to calculate a performance index on an unstable process to the marketer who uses a t-test when there is a correlation between their samples, I find this to be the most commonly broken commandment. Unfortunately, I also think it is one of the most dangerous that we have talked about in this series. All models have assumptions, and it is important to make sure that you satisfy these assumptions. Otherwise, your results are suspect at best, down-right worthless most of the time. Usually, at this point I get the argument “but mathematically, I can calculate it.” Sure, mathematically you CAN calculate anything. But theoretically, should you?

So, why is this so common? Because it is a misunderstanding of what the assumptions are in the first place. Compound that with the advent of many software programs out there that make it easier and more user-friendly to calculate results. Now, don’t get me wrong, this is not a bad thing. I myself do not want to go back to hand calculations or Excel formulas. However, if you have never seen the formula, never spent the time understanding the assumptions, then you may not have a grasp what you are doing is correct. Sure, plug in numbers and you will get an answer, but is it right?

How important is this to get right? Let’s put it this way, if a Doctor diagnoses you wrong, but does everything else right according to his diagnosis, is he right? No, no way we would let him get away with it. So, why do we let it pass in statistics?

Friday, March 7, 2008

Top 10 Commandments of Statistical Inference: #5

Well, I am glad on the last post was well received, so now I think it is safe for me to start counting down the last 5 commandments of Statistical Inference. Number five:

Thou shalt not adulterate they model to obtain statistical significance.

Now, when you first look at this you think…Adulterate? But it does make sense. It comes down to some of our previous discussions, and that is make sure you do not knowingly (or unknowingly) allow extraneous variables or inferior ingredients into model. Make sure you take steps to control for the things you can control for. Sometimes, it is as easy as excluding certain people, or certain parameters. Other times you may have to really think practically about what CAN affect your model and control for that. In the marketing world it might be to control the time of day when your sends go out. In an experimental design in the lab you may want to add test at separate times and add a blocking variable. Use common sense and your own knowledge of the situation. This is not a “math” problem per-se. Of course there are some techniques available to help, but this typically requires you to sit down and map out your process and brainstorm about all the things that can affect your design and control the heck out of them when you can. This is always my favorite part of designing experiments. It’s when you can be creative. Next post we will discuss about how to understand and fit your situation into a needed model (rather than the other way around!).

Wednesday, March 5, 2008

Statistical Humor Gone Awry!

So, we have gone through the first 5 commandments of Statistical Inference. I am going to move to the next 5 in the next few posts, but before I do it has come to my attention that by presenting these, I have gotten myself into a little bit of a dilemma. My attempt at statistics humor may have gone awry!

I get the feeling that people wondering when it is ok to do statistical analysis, or is it EVER correct to do statistical analysis. Worst yet, I think statistics can only be done in a laboratory with white coats!

First and foremost let me assure you I do believe whole heartedly in statistical analysis ;-) If I didn’t, well, I wouldn’t have worked for the last decade plus in the arena. Secondly, I am not the anti-layman stats guy in the ivory tower throwing pennies and guessing the probability of it hitting someone. Far be it, I am actually a trained psychologist, not a statistics major so I whole heartedly believe that a statistician can encompass people who are not “classically” trained but apply statistics to solve problems. Lastly, the 10 commandments are a tongue in cheek attempt at humor that some of us stats geeks really enjoy. Sadly, when I first received “the list” from one of my co-workers, all stats geeks I was working with at the time stopped whatever they were doing and ran the Xerox copier out of paper.

Bottom line, I think anyone can perform valid statistical testing, but it must be valid, and must follow the rules of what the models were designed for. If you can do this, you will have a wonderful design and results, if not, you are going to find yourself into a mess and you won’t really know why! If you need help, find someone who does understand all the nuances that’s why they are there!

Friday, February 29, 2008

Top 10 Commandments of Statistical Inference: #6

So, the 6th commandment of Statistical Inference is:

Thy shalt not covet thy Colleague’s data.

This sounds like a pretty easy one, but I am amazed to see even now people who view other people’s data and “want what they have.” Why is this so bad? This can drive people to reach beyond what they should in an effort to find “statistical significance” to keep up with the Joneses. How do they do this? Maybe when the numbers don’t match-up they use a less stringent model. Perhaps they refuse to use an adjustment when the situation calls for it. This leads them to travel in and out of the grey that is statistics. These tactics are egregious enough, but then there are those that understand statistics even less and wonder why their data doesn’t look like someone else’s. Many times I have been asked, and sometimes almost blamed or considered a bad statistician, if the data doesn’t look like someone else’s. In some limited fashions I have been prodded to make it look more favorable. This was refused much to the persons chagrin as they did not understand that it was much more than just data integrity on the line.

Bottom line, data is data. It can be made into anything that you want, but only those that truly understand it and use it correctly will learn and help improve the situation. Wishing to have the results of others is not an issue in itself; if it helps you drive improvement towards that goal. It’s when it drives you to look the other way in the analysis where it causes problems.

Wednesday, February 27, 2008

Top 10 Commandments of Statistical Inference: #7

The seventh commandment of Statistical Inference is:

Though Shalt not bear false witness against they control group.

To understand this commandment, first you must know what a control group. Typically, when doing a scientific study, one has looks at the difference between two groups that are statistically the same. The first group is the experimental group in which they receive the treatment, the second; the control group, is the group which does not receive the treatment (e.g., placebo). So, how can someone bear false witness (lie) about the control group? The first, and most common is not making sure the control group is statistically the same from the experimental group. By not sampling properly, and then saying that they are the “same” as the experimental group when they are not, this could affect your results. Also, you could have introduced confounding variables into the mix by not controlling both groups properly. You know have no idea if what you see in the experiment is happening because of the treatment, or because of variables you did not control for. Finally, all people can be biased, even researchers. They want to prove their theory. If the experiment is testing in anyway the effectiveness of a treatment, they may have a tendency to favor the experimental group unknowingly. This is why researchers do “double-blind’ studies in which neither the test group nor the researcher knows who belongs in which group.

Not all testing is strict scientific testing, but make sure that you are sampling the correct groups, controlling for the correct variables, and not allowing biases to enter into the results.

Tuesday, February 26, 2008

Top 10 Commandments of Statistical Inference: #8

In our last few posts, we talked about two of the top 10 commandments of statistical inference. We learned that: Though shalt not infer causal relationships from statistical inference and though shalt not apply large sample approximation in vain. Today, for the number 8 commandment:

Though shalt not worship the 0.05 significance level!

This is one I run into all the time. It goes with my earlier post on “Statistical Inference Is Not A License.’ For some reason, a common mistake is to focus solely on a 0.05 significance level and I think it is because of a couple of reasons. First, it is what is taught in schools for the most part. I remember back to my first few stats classes and it seems like it was best case to just teach “Above or Below 0.05.” It’s also the misunderstanding and overreaching that we tend to do when we have statistical significance. Finally, of course, there can be statistical reasons, but when I ask people why the 0.05 level is so important to them, few understand it this deeply.

The main thing to remember is what it really means. It means that you are 95% confident depending upon your models assumptions and are willing to take that 5% risk that there is no difference (Type I Error). So, the real question is, how important is this to you? If you are in a lab, then a 0.01 significance might be what you shoot for. In a sociological study, perhaps maybe a 0.10-0.15 level will satisfy you. Whatever the case, it will depend on how strict you wish to be or not be and what situation you are in.

Monday, February 25, 2008

Top 10 Commandments of Statistical Inference: #9

OK, so last post we discussed the first rule (or, number 10) of statistical inference. That was, Thou shalt not infer causal relationships from statistical significance. Number 9 is:

Thou shalt not apply large sample approximation in vain.

This is more of a tricky concept. For the most part, the larger the sample size the closer to the population you are. Right? Well, based off of that, the closer one is to the sample representing the population. Most statistical models are based off of this assumption, and therefore the larger sample size you have the “easier” it becomes to find statistical significance. Even a first year stats student begins to be able to point this out. Just look at the back of any statistical book and look at the t-distribution and watch what happens to the t-value needed to find significance. It goes down….

In fact, I remember one of my interview questions for my first job was:

You have one sample with a correlation value of .30 and no significance and another value of .28 and statistical significance. Why would a smaller value have significance?

Because significance has little to do with strength and a larger sample size can help “find significance.”

In other words, don’t use large sample sizes just to find significance. It is important to take the correct sample size for the statistical model you are using. Each model is different and if you don’t know the assumptions of the model, you should search out and ask of the ramifications of a large sample size. In other words, know thy model!

Thursday, February 21, 2008

Top 10 Commandments of Statistical Inference: #10

All,

I am starting a new series on the Top 10 Commandments of Statistical Inference. In my first job as a statistical consultant I was given this and it’s been hanging on my wall ever since. I wish I could claim it was mine, but I can’t. For today:

Number 10: Thou shalt not infer causal relationships from statistical significance.

This seems like it should be the number 1, but it isn’t. This goes back to my last post. All too many times, we see statistical significance and we use this as a license to do what we want, or saying that x caused y. Bottom line, there are very few, if any, situations in which you can infer a causal relationship from finding significance. Only if you are able to control for every outside variable, and are able to directly manipulate the variables you are testing, could you indicate causality. Of course this is next to impossible. Therefore, as mentioned before, you can only state “Based off of what I know, I can indicate that I am X% confident that what I found in the study, I would find in the general population.”

Wednesday, February 20, 2008

Statistical Significance Is Not A License

Please go check out Avinash Kaushik’s blog about statistical significance. I found his blog very helpful and in the entry I like the fact that he begins to discuss how we must use statistics when testing our assumptions. He also points to Brian Teasley’s stats calculator. I pulled this down and tried to find the assumption underneath. I am contacting both, to see if I can get those. I will let you know my thoughts on those.

However, one concern I have, is that it brought up an all too familiar ring to my ear. I am increasingly seeing “Statistical Significance” become a license to do what we want. I want to remind everyone, what statistical significance really means. Simply put, in most cases that we are dealing with, statistical significance indicates that you are x% confident that what you found in your testing or sample, you would find in the general population. In the case of an A/B test, it simply tells you that I am x% confident that there is a difference between A and B and what I found in my testing, I would find in population. That is ALL that it tells you. Furthermore, it is contingent upon you doing the right test the right way in the first place. So, even if you have statistical significance, does not mean what you found was really right. Wrong assumptions, wrong manipulations, and wrong sampling are the issues I find the most often. The sampling piece can be can be the most problematic. You could do a test in one month and find results, and do the same test the next month and find widely different results…both being significant! What went wrong? You probably do not have your assumptions or sampling down pat. Make sure you do that before you test. Otherwise, “statistical significance” can change from the license to do what you want to a pink slip!

Again, I will let you know what I find out about the calculator!

Friday, February 8, 2008

Synergy

In business, most of the time, it doesn’t happen. People tend to work on their own area and own sides leaving two or more extremely strong wheels to spin on their own.

As an analyst who was initially trained in a “conservative” setting, I find that it is sometimes hard to create synergy with those who do not think in tight theoretical ways. When I first started my career, I was pretty idealistic and very conservative with data. All things had to be “balanced.” This made it extremely difficult to create synergy with those who do not have to have things balanced. As I moved on with my career, I quickly realized that more flexibility was needed on my side. This was accelerated when I began working for a photomask company as the corporate statistician. You see, photomask manufacturing is pretty much N-of-1 manufacturing, and as you know statistics in manufacturing is all about replication. So, I quickly found out the key was not to focus on what was different and how to “fit” a statistical model to non-replicates, but to find out really what was the same, what WAS replicated, and control the heck out of that. It worked well, and led to my first two publications. However, I never really was part of a synergy, since my main focus was to pound out reports and focus on how to analyze the same data differently.

I then moved to my current position, and marketing was a new area of focus for me. It was much more fluid than the “lab” of a research facility. What I did realize, however, is that there was some synergy happening within the company. People were working together, not in silos. Lately, we are really starting to pick-up momentum, and it is very exciting. Things are “coming together” in a way few analysts actually get to see. Most of us typically sit back and pound reports and think of new ways to get and analyze data. What is really exciting though is when the metrics begin to line up with the corporate identity. That is a great feeling for an analyst.

So, what’s my point? To create synergy, everyone needs to change and it takes time. It took me years and different situations, to change from a “theoretical” statistician and come closer to the middle. Yet, it can’t just be one person. It will not work if one person moves all the way to the other person. Others also have to come towards the middle. When it does happen though, it can be a very exciting time for everyone.

Monday, February 4, 2008

More Polls

OK, this primary on the democratic side is going to be a wild one (at least in the news). We have another “swing” with a poll. The CNN/Opinion Research Poll that was reported today has indicated that Obama has now “erased” a gap between himself and Clinton with one day to go. Compare this to a few months ago when Clinton had a “significant” lead over Obama. Meaning? Not much.

Here’s why. Although I appreciate that they state the current poll has a 4.5 point error rate and do not say he is now in the lead, it means little to nothing as to how things will shake out tomorrow. This is a national survey. So, they are again sampling from a population who may not even live in a state that will vote tomorrow, and even if he or she does, who knows if they will even vote in a primary. So, it means nothing to the amount of delegates that Obama or Clinton could pick up. But it sure makes for great headlines, which is the scary part. By dissecting each poll and showing these “wild swings” the media is creating news, not reporting it. For the casual observer, if they see this, they may decide to hitch themselves on the wagon of the winner and it may have a small effect on the outcome tomorrow.

Maybe more interesting is the vote in California, who is voting tomorrow. There was a large fluctuation between two weeks ago when Clinton had a double-digit lead to a poll on Sunday that shows an insignificant lead for Clinton (within the 4.5 points). How interesting is this? Not as interesting as they want it to seem. Could it be Oprah? Could it be Maria Shriver? Or could it just be bad sampling. I am still on point to say that a poll should not fluctuate this much within a two week period if the sampling is right (whether the premise, delivery or results of the poll is right or not.) Bottom line, anytime you are sampling from the same population there should not be such a fluctuation, even if you are asking the wrong thing. In my work, the first thing I look to when I see something like we see here is, “Did I get my sampling right?” “Am I asking the same questions from the same population?” In the case of these polls, I say probably not.

So, sorry, maybe Oprah isn’t responsible for such a wild swing after all. Who could be? Hmmm, didn’t Edwards just drop out in the last two weeks? A point that is lost on them I suppose…

Friday, February 1, 2008

Primary Polls

So, why have political polls been so wrong? I get asked this from time to time. Well, it is a complicated answer. First, we must address which polls have the problems. The first type of polls, (the ones that seem so wrong) are the pre-voting polls that are taken weeks or days prior to polling. The second type, the exit polls, is taken directly after the voter has voted. Obviously, this one is much more accurate (although the last two elections, even these are failing much more frequently.)

Let’s focus on the pre-voting polls. These polls are taken months, weeks, and days before the vote. To keep things simple, people from a particular demographic are sampled and polled about who they would vote for in the upcoming primary. In the recent primaries, they have been WAY off. Why? Well, it can be quite complicated, but I think there are several things at play. First, the models are broken. Many people are still living in a world where they think that the old social model is still in existence. This is not true. No longer can we typecast people according to strict demographics. Where you could once count on a particular demographic to react or vote one way, you can no longer do so. Why? People have so much more information at their finger types due to technology. In years past, people would get their information from regional and perhaps one national news source and they could be swayed easier since they only got a couple of views. Now, people are inundated with news 24 hours a day 7 days a week. They also no longer have to count on social networking with people in their vicinity, but rather can converse with people from all across the world who actually hold many of their views, creating micro-groups of people with the same thoughts. One-person on an island no longer exists. In other words, the old models are no longer accurate. This leads to the second issue which is sampling. If the models are broken, surely the sampling is as well. When you rely on asking a few people to predict the whole, you must have the correct samples in place. Because of what was stated above, undoubtedly the samples are wrong. How can you tell? Look and see how fast the same polls are changing from week to week or in some cases day to day. One polling center can have wild swings. This is no fluke. If your sample is not accurate, this can happen anytime you are attempting to predict. Furthermore, when polling a primary, you may be asking people who have no plan on voting in the primary. Another reason for the wild swings? Because of the information explosion, people tend to change their mind much quicker than before. We are a society of instant news and change, which makes it that much harder to predict.

So, what to look for with Super Tuesday coming up? Well, certainly do not look too far into the polls to tell you what is going to happen. Only way to be for sure on who will come out ahead is by watching the actual results come in.