December 8, 2009

Can statistics predict future?

The article by Gillian Tett and Peter Thal Larsen called “Market faith goes out the window as the ‘model monkey’ loses track of reality” talks about how people that work in the banks and who trades on the market only after entering inputs into the model were called F9 model monkeys, companies and major banks using complex mathematical modeling systems to trade on the market. It emphasizes on the problem that some hedge funds and banks experience such as trading losses due to the fact that these institutions are basing their business strategies on the bunch of complex models that are not accurate. Companies like Ford and General Motors are faced with the credit downgrades because they rely on the banks that heavily use mathematical models that were either created by the theory of the Nobel Prize winning economists or are partly untested and not on the reality and the behavior of the real life. The article also talks about how banks started developing their own models like “senior” and “mezzanine” models for figuring out the collateralized debt obligations (CDOs) pricing and to selling them to the clients. Banks sell simpler models to the public and keeping the models that are more risky and complex for them. Different banks have different models for the CDO pricing and market prediction, thus causing financial institutions like JPMorgan and Deutsche Bank a financial damage.
Discussion Question: Why do you need to be careful about what your model is telling you?
The reason why we need to be careful about what the model is telling us is because in the article the models that were used for market prediction and CDO pricing were all based on the regression analysis which was based on the linear regression. What regression analysis does is it enables us to develop a model to predict a value of a numerical variable in this case it is CDO pricing. The variable that we wish to predict is dependable variable that is CDO pricing, and variable that we use to make our prediction is independent variable that is all the debt and equity products, complex mathematical models and formulas and the hard work of the economists. So, the fact is that there are always unexpected changes on the market and by relying heavily on the models alone it enables us to adapt to those changes, thus causing losses. We have to take into consideration the fact that there is always a room for errors, market changes, and demand changes, therefore values cannot be the same neither can the values in a regression analyses be the same, it cannot be simply based on the straight line the results will always vary around it. We can use a standard error of the estimate to measure the standard deviations around that line and the assumption of regression techniques to calculate the errors, but I guess it would be next step that the “F9 model monkeys will take.
The second article talks about stock prediction, how stock prices can be predicted based on the certain events such as if NFL wins super bowl the prices will go up and how stock brokers and investors are allured into the data-mining. This article written by Jason Zweig talks about how brokers on the wall street trying to find an explanation or a cause that predicts the returns on the stock market not realizing that because of the massive amounts of information that stock market generates the relationship that is found is purely coincidental which gets them into data-mining. The data mined numbers gets so irresistible and alluring that people invest billions of dollars to find hypothetical results that no one even knows if they will work in the real world. To prove the point of the data-mining Mr. Leinweber decided to do an experiment that was used to predict US stock prices based on the annual butter production in Bangladesh in the past 13 years. He was able to predict a US stock returns with 99% accuracy that by tossing in US cheese production and the total population of sheep in both US and the Bangladesh for the past 13 years. However, there is no explanation or a legitimate reason why US stock returns would be determined by the Bangladesh livestock returns, therefore the results are coincidental. There is couple of rules to prevent data-mining. First and most important is the results have to make sense. Second rule is to look at the data in separate pieces. And the last rule is to give it some time; hypothetical result will not last once they face the reality in the real world.
Discussion Question: What is the difference between correlation and causation?
Correlation used to describe the degree of relationship between two variables and causation describes the degree to which one variable causes the other variable. For example, in this article we can clearly see that a correlation does not automatically calls for causation. Certain events that happened that might be correlated like US stock prices and Bangladesh butter production, but there is no explanation that one causes the other. Although, it looks very appealing and we want to make it so the US stock can be predicted based on the Bangladesh livestock, and stock prices do go up when NFL wins the Super Bowl, but we can’t not until there is some kind of a scientific or statistical explanation that make sense to this relationship. People want to believe what they see, often believing what they want to believe regardless of the facts. They tend to believe that everything is like a chain reaction. Especially once they see a relationship that works they attach a cause to it without basing it on a logical reasoning. We just have to make sure not to get caught by the data-mining; we just have to be able to distinguish between what is real and what is just a mirage.

October 26, 2009

Statistics Saves Lives!

One of the articles that I read was "Do Cholesterol Drugs Do Any Good" was about a cholesterol lowering drugs called statin. This article basically focuses on the argument between money making pharmaceutical companies and science. According to the article most of the people that are taking cholesterol lowering drugs have no chance of benefiting from it, instead they run a risk of harm. Moreover, the only people that might benefit from the drug are those who have some kind of an underlying disease such as heart disease or high blood pressure. People who don't have any heart condition, that are over the age of 65, and women receive no benefit from the statin drugs regardless of their cholesterol levels. There was a trial study done on statin drug called Lipitor, which concluded that for every 100 people taking drug for 3 and 1/3 years three people on placebo effect and two people on Lipitor had heart attack. So, 100 people needed to be treated by taking Lipitor for all these years for only one person to benefit from it. Here is the question, why take Scientists argues that However, pharmaceutical companies like Pfizer, Merck, Bristol-Myer Squibb, and Schigh -Plough are still marketing and selling those drugs with no problems. It has been drilled to the millions of Americans that high levels of bad cholesterol is a straight road to the grave and that the only way to lower it to healthy levels is by taking statins. It also mentions some other drugs that have relatively same effect on people, such as Avandia, that widely used by people with diabetes. Clinical trials studies show that it increases the risk of heart attacks and has a little evidence that drug actually help patients with diabities. Other medications are hormone replacement therapy that causes heart disease and anti-psychotic medications that were less effective than placebo in reducing aggression with intellectual disability. People and most of the doctors are not aware of these facts. It is all depends on how the statistics and the results are presented. For example, pharmaceutical companies they advertise the big percentage drops in heart attack, while obscuring the true number needed to treat NNT. But, once it comes to the side effects, they turn the message around by saying that only 1 in 100 people suffer a side effect even if it calls for 50% increase. So, we can see that unfortunately the way our healthcare system runs is not based on data and statistical benefits; it is based on what produces big bucks.
The second article called "The Median Isn't the Message" by Stephen Jay Gould. In this article Dr. Gould talks about his personal story and how he was dealing with it. Dr. Gould was diagnosed with a rare and serious form of abdominal cancer called mesothelioma. It is usually caused by exposure to asbestos. After doing the research he learned that this type of cancer is incurable, with the median mortality of only eight months after discovery. However, Dr. Gould discovered that according to statistics which is based on the same type of cancer for age, class, health, socioeconomic status, and those with positive attitudes tend to live longer. Jay Gould convinced that if people had an adequate understanding of statistics to be able to evaluate the real meaning of the "median mortality of eight months" and view the variation of central tendency in the way that there is a chance exceeding eight months expectations, rather than interpreting it as "I would probably be dead in eight months". This statistical data was plotted on the right skewed graph with the left of the distribution contained a zero which identifies death or before and the right half of the curve extended out indicating years to live. Well, Dr. Gould was one of those people who analyzed the statistical data that was available on this type of cancer and chose to be in the extended tail of the right skewed distribution curve. He decided to exceed the eight months prognosis and beat the doctor's expectations. Dr. Gould lived for 20 years from the day he was diagnosed, exceeding everyone's expectations.
Both of these articles can easily be related to everyday statistics. It all depends on how someone interprets it. In case of cholesterol lowering drugs scientist say that the benefits of not taking the drug outweigh the risk of a person having a heart attack. It is not worth spending all these money and having all those side effects on drugs that may end up having no benefit in lowering once LDL cholesterol or have a little effect that could have been accomplished by some diet and simple physical activities. Obviously, if people knew all these facts they would most likely not take the drug, but who would tell them?! The pharmaceutical companies that want to increase their profits and returns to share holders, or the doctors that are being well compensated by the drug-making companies for promoting and prescribing their drugs or those physicians that simply don't know these facts, will tell the truth to the patients about taking these drugs. No, most likely not. Statistics will! With a little research people can know the truth about drugs that they are taking and make better decisions regarding their health. The only problem is that we want to believe our doctors, our pharmacist; we tend to believe that by taking the "magic" pill we will feel better.
In the case of Dr. Gould he chose to be in the right tail of the distribution of variation curve and he lived longer than it was expected. With the help of statistics he was able to see what the chances were of living longer, and the factors that will help him to achieve this goal. Every day we make a decisions and basing it on some factors that are most likely are based on statistics, often we don't even realize doing it. We go to see a movie in the movie theater, for example. We look at the review and ratings first to see if the movie is good and worth seeing it. Well, this is statistics. It shows how many people saw it, how many people liked it and what the age groups were. Or when we make a purchase, we like to see how many people bough this product and things like that. So, sooner or later someone faces a dilemma in their life that argues pros and cons, good or bad, life or death, luckily we have statistics which gives us a choice to fight or not to fight.

September 20, 2009

Statistics is everywhere!

It all began when Mr. Etzioni, a professor of computer science and engeneering at University of Washington realized that with the help of computers the airfares can be predicted. All he needed was a data on seat supply and demand and an algorithm to predict how airlines’ systems were going to price those seats. Mr. Etzioni together with Hugh Crean created an airfare search engine called Farecast which predicts how much the price of an airline ticket will rise or fall over the upcoming days. It actually buys data on the availability of seats and their prices from ITA Sofware that sells the same information to the travel agents, travel web sites, computer reservation services and airlines. How it works, the computer uses an algorithm that focuses on the volatility of airline prices and its relation to the airline inventory. So, when the traveler pick the dates and the destination of the desired trip, Farecast generates a list of available flights, listing them from the cheapest to the most expensive and then it tells you what fares are going to go down. Farecast offers information on almost all of the carriers except Jet Blue and Sothwest and promises ones it covers all of the United States major cities; the company will go internationally by adding foreign routes. Farecast also tells you what are the best days to travel, displays the cheapest tickets from the city you’re traveling from, and shows which departure and arrival time would have the cheapest fare. So, it works similarly to the other sites like Zillow.com which used to estimate the price on the real estate and Inrix that uses global positioning satellite receivers to predict traffic, so if a freeway is jammed, for example, its computers would alert the driver to the least busy alternative route.

Leo McCloskey, the founder of the Enologix believes that by using computers and mathematics winemakers can compute taste quality from statistical correlations between chemistry and critics tasting scores.What this system does is it takes grape samples and extracts the juice to measure some of its chemical compounds. Then it uses the software to compare the chemistry of the projected wines with the benchmark example. Enologix runs sample through the liquid chromatograph to separate and then measure the compounds. It compares the chemical compounds to those bottled wines that were previously analyzed and criticized by growers and wine critics. The outcomes of this testing is based on a 100-point scale, which is an analogous to those used by the famous wine critics. It divides wine in four categories the lower the tannin which makes wine taste bitter and astringent the higher the category, thus better quality. It basically focuses on the chemicals such as terpenes, phenols and anthocyanins which are responsible for the following characteristic such as texture, aroma, taste and color, which are the quality determinants of the wine. So, by using this modern technology winemakers can predict their own critical scores with 95 percent accuracy.

Overall, the bigger picture here is that everything in the world has to do with statistics. The more modernized we become the more of the statistics we utilize. We may not notice it, but everything we do is related in some level to statistics. Our everyday decisions are based on statistics. Whether it’s something simple like traveling or shopping or more complicated like buying a house or choosing a school for your child; all of these are based on statistics. A simple activity like food shopping is clearly related to statistics. Supermarkets like Shoprite and Pathmark use statistical data to figure out what product should go on sale this week or what product should have its price raised. As we know people tend to buy more things when they are on sale, or more accessible and convenient for them. For example, statistics show that if a certain product is positioned on the middle shelf of the grocery store people would buy more of it just because it is at their level of reach, rather than reaching or bending for the same product if it was positioned on the upper or lower shelves. If you are looking to buy a house or a car, the best time to buy a house is toward the end of the year because prices tend to be lower around the winter time. Statistically more people tend to buy cars right after the New Year’s, because new cars drop in price around that time.

The government constantly uses statistics for many economical and political reasons. It is used to prevent government deficits and surpluses, forecast government spending, and to analyze economical status. Statistics are also used during elections. The presidential polls people take help to predict which president will win the election. However, these polls can be misleading at times because not all people who respond to the polls will go out and vote. Therefore poll results could actually be opposite to election results. Today, during the economic crisis the government uses statistics to calculate the current economical status of the country, to compare unemployment rates and to predict the future of our economy. Agencies such as the FDA, WHO, and other major organizations use statistical data to test new drugs or to calculate how much vaccine will needed to help prevent H1N1 virus (swine flu).

Personally, I deal with statistics all the time. I work as a dental hygienist in a dental office, and we always analyze our performance as well as patient flow and set daily production goals using statistical data available to us through the special dental software called Dentrix. Then we build our patient schedules based on the production goals set. Moreover, my yearly bonuses are based on production from my own patient pull. My boss uses statistical data that shows how many patients I treated and what my production for the year was. It calculates how many hygiene dental cleanings I did, how many radiographic x-rays I took, and what was my overall performance. Lastly, he calculates the percentage of overall hygiene production and adds it as my bonus.

Today’s society is completely overtaken by statistics. Everywhere you go and everything you do is probably observed and transformed into statistical data. How much exercise on average a person should do to stay healthy or how much coffee does an average person consumes a year; all of it is statistics. Every decision we make and every step we take in life leads to statistics and nothing more. Statistics is the way to understand the world around us.