There were a ton of comments here telling me that the Facebook bookmarklet that I made broke, and this happened because Facebook changed the name of the file they were using. I’ve updated the post, so the bookmarklet there should work now.
Category Archives: Uncategorized
Hackers with Opinions
There is this big thing on the internet these days where computer people write essays. And its weird because these people are supposed to be techie people, you know, they just code and talk about binary, not write essays. But a very large genre seems to be emerging on the web which is the sort of hackers with opinions, the people who code who also can write elegantly. When I think of hackers with opinions, I think of writings mostly in the style of Paul Graham, who writes essays on art, ideas, painting, and technology. But I’ve stumbled upon tons of writings like this recently, where common wisdom and tech wisdom become one and the same. It’s weird stuff. And kinda funny, but very interesting.
Predicting Gender of Facebook Statuses
I just did a research project for my CS224N class where we tried to predict the gender of different Facebook statuses. We also made a fun tool that will generate typical male and female Facebook statuses. Check out our research here.
What is the probability you get this question right, given you know Bayes’ rule?
So to rephrases the question:
P(question right | you know Bayes’ rule) = … by Bayes’ Rule
P ( you know Bayes’ rule | question right ) P ( question right ) / P ( you know Bayes’ rule)
That’s kind of confusing.
What struck me was how interesting it is how our intuitions about probability can be so wrong. For example, take this question, or riddle:
If you a have a test that is 99% accurate for detecting swine flu (if you have swine, it will return positive 99% of the time) and only 1% of the population has swine flu, what is the probability that you actually have swine flu given that you test positive? (Stop to think for a second before you answer.)
Most people will probably say 99%, because the test was 99% accurate. But what that really means is that the test will return a positive result 99% of the time if you have swine flu P ( + | swine ) = .99 — but this is not the same as the question we are asking; namely what is the probability that you have swine flu given that you test positive, or P ( swine | + ).
This confusion is a result of many types of confusion with probability that are all related, the conditional probability fallacy, the confusion of the inverse, the base rate fallacy, and a related riddle, the Monty Hall Problem. See Bayes theorem for a more technical analysis.
But the answer is not 99%. It is 50%. Bizarre. So why is this?
Take this numerical example to help illustrate. Say you have a town of 101 people, where 100 of them are perfectly healthy and 1 of them has the swine. Ok.
The question still remains: What is the probability that you have the swine given you test positive? Or what is P ( swine | + ) ?
According to Bayes’ rule that is P (+| swine)P(swine) / P (+).
What that is really saying is that the probability you have swine given you test positive is the ratio of all the people who actually have the swine and tested positive over all the people who tested positive.
So lets go back to the numerical example.
1 person had swine and they tested positive.
1 person had a false positive–out of the 100 healthy people there was a 1% error rate.
So we had 1 TRUE positive / 2 TOTAL positives or 50%!
The confusion here is several-fold. It is very hard to keep conditional probabilities separate from their inverse. It is even more confusing when you think about the difference in probabilities of the events. One error is that we forget to take into account how rare the swine was in the first place (in the example, only 1% had it). So even if we tested positive, since there were so many more people who didn’t have it, there will be a rather large number of false positives because the test is only 99% accurate.
It’s very interesting, and very non-intuitive. Many doctors presented with this problem get it very wrong!
Let me know if this is interesting to you, clarifies the topic, or confuses it.