Our Nature Communciations paper – ice-like phonons in liquid water

Our paper, “The hydrogen-bond network of water supports propagating optical phonon-like modes” was published on January 4th in Nature Communications (full open access pdf). A press release about our work has been issued by the Stony Brook Newsroom and picked up by news aggregator Phys.org.

Our work shows that propagating vibrations or phonons can exist in water, just as in ice. The work analyzes both experimental data and the results of extensive molecular dynamics simulations performed with a rigid model (TIP4P/eps), a flexible model (TIP4P/2005f), and an ab-initio based polarizable model (TTM3F). Many of these simulations were performed on the new supercomputing cluster at Stony Brook’s Institute for Advanced Computational Science.

Continue reading

3 Comments

Filed under non-technical, research, Uncategorized

“Accelerando” by Charles Stross

AccelerandoAccelerando_(book_cover)
by Charles Stross
2006, 415 pg

“There is an intrinsic unknowability about the technological singularity. Most writers leave it safely offstage or invent reasons why it doesn’t happen. Not Charles Stross. Accelerando lives up to its name, and is the most unflinching look into radical optimism I’ve seen.” – Vernor Vinge

During winter break I finally read Accelerando.  I say “finally” because this book was first recommended to me in 2009 at the (now defunct) Rensselaer Polytechnic Institute transhumanism club. Accelerando is notable as being perhaps the first novel to have a storyline which traverses directly through a technological singularity.
Continue reading

Leave a Comment

Filed under books, singularity, technology, Uncategorized

Maximum entropy priors

How do we assign priors?

If we don’t have any prior knowledge, then the obvious solution is to use the principle of indifference. This principle says that if we have no reason for suspecting one outcome over any other, than all outcomes must be considered equally likely. Jakob Bernoulli called this the “principle of insufficient reason”, a play on the “principle of sufficient reason”, which asserts that everything must have a reason or cause. This may be the case, but if we are ignorant of reasons, we cannot say that one outcome will be more likely than any other.
Continue reading

Leave a Comment

Filed under Bayesian inference, machine learning, python, Uncategorized

More counterintuitive Bayesian reasoning problems

Remember how in my last post I said Bayesian reasoning is counter-intuitive? It’s simultaneously maddening and fascinating because clearly, given we accept with certainty the assumptions that go into model/hypothesis selection and the prior, the application of Bayes’ theorem gives the correct probability for each model/hypothesis in light of the evidence presented.

Last time I gave the canonical example of a test for a disease. Humans tend to not take into account that if the base rate (over all frequency of having a disease) is very low, then a positive result on a test may not be very meaningful. If the probability of the test giving a false positive is 1%, and the base rate is also 1%, then the chance that you have disease given a positive result is only 50%. Once you understand this, the non-intuitive nature goes away.

Here I will give two more examples of highly non-intuitive Bayesian problems.

The Monty Hall problem
The famous Monty Hall problem can be solved with Bayes’ rule. A statement of the problem is:

There are three doors, labelled 1, 2, 3. A single prize has been hidden between one of them. You get to select one door. Initially your chosen door will not be opened. Instead, the gameshow host will choose one of the other two doors, and he will do so in such a way as to not reveal the prize. After this, you will be given a fresh choice among the 2 remaining doors – you can stick with your first choice, or switch to the other closed door.

Imagine you picked door 1. The gameshow host picks door 3, opening to reveal no prize. Should you (a.) stick with door 1, (b.) switch to door 2, or (c.) does it make no difference?

If you have not heard of this problem before, think it over a while.

The Bayes’ theorem solution is as follows: Denote the possible outcomes as:
 D_1 = the prize is between door 1
 D_2 = the prize is between door 2
 D_3 = the prize is between door 3

We know  P(D_1) = \frac{1}{3}. The question is what is [/latex]P(D_2 | D)[/latex]? We use the symbol  D to denote the data/evidence we have, which is the fact that the gameshow host opened door 3. We can solve this using Bayes’ theorem:

 P(D_2|E) = \frac{P(E |D_2)P(D_2) }{ P(E) }
 P(D_2|E) = \frac{P(E |D_2)P(D_2) }{P(E |D_1)P(D_1) +P(E |D_2)P(D_2)}
 \quad = \frac{( 1)(\frac{1}{2})} { (\frac{1}{3})(\frac{1}{2}) + (1)(\frac{1}{3})}
 \quad = \frac{2}{3}

Therefore it is better to switch to door 2. By switching to door 2, we double our chance of winning from  \frac{1}{3} to  \frac{2}{3}. The tricky part of the calculation is calculating the normalizing factor  P(E), where we must consider the probability the game show host will open door 3 when the prize is behind door 1 (=1/2) and the case where the prize is behind door 2 (=1).

Note the following mind-blowing shortcut to solving the problem:

Since door 3 was opened, we know  P(D_3) = 0. The gameshow host did nothing to interfere with door 1. Thus  P(D_1) = 1/3 as it was in the beginning. Now, we know  P(D_1) + P(D_2) + P(D_3) = 1, so  P(D_2) = 2/3!

Bayesian model comparison
Bayes’ theorem allows us to compare the likelihoods of different models being true. To take a concrete example, let’s assume we have black box with a button attached. Whenever we hit the button, and a light on top of the box blinks either green or red. We hit the button a number times, obtaining a sequence:

GRRGGRGRRGRG…

Let’s say say we model the black box as a ‘bent coin’. Thus, our model says that each outcome is statistically independent from previous outcomes and the probability of getting green is  p_g. Using Bayes’ rule, we can infer the most likely value of  p_g for this model, and compute the probability of any  p_g given a sequence of observations. In the interest of space, I won’t solve it here.

We might have a different model, though. We may model the black box as a dice. If the dice lands on 1, the green light goes on, otherwise, the red light goes on. This corresponds to our earlier, more general model, but with  p_g fixed at  p_g = \frac{1}{6}. The first model has a free parameter,  p_g, while the second model does not.

The method of Bayesian model comparison can tell us which model is more likely. Instead of analyzing the situation with a single model, we now consider both models at the same time. I like to use the term ‘metamodel’ for this. In our metamodel, we assume equal prior probabilities for each model. We denote the sequence of flashes we observed as  s. Model 1 is denoted  \mathcal {H}_1, and model 2 is denoted  \mathcal {H}_2 (the symbol  \mathcal{H} stands for ‘hypothesis’, a word which we take as synonymous with ‘model’).

 P(\mathcal{H}_1 | s) = \frac{ P( s | \mathcal{H}_1) P(\mathcal{H}_1) }{ P(s) }
 P(\mathcal{H}_2 | s) = \frac{ P( s | \mathcal{H}_2) P(\mathcal{H}_1) }{ P(s) }

The relative probability of model 2 over model 1 is encoded in the ratio of the posterior probabilities:

 \frac{P(\mathcal{H}_1 | s) }{P(\mathcal{H}_2 | s) } = \frac{P(s| \mathcal{H}_1)}{P(s|\mathcal{H}_2)}

The ratio tells us the relative probability that model 1 is correct. Note that absolute probabilities of model 1 and model 2 can be computed from this using the fact that  P(\mathcal{H}_1|s)+P(\mathcal{H}_2 |s)=1 That’s all on model comparison for now. A more detailed discussion can be found in MacKay’s book.

The case of the blood stains
This problem is taken directly MacKay’s book:

Two people have left traces of their own blood at the scene of a
 crime. A suspect, Oliver, is tested and found to have type ‘O’
 blood. The blood groups of the two traces are found to be of type
 ‘O’ (a common type in the local population, having frequency 60%)
 and of type ‘AB’ (a rare type, with frequency 1%). Do these data
 (type ‘O’ and ‘AB’ blood were found at scene) give evidence in
 favour of the proposition that Oliver was one of the two people
 present at the crime?

At first glance, lawyer may easily convince the jury that the presence of the type ‘O’ blood stain provides evidence that Oliver was present. The lawyer may argue the while the degree of weight it should carry may be small, since type ‘O’ is fairly common, nonetheless the presence of type ‘O’ should count as positive evidence. However, this is not the case!

Denote the proposition ‘the suspect and one unknown person were present’ by  S. The alternative,  \bar{S}, states that “two unknown people from the population were present”.

If we assume that the suspect, Oliver, was present, then the likelihood of the data is simply the likelihood of having a person with blood type ‘AB’:
 P(D|S) = p_{AB} = .01

The likelihood of the other case is the likelihood that two unknown people drawn from the population have blood types ‘AB’ and ‘O”:
 P(D|\bar{S}) = 2 p_{AB}p_{O} = .083

The second case is more likely. The likelihood ratio is
 \frac{.01}{.083} = .83

Thus the data actually provides weak evidence against the supposition that Oliver was present. Why is this?

We can gain some insight by first considering another suspect, Alberto, who has blood type  AB. We denote the hypothesis that Alberto was present by  S', and the hypothesis that he wasn’t present  \bar{S}. In this case, the ratio is:

 \frac{S'}{\bar{S}}=\frac{p_{O}}{2p_{AB}p_{O}}=\frac{.6}{(2)(.01)(.6)}=50

Clearly, in this case, the evidence does support the hypothesis that Alberto was there. MacKay elaborates: (my emphasis added)

Now let us change the situation slightly; imagine that 99% of people are of blood type O, and the rest are of type AB. Only these two blood types exist in the population. The data at the scene are the same as before. Consider again how these data influence our beliefs about Oliver, a suspect of type O, and Alberto, a suspect of type AB. Intuitively, we still believe that the presence of the rare AB blood provides positive evidence that Alberto was there. But does the fact that type O blood was detected at the scene favour the hypothesis that Oliver was present? If this were the case, that would mean that regardless of who the suspect is, the data make it more probable they were present; everyone in the population would be under greater suspicion, which would be absurd. The data may be compatible with any suspect of either blood type being present, but if they provide evidence for some theories, they must also provide evidence against other theories.

Here is another way of thinking about this: imagine that instead of two people’s blood stains there are ten (independent stains), and that in the entire local population of one hundred, there are ninety type O suspects and ten type AB suspects. Consider a particular type O suspect, Oliver: without any other information, and before the blood test results come in, there is a one in 10 chance that he was at the scene, since we know that 10 out of the 100 suspects were present. We now get the results of blood tests, and find that nine of the ten stains are of type AB, and one of the stains is of type O. Does this make it more likely that Oliver was there? No, there is now only a one in ninety chance that he was there, since we know that only one person present was of type O.

MacKay continues to elaborate this problem by doing a more explicit calculation. In the end the conclusion is:

If there are more type O stains than the average number expected under hypothesis  \bar{S}, then the data give evidence in favour of the presence of Oliver. Conversely, if there are fewer type O stains than the expected number under  \bar{S}, then the data reduce the probability of the hypothesis that he was there.

Note the similarity with the drug test example. The base rate of blood stains must be considered.

Bayesian statistics in court
Ideally, a jury would apply Bayesian reasoning to rank the likelihood of different hypotheses. The chance that a person is a suspect is denoted  S, and the probability is encoded in the ratio  \frac{S}{\bar{S}}. In the words of MacKay:

“In my view, a jury’s task should generally be to multiply together carefully evaluated likelihood ratios from each independent piece of evidence with an equally carefully reasoned prior probabilities.”

The potential for Bayesian methods to improve the criminal justice system is huge. The issue though is that statistics can also be easily manipulated by subtle changes in the inputs. Judges and juries can easily be misled if they have no understanding of statistics. One solution is to train the jury in Bayesian statistics during the course of the case, and this has been used by lawyers to help juries understand complicated blood stain DNA evidence. However, many judges (who usually lack a deep understanding of statistics) are immensely skeptical of whether the jury can properly analyze complex statistical data without being hoodwinked. From their point of view, statistics are too opaque. There is the question of the confidence that can be placed in the jury to properly apply Bayesian methodology, even after training. Should juries be quizzed on their ability to do Bayesian reasoning before being allowed to deliberate? The challenge is to explain complicated statistical methodologies in a way that lay people can understand, and no solution has yet been found that all parties agree upon. For this reason, the use of Bayesian statistics in courts has been banned in the UK. Obviously, this is not at all an optimal situation.

Leave a Comment

Filed under Bayesian inference, machine learning, statistics, Uncategorized

Exclusion zone water

Note: this is rather technical. 

This spring I had the pleasure of speaking briefly with a distinguished engineer, inventor, businessperson, and benefactor of science. He explained how he has recently become interested in the work of Prof. Gerald Pollack, who discovered what he calls the “4th phase of water”. The very term “4th phase of water” immediately raised an alarm bell in my head, since there are actually 19 or so known phases of water. I decided to check out what this “4th phase” was. It turns out this ‘phase’ has so far only been observed at the boundary with an odd material called Nafion, so really, it’s interfacial water with special properties, not a new phase of the liquid itself. My research focus the past three years has been understanding the microscopic details underlying the dielectric properties of water.  I am very interested in the structure and behavior of water around proteins and dissolved ions (and have read numerous papers on the subject) so naturally I am interested in Dr Pollack’s claims. Additionally, Pollack has shown that he can use the exclusion done phenomena to build a device that filters out nanospheres, and he claims his discovery can be used for desalination technology. He has not yet actually presented a functioning desalination apparatus, but he has filed a patent for the technology.
Continue reading

13 Comments

Filed under pseudoscience, research, technical, Uncategorized

A physicist’s visualization of Bayes’ theorem

Have you noticed that everyone is talking about Bayes’ theorem nowadays?

Bayes’ theorem itself is not very complicated. The human mind, however, is extremely bad at trying to gain an intuitive understanding of Bayes’ theorem based (Bayesian) reasoning. The counter-intuitive nature of Bayesian reasoning, combined with the jargon and intellectual baggage that usually accompanies descriptions of Bayes’ theorem, can make it difficult to wrap one’s mind around. I am a very visual thinker, therefore, I quickly came up with a visualization of the theorem. A little Googling shows that there are many different ways of visualizing Bayes’ theorem. A few months ago I came across a visualization of Bayes’ theorem which I found somewhat perplexing.  Even though mathematical truths are universal, they are internalized differently by every individual. I would love to hear whether others find my visualization approach useful. It is a very physicist-oriented visualization.
Continue reading

3 Comments

Filed under Mathematics, technical

Moore’s law – Kurzweil vs. Thiel

Everyone knows what Moore’s law is – processors become cheaper, exponentially. There are a multitude of more precise formulations. Moore’s original formulation was very precise – it stated that the density of transistors achievable with the lowest cost transistor doubles every two years. A few years later he revised the doubling time to every 18 months. Personally, I prefer Ray Kurzweil’s formulation – the computational power (measured in calculations per second) available for $1000 doubles every ~2 years. The two versions are nearly identical, but using calculations per second per $1000 also takes into account how well the transistors are used (layout) and driven (clockspeed).
Continue reading

Leave a Comment

Filed under computers, economics, Musings, singularity, technology

Crackpot Nobelists

I have mixed feelings about the Nobel Prize. To some degree, I sympathize with Feynman:

I agree with Feynman that the discoveries that people made about the nature of reality are more important than titles and distinctions. The true test of scientific work is what it tells us about reality.  Of course “importance” is subjective, but importance can be defined operationally in terms of how much implication a discovery has for the workings of the universe, as made manifest through the breadth of experimental results that it can help explain. The importance of a discovery can also be defined through the impact a discovery has on technology and society.  The later definition is actually closer to Nobel’s intent, although the former version of “importance” usually implies the later as well. The invention of the transistor, a singular event by a few people, surely passes the test. Scrolling through the Nobel prizes in physics, they all appear to pass these operational tests at the highest level. Still, more generally prizes and awards are often corrupted by human cognitive biases and by the insularity of certain social networks – ie winning the prize becomes more about ‘who you know’ rather than ‘what you discovered’. Continue reading

2 Comments

Filed under Musings, non-technical, pseudoscience

Explaining the uncertainty principle correctly

“I think I can safely say that nobody understands quantum mechanics.”
– Richard Feynman, in The Character of Physical Law (1965)

DrZbPV0

One of the greatest things about being a physics major is that you learn quantum theory and can proudly say that you understand it. When I say “understand” I mean that you are able to apply it to make predictions about reality. As Feynman rightly pointed out, nobody understands quantum mechanics in the sense of understanding why it is the way it is.  Fortunately, we don’t have to understand why it is to know how to use it to make predictions. Using quantum mechanics we can in principle understand almost everything in the universe, in the sense of being able to predict things. In this respect, the theory of quantum mechanics is the highest pinnacle of human thought, since it subsumes almost all of reality into a theory based on just a few axioms and equations.
Continue reading

2 Comments

Filed under quantum mechanics, quantum theory

What’s your evidence threshold?

As we communicate science to the public, it is also important to explain how science works and how to distinguish science from pseudoscience. From years of being in the field, scientists generally have their ‘bullshit’ detectors pretty finely calibrated, which leads to a type of illusion of transparency.  In other words, scientists often underestimate how difficult it is for people outside their field to distinguish established science from pseudoscience. The reason for this is not just that pseudoscience is ubiquitous on the internet. The most persistent and damaging pseudoscience (homoeopathy, climate change denial, anti-vaccination, etc) is justified using studies published in peer reviewed journals by PhDs. Peddlers of pseudoscience use these studies to support their claims while ignoring all other studies which refute them. The vast majority of the public will have no knowledge of the follow up literature (or details of the experiment design and intricacies of statistical significance). Good science reporting will be careful to make wide reaching claims on the basis of a single study, and will mention the statistical significance of results, but such reporting is the exception, not the rule. Therefore, articles on pseudoscience websites can be effectively indistinguishable from much of run of the mill scientific news reporting.
Continue reading

Leave a Comment

Filed under Uncategorized