The art of expecting p-values

In this post, I try to present the intuition behind the fact that, when studying real effects, one usually should not expect p-values near the 0.05 threshold. If you don’t read quantitative research, you may want to skip this one. If you think I’m wrong about something, please leave a comment and set the record straight!

Recently, I attended a presentation by a visiting senior scholar. He spoke about how their group had discovered a surprising but welcome correlation between two measures, and subsequently managed to replicate the result. What struck me, was his choice of words:

“We found this association, which was barely significant. So we replicated it with the same sample size of ~250, and found that the correlation was almost the same as before and, as expected, of similar statistical significance (p < 0.05)“.

This highlights a threefold, often implicit (but WRONG), mental model:

[EDIT: due to Markus’ comments, I realised the original, off-the-top-of-my-head examples were numerically impossible and changed them a bit. Also, added stuff in brackets that the post hopefully clarifies as you read on.]

  1. “Replications with a sample size similar to the original, should produce p-values similar to the original.”
    • Example: in subsequent studies with n = 100 each, a correlation (p = 0.04) should replicate as the same correlation (p ≈ 0.04) [this happens about 0.02% of the time when population r is 0.3; in these cases you actually observe an r≈0.19]
  2. “P-values are linearly related with sample size, i.e. bigger sample gives you proportionately more small p-values.”
    • Example: a correlation (n = 100, p = 0.04), should replicate as a correlation of about the same, when n = 400, with e.g. a p ≈ 0.02. [in the above-mentioned case, the replication gives observed r±0.05 about 2% of the time, but the p-value is smaller than 0.0001 for the replication]
  3. “We study real effects.” [we should think a lot more about how our observations could have come by in the absence of a real effect!]

It is obvious that the third point is contentious, and I won’t consider it here much. But the first two points are less clear, although the confusion is understandable if one has learned and always applied Jurassic (pre-Bem) statistics.

[Note: “statistical power” or simply “power” is the probability of finding an effect, if it really exists. The more obvious an effect is, and the bigger your sample size, the better are your chances of detecting these real effects – i.e. you have bigger power. You want to be pretty sure your study detects what it’s designed to detect, so you may want to have a power of 90%, for example.]

Figure 1. A lottery machine. Source: Wikipedia

To get a handle of how the p behaves, we must understand the nature of p-values as random variables 1. They are much like the balls in a lottery machine, with values between zero and one marked on them. The lottery machine of real effects has disproportionately more low (e.g. < 0.01) values on the balls, while the lottery machine of null effects contains a “fair” distribution of numbers on balls (where each number is as likely as any other). If this doesn’t make sense yet, read on.

Let us exemplify this with a simulation. Figure 2 shows the expected distribution of p-values, when we do 10 000 studies with one t-test each, and every time report the p of the test. You can think of this as 9999 replications with the same sample size as the original.

Figure 2: p-value distribution for 10 000 simulated studies, under 50% power when the alternative hypothesis is true. (When power increases, the curve gets pushed even farther to the left, leaving next to no p-values over 0.01)

Now, if we would do just five studies with the parameters laid out above, we could see a set of p-values like {0.002, 0.009, 0.024, 0.057, 0.329, 0.479}, half of them being “significant” (in bold). If we had 80% power to detect the difference we are looking for, about 80% of the p-values would be “significant”. As an additional note, with 50% power, 4% of the 10 000 studies give a p between 0.04 and 0.05. With 80% power, this number goes down to 3%. For 97.5% power, only 0.5%  of studies (yes, five for every thousand studies) are expected to give such a “barely significant” p-value.

The senior scholar, who was mentioned in the beginning, was studying correlations. They work the same way. The animation below shows, how p-values are distributed for different sample sizes, if we do 10 000 studies with every sample size (i.e. every frame is 10 000 studies with that sample size). The samples are from a population where the real correlation is 0.3. The red dotted line is p = 0.05.

Figure 3. P-value distributions for different sample sizes, when studying a real correlation of 0.3. Each frame is 10 000 replications with a given sample size. If pic doesn’t show, click here for the gif (and/or try another browser).

The next animation zooms in on “significant” p-values in the same way as in figure 2 (though the largest bar goes off the roof quickly here). As you can see, it is almost impossible to get a p-value close to 5% with large power. Thus, there is no way we should “expect” a p-value over 0.01 when we replicate a real effect with large power. Very low p-values are always more probable than “barely significant” ones.

Figure 4. Zooming in on the “significant” p-values. It is more probable to get a very low p than a barely significant one, even with small samples. If pic doesn’t show, click here for the gif.

But what if there is no effect? In this case, every p-value is equally likely (see Figure 5). This means, that in the long run, getting a p = 0.01 is just as likely as getting a p = 0.97, and by implication, 5% of all p-values are under 0.05. Therefore, the number of studies that generated a p between 0.04 and 0.05, is 1%. Remember, how this percentage was 0.5% (five in a thousand) when the alternative hypothesis was true under 97.5% power? Indeed, when power is high, these “barely significant” p-values may actually speak for the null, not the alternative hypothesis! Same goes for e.g. p=0.024, when power is 99% [see here].

Figure 5. p-value distribution when the null hypothesis is true. Every p is just as likely as any other.

Consider the lottery machine analogy again. Does it make better sense now?

The lottery machine of real effects has disproportionately more low (e.g. < 0.01) values on the balls, while the lottery machine of null effects contains a “fair” distribution of numbers on balls (each number is as likely as any other).

Let’s look at one more visualisation of the same thing:

Figure 6. The percentages of “statistically significant” p-values evolving as sample size increases. If the gif doesn’t show, you’ll find it here.

Aside: when the effect one studies is enormous, sample size naturally matters less. I calculated Cohen’s d for the Asch 2 line segment study, and a whopping d = 1.59 emerged. This is surely a very unusual effect size in psychological experiments, and leads to high statistical power even under low sample sizes. In such a case, by the logic presented above, one should be extremely cautious of p-values closer to 0.05 than zero.

Understanding all this is vital in interpreting past research. We never know what the data generating system has been (i.e. are p-values extracted from a distribution under the null, or under the alternative), but the data gives us hints about what is more likely. Let us take an example from a social psychology classic, Moscovici’s “Towards a theory of conversion behaviour” 3. The article reviews results, which are then used to support a nuanced theory of minority influence. Low p-values are taken as evidence for an effect.

Based on what we learned earlier about the distribution of p-values under the null vs. the alternative, we can now see, under which hypothesis the p-values are more likely to occur. The tool to use here is called the p-curve 4, and it is presented in Figure 6.

Figure 6. A quick-and-dirty p-curve of Moscovici (1980). See this link for the data you can paste onto p-checker or p-curve.

You can directly see, how a big portion of p-values is in the 0.05 region, whereas you would expect them to cluster near 0.01. The p-curve analysis (from the p-curve website) shows that evidential value, if there is any, is inadequate (Z = -2.04, p = .0208). Power is estimated to be 5%, consistent with the null hypothesis being true.

The null being true may or may not have been the case here. But looking at the curve might have helped researchers, who spent some forty years trying to unsuccessfully replicate the pattern of Moscovici’s afterimage study results 5.

In a recent talk, I joked about a bunch of researchers who tour around holiday resorts every summer, making people fill in IQ tests. Each summer they keep the results which show p < 0.05 and scrap the others, eventually ending up in the headlines with a nice meta-analysis of the results.

Don’t be those guys.


Disclaimer: the results discussed here may not generalise to some more complex models, where the p-value is not uniformly distributed under the null. I don’t know much about those cases, so please feel free to educate me!

Code for the animated plots is here. It was inspired by code from Daniel Lakens, whose blog post inspired this piece. Check out his MOOC here. Additional thanks to Jim Grange for advice on gif making and Alexander Etz for constructive comments.


  1. Murdoch, D. J., Tsai, Y.-L. & Adcock, J. P-Values are Random Variables. The American Statistician 62, 242–245 (2008).
  2. Asch, S. E. Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological monographs: General and applied 70, 1 (1956).
  3. Moscovici, S. in Advances in Experimental Social Psychology 13, 209–239 (Elsevier, 1980).
  4. Simonsohn, U., Simmons, J. P. & Nelson, L. D. Better P-curves: Making P-curve analysis more robust to errors, fraud, and ambitious P-hacking, a Reply to Ulrich and Miller (2015). J Exp Psychol Gen 144, 1146–1152 (2015).
  5. Smith, J. R. & Haslam, S. A. Social psychology: Revisiting the classic studies. (SAGE Publications, 2012).

The legacy of social psychology

To anyone teaching psychology.

In this post I express some concerns about the prestige given to ‘classic’ studies, which are widely taught in undergraduate social psychology courses around the world. I argue that rather than just demonstrating a bunch of clever but dodgy experiments, we could teach undergraduates to evaluate studies for themselves. To exemplify this, I quickly demonstrate power, Bayes factors, the p-checker app and the GRIM test.

psychology’s foundations are built not of theory but with the rock of classic experiments

Christian Jarrett

Here is an out-of-context quote from Sanjay Srivastava from a while back:


This got me thinking about why and how we teach classic studies.

Psychologists usually lack the luxury of well-behaving theories. Some have thus proposed that the classic experiments, which have survived in the literature until the present, serve as the bedrock of our knowledge 1. In the introduction to a book retelling the stories of classic studies in social psychology 2, the authors note that classical studies have “played an important role in setting the research agenda for the field as it has progressed over time” and “serve as common points of reference for researchers, teachers and students alike”. The authors continue by pointing out that many of these classics lacked sophistication, but that this in fact is a feature of their enduring appeal, as laypeople can understand the “points” the studies make. Exposing the classics to modern statistical methods, would thus miss their point.

Now, this makes me wonder; if the point of a study is not to assess the existence of a phenomenon, what in the world may it be? One answer would be to serve as historical examples of practices no longer considered scientific, but I doubt this is what’s normally thought. Notwithstanding, I wanted to dip into the “foundations” of our knowledge by demostrating the use of some more-or-less recently developed tools on a widely known article. According to Google Scholar, the Festinger and Carlsmith cognitive dissonance experiment 3 has been cited for over three thousand times, so its influence is hard to downplay.


But first, a necessary digression: statistical power is the probability of detecting a “significant” effect of the postulated size, if the null hypothesis is false. As explained in Brunner & Schimmack 4, it is an interesting anomaly that the statistical power of studies in psychology is usually small, but almost all of them end up finding these “significant” results. As to how small, power doubtfully exceeds 50% 5–7, and for small (conventional?) effect sizes, the mean has been shown to be as low as 24%. As a recent replication project regarding the ego depletion effect 8 exemplified, a highly “replicable” (as judged by the published record) phenomenon may turn out to be a fluke, when null findings are taken into account. This has recently made psychologists consider the uncomfortable possibility, that entire research lines consisting of “accumulated scientific evidence” may in fact not contain that much evidence 9,10.

So, what is the statistical power of Festinger and Carlsmith? Using G*Power 11, it turns out that they had 80% chance to discover a humongous effect of d = 0.9, and only a coin flip’s probability to find a (still large) effect of d = 0.64. Now, if an underpowered study finds an effect, with current practices it is likely to be exaggerated, and/or even of the wrong sign 12. Here would be a nice opportunity to demonstrate these concepts to students.

Considering the low power, it may not come as a surprise that the evidence the study provided was low to begin with. A Bayes Factor (BF) is an indicator of evidence for one hypothesis, in relation to another. In this case, a BF of ~3 moves an impartial observer from being 50% sure the experiment works to being 75% sure, or a skeptic from being 25% sure to being 43% sure that the effect is small instead of nil.

It would be relatively simple to introduce Bayes Factors with this study. The effect of a prior scale in this case does not matter much for reasonable choices, as exemplified with a plot made in JASP with two clicks:

Figure 1: Bayes factor robustness check for the main finding of the dissonance study. Plotted by JASP, using n=20 for both groups, a t-value of 2.48 and a cauchy prior scale of 0.4.

Nowadays it is possible to easily check, whether a paper correctly reports test statistics and their associated p-values. The p-checker app (this link feeds the relevant statistics to the app) can do this, and it turns out that most of the t-values in the paper are incorrectly rounded down (assuming, that “significant at the 0.08 level” means p < 0.08). You can demonstrate this by including the link on your slides, using it to go to p-checker and choosing “p-values correct?”.

Finally, you can look at the study using the GRIM test 13, which evaluates if the reported means are mathematically possible. As it turns out, a quarter of the reported means in the table with the main results do not pass the test. One more time: 25% of the reported means are mathematically impossible. The most likely explanation for this is shoddy reporting of means or accidental misreporting of sample sizes, but I find it telling that—to my knowledge, at least—the issue has not come up in fifty years of scientific investigation.

Figure 2: Main results table of the Festinger & Carlsmith study. Circled means are mathematically impossible given the reported sample sizes.

Now, even though I have doubts about this study, as well as the process by which the theory has “evolved” 14, it does not mean that cognitive dissonance effects do not exist. It is just that the research may not have been able to capture the essence of this everyday phenomenon (which, if it exists, can influence behaviour without the help of academics). Under the traditional paradigm of psychological science, fraught with publication bias and unhelpful incentives 10, a Registered Replication Report (RRR) -type of work would be needed, and even that could only test one operationalisation. As an undergraduate, I would have been exhilarated to hear early about how and why such initiatives work, and why the approach is much more informative than any singular experiments.

Returning to the notion of the bedrock of psychology, consisting of classic experiments instead of theories as in the natural sciences 1. Perhaps we need a more solid foundation, regardless of whether some flashy findings from decades ago happened to spur out a progressive-ish 15,16 line of research.

How would such foundation come to be? Maybe teaching could play a role?


  1. Jarrett, C. Foundations of sand? The Psychologist 21, 756–759 (2008).
  2. Smith, J. R. & Haslam, S. A. Social psychology: Revisiting the classic studies. (SAGE Publications, 2012).
  3. Festinger, L. & Carlsmith, J. M. Cognitive consequences of forced compliance. The Journal of Abnormal and Social Psychology 58, 203–210 (1959).
  4. Brunner, J. & Schimmack, U. How replicable is psychology? A comparison of four methods of estimating replicability on the basis of test statistics in original studies. (2016).
  5. Button, K. S. et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci 14, 365–376 (2013).
  6. Cohen, J. Things I have learned (so far). American psychologist 45, 1304 (1990).
  7. Sedlmeier, P. & Gigerenzer, G. Do studies of statistical power have an effect on the power of studies? Psychological bulletin 105, 309 (1989).
  8. Hagger, M. S. et al. A multi-lab pre-registered replication of the ego-depletion effect. Perspectives on Psychological Science (2016).
  9. Earp, B. D. & Trafimow, D. Replication, falsification, and the crisis of confidence in social psychology. Front. Psychol 6, 621 (2015).
  10. Smaldino, P. E. & McElreath, R. The Natural Selection of Bad Science. arXiv preprint arXiv:1605.09511 (2016).
  11. Faul, F., Erdfelder, E., Lang, A.-G. & Buchner, A. G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods 39, 175–191 (2007).
  12. Gelman, A. & Carlin, J. Beyond Power Calculations Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science 9, 641–651 (2014).
  13. Brown, N. J. L. & Heathers, J. A. J. The GRIM Test: A Simple Technique Detects Numerous Anomalies in the Reporting of Results in Psychology. Social Psychological and Personality Science (2016). doi:10.1177/1948550616673876
  14. Aronson, E. in The science of social influence: Advances and future progress (ed. Pratkanis, A. R.) 17–82 (Psychology Press, 2007).
  15. Lakatos, I. History of science and its rational reconstructions. (Springer, 1971).
  16. Meehl, P. E. Appraising and amending theories: The strategy of Lakatosian defense and two principles that warrant it. Psychological Inquiry 1, 108–141 (1990).


How lack of transparency feeds the beast

This is a presentation I held for the young researchers branch of the Finnish Psychological Society. I show how low power and lack of transparency can lead to weird situations, where the published literature contains little or no knowledge.


We had big fun with Markus Mattsson and Leo Aarnio in a seminar, presenting to a great audience of eager young researchers.

The slides for my talk are here:

If you’re interested in more history and solutions, check out Felix Schönbrodt‘s slides here. Some pictures were made adapting code from a wonderful Coursera MOOC by Daniel Lakens. For Bayes, check out Alexander Etz‘s blog.

Oh, and for the monster analogy; this piece made me think of it.

Getting Started With Bayes

This post presents a Bayesian roundtable I convened for the EHPS/DHP 2016 health psychology conference. Slides for the three talks are included.

bayes healthpsych cover

So, we kicked off the session with Susan Michie and acknowledged Jamie Brown who was key in making it happen, but could not attend.


Robert West was the first to present, you’ll find his slides “Bayesian analysis: a brief introductionhere. This presentation gave a brief introduction to Bayes and how belief updating with Bayes Factors works.

I was the second speaker, building on Robert’s presentation. Here are slides for my talk, where I introduced some practical resources to get started with Bayes. The slides are also embedded below (some slides got corrupted by Slideshare, so the ones in the .ppt link are a bit nicer).

The third and final presentation was by Niall Bolger. In his talk, he gave a great example of how using Bayes in a multilevel model enabled him to incorporate more realistic assumptions and—consequently—evaporate a finding he had considered somewhat solid. His slides, “Bayesian Estimation: Implications for Modeling Intensive Longitudinal Data“, are here.

Let me know if you don’t agree with something (especially in my presentation) or have ideas regarding how to improve the methods in (especially health) psychology research!

A short intro to what’s up

In the slides below, I present what I’m currently (August 2016) up to in the health psychology front, and what I may be doing in the next couple of years, regarding employee well-being.



Note 1: If it’s not obvious, there are a lot of people besides me making these projects happen. I’m especially indebted to Nelli Hankonen, the principal investigator of both of them.

Note 2: The (Finnish only) web site of the Let’s Move It intervention is here.

Esittelyssä Raistlin Laplace

[See English version here.]


Raistlin Laplace on juuri saanut psykiatriltaan diagnoosin, jota hän istuutuu lukemaan keväisenä päivänä New Yorkin aurinkoisen Keskuspuiston penkille. Ohuet huulet tapailevat luisevien sormien pitelemää tuomiota: “Epistemologinen meluyliherkkyys”. Se liittyi jotenkin siihen, kuinka hahmoja (signaali) erotetaan melun (tai “kohinan”) keskeltä, kuinka esimerkiksi kanavien välille viritetty radio ei kerro paljoakaan soittolistojen laatijoiden musiikkimauista, koska melua on liikaa signaaliin nähden. Toisaalta ihmisaivot ovat hahmontunnistuskone vailla vertaa, ja voivat vaivatta havaita saatanallisia säkeitä takaperin soitetussa musiikissa tai Jeesuksen koiran anuksessa. Herra Laplacen ongelma oli käänteinen raamatusta salattuja koodeja etsivän väen tulokulmaan nähden; pakkomielteinen satunnaisuuden luomien illusoristen hahmojen välttäminen. Diagnoosi kävi tietyllä tavalla järkeen, mutta hän oli kauan sitten lakannut luottamasta asioihin, jotka kävivät järkeen.

Raistlin vietti lapsuutensa Napoli-nimisessä pikkukylässä Yhdysvalloissa. Se oli kuulostanut sopivan eurooppalaiselta hänen ranskalais-venäläisille siirtolaisvanhemmilleen, jotka halusivat tarjota ainoalle lapselleen suvaitsevaisen kasvuympäristön jostain vähemmän sotaisasta maankolkasta. Vasta muutettuaan heille valkeni, että Napoli oli tosiasiassa punaniskakylä, jossa vanhemmat käyttivät suuren osan päivästään työmatkoihin ja lapset verisiin tappeluihin naapurikylien nuorten kanssa.

Raistlin oli aina ollut olemukseltaan sairaalloinen, vaikkei juurikaan tavannut sairastaa. Hänen hintelä ja kalvakka ulkonäkönsä, sekä pistävän sinisten silmien ja pikimustan tukan luoma kontrasti sai alusta lähtien taikauskoiset vanhukset kuiskimaan. Tietoisena tästä, hän ala-asteikäisenä paikallisen kirjaston löydettyään huomasi nauttivansa suunnattomasti rajatiede-nurkkauksen kirjoista, ammentaen itseensä kaikkea okkultismista ja samanismista new age-niteisiin. Urheilusta – tai sen puoleen mistään muustakaan, mikä muita lapsia kiinnosti – hän ei koskaan välittänyt, ja olisikin kaikkein mieluiten vain halunnut viettää aikaa yksin kirjojensa parissa.

Ensimmäisen kerran hänen informaatiomaailmankaikkeutensa romahti, kun hän kaikkea kokeilleena joutui ylä-asteella myöntämään, etteivät rajatiede-nurkkauksen kirjojen rituaalit ja tekniikat toimineetkaan luvatusti. Kaikki aikuisten kirjoittama ei ollutkaan erehtymätöntä, kaikki informaatio ei sisältänytkään tietoa. Mutta tämä oli vasta alkua.

Eräänä New Yorkin syksyisenä sadepäivänä 28-vuotias Raistlin liimasi kiinni startup-yrityksensä konkurssihakemuksen sisältävän kirjekuoren ja mietti, mikä oli mennyt vikaan. Hän oli tehnyt kaiken oikein; lukenut oikeat kirjat, noudattanut menestysyritysten taktiikoita, kuunnellut satoja tunteja populaaripsykologiaa hyödyntäviä myyntikoulutusnauhoja, ottanut harkittuja riskejä ja tehnyt vuosia työtä periksiantamattomalla asenteella. Jossain vaiheessa rahat vain loppuivat ja –  velkojien hengittäessä niskaan – lisää ei tullut. Yksiönsä himmeässä valaistuksessa Raistlin hitaasti kasvavan kauhun vallassa pohti, mistä kirjailijat tiesivät niiden asioiden, joiden he väittivät tietävänsä todeksi, olevan totta? Erosivatko he tosiaan jollain tapaa hänestä itsestään, joka olisi voinut tällä hetkellä olla menestyvän teknologiayrityksen johtaja, mikäli vain muutama pikkuasia olisi sattunut menemään toisin?

Mies ei nukkunut sinä yönä. Hänen mielessään pyörivät ne lukemattomat tunnit, jotka hän oli viettänyt sanomalehtien parissa oppimatta mitään maailman toiminnasta. Uusista läpimurroista kertovat tiedeuutiset, joista kaikki olivat jälkeenpäin osoittautuneet ennenaikaisiksi; kaikki ne kirjat, joiden kirjoittajat luulivat kokemuksensa johtuvan satunnaistapahtumien sijaan omasta toiminnastaan ja kyvyistään.

Kaksi vuotta myöhemmin hän luki enää vain vertaisarvioituja tieteellisiä artikkeleita, kunnes matemaatikko-tilastotieteilijä John Ioannidisin artikkelista “Why most published research findings are false” seurannut keskustelu sai hänet sille kannalle, ettei tieteelliseenkään tietoon ole luottaminen. Informaation ja tiedon välinen suhde, josta hän oli ylä-asteella oppinut, alkoi muodostua hänelle pakkomielteeksi: Raistlin ei halunnut enää yhtään enempää informaatiota, hän janosi tietoa. Puhdas matematiikka todistettavissa olevine aksioomineen viimein tarjosi juuri tätä, ja hintelä oppimisaddiktimme paneutuikin siihen täysin rinnoin, päätyen sukulaisen suosituksen kautta pankkiin töihin. Hän asetti tavoitteekseen välttää kaikkea sellaista informaatiota, mikä ei ollut kosher – jos signaali oli heikko suhteessa meluun, mielen portit pysyivät visusti suljettuina.

Tästä seurasi odottamaton ongelma: mitä enemmän hän pyrki eristämään itsensä “hyödyttömältä hölynpölyltä”, sitä herkemmäksi hän sille tuli. Silloin harvoin kun hän enää käyskenteli ulkona, iltapäivälehtien shokkiotsikot tuntuivat vatsanpohjassa asti. Mainokset saivat hänet raivon valtaan. Hän alkoi myös välttelemään sosiaalisia tilanteita tajutessaan, kuinka helposti hyvät tarinat jäivät hänen mieleensä kummittelemaan. Hän työskenteli riskianalyytikkona, eikä halunnut alkaa pelätä lentokoneita, koska jonkun tutun tutun tuttu oli kokenut kauheita pakkolaskun tehneessä koneessa. Pakko-oireiden (kuten venäläisten matemaatikkojen nimien nopea peräkkäinen toistaminen jonkun perustellessa kantaansa anekdootein) pahetessa, Raistlinin huolestunut työnantaja ohjasi hänet ammattiavun piiriin.

Kevään voimistuvien auringonsäteiden lämmittämällä Keskuspuiston penkillä diagnoosiaan tarkasteleva Raistlin oli luvannut psykiatrilleen aloittaa terapiaryhmässä. Hänen ottamansa ahdistuslääkkeet olivat myös alkaneet tehota, mikä sai hänet ostamaan viereiseltä hodarikauppiaalta iltapäivälehden ja lukemaankin siitä pari sivua. Se ei tuntunut enää niin pahalta, suunnilleen yhtä järkevältä kuin hänen diagnoosinsakin; psykiatri oli selittänyt epistemologisen meluyliherkkyyden tarkoittavan tiedon alkuperään liittyvää ahdistusta siitä, ettei kohinan keskeltä löydykään signaalia, ja kuoleman hetkellä tajuaa eläneensä elämänsä reagoiden mielen melussa näkemiin aaveisiin, todellisten ilmiöiden sijaan.

Joitain tunteja myöhemmin nuori, koiraa ulkoiluttava opiskelija löysi ilokseen puistosta päivän lehden, jonka hän vei kotiinsa ja avasi murokulhon ääressä. Se näytti muuten lähes koskemattomalta, mutta usean artikkelin perään oli hyvin pienellä mutta varmalla käsialalla kirjoitettu: “Kolmogorov. Kolmogorov. Kolmogorov.

Introducing Raistlin Laplace

In this post, you meet Raistlin Laplace. You will hear more of him at a later time. Please find the Finnish version here.


Raistlin Laplace has just received a diagnosis from his psychiatrist. It’s a sunny day of early spring, as he sits down on a bench in New York’s Central Park and opens an envelope. His thin lips hesitate upon the judgement held in his bony fingers: “Epistemological Hypersensitivity”. It had something to do with how patterns are distinguished from the midst of noise. Like how a radio tuned in the middle of two channels doesn’t tell much of the DJs’ music taste; too much noise, too little signal. The human brain is a signal detection machine without comparison, as it can detect satanic verses in backwards-played metal music or see Jesus in a dog’s anus (has happened). But Mr. Laplace’s problem was at odds with the one of those who seek secret codes in the bible. He was obsessed with avoiding randomness-created illusory patterns. The diagnosis made sense in a way, but he had long ago given up trust in things that made sense.

Raistlin spent his youth in a small town called Naples in southwest Florida. It had sounded aptly European to his French-Russian immigrant parents, who wanted to offer their only child a more tolerant environment from a less war-prone part of the world. It wasn’t immediately clear that Naples was, in fact, a red-neck village where parents spent most of their days commuting, and children in bloody fights with the youngsters of nearby villages.

Raistlin had always had a sickly appearance, although he was seldom ill. He had a feeble posture and pale complexion, combined with the contrast between his icy blue eyes and jet-black hair. This was more than enough to make the superstitious elderly whisper. Knowing this, and upon discovering the local library in elementary school, he realised he took great delight in the books found at the corner marked occultism. He devoured everything from shamanism to theosophy and new age. Sports—or, to that matter, anything else which interested other children—he couldn’t care less about. Having a hard time fitting in, he would’ve most wanted just to spend time alone with his books.

The first time his information universe collapsed was, when in junior high, he had to admit that the rituals and techniques of the occult-corner didn’t work as promised. Everything adults wrote wasn’t unerring; all information wasn’t knowledge. But the shock waned quickly and little did he know that this was only the beginning.

On a rainy New York day, 28-year-old Raistlin sealed the envelope containing a bankruptcy application of his startup company. He pondered on what had gone wrong. He had done everything right; read all the right books, followed the strategies of highly successful companies, listened to hundreds of hours of popular psychology-inspired sales training tapes, taken educated risks, and for years worked with a relentless, never-give-up attitude. At some point the money just run out and, as creditors breathed down his neck, more wasn’t coming. In the dim lighting of his studio apartment, Raistlin felt horror escalate. How could those writers, who so confidently spew out facts of the world, actually know how things truly worked? Were they really different than him, who—had any of the myriad small things gone differently—could now well be the CEO of a highly successful tech company?

He didn’t sleep that night. He watched an agonising replay of all those hours he had spent reading newspapers without learning anything about how the world actually worked. All the popular science news touting great new discoveries, all of which had later turned out to be premature. All the books written by those who thought their success was caused by their own actions and aptitude, instead of random occurrences of serendipity.

Two years later he only read peer-reviewed scientific journals. That is, until the discussion which followed mathematician-statistician John Ioannidis’ article “Why most published research findings are false” persuaded him of the fallibility of the scientific method (outside of physics, at least). The relationship between information and knowledge he had learnt about in junior high, began forming as an obsession: Raistlin wanted no more information, and he hungered for knowledge. Pure mathematics with it’s provable axioms finally offered just this, and our bony learning addict delved into it. By a stroke of luck and a relative’s recommendation, he ended up working in a bank. He vowed to avoid all information which wasn’t kosher; if the signal-to-noise ratio was low, the gates of his mind remained sealed.

This resulted in an unexpected problem: the more he aspired to isolate himself from “useless nonsense”, the more sensitive to it he became. On those few days he strolled outside, the shock headlines at newspaper stands turned his stomach to knots. Advertisements filled him with outrage. He also started avoiding social situations when he realised how easy it was for good stories to get stuck in his brain. He worked as a risk analyst, and didn’t want to start fearing airplanes just because some acquaintance of an acquaintance had experienced dread during an emergency landing. Compulsions—like fast repetition of names of old Russian mathematicians, when someone used anecdotes to advocate a position—got worse and eventually his worried employer steered him towards professional help.

Rays of intensifying sunlight warmed up Raistlin’s bench and whispered promises of summer to the people wandering about Central Park. Raistlin had promised his psychiatrist to begin participating in a therapy group. The anxiety meds he took had also started to kick in, which made him buy a newspaper from a nearby hot dog stand and even read a couple of pages. It didn’t feel as bad anymore, perhaps about as sensible as his diagnosis. The psychiatrist had explained that epistemological hypersensitivity meant anxiety stemming from the origin of knowledge. That there wouldn’t be a signal in the noise. Fear of realising, upon the moment of one’s death, that he had spent his life reacting to ghosts the mind saw in the noise, instead of real phenomena.

Some hours later, a young student walking a dog in the park, to her delight stumbled upon a pristine newspaper. She took it home and opened it in front of a bowl of cereal. At first glance, the paper looked untouched. Only after reading several articles, she noticed some very small but resolute handwriting. In the margins, someone had written: “Kolmogorov. Kolmogorov. Kolmogorov.