Affordance Mapping to Manage Complex Systems: Planning a Children’s Party

I’ve recently followed with interest Dave Snowden’s development of “Estuarine Mapping”, also known as “Affordance Mapping”. The process is based on a complex systems framework to design and de-risk change initiatives (see link in the end of this post). After taking part in training sessions and facilitating some mapping exercises with groups, I found myself in want of a metaphor that didn’t require an understanding of coastal geography.

Enter the world of children’s parties. Snowden has a famous anecdote about organising a party for kids, which brilliantly illustrates the folly of applying traditional management techniques to complex systems. Inspired by this tale, I’ve reimagined it here as a simplified depiction of the Affordance Mapping process. So here we go.

Picture yourself tasked with organising a birthday bash for a group of energetic seven-year-olds. But instead of reaching for a conventional party-planning checklist, you decide to employ the Affordance Mapping process. What would you do?

First, you’d start by surveying the party landscape. You’d identify all the elements that could influence the party – from the near-immovable dining table to the ever-shifting moods of the kids. We’ll call these our party elements.

Next, you’d create a map of these elements. On one axis, you’d have how much energy it takes to change each element – moving the dining table would be high energy, while changing the music playlist would be low. On the other axis, you’d have how long it takes to make these changes – getting pizza delivered, or setting up a bouncy castle might take an hour, while changing a game rule could be instant.

Now, you’d draw a line in the top right corner. Everything above this line is beyond your control – things you absolutely can’t change, like the fact that Tommy’s allergic to peanuts. You’d also draw a second line for things that are outside your control, but amenable in collaboration with other parents, like how the party should end by 6 PM. You’d also mark a zone in the bottom left corner, for elements that change too easily and might need stabilising, like the kids’ attention spans or the volume level.

The result might look something like this:

The exciting part is the middle area. Here’s where you can actually make changes to improve the party; the things you can manage. But you can also try to make some elements more manageable via (de)stabilisation efforts, or remove some altogether.

For example, you might decide to:

  1. Keep some elements as they are (the classic musical chairs game)
  2. Remove others that aren’t fun (the complicated crafts project your spouse found on Pinterest)
  3. Modify some to make them more enjoyable (have kids organise themselves into a line arranged by height, when moving outdoors after the cake is done with)

You’d come up with small experiments to test these ideas. Maybe you’ll try introducing a new party game like “freeze dance” to alleviate boredom in waiting for transitions from one activity to the next, or rearranging the gift-opening area. You’d also think about how changing one element might affect others – will having a water balloon toss right before snack time lead to damp clothes?

Finally, you’d plan how to amplify emergent positive side-effects, and mitigate negative ones. You’ll also redraw your party map before next year’s party. This way, you’re always working towards a more fun and dynamic party, understanding that some elements will always be shifting (like the kids’ favorite songs) while others stay constant (like the need for cake).

Technical note. The items on the map, in the lingo of the complex systems philosopher Alicia Juarrero, represent “constraints“; things that modulate a system’s behaviour. In complex systems, these are intertwined in such deep ways, that their effects are seldom amenable to an analysis of linear causality. To change a system’s macro-level state, you execute multiple parallel micro-interventions that aim to affect these constraints. For a recent open access book chapter outlining the rationale, see here: As through a glass darkly: a complex systems approach to futures.

What Behaviour Change Science is Not

Due to frequent misconceptions about the topic, I wanted to outline a via negativa description of this thing called behaviour change science: in other words, what is it not? This is part of a series of posts clarifying the perspective I take in instructing a virtual course on behaviour change in complex systems at the New England Complex Systems Institute (NECSI). The course mixes behaviour change with complex systems science along with practical collaboration tools for making sense of the world in order to act in it.

Behaviour change science refers to an interdisciplinary approach, which often hails from social psychology, and studies changing human behaviour. The approach is motivated by the fact that many large problems we face today – be they about spreading misinformation, preventing non-communicable diseases, taking climate action, or preparing for pandemics – contain human action as major parts of both the problems and their solutions.

Based on many conversations regarding confusions around the topic, there is a need to clarify five points.

First, “behaviour change” in the current context is understood in a broad sense of the term, synonymous with human action, not as e.g. behaviourism. As such, it encompasses not only individuals, but also other scales of observation from dyads to small groups, communities and society at large. Social ecological models, for example, encourage us to think in such a multiscale manner, considering how individuals are embedded within larger systems. Methods for achieving change tend to differ for each scale; e.g. impacting communities entails different tools than impacting individuals (but we can also unify these scales). And people I talk to in behaviour change, understand action arises from interaction (albeit they may lack the specific terminology).

Second, the term intervention is understood in behaviour change context in a broader sense, than “nudges” to mess with people’s lives. A behaviour change intervention depicts any intentional change effort in a system, from communication campaigns to community development workshops and structural measures such as regulation and taxation. Even at the individual level, behaviour change interventions do not need to imply that an individual’s life is tampered with in a top-down manner; in fact, the best way to change behaviour is often to provide resources which enable the individual to act in better alignment with goals they have. Interventions can and do change environments that hamper those goals, or provide social resources and connections, which enable individuals to take action with their compatriots.

Third, behaviour change is not an activity taken up by actors standing outside the system that’s being intervened upon. Instead, best practices of intervention design compel us to work with stakeholders and communities when planning and implementing the interventions. This imperative goes back to Kurt Lewin’s action research, where participatory problem solving is combined with research activities. Leadership in social psychology is often defined not as the actions of a particular high-ranking role, but those available to any individuals in a system. Behaviour change practice is the same. To exaggerate only slightly: “Birds do it, bees do it, even educated fleas do it”.

Fourth, while interventions can be thought of as “events in systems”, some of which produce lasting effects while others wash away, viewing interventions as transient programme-like entities can narrow our thinking of how enablement of incremental, evolutionary, bottom-up behaviour change could optimally take place. Governance is, after all, conducted by local stakeholders in constant contact with the system, with larger leeway to adjust actions without fear of breaking evaluation protocol, and hopefully “skin in the game” perhaps long after intervention designers have moved on.

Fifth, nothing compels an intervention designer to infuse something novel into a system. For example, reverse translation studies what already works in practice, while aiming to learn how to replicate success elsewhere. De-implementation, on the other hand, studies what does not work, with the goal of removing practices causing harm. In fact, “Primum non nocere”; first, do no harm, is the single most important principle for behaviour change interventions .

Making sense of human action

Understanding and influencing human behavior is usually not a simple endeavor. Behaviors are shaped by a multitude of interacting factors across different scales, from the individual to the societal, and occur within systems of systems. Developing effective behavior change interventions requires grappling with this complexity. The approach taken in traditional behaviour change science uses behaviour change theories to make this complexity more manageable. I view these more akin to heuristic frameworks with practical utility – codification attempts of “what works for whom and when” – rather than theories in the natural science sense.

If you want a schematic of how I see behaviour change science, it might be something like the triangle below. It’s a somewhat silly representation, but what the triangle tries to convey, is that complex systems expertise sets out strategic priorities: Which futures should we pursue, and what kinds of methods make sense to get us going (key word is often evolution).

Behaviour change science, on the other hand, is much more tactical, offering tools and frameworks to understand how to make things happen closer to where the rubber hits the road.

But we will also go nowhere, unless we can harness collective intelligence of stakeholders and organisation / community members. This is why collaboration methods are essential. I will teach some of the ones I’ve found most useful in the course I mentioned in the intro.

If you want to learn more about the intersection of complex systems science and behaviour change, have a look at my Google Scholar profile, or see these posts:

Crafting Policies for an Interconnected World

This piece has been originally published as: Heino, M. T. J., Bilodeau, S., Fox, G., Gershenson, C., & Bar-Yam, Y. (2023). Crafting Policies for an Interconnected World. WHN Science Communications, 4(10), 1–1. https://doi.org/10.59454/whn-2310-348

While our knowledge expands faster than ever, our ability to anticipate and respond to global challenges or opportunities remains limited. A political upheaval in one country, a technological innovation in another, or an epidemic in a far-away city – any of these can create a global change cascade with many unexpected repercussions. Why is this? A significant part of the answer lies in our increased global connectivity, which produces both new risks and novel opportunities for collaborative action. 

In this rapidly evolving world, proactive and adaptive public policies are paramount, with a primary focus on human well-being, rights, and needs. The COVID-19 pandemic serves as a stark reminder that while traditional political and economic systems claim to represent public interests and allocate resources optimally, there’s often a gap between claim and reality. That people vote for political leaders doesn’t guarantee they will focus on public well-being or the availability of resources. A genuine human-centered focus on well-being, satisfaction, and quality of life becomes indispensable.

Reflecting on our pandemic response, mostly hierarchy-based and bureaucratic, we observed glaring operational shortcomings: delayed responses, disjointed actions, and ineffective execution of preparedness plans [1]. However, what has been less discussed is the insight that the crisis offers into the role of uncertainty due to nonlinear risks in shaping policy outcomes. 

Complex systems may present unseen, extreme risks that can spiral into catastrophic failures if left unaddressed early on. These failures can occur upon reaching instabilities and “tipping points,” that result in abrupt large-scale losses of well-being or resilience of a system, be it an ecosystem or a social system such as a nation [2–4]

The poor understanding of such non-linear risks is apparent through the ongoing  phases of the pandemic, where those who called for increased precaution were often accused of “fearmongering”. A misinterpretation of human reactions is a likely contributor: contrary to the common belief, people do not usually panic in emergencies. Instead, they tend to respond in constructive, cooperative ways, if given clear and accurate information. The widespread belief in a mass panic during disasters belongs to a group of misconceptions, studied in social psychology under the umbrella term of “disaster myths” [5–7]. The real danger lies in creating a false sense of security. If such a sense is shattered due to an unexpected event and lack of preparation, the fallout can be far more damaging in terms of physical, mental, and economic impact, not to mention loss of trust. Thus, the general recommendation for communication is to not downplay threats.  Instead, authorities need to offer the public clear information about potential risks and, crucially, guidance on how to prepare and respond effectively. This guidance has the potential to transform anxiety and passivity into positive self-organized action [8].

Human action lies at the core of many contemporary challenges, from climate change to public health crises. After all, it is human behavior – collective and nonlinear – that fuels the uncertainty of the modern world. The recognition of how traditional approaches can fall short in our increasingly volatile and complex contexts has led to increased demand for “strategic behavioral public policy” [9]

How can we advance our understanding of human behavior linked to instabilities and tipping points and turn them into capabilities for policy makers? The key is to understand how networks of dependencies between people link behaviors across a system. Complex systems science [10], as a field of study, involves understanding how different parts of a system interact with each other, creating emergent properties at multiple scales that cannot be predicted by studying the parts individually: There is no tsunami in a water molecule, no trusting relationship in an isolated interaction, no behavioral pattern in a single act, and no pandemic in an isolated infection [11]. Yet, the transformative potential of combining behavioral science with an understanding of complex systems science, a crucial tool for decision-making under uncertainty, remains largely untapped.

There are significant opportunities in weaving complex systems perspectives into human-centered public policy, infusing a deeper understanding of uncertainty into the heart of policy-making. A fusion of behavioral insights with an understanding of complex systems is not merely an intellectual exercise but a crucial tool for decision-making in crisis conditions and under uncertainty. As some examples:

  1. It urges us to prepare for uncommon events, like pandemics with impacts surpassing those of major conflicts like World War II. This realization comes as we discover that what would be extremely rare events in isolated systems, can become relatively frequent in an interconnected world [12–14]. A long-standing example is how economic crises, which many experts considered rare enough to be negligible, have repeatedly caught us off-guard.
  2. It emphasizes the importance of adaptability in seizing unforeseen opportunities and minimizing potential damages. Central to this adaptability is the concept of “optionality.” This means maintaining a broad array of choices and opportunities, allowing for increased adaptability and selective application based on evolving circumstances. Recognizing that we cannot anticipate every twist and turn of the future, our best approach is indeed to embrace evolutionary strategies; creating systems that effectively solve problems, instead of trying to solve each unique problem separately [15]. An important takeaway is that instead of over-optimizing for current conditions, investing in buffers and exploration – even if they seem redundant – becomes vital when the future is uncertain.
  3. It empowers us to distribute decision-making power to collaborative teams. This is because teams can solve many more high complexity problems than individuals can, and significant portions of the modern world are becoming too complex for even the most competent individuals to fully grasp [16,17].

However, integrating these insights is easier said than done. The shift requires significant capacity building among policymakers. It begins with understanding why novel approaches are necessary, and ensuring the adequate systems for preparedness are empowered. Training programs can help policymakers grasp the concepts of risk, uncertainty, and complex systems.

Developing human-centric policies under uncertainty

One recent training to improve competence in behavioral and complex systems insights [18], emphasized three factors of the policy development process: co-creation, iteration, and creativity. These are briefly outlined below.

  • Co-creation: Ideal teams addressing complex challenges have members with a diversity of backgrounds and expertise, where everyone is able to contribute their knowledge to shared action. Much can be achieved by limiting the influence of hierarchy and enabling interaction between team members and other stakeholders; formal approaches include e.g. the implementation of “red teams” [19]. Those who are most impacted by the plans, need to play a key role in the process. They are often citizens, who can provide critical information and expertise about the local environment [20,21].
  • Iteration: Mistakes naturally occur as an intrinsic part of gaining experience, developing the ability to tackle complex challenges, and building organizations to address them.  In general, ideas and systems for responding to complex contexts need to be allowed to evolve through (parallel) small-scale experiments and feasibility tests in real-world contexts. Feasibility testing should leverage the aforementioned optionality, retaining the ability to roll back in case of unforeseen negative consequences – or to amplify positive aspects that are only revealed upon observing how the plan interacts with its context [21,22]
  • Creativity: Excessive fear and stress impede innovation. If the design process is science-based, inclusive, and supports learning from weaknesses revealed by iterative explorations that can safely fail, we need not be afraid to try something different or outside of the box. In fact, this is where the most innovative solutions often come from.

Drawing on our earlier discussion on complex systems and human behavior, we understand that in the face of sudden threats, there is a critical need for nimbleness. Rapid response units, representing the frontline of our defense, should possess the autonomy to act, unencumbered by political hindrances. An example would be fire departments’ autonomy to respond to emergencies within pre-set and commonly agreed-upon protocols. The lessons from the pandemic and the insights from complex systems thinking underscore this. But how do we reconcile swift action with informed decision-making?

Transparent, educated communication, and trust based on the experience of success, can potentially bridge this gap. Science is how we understand the consequences of actions, and selecting the best consequences is essential for global risks. By ensuring policymakers and the public are informed and aligned, we can address risks head-on, anchored in commonly-held values and backed by science. As we lean into the practices discussed earlier, such as co-creation and iteration, our mindset too must evolve. Embracing new, sometimes unconventional, approaches will enable us to sidestep past policy pitfalls, especially those painfully highlighted by recent global events. Protecting rapid response teams from political interference upgrades our societal apparatus to confront the multifaceted challenges of our time. 

Learning anticipatory adaptation

Our ultimate aim is clear: proactivity. Rather than reacting once harm is done, we need to anticipate, adapt, and equip policymakers with the necessary insights and tools using a multidisciplinary approach that includes behavioral and complexity sciences. We can respond to the unpredictable, ensuring society is robust and resilient. This necessitates a collective call-to-action, urging citizens and organizations to develop institutions and inform policy makers to empower communities to thrive amidst uncertainties.

Bibliography

[1] Heino MT, Bilodeau S, Bar-Yam Y, Gershenson C, Raina S, Ewing A, et al. Building Capacity for Action: The Cornerstone of Pandemic Response. WHN Sci Commun 2023;4:1–1. https://doi.org/10.59454/whn-2306-015.

[2] Scheffer M, Bolhuis JE, Borsboom D, Buchman TG, Gijzel SMW, Goulson D, et al. Quantifying resilience of humans and other animals. Proc Natl Acad Sci 2018:201810630. https://doi.org/10/gfqjqr.

[3] Heino M, Proverbio D, Resnicow K, Marchand G, Hankonen N. Attractor landscapes: A unifying conceptual model for understanding behaviour change across scales of observation 2022. https://doi.org/10.31234/osf.io/3rxyd.

[4] Scheffer M, Borsboom D, Nieuwenhuis S, Westley F. Belief traps: Tackling the inertia of harmful beliefs. Proc Natl Acad Sci 2022;119:e2203149119. https://doi.org/10.1073/pnas.2203149119.

[5] Clark DO, Patrick DL, Grembowski D, Durham ML. Socioeconomic status and exercise self-efficacy in late life. J Behav Med 1995;18:355–76. https://doi.org/10/bjddw6.

[6] Drury J, Novelli D, Stott C. Psychological disaster myths in the perception and management of mass emergencies: Psychological disaster myths. J Appl Soc Psychol 2013;43:2259–70. https://doi.org/10.1111/jasp.12176.

[7] Drury J, Reicher S, Stott C. COVID-19 in context: Why do people die in emergencies? It’s probably not because of collective psychology. Br J Soc Psychol 2020;59:686–93. https://doi.org/10/gg3hr4.

[8] Orbell S, Zahid H, Henderson CJ. Changing Behavior Using the Health Belief Model and Protection Motivation Theory. In: Hamilton K, Cameron LD, Hagger MS, Hankonen N, Lintunen T, editors. Handb. Behav. Change, Cambridge: Cambridge University Press; 2020, p. 46–59. https://doi.org/10.1017/9781108677318.004.

[9] Schmidt R, Stenger K. Behavioral brittleness: the case for strategic behavioral public policy. Behav Public Policy 2021:1–26. https://doi.org/10.1017/bpp.2021.16.

[10] Siegenfeld AF, Bar-Yam Y. An Introduction to Complex Systems Science and Its Applications. Complexity 2020;2020:6105872. https://doi.org/10/ghthww.

[11] Heino MTJ. Understanding and shaping complex social psychological systems: Lessons from an emerging paradigm to thrive in an uncertain world 2023. https://doi.org/10.31234/osf.io/qxa4n.

[12] Cirillo P, Taleb NN. Tail risk of contagious diseases. Nat Phys 2020;16:606–13. https://doi.org/10/ggxf5n.

[13] Rauch EM, Bar-Yam Y. Long-range interactions and evolutionary stability in a predator-prey system. Phys Rev E 2006;73:020903. https://doi.org/10/d9zbc4.

[14] Taleb NN. Statistical Consequences of Fat Tails: Real World Preasymptotics, Epistemology, and Applications. Illustrated Edition. STEM Academic Press; 2020.

[15] Bar-Yam Y. Engineering Complex Systems: Multiscale Analysis and Evolutionary Engineering. In: Braha D, Minai AA, Bar-Yam Y, editors. Complex Eng. Syst. Sci. Meets Technol., Berlin, Heidelberg: Springer; 2006, p. 22–39. https://doi.org/10.1007/3-540-32834-3_2.

[16] Bar-Yam Y. Why Teams? N Engl Complex Syst Inst 2017. https://necsi.edu/why-teams (accessed August 9, 2023).

[17] Bar-Yam Y. Complexity rising: From human beings to human civilization, a complexity profile. Encycl Life Support Syst 2002.

[18] Hankonen N, Heino MTJ, Saurio K, Palsola M, Puukko S. Developing and evaluating behavioural and systems insights training for public servants: a feasibility study. Julkaisematon Käsikirjoitus 2023.

[19] UK Ministry of Defence (MOD). Red Teaming Handbook. GOVUK 2021. https://www.gov.uk/government/publications/a-guide-to-red-teaming (accessed August 9, 2023).

[20] Tan Y-R, Agrawal A, Matsoso MP, Katz R, Davis SLM, Winkler AS, et al. A call for citizen science in pandemic preparedness and response: beyond data collection. BMJ Glob Health 2022;7:e009389. https://doi.org/10.1136/bmjgh-2022-009389.

[21] Joint Research Centre, European Commission, Rancati A, Snowden D. Managing complexity (and chaos) in times of crisis: a field guide for decision makers inspired by the Cynefin framework. Luxembourg: Publications Office of the European Union; 2021.

[22] Skivington K, Matthews L, Simpson SA, Craig P, Baird J, Blazeby JM, et al. A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance. BMJ 2021;374:n2061. https://doi.org/10.1136/bmj.n2061.

What Does “Behaviour Change Science” Study?

This is an introductory post about this paper. The paper introduces to the object of study in “behaviour change science”, i.e. complex systems – which include most human systems from individuals to communities and nations.

In a health psychology conference many years ago (when we still travelled for that sort of thing), I wandered into the conference venue a bit late, and the sessions had already started. There was just one other person in the hallways, looking a bit lost. I was scared to death of another difficult-to-escape presentation cavalcade about how someone came up with p-values under 0.05, so I made some joke about our confusion and ended up preventing his attendance, too. Turned out he was a physicist recently hired in a behavioural medicine research group, sent to the conference to get his first bearings about the field. Understandably, he was confused with a hint of distraught: “I don’t understand a word about what these people talk about. And I’ve been to several sessions already without having seen a single equation!” (nb. if you don’t think this is funny, you’re probably not a social scientist.)

Given that back then I was finding my first bearings on network science, we had a lot to talk about during the rest of the conference. I don’t remember much about the conference, but I remember him making an excellent point about learning: The best way to learn anything is to talk to someone who’s just learned about the thing. While not yet mega-experts, they still have an idea of where you stand, and can hence make things much more understandable than those, who already swim in a sea of concepts unfamiliar to you.

In a recent paper about behaviour change as a topic of research, we tried to do exactly this. I know I’m crossing the chasm where I’m not yet the mega-expert, but am already losing the ability to see what people in my field find hard to grasp. I presented the paper in a research seminar and people found it quite challenging, but on the other hand, I’ve never seen such ultra-positivity from reviewers. So maybe it’s helpful to some.

This impeccably written manuscript provides a thorough, state-of-the-art review of complex adaptive systems, particularly in the context of behavior change, and it does an excellent job explaining difficult concepts.

– Reviewer 2

Here’s a quick test to see if it might be valuable to you. Have a look at this table, and if you think all is clear, you can skip the piece with good conscience:

I also made a video introduction to the topic. If you’re in a rush, you can just run through a pdf of the slides.

If you’re in an even bigger rush, the picture below gives a quick synopsis. To find out more, check out this post. You might also be interested in What Behaviour Change Science is Not.

Complexity perspectives on behaviour change interventions

I had the great pleasure to be involved in organising a symposium on the topic of my dissertation. Many if not most societal problems are both behavioural and complex; hence the speakers’ backgrounds varied from systems science, and psychology to social work and physics. Below is a list of video links along with a short synopsis of the talks. See here for other symposia in the Behaviour Change Science and Policy series.

A live-tweeting thread on 1st day here, 2nd day (not including presentations by me, Nanne Isokuortti or Ira Alanko) here. See here for the official web page, and here for the YouTube playlist!

Nelli Hankonen: Opening words & introduction to the Behaviour Change Science & Policy (BeSP) project

  • See here for videos of previous symposia (I: Intervention evaluation & field experiments; II: Behavioural insights in developing public policy and interventions; III: Reverse translation: Practice-based evidence; IV: Creating real-world impact: Implementation and dissemination of behaviour change interventions)

Marijn de Bruijn: Integrating Behavioural Science in COVID-19 Prevention Efforts – The Dutch Case

  • Behaviour change efforts for COVID-19 protective behaviours are operations on a complex system’s user experience: A virus is the problem, but behaviour is the solution.
  • Knee-jerk communication responses of health officials can be improved upon by using methods derived from what works in real-world behavioural science interventions.
  • Protective behaviours entail feedback dynamics: for example, crowding leads to difficulty maintaining distance, which leads to perceiving that others don’t consider it important, which leads to more crowding, etc.

Nelli Hankonen: Why is it Useful to Consider Complexity Insights in Behaviour Change Research?

  • Complexity-informed approaches to intervention have been around for a long time, but only recently analytical methodology has become widely available.
  • There are important differences between “complicated” and “complex” behavioural interventions.
  • By not taking the complexity perspective into account, we may be missing opportunities to properly design interventions.

Olli-Pekka Heinonen: Complexity-Informed Policymaking

  • If a civil servant wants to be effective, maximum control doesn’t work – even what constitutes “progress” can be difficult to ascertain.
  • Systems, such as the society, move: what worked yesterday, might not work today.
  • Hence continuous learning, adaptation and experimenting are not optional for societal decision-making.

Gwen Marchand: Complexity Science in the Design and Evaluation of Behaviour Interventions

  • What does it mean to define behavior and behavior change from a complex systems perspective?
  • Focal units and well-defined timescales are key considerations for design and research of intervention 
  • Context acts to constrain and afford possible states for behavior change related to intervention

Jari Saramäki: How do Behaviours, Ideas, and Contagious Diseases Spread Through Networks?

  • People are embedded in networks that influence their behaviour and health
  • Network structure – how the networks’ links are organized – strongly affects this influence
  • Interventions that modify network structure can be used to promote or hinder the spread of influence or contagion.

Matti Heino: Studying Complex Motivation Systems – Capturing Dynamical Patterns of Change in Data from Self-assessments and Wearable Technology

  • Analysis of living beings involves addressing interconnected, turbulent processes that vary across time.
  • Recruiting less individuals and collecting more data on fewer variables, may be a considerably beneficial tradeoff to better understand dynamics of a psychological phenomenon.
  • Methods to deal with such data include building networks of networks (multiplex recurrence networks) and assessing early warning signals of sudden gains or losses.

If you’re interested in the links, download my slides here. I actually forgot to show what a multiplex network of variables combined from several theories looks like (you don’t condition on all other variables, so you can combine stuff from different frameworks without the meaning of the variables changing, as in a regression-based analysis). Anyway, it looks like this:

A single person’s multiplex recurrence network, i.e. a network of recurrence networks of work motivation variables queried daily for 30+ days. Colored connectors are relationships which can’t be attributed to randomness.

Nanne Isokuortti: From Exploration to Sustainment – Understanding Complex Implementation in Public Social Services

  • Illustrate the complexity in an implementation process with a real-world case example
  • Introduce Exploration, Preparation, Implementation, and Sustainment (EPIS) Framework
  • Provide suggestions how to aid implementation in complex settings

Ira Alanko: The AuroraAI Programme

  • The Finnish public sector is taking active steps to utilise AI to make using of  services easier
  • AI has opened a window for a systemic shift towards human-centricity in Finland
  • The AuroraAI-network is a collection of different components, not a platform or collection of chatbots

Daniele Proverbio: Smooth or Abrupt? How Dynamical Systems Change Their State

  • Natural phenomena don’t necessarily follow smooth and linear patterns while evolving.
  • Abrupt changes are common in complex, non-linear systems. These are arguably the future of scientific research.
  • There exist a limited number of transition classes. Understanding their main drivers could lead to useful insights and applications.

Ken Resnicow:  Behavior Change is a Complex Process. How does that impact theory, research and practice?

  • Behavior change is a complex, non linear process.
  • Sudden change is more enduring than gradual change.
  • Failure to replicate prior interventions can be understood from a complexity lens.

(nb. on the last talk: personally, I’m not a huge fan of mediation analysis, moderated or otherwise. Stay tuned for an interview where I discuss the topic at some length with Fred Hasselman)

Notes from the symposium by Grace Lau

Their mean doesn’t work for you

In this post, I present a property of averages I found surprising. Undoubtedly this is self-evident to statisticians and people who can think multi-variately, but personally I needed to see it to get a grasp of it. If you’re a researcher, make sure you do the single-item quiz before reading, to see how well your intuitions compare to those of others!

UPDATE: The finding regarding average intervention participants’ prevalence is published in this paper, in case you want a citable reference for it.

Ooo-oh! Don’t believe what they say is true
Ooo-oh! Their system doesn’t work for you
Ooo-oh! You can be what you want to be
Ooo-oh! You don’t have to join their f*king army

– Anti-Flag: Their System Doesn’t Work For You

In his book “The End of Average”, Todd Rose relates a curious story. In the late 1940s, the US Air Force saw a lot of planes crashing, and those crashes couldn’t be attributed to pilot error nor equipment malfunction. On one particularly bad day, 17 pilots crashed without an obvious reason. As everything from cockpits to helmets had been built to conform to the average pilot of the 1926, they brought in Lt. Gilbert Daniels to see if pilots had gotten bigger since then. Daniels measured 4063 pilots—who were preselected to not deviate from the average too much—on ten dimensions: height, chest circumference, arm length, thigh circumference, and so forth.

Before Daniels began, the general assumption was, that these pilots were mostly if not exclusively average, and Daniels’ task was to find the most accurate point estimate. But he had a more fundamental idea in mind. He defined “average” generously as person who falls within the 30% band around the middle, i.e. the median ±15%, and looked at whether each individual fulfills that criterion for all the ten bodily dimensions.

So, how big a proportion of pilots were found to be average by this metric?

Zero.

averageman clip1
Daniels, Gilbert S. “The” Average Man”?” AIR FORCE AEROSPACE MEDICAL RESEARCH LAB WRIGHT-PATTERSON AFB OH, 1952.

This may be surprising, until you realise that each additional dimension brings with it a new “objective”, making it less likely that someone achieves all of them. But actually, only a fourth were average on a single dimension, and already less than ten percent were average on two dimensions.

As you saw in the quiz, I wanted to figure out how big a proportion of our intervention participants could be described as “average” by Daniels’ definition, on four outcome measures. The answer?

A lousy 1.5 percent.

I’m a bit slow, so I had to do a of simulation to get a better grasp of the phenomenon (code here). First, I simulated 700 intervention participants, who were hypothetically measured on four random, uncorrelated, normally distributed variables. What I found was that 0.86 % of this sample were “average” by the same definition as before. But what if we changed the definition?

Here’s what happens:

averageman uncorrelated

As you can see, you’ll describe more than half of the sample only when you extend the definition of “average” to about the middle 85% percent (i.e. median ±42.5%).

But what if the variables were highly correlated? I also simulated 700 independent participants with four variables, which were correlated almost perfectly (within-individual r = 0.99) with each other. Still, only 22.9 % percent of participants were described by defining average as the middle 30% around the median. For other definitions, see the plot below.

averageman correlated

What have we learned? First of all: When you see averages, do not go assuming that they describe individuals. If you’re designing an intervention, you don’t just want to see which determinants correlate highly with the target behaviour on average, or seem changeable in the sense that the mean on those variables is not very high to begin with in your target group (see the CIBER approach, if you’re starting from scratch and want to get a preliminary handle on the data). This, because a single individual is unlikely to have the average standing on more than, say, two of the determinants, and individuals are who you’re generally looking to target. One thing you could do, is a cluster analysis where you’d look for the determinant profile, which is best associated with e.g. hospital visits (or, attitude/intention), and try to target the changeable determinants within that.

As a corollary: If you, your child, or your relationship doesn’t seem to conform to the dimensions of an average person in your city, or a particular age group, or whatever, this is completely normal! Whenever you see yourself falling behind the average, remember that there are plenty of dimensions where you land above it.

But wait, what happened to USAF’s problem of planes crashing? Well, the air force told the plane manufacturers to fix the problem of cockpits which don’t fit any individuals. The manufacturers said it was impossible and extremely costly. But when the air force said didn’t listen to excuses, cheap and easy solutions appeared quickly. Adjustable seats—now standard equipment in cars—are an example of the new design philosophy of individual fit, where we don’t try to fit the individual to the system, but the system to the individual.

Let us conclude with Daniels’ introduction section:

averageman clip2

Three additional notes about the average:

Note 1: Here’s a very nice Google Talks presentation of this and extended topics!

Note 2: There’s a curious tendency to think that deviations from the average represent “error” regardless of domain, whereas it’s self-evident that individuals can survive both if they’re e.g. big and bulky, or small and fast. With psychological measurement, is it not madness to think all participants have an attitude score, which comes from a normal distribution with a common mean for all participants? To inject reality in the situation, each participant may have their own mean, which changes over time. But that’s a story for another post.

Note 3:  I’m taking it for granted, that we already know that the average is a useless statistic to begin with, unless you know the variation around the average, so I won’t pound on that further. But remember that variables generally aren’t perfectly normally distributed, as in the above simulations; my guess is that the situation would be even worse in those cases. Here’s a blog post you may want to check out: On Average, You’re Using the Wrong Average.

Note 4: Did I already say, that you generally shouldn’t make individual-level conclusions based on between-individual data, unless ergodicity holds (which, in psychology, would be quite weird)? See short video here!

 

Is it possible to unveil intervention mechanisms under complexity?

In this post, I wonder what complex systems, as well as the nuts and bolts of mediation analysis, imply for studying processes of health psychological interventions.

Say we make a risky prediction and find an intervention effect that replicates well (never mind for now that replicability is practically never tested in health psychology). We could then go on to investigating boundary conditions and intricacies of the effect. What’s sometimes done is a study of “mechanisms of action”, also endorsed by the MRC guidelines for process evaluation (1), as well as the Workgroup for Intervention Development and Evaluation Research (WIDER) (2). In such a study, we investigate whether the intervention worked as we thought it should have worked (in other words, to test the program theory; see previous post). It would be spectacularly useful to decision makers, if we could disentangle the mechanisms of the intervention; “by increasing autonomy support, autonomous motivation goes up and physical activity ensues”. But attempting to evaluate this opens a spectacular can of worms.

Complex interventions include multiple interacting components, targeting several facets of a behaviour on different levels of the environment the individual operates in (1). This environment itself can be described as a complex system (3). In complex, adaptive systems such as the society or a human being, causality is thorny issue (4): Feedback loops, manifold interactions between variables over time, path-dependence and sensitivity to initial conditions make it challenging at best to state “a causes b” (5). But what does it even mean to say something causes something else?

Bollen (6) presents three conditions for causal inference: isolation, association and direction. Isolation means that no other variable can reasonably cause the outcome. This is usually impossible to achieve strictly, which is why researchers usually aim to control for covariates and thus reach a condition of pseudo-isolation. A common, but not often acknowledged problem is overfitting; adding covariates to a model leads to also fitting the measurement error they carry with them. Association means there should be a connection between the cause and the effect – in real life, usually a probabilistic one. In social sciences, a problem arises as everything is more or less correlated with everything else, and high-dimensional datasets suffer of the “curse of dimensionality”. Direction, self-evidently, means that the effect should flow from one direction to the other, not the other way around. This is highly problematic in complex systems. For an example in health psychology, it seems obvious that depression symptoms (e.g. anxiety and insomnia) feed each other, resulting in self-enforcing feedback loops (7).

When we consider the act of making efficient inferences, we want to be able to falsify our theories of the world (9); something that’s only recently really starting to be understood among psychologists (10). An easy-ish way about this, is to define the smallest effect size of interest (SESOI) a priori, ensure one has proper statistical power and attempt to reject the hypotheses that effects are larger than the upper bound of the SESOI, and lower than the lower bound. This procedure, also known as equivalence testing (11) allows for rejecting the falsification of statistical hypotheses in situations, where a SESOI can be determined. But when testing program theories of complex interventions, there may be no such luxury.

The notion of non-linear interactions with feedback loops makes the notion of causality in a complex system an evasive concept. If we’re dealing with complexity, it is a situation where even miniscule effects can be meaningful when they interact with other effects: even small effects can have huge influences down the line (“the butterfly effect” in nonlinear dynamics; 8). It is hence difficult to determine the SESOI for intermediate links in the chain from intervention to outcome. And if we only say we expect an effect to be “any positive number”, this leads to the postulated processes, as described in intervention program theories, being unfalsifiable: If a correlation of 0.001 between intervention participation and a continuous variable would corroborate a theory, one would need more than six million participants to detect it (at 80% power and an alpha of 5%; see also 12, p. 30). If researchers are unable to reject the null hypothesis of no effect, they cannot determine whether there is evidence for a null effect, or if a more elaborate sample was needed (e.g. 13).

Side note: One could use Bayes factors to compare whether a point null data generator (effect size being zero) would predict the data better than, for example, an alternative model where most effects are near zero but half of them over d = 0.2. But still, the smaller effects you consider potentially important, the less the data can distinguish between alternative and null models. A better option could be to estimate, how probable it is that the effect has a positive sign (as demonstrated here).

In sum, researchers are faced with an uncomfortable trade-off: Either they must specify a SESOI (and thus, a hypothesis) which does not reflect the theory under test or, on the other hand, unfalsifiability.

A common way to study mechanisms is to conduct a mediation analysis, where one variable’s (X) impact on another (Y) is modelled to pass through a third variable (M). In its classical form, one expects the path X-Y to go near zero, when M is added to the model.

The good news is, that nowadays we can do power analyses for both simple and complex mediation models (14).  The bad news is, that in the presence of randomisation of X but not M, the observed M-Y relation entails strong assumptions which are usually ignored (15). Researchers should e.g. justify why there exist no other mediating variables than the ones in the model; leaving variables out is effectively the same as assuming their effect to be zero. Also, the investigator should demonstrate why no omitted variables affect both M and Y – if there are such variables, the causal effect may be distorted at best and misleading at worst.

Now that we know it’s bad to omit variables, how do we avoid overfitting the model (i.e. be fooled by looking too much into what the data says)? It is very common for seemingly supported theories to fail to generalise to slightly different situations or other samples (16), and subgroup claims regularly fail to pan out in new data (17). Some solutions include ridge regression in the frequentist framework and regularising priors in the Bayesian one, but the simplest (though not the easiest) solution would be cross-validation. In cross-validation, you basically divide your sample in two (or even up to n) parts, use the first one to explore and the second one to “replicate” the analysis. Unfortunately, you need to have a large enough sample so that you can break it down to parts.

What does all this tell us? Mainly, that investigators would do well to heed Kenny’s (18) admonition: “mediation is not a thoughtless routine exercise that can be reduced down to a series of steps. Rather, it requires a detailed knowledge of the process under investigation and a careful and thoughtful analysis of data”. I would conjecture that researchers often lack such process knowledge. It may also be, that under complexity, the exact processes become both unknown and unknowable (19). Tools like structural equation modelling are wonderful, but I’m curious if they are up to the task of advising us about how to live in interconnected systems, where trends and cascades are bound to happen, and everything causes everything else.

These are just relatively disorganised thoughts, and I’m curious to hear if someone can shed hope to the situation. Specifically, hearing of interventions that work consistently and robustly, would definitely make my day.

ps. If you’re interested in replication matters in health psychology, there’s an upcoming symposium on the topic in EHPS17 featuring Martin Hagger, Gjalt-Jorn Peters, Rik Crutzen, Marie Johnston and me. My presentation is titled “Disentangling replicable mechanisms of complex interventions: What to expect and how to avoid fooling ourselves?

pps. A recent piece in Lancet (20) called for a complex systems model of evidence for public health. Here’s a small conversation with the main author, regarding the UK Medical Research Council’s take on the subject. As you see, the science seems to be in some sort of a limbo/purgatory-type of place currently, but smart people are working on it so I have hope 🙂

complexity rutter twitter.PNG

https://twitter.com/harryrutter/status/876219437430517761

 

Bibliography

 

  1. Moore GF, Audrey S, Barker M, Bond L, Bonell C, Hardeman W, et al. Process evaluation of complex interventions: Medical Research Council guidance. BMJ. 2015 Mar 19;350:h1258.
  2. Abraham C, Johnson BT, de Bruin M, Luszczynska A. Enhancing reporting of behavior change intervention evaluations. JAIDS J Acquir Immune Defic Syndr. 2014;66:S293–S299.
  3. Shiell A, Hawe P, Gold L. Complex interventions or complex systems? Implications for health economic evaluation. BMJ. 2008 Jun 5;336(7656):1281–3.
  4. Sterman JD. Learning from Evidence in a Complex World. Am J Public Health. 2006 Mar 1;96(3):505–14.
  5. Resnicow K, Page SE. Embracing Chaos and Complexity: A Quantum Change for Public Health. Am J Public Health. 2008 Aug 1;98(8):1382–9.
  6. Bollen KA. Structural equations with latent variables. New York: John Wiley. 1989;
  7. Borsboom D. A network theory of mental disorders. World Psychiatry. 2017 Feb;16(1):5–13.
  8. Hilborn RC. Sea gulls, butterflies, and grasshoppers: A brief history of the butterfly effect in nonlinear dynamics. Am J Phys. 2004 Apr;72(4):425–7.
  9. LeBel EP, Berger D, Campbell L, Loving TJ. Falsifiability Is Not Optional. Accepted pending minor revisions at Journal of Personality and Social Psychology. [Internet]. 2017 [cited 2017 Apr 21]. Available from: https://osf.io/preprints/psyarxiv/dv94b/
  10. Morey R D, Lakens D. Why most of psychology is statistically unfalsifiable. GitHub [Internet]. in prep. [cited 2016 Oct 23]; Available from: https://github.com/richarddmorey/psychology_resolution
  11. Lakens D. Equivalence Tests: A Practical Primer for t-Tests, Correlations, and Meta-Analyses [Internet]. 2016 [cited 2017 Feb 24]. Available from: https://osf.io/preprints/psyarxiv/97gpc/
  12. Dienes Z. Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference. Palgrave Macmillan; 2008. 185 p.
  13. Dienes Z. Using Bayes to get the most out of non-significant results. Quant Psychol Meas. 2014;5:781.
  14. Schoemann AM, Boulton AJ, Short SD. Determining Power and Sample Size for Simple and Complex Mediation Models. Soc Psychol Personal Sci. 2017 Jun 15;194855061771506.
  15. Bullock JG, Green DP, Ha SE. Yes, but what’s the mechanism? (don’t expect an easy answer). J Pers Soc Psychol. 2010;98(4):550–8.
  16. Yarkoni T, Westfall J. Choosing prediction over explanation in psychology: Lessons from machine learning. FigShare Httpsdx Doi Org106084m9 Figshare. 2016;2441878:v1.
  17. Wallach JD, Sullivan PG, Trepanowski JF, Sainani KL, Steyerberg EW, Ioannidis JPA. Evaluation of Evidence of Statistical Support and Corroboration of Subgroup Claims in Randomized Clinical Trials. JAMA Intern Med. 2017 Apr 1;177(4):554–60.
  18. Kenny DA. Reflections on mediation. Organ Res Methods. 2008;11(2):353–358.
  19. Bar-Yam Y. The limits of phenomenology: From behaviorism to drug testing and engineering design. Complexity. 2016 Sep 1;21(S1):181–9.
  20. Rutter H, Savona N, Glonti K, Bibby J, Cummins S, Finegood DT, et al. The need for a complex systems model of evidence for public health. The Lancet [Internet]. 2017 Jun 13 [cited 2017 Jun 17];0(0). Available from: http://www.thelancet.com/journals/lancet/article/PIIS0140-6736(17)31267-9/abstract

 

The scientific foundation of intervention evaluation

In the post-replication-crisis world, people are increasingly arguing, that even applied people should actually know what they’re doing when they do what they call science. In this post I expand upon some points I made in these slides about the philosophy of science behind hypothesis testing in interventions.

How does knowledge grow when we do intervention research? Evaluating whether an intervention worked can be phrased in relatively straightforward terms; “there was a predicted change in the pre-specified outcome“. This is, of course, a simplification. But try and contrast it with the attempt to phrase what you mean when you want to claim how the intervention worked, or why it did not. To do this, you need to spell out the program theory* of the intervention, which explicates the logic and causal assumptions behind intervention development.

* Also referred to as programme logic, intervention logic, theory-based (or driven) evaluation, theory of change, theory of action, impact pathway analysis, or programme theory-driven evaluation science… (Rogers, 2008). These terms are equivalent for the purposes of this piece.

The way I see it (for a more systematic approach, see intervention mapping), we have background theories (Theory of Planned Behaviour, Self-Determination Theory, etc.) and knowledge from earlier studies, which we synthesise into a program theory. This knowledge informs us about how we believe an intervention in our context would achieve its goals, regarding the factors (“determinants”) that determine the target behaviour. From (or during the creation of) this mesh of substantive theory and accompanying assumptions, we deduce a boxes-and-arrows diagram, which describes the causal mechanisms at play. These assumed causal mechanisms then help us derive a substantive hypothesis (e.g. “intervention increases physical activity”), which informs a statistical hypothesis (e.g. “accelerometer-measured metabolic equivalent units will be statistically significantly higher in the intervention group than the control group”). The statistical hypothesis then dictates what sort of observations we should be expecting. I call this the causal stream; each one of the entities follows from what came before it.

program_2streams.PNG

The inferential stream runs to the other direction. Hopefully, the observations are informative enough so that we can make judgements regarding the statistical hypothesis. The statistical hypothesis’ fate then informs the substantive hypothesis, and whether our theory upstream get corroborated (supported). Right?

Not so fast. What we derived the substantive and statistical hypotheses from, was not only the program theory (T) we wanted to test. We also had all the other theories the program theory was drawn from (i.e. auxiliary theories, At), as well as an assumption that the accelerometers measure physical activity as they are supposed to, and other assumptions about instruments (Ai). Not only this, we assume that the intervention was delivered as planned and all other presumed experimental conditions (Cn) hold, and that there are no other systematic, unmeasured contextual effects that mess with the results (“all other things being equal”; a ceteris paribus condition, Cp).

Program_link tells.png

We now come to a logical implication (“observational conditional”) for testing theories (Meehl, 1990b, p. 119, 1990a, p. 109). Oi is the observation of an intervention having taken place, and Op is an observation of increased physical activity:

(T and At and Ai and Cn and Cp) → (Oi → Op)

[Technically, the first arrow should be logical entailment, but that’s not too important here.] The first bracket can be thought of as “all our assumptions hold”, the second bracket as “if we observe the intervention, then we should observe increased physical activity”. The whole thing thus roughly means “if our assumptions (T, A, C) hold, we should observe a thing (i.e. Oi → Op)”.

Now here comes falsifiability: if we observe an intervention but no increase in physical activity, the logical truth value of the second bracket comes out false, which also destroys the conjunction in the first bracket. By elementary logic, we must conclude that one or more of the elements in the first bracket is false – the big problem is that we don’t know which element(s) was or were false! And what if the experiment pans out? It’s not just our theory that’s been corroborated, but the bundle of assumptions as a whole. This is known as the Duhem-Quine problem, and it has brought misery to countless induction-loving people for decades.

EDIT: As Tal Yarkoni pointed out, this corroboration can be negligible unless one is making a risky prediction. See the damn strange coincidence condition below.

Program_link fails.png

EDIT: There was a great comment by Peter Holtz. Knowledge grows when we identify the weakest links in the mix of theoretical and auxiliary assumptions, and see if we can falsify them. And things do get awkward if we abandon falsification.

If wearing an accelerometer increases physical activity in itself (say people who receive an intervention are more conscious about their activity monitoring, and thus exhibit more pronounced measurement effects when told to wear an accelerometer), you obviously don’t conclude the increase is due to the program theory’s effectiveness. Also, you would not be very impressed by setups where you’d likely get the same result, whether the program theory was right or wrong. In other words, you want a situation where, if the program theory was false, you would doubt a priori that among those who increased their physical activity, many would have underwent the intervention. This is called the theoretical risk; prior probability p(Op|Oi)—i.e. probability of observing increase in physical activity, given that the person underwent the intervention—should be low absent the theory (Meehl, 1990a, p. 199, mistyped in Meehl, 1990b, p. 110), and the lower the probability, the more impressive the prediction. In other words, spontaneous improvement absent the program theory should be a damn strange coincidence.

Note that solutions for handling the Duhem-Quine mess have been proposed both in the frequentist (e.g. error statistical piecewise testing, Mayo, 1996), and Bayesian (Howson & Urbach, 2006) frameworks.

What is a theory, anyway?

A lot of the above discussion hangs upon what we mean by a “theory” – and consequently, should we apply the process of theory testing to intervention program theories. [Some previous discussion here.] One could argue that saying “if I push this button, my PC will start” is not a scientific theory, and that interventions use theory but logic models do not capture them. It has been said that if the theoretical assumptions underpinning an intervention don’t hold, the intervention will fail, but that doesn’t make an intervention evaluation a test of the theory. This view has been defended by arguing that behaviour change theories underlying an intervention may work, but e.g. the intervention targets the wrong cognitive processes.

To me it seems like these are all part of the intervention program theory, which we’re looking to make inferences from. If you’re testing statistical hypotheses, you should have substantive hypotheses you believe are informed by the statistical ones, and those come from a theory – it doesn’t matter if it’s a general theory-of-everything or one that applies in very specific context such as the situation of your target population.

Now, here’s a question for you:

If the process described above doesn’t look familiar and you do hypothesis testing, how do you reckon your approach produces knowledge?

Note: I’m not saying it doesn’t (though that’s an option), just curious of alternative approaches. I know that e.g. Mayo’s error statistical perspective is superior to what’s presented here, but I’m yet to find an exposition of it I could thoroughly understand.

Please share your thoughts and let me know where you think this goes wrong!

With thanks to Rik Crutzen for comments on a draft of this post.

ps. If you’re interested in replication matters in health psychology, there’s an upcoming symposium on the topic in EHPS17 featuring Martin Hagger, Gjalt-Jorn Peters, Rik Crutzen, Marie Johnston and me. My presentation is titled “Disentangling replicable mechanisms of complex interventions: What to expect and how to avoid fooling ourselves?“

pps. Paul Meehl’s wonderful seminar Philosophical Psychology can be found in video and audio formats here.

Bibliography:

Abraham, C., Johnson, B. T., de Bruin, M., & Luszczynska, A. (2014). Enhancing reporting of behavior change intervention evaluations. JAIDS Journal of Acquired Immune Deficiency Syndromes, 66, S293–S299.

Dienes, Z. (2008). Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference. Palgrave Macmillan.

Dienes, Z. (2014). Using Bayes to get the most out of non-significant results. Quantitative Psychology and Measurement, 5, 781. https://doi.org/10.3389/fpsyg.2014.00781

Hilborn, R. C. (2004). Sea gulls, butterflies, and grasshoppers: A brief history of the butterfly effect in nonlinear dynamics. American Journal of Physics, 72(4), 425–427. https://doi.org/10.1119/1.1636492

Howson, C., & Urbach, P. (2006). Scientific reasoning: the Bayesian approach. Open Court Publishing.

Lakatos, I. (1971). History of science and its rational reconstructions. Springer. Retrieved from http://link.springer.com/chapter/10.1007/978-94-010-3142-4_7

Mayo, D. G. (1996). Error and the growth of experimental knowledge. University of Chicago Press.

Meehl, P. E. (1990a). Appraising and amending theories: The strategy of Lakatosian defense and two principles that warrant it. Psychological Inquiry, 1(2), 108–141.

Meehl, P. E. (1990b). Why summaries of research on psychological theories are often uninterpretable. Psychological Reports, 66(1), 195–244. https://doi.org/10.2466/pr0.1990.66.1.195

Moore, G. F., Audrey, S., Barker, M., Bond, L., Bonell, C., Hardeman, W., … Baird, J. (2015). Process evaluation of complex interventions: Medical Research Council guidance. BMJ, 350, h1258. https://doi.org/10.1136/bmj.h1258

Rogers, P. J. (2008). Using Programme Theory to Evaluate Complicated and Complex Aspects of Interventions. Evaluation, 14(1), 29–48. https://doi.org/10.1177/1356389007084674

Shiell, A., Hawe, P., & Gold, L. (2008). Complex interventions or complex systems? Implications for health economic evaluation. BMJ, 336(7656), 1281–1283. https://doi.org/10.1136/bmj.39569.510521.AD

 

Evaluating intervention program theories – as theories

How do we figure out, whether our ideas worked out? To me, it seems that in psychology we seldom rigorously think about this question, despite having been criticised for dubious inferential practices for at least half a century. You can download a pdf  of my talk at the Finnish National Institute for Health and Welfare (THL) here, or see the slide show in the end of this post. Please solve the three problems in the summary slide! 🙂

TLDR: is there a reason, why evaluating intervention program theories shouldn’t follow the process of scientific inference?

summary