Selected Publications

Citation info:

Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95%; confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects. Correlational tests suggest that replication success was better predicted by the strength of original evidence than by characteristics of the original and replication teams.

Recent Publications

Recent Posts

Anomalous diffusion: Individual particle tracking vs. Ensemble averages

A recent article by Bos et al. (2017)1 in which the authors compared network measures derived from cross-sectional (between-individual) and within-individual data sparked some discussion on facebook.

The authors conclude that the outcomes of network analyses based on time-ordered ensemble averages are not the same as the outcomes based on the time evolution of individual trajectories.

I argued this result is to be expected, because I think the data-generating process underlying time and trial series of human physiology and performance should be considered a non-ergodic process and the particular analyses used to construct the graphical model (the Gaussian Graphical Model) assume ergodic processes. Simply put, the ergodic assumption is that the space-averaged behaviour of some variable observed in an ensemle will be the same as the time-averaged behaviour observed in an individual (… in the limit of infite time and space). If it is violated, this is a problem for such models, because it implies the statistical properties of the process change over time and one cannot depend (fullly) on universal laws of probability to predict or infer properties and behaviour of a system.

This post originated as a short tutorial on analysing stationarity and other properties of time series, the current text is an expansion of the page kindly tweeted by Eiko Fried:

Particles are individuals too!

I often use the Complex Sysems argument to suggest measurements should be considered non-ergodic (many interactions between processes across multiple scales that are mostly multiplicative instead of additive), however, it turns out that studies of individual particle tracking (both biological and physical ‘particles’) also have to deal with ergodicity-breaking, ageing and non-stationarity. This behaviour is called anomalous diffusion, which turns out to be more common in nature than the classical normal diffusion processes studied by Brown, Einstein and many other prominent scientists.

The following quotes are from the article Anomalous diffusion models and their properties: non-stationarity, non-ergodicity, and ageing at the centenary of single particle tracking which provides an excellent review and perspective.2 The authors list the conditions for normal diffusion, they should sound familiar because these assunmptions are the same as for most statistical analyses used in the social and life sciences:

The conditions assumed by Einstein in his derivation of the diffusion equation are (i) the independence of individual particles, (ii) the existence of a sufficiently small time scale beyond which individual displacements are statistically independent, and (iii) the property that the particle displacements during this time scale correspond to a typical mean free path distributed symmetrically in positive or negative directions.

So, we need:

  • Independence of indivdual measurement objects in a sample
  • Memorylessness regarding any interactions between individuals and their environment. The after-effects of interactions should be short-lived and not affect behaviour in the long run (no long-range correlations).
  • Randomly distributed deviations from central tendency (Gaussian error distribution).

What happens when we violate these assumptions?

In anomalous diffusion processes, at least one of these fundamental assumptions is violated, and the strong convergence to the Gaussian according to the central limit theorem broken. In particular, by departing from one or more of the assumptions (i)–(iii), we find that there exist many different generalisations of the Einstein–Smoluchowski diffusion picture.

Ok, but what does that mean?

The fact that we consider this large range of anomalous diffusion processes is the non-universal nature of anomalous diffusion itself. Once we leave the realm of Brownian motion, we lose the confines of the central limit theorem forcing the processes to converge to the Gaussian behaviour predicted by Einstein.


Quite commonly such analyses of time series from experiment or simulations are performed in terms of time averaged observables, in particular, the time averaged MSD [Mean squared displecement = Mean squared ‘error’]. We point out that the physical interpretation of the obtained behaviour of such time averages in terms of the typically available ensemble approaches may be treacherous : many of the anomalous diffusion processes discussed herein lead to a disparity between the ensemble and the time averaged observable, for instance, between the ensemble and time averaged MSDs […] even in the limit of long measurement times. Moreover, it turns out that individual results for time averages […] appear to be irreproducible, despite long measurement times.

To summarise, when non-ergodicity, non-stationarity and ageing play a role in the timeseries measurements of variables of interest, one cannot rely on ensemble statistics to yield valid inferences, in fact they should be considered treacherous, and individual time series averages are irreproducible.

What follows are some tests that have been developped (in different scientific disciplines) to check assumptions of non-stationarity (and more).

This is by no means an exhaustive account of tests (or even a strategy that I would use myself, but that is for another post).

A timeseries of daily mood: down

Unzip and load these interesting ESM data provided by:

Kossakowski, J. J., Groot, P. C., Haslbeck, J. M., Borsboom, D., & Wichers, M. (2016, December 12). Data from “Critical Slowing Down as a Personalized Early Warning Signal for Depression.” Retrieved from

There is a timecode in the dataset indicating when the participant was promted to nswer some questions. Here, we use it to create a time series object and plot it.

beep <- dmy_hms(paste0(df.ts$date," ",df.ts$beeptime,":00"),tz = "Europe/Amsterdam")
y <- xts(df.ts$mood_down, = beep)
plot(y, main = "Time series of variable 'mood_down'")

Test for level and trend stationarity

# Load some Time Series libraries
# BARTELS RANK TEST [rank version of von NEUMANN's Ratio Test for Randomness]
# H0: randomness
# H1: nonrandomness
randtests::bartels.rank.test(df.ts$mood_down, alternative = "two.sided")
>   Bartels Ratio Test
> data:  df.ts$mood_down
> statistic = -16.606, n = 1474, p-value < 2.2e-16
> alternative hypothesis: nonrandomness

# H0: randomness
# H1: trend
randtests::bartels.rank.test(df.ts$mood_down, alternative = "left.sided")
>   Bartels Ratio Test
> data:  df.ts$mood_down
> statistic = -16.606, n = 1474, p-value < 2.2e-16
> alternative hypothesis: trend

# H0: randomness
# H1: systematic oscillation
randtests::bartels.rank.test(df.ts$mood_down, alternative = "right.sided")
>   Bartels Ratio Test
> data:  df.ts$mood_down
> statistic = -16.606, n = 1474, p-value = 1
> alternative hypothesis: systematic oscillation

# H0: randomness
# H1: upward or downward trend
randtests::cox.stuart.test(na.exclude(df.ts$mood_down), alternative = "two.sided")
>   Cox Stuart test
> data:  na.exclude(df.ts$mood_down)
> statistic = 246, n = 362, p-value = 7.196e-12
> alternative hypothesis: non randomness

# H0: randomness
# H1: upward trend
randtests::cox.stuart.test(na.exclude(df.ts$mood_down), alternative = "right.sided")
>   Cox Stuart test
> data:  na.exclude(df.ts$mood_down)
> statistic = 246, n = 362, p-value = 3.598e-12
> alternative hypothesis: increasing trend

# H0: randomness
# H1: downward trend
randtests::cox.stuart.test(na.exclude(df.ts$mood_down), alternative = "left.sided")
>   Cox Stuart test
> data:  na.exclude(df.ts$mood_down)
> statistic = 246, n = 362, p-value = 1
> alternative hypothesis: decreasing trend

# H0: level stationarity
# H1: unit root [not level stationary]
tseries::kpss.test(na.exclude(df.ts$mood_down), lshort = TRUE, null = "Level")
>   KPSS Test for Level Stationarity
> data:  na.exclude(df.ts$mood_down)
> KPSS Level = 1.6737, Truncation lag parameter = 8, p-value = 0.01

# H0: trend stationarity 
# H1: not trend stationary
tseries::kpss.test(na.exclude(df.ts$mood_down), lshort = TRUE, null = "Trend")
>   KPSS Test for Trend Stationarity
> data:  na.exclude(df.ts$mood_down)
> KPSS Trend = 0.086699, Truncation lag parameter = 8, p-value = 0.1

Test for AR, ARCH or an optimal ARIMA process

First inspect the partial autocorrelation function to get an idea of the AR order.

acf(df.ts$mood_down, na.action = na.pass)
pacf(df.ts$mood_down, na.action = na.pass)

One could conclude there’s at least \(AR(3)\) involved, however there are also possible long-range correlations in these data. I am not an initiate of the how-to-interpret-(P)ACF-to-ARfiMA-orders-cult, so let’s do some tests!

# H0: time series follows some AR process
# H1: time series cannot be considered some AR process
> $test.stat
> [1] 5.359887
> $p.value
> [1] 0.02074511
> $order
> [1] 16

Not an AR process, how about ARCH or a best fitting ARiMA?

# H0: time series follows some AR process
# H1: time series cannot be considered some ARCH process
TSA::McLeod.Li.test(y = df.ts$mood_down, plot = TRUE, omit.initial = TRUE)

# H0: time series follows some AR process
# H1: time series cannot be considered some ARiMA process
TSA::McLeod.Li.test(object = forecast::auto.arima(df.ts$mood_down), plot = TRUE, omit.initial = TRUE)

This test finds no convincing evidence of the timeseries being either ARCH or ARiMA.

Structural decomposition

It is also possible to decompose the time series in different sources of variation. Let’s calculate the average frequency between beeps (in hours), and use that as a guess for any periodicity that may occur in the timeseries.

(beepMean <- mean(time_length(diff(beep), unit = "hours")))
> [1] 3.886332
ts0 <- ts(df.ts$mood_down, frequency = round(beepMean))
fit<-StructTS(ts0, type="BSM")
par(mfrow = c(4, 1)) # to give appropriate aspect ratio for next plot.
plot(cbind(fitted(fit), resids=resid(fit)), main= "Basic Structural Model for 'mood_down'")

There is a lot going on here, the residuals and level fluctuations show we probably should not assume a linear ergodic stochastic change process.


This timeseries is:

  • Not level stationary
  • Not random
  • Not oscillatory
  • Not generated by some AR process
  • Not ARCH or generated by a best fitting ARiMA process
  • Contains long range correlations in PACF, e.g. lags 6, 9, 13, 25 and beyond (remember it is 25 * 4 hours!)
  • Upward Trend (which is stationary)

This is likely a violation of the strong ergodic condition / strong memorylessness property:

The statistical properties of a between-individual cross-sectional sample of the same theoretical process will be very different from a within-individual sample in a non-trivial way if the process can be considered to be some form of anomalous diffusion.

So, do not use any analytic techniques that depend on the ESM data and underlying process to be stationary and ergodic if these tests indicate the assumptions are violated.

NEXT: Dynamics in state space

What can we do to study non-ergodic processes?

Again from the anomalous diffussion article:

We note that non-ergodicity […] is not restricted to the spatial diffusion of particles, but similar principles hold for certain processes revealing non-exponential dynamics, such as the blinking behaviour of individual quantum dots or laser cooling. To physically interpret such measurements we need to understand the time averages of individual time series. As we will see, this requires information beyond the conventional ensemble averages for a variety of anomalous diffusion processes.

This resonates with some great advice from Peter Molenaar in Todd Roses’s book The End of Average:

“first analyse, then aggregate”

When I find the time I’ll post some examples of how to represent and quantify state space dynamics of individual time series.

For now, the plots below should make clear the occurrence of state stransitions is not distributed according to a uniform or Gaussian pdf.


df.ts2 = data.frame(
  Signal = round(head(df.ts$mood_down, -1),0),
  NextSignal = round(tail(df.ts$mood_down, -1),0),
  Weight = 1,
  Label = "mood_down",
  Size  = 0,
  Time = factor(round(seq(1,150, length.out = max(index(head(beep,-1))))))

N = length(df.ts2$Signal)

dfcat <- round(runif(N,min=-3,max=3))
dfcat2 <- data.frame(
  Signal = round(head(dfcat, -1),0),
  NextSignal = round(tail(dfcat, -1),0),
  Weight = 1,
  Label = "random_uniform",
  Time = factor(round(seq(1,150, length.out = max(index(head(beep,-2))))))

dftot <- rbind(dfcat2,df.ts2)

ggplot(dftot, aes(xbucket = Signal, 
                   ybucket = NextSignal, 
                   fill = factor(NextSignal), 
                   weight = Weight) )+
  stat_marimekko(color = 'black', xlabelyposition = -0.1) + 
  scale_y_continuous("Relative frequency") +
  scale_x_continuous("Current value") +
  scale_fill_discrete("Next value") +
  facet_wrap(~Label) +
  ggtitle(label = "State tranisitions: Random numbers vs. 'I feel down'",subtitle = paste("box dimensions represent relative frequencies of",N,"values")) +
  theme(axis.text.x =  element_blank(), 
    axis.ticks.x = element_blank(), 
    strip.background = element_rect(fill="grey90"),
    strip.text = element_text(face = "bold"),
    plot.background = element_blank(), 
    panel.border = element_blank(), 
    panel.background = element_blank(), 
    panel.grid = element_blank()) 

If we consider the values \(-3\) to \(3\) to span the possibe states the ‘system’ can be in, then a crude way to study the dynamics of the states is to look at a lag-1 return plot.


tmp   <-$Signal,df.ts2$NextSignal))

# Set the size to frequency of occurrence of each combination of values
for(c in seq_along(tmp$Freq)){
  ID <- (df.ts2$Signal%in%tmp$Var1[c])&(df.ts2$NextSignal%in%tmp$Var2[c])
    df.ts2$Size[ID] <- tmp$Freq[c]

ga <- ggplot(df.ts2,aes(x=Signal,y=NextSignal, frame = Time)) +
  geom_path(colour = "grey80") +
  geom_point(aes(size=Size)) +
  scale_y_continuous("Next value") +
  scale_x_continuous("Current value") +
  scale_size_continuous("Frequency of\n pair (Curren,Next)") + 
  ggtitle(label = "Window", subtitle = "Return plot") +
  theme_bw() + coord_equal()

gganimate(ga, interval=.2, "returnplot.gif")

It is appears to be the case the system is attracted to specific states. Perhaps the dynamics can be described as transitions between more or less stable states, or stable temporal patterns of recurring states.

to be continued …

  1. Bos, F. M., Snippe, E., de Vos, S., Hartmann, J. A., Simons, C. J., van der Krieke, L., & Wichers, M. (2017). Can We Jump from Cross-Sectional to Dynamic Interpretations of Networks? Implications for the Network Perspective in Psychiatry. Psychotherapy and Psychosomatics, 86(3), 175-177

  2. Metzler, R., Jeon, J. H., Cherstvy, A. G., & Barkai, E. (2014). Anomalous diffusion models and their properties: non-stationarity, non-ergodicity, and ageing at the centenary of single particle tracking. Physical Chemistry Chemical Physics, 16(44), 24128-24164.



No reason, this is the accidental mathematician category.

Pandigital numbers

Pandigital numbers aren’t that special, it’s a number that has all digits from 1-9, or, 0-9 in it.

For example: 381654729 is a pandigital number (with some extra properties)

Pandigital functions

As Complexity Scientists (and musicians and artists in general) know:

Interesting things will happen when the degrees of freedom available to a system to generate its behaviour, are reduced.

Now, the pandigital constraint can become very interesting when applied to functions, or rather, mathematical operators. The rules are:

  • Take a pandigital number
  • Stick any mathematical operator between any cluster of digits
  • Attempt to get an interesting outcome

Turns out such a pandigital formula is able to approximate the trancendental number \(e\) with uncanny precision!

\[e\approx\left (1 + 9^{-4^{6*7}} \right)^{3^{2^{85}}}\]

It was discovered by Richard Sabey in 2004.

The Numberphile channel has a great video on it:


A beginning is the time for taking the most delicate care that the balances are correct.

– Fom Manual of Muad’Dib by Prinses Irulan (Herbert, 1965)

It’s been a while…

If you still have a while…


This is a demonstration of using R in the context of hypothesis testing by means of Effect Size Confidence Intervals.


A Post-Publication Peer-Review (3PR) of Time, Money, and Morality (Gino & Mogilne, 2013).



Badges to Acknowledge Open Practices

The aim is to specify a standard by which we can say that a scientific study has been conducted in accordance with open-science principles and provide visual icons to allow advertising of such good behaviours.


This is an example of using the custom widget to create your own homepage section.

I am a teaching instructor for the following courses at:

Graduate School Behavioural Science:

– Dynamics of Complex Systems (Go to Course Materials 2016-2017)

School for Pedagogical and Educational Sciences:

– Introduction to Scientific Methods