Skip to content

When you can (and when you absolutely should not) show a standard error in your plots

Patrice Salomé of Plant Editors explains the correct use of standard error.

We are very used to seeing plots with error bars around data points. These data points are most often the mean of a number of technical replicates collected during a given experiments; you will likely assume that the error bar represents the standard deviation (SD) or the standard error of the mean (SEM). 

You may have also read the instructions for authors and noticed how many journals request that you show individual data points when you have fewer than six (or thirty) replicates.

It turns out that these two paragraphs have something in common (think of it as their intersection in a Venn diagram): it turns out that SD and SEM can only be shown when specific conditions are met; the journals trust that you know this, but want to verify your knowledge of basic statistics all the same.

When we perform an experiment and measure a parameter, we do so as several replicates. When you calculate the mean and SD (or SEM), you make one major assumption: that all these replicate values will follow a normal distribution, centered on the mean. The normal distribution describes how the data points should behave, and how many of them will be within given intervals relative to the mean:

Illustration of a normal curve with one, two, and three standard deviations from the mean.

Of course, you cannot perform (close to) infinite numbers of replicates, but there is a minimum number of replicates that need to be collected before you can assume your data follow a normal distribution. This number is typically three, as a compromise between statistical precision and experimental cost. If you only have two replicates, you cannot assume that their associated values follow a normal distribution, as they will only define a line.

So repeat after me: I swear to never show a mean and SD (or SEM) with any n values below 3. Please also note that n = 3 should not come from three independent experiments, each with one replicate, or from two independent experiments, one with one replicate and one with two replicates. It would be best to stick to the “3 x >3” rule of three independent experiments (these would be your biological replicates), each consisting of at least three technical replicates. (Not sure about biological and technical replicates? Learn more in this blog post!)

And what can you do if your data do not follow a normal distribution? Just plot the individual data points, either alone or in addition to a violin plot or a boxplot, which represent the distribution of the data without making any assumptions.

Want to have our experienced science editors have a look at your manuscript? Contact us here!