RG: Do we actually need (or understand) more than basic statistics?

Link to the original Q&A thread at ResearchGate

This is another topic worthwhile looking at, and especially thinking about. I copy here, my answer, that is to some extent off-topic (you will need to follow the link above to read the original post and other answers):

Frequently students that I have supervised, seem to think that statistical tests come first, rather than being a source of guidance on how far we can stretch the inferences that we can make by “looking at the data” and derived summaries. They just describe effects as statistically significant or not. This results in very boring “results” sections lacking the information that the reader wants to know. When I read a paper I want to know the direction and size of an effect, what patterns are present in the data, and if there is a test, then statistical tests should help us decide what amount of precaution we need to use until additional evidence becomes available. Many students and experienced researchers which “worship” p-values and the use of strict risk levels ignore how powerful and important is the careful design of experiments, and how the frequently seen use of “approximate” randomization procedures or the approach of repeating an experiment until the results become significant invalidate the p-values they report.

[edited 5 min later] As I read again what I wrote it feels off-topic, but what I am trying to say is that not only the proliferation of p-values and especially the use fixed risk levels, but also many times how results are presented, is the reflection of a much bigger problem: statistics being taught as a mechanical and exact science based on clear and fixed rules. Oversimplifying the subtleties and degree of subjectivity involved in any data analysis, especially in relation to what assumptions are reasonable or not, and how any experimental protocol relates to which assumptions are tenable or not, is simply not teaching what would be the most useful training for anybody doing experimental research. So, in my opinion, yes we need to understand much more than basic statistics in terms of principles, but this does not mean that we need to know advanced statistical procedures unless we use them or assess work that uses them.

 

RG: What prevents you from using a p-value other than 0.05 as your statistical significance cut-off?

Link to the original Q&A thread at ResearchGate

Even though there were already 84 answers, I added my own answer:

… for me choosing the critical p-value is not a statistical question. It is in the realm of the real-world effective cost of making the wrong decision. In research, it mainly relates to balancing “false positive” and “false negative” decisions. So, mostly informally, sometimes researchers set the critical value at 0.1 (10%) when replication is low. On the other hand when we have many replicates, we will find statistically significant differences that are biologically irrelevant. [Added only here: The 5 % tends to work not too badly for the number replicates used by many of us.]

In my opinion in every scientific publication, whatever critical value we use for discussing and interpreting the results, the actual p-values should always be given. Not doing so, just discards valuable information. Of course, one historical reason for not reporting actual values was the laborious calculations involved in obtaining values by interpolation when using printed tables.

The situation has far-reaching consequences when dealing with legal regulation compliance studies, or for environmental impact assessment, or safety. I would not want to take 1 in 20 risk of making the wrong decision concerning the possible lethal side-effect of a new medicine, while it might be acceptable to take that risk when comparing the new medicine to a currently used medicine known to be highly effective [but maybe not if comparing against a placebo]. In such cases we would want, rather than balance the risks of making false positive or false negative decisions, minimize one of them. In other words minimize the probability of the type of mistake that we need/want to avoid.

I have avoided statistical jargon, to make this understandable to more readers. Statisticians call these Type I and Type II errors, and there is plenty of literature on this. In any case I feel most comfortable with Tukey’s view on hypothesis testing, and his idea that we can NEVER ACCEPT the null hypothesis. We can either get evidence that A > B or A < B, and the alternative being that we have not enough evidence to decide which one is bigger. Of course in practice using power analysis, we can decide that we could have detected or not a difference that would be in practice relevant. However, this is conceptually very different to accepting that there is no difference or no effect.

[I would like to see students, and teachers, commenting on this problem, and how this fits with their understanding of the use of statistics in real situations. Please, just comment below. I will respond to any comments, and write a follow-up post on the  effect of using different numbers of replicates on inferences derived from data].

Thinking, Fast and Slow

Daniel Kahneman (2012) Thinking, Fast and Slow. Penguin Books, London.ISBN 978-0-141-03357-0.

I am currently reading this book. I am finding it extremely interesting. Understanding how and why we make choices, is important for everybody. If you are a scientist or aspire to be one in the future, understanding why we accept more readily some experimental results than others, why we are more comfortable with some hypotheses than others, is of fundamental importance, both to guard against bias, and to be able to present our new ideas in a way that will make them more acceptable. Continue reading “Thinking, Fast and Slow”

Why are plants green?

This is a frequently posed question, that has no unique or simple answer. Prof. Lars Olof Björn has written a section on this in his book Photobiology: the science of life and light which is much more detailed than this short post. The problem with this question is that its meaning can be different to different persons. I will start by separating different aspects of this question into separate, and better-defined, questions that are easier to answer: Continue reading “Why are plants green?”

How to Write a Great Research Paper

Abstract

Professor Simon Peyton Jones, Microsoft Research, gives a guest lecture at the University of Cambridge on writing. Seven simple suggestions: don’t wait – write, identify your key idea, tell a story, nail your contributions, put related work at the end, put your readers first, listen to your readers.

http://www.youtube.com/watch?v=g3dkRsTqdDA

via How to Write a Great Research Paper – YouTube.

The colour of light reflected by plants

Plants and many animals can perceive “colours” that are invisible to us. For plants, near-infrared radiation just outside the limit of human vision (called by plant biologists “far-red”) plays a very important role. It is possible with modified or special cameras, but also with normal cameras with the help filters to photograph the invisible.

Some years ago, in the summer, I took some photographs in Joensuu with the help of my son Tomás. These photographs illustrate the colour of the light reflected by plants. They were taken with a digital camera (Olympus E-510, 50 mm f 1:2 objective) mounted on a tripod on a day with broken clouds in the sky. The FR photograph was taken with an optical “IR” long pass filter that blocks all visible light and ultraviolet radiation (Wavelengths shorter than 720 nm). The RED, BLUE and GREEN filtered ïmages are just the three channels of the sensor of the camera selected in post processing from an image taken without an optical filter. These four images are displayed in greyscale. The unfiltered colour image is also included for comparison. [Last edited on 2018-01-06]

Continue reading “The colour of light reflected by plants”