Today there was an interesting article in Helsingin Sannomat.

http://www.hs.fi/ulkomaat/a1415168967244

The book by Mary Somerville, that is mentioned in the article is in the public domain and available, as many other old and interesting books, through the Internet Archive.

https://archive.org/details/onconnexionphys00somegoog

I was particularly impressed by the preface written in the 1830’s:

Mary_Somerville In one of the comments in HS a reader writes that programming is boring… I disagree, just coding may be boring, but designing software and algorithms is anything but boring!

Another reader correctly says that Ada is not a super-computer langauge. In a way it was ment to be when it was designed, but in the sense of being a tool for creating reliable and bug-free software. However, as some other languages designed by a commitee it ended being too complex and inconsistent, and because of this difficult to use as a general purpose language.

Series title: Statistics: Changes since I was an undergrad

Abstract. I took my first course in Statistics 37 years ago. How we do statistics has changed dramatically since then. The amount of data we produce and analyse has also increased enormously. However, different research communities are making use of these new possibilities to very different extents. Even Biology curricula at different universities differ substantially in their emphasis on “quantitative” methods and “numerical/mathematical” literacy. All branches of Biology are becoming more quantitative. By reflecting on how advances in computers and computing science (and in the methods these advances have made possible) have opened a whole new way of approaching data analysis, I hope I will make you rethink your approach to data analysis.

If you are planning to participate next year in my course 526052 Using R for reproducible data analysis I recomend that you attend this series of talks as a gentle introduction to the subject. If you attend you can get credits, either through the DDPS or in on the regular seminar series in Plant Biology.

Place: Biocenter 3, Room 5405 (5th floor, the room in front of the stairs)

Two hours are reserved, for talk plus discussion.

Please, let me know by e-mail if you intend to participate.

Part 1: Increased easy of computation

Monday 17 November, 10:15-12:00

This first talk focuses on describing the advances in computing hardware and software and why they are relevant to data analysis. I will also briefly mention the now fashionable “Data Science” and “Big Data” concepts and the currently fuzzy boundary between statistics and programming.

Part 2: Advances in theory and methods

Tuesday 18 November, 13:15-15:00

If your statistical knowledge is limited to the “traditional” methods, I hope to introduce you to the new possibilities brought about by lifting the that used to prevent us from using computation intensive methods and analysing big data set. In contrasts, if you are a young researcher, well versed in modern methods, you will still hopefully find my talk interesting from the historical perspective of getting a glimpse of what limitations we had to deal with in the recent past, and how they influenced, and still influence, the traditional ways of treating biological data. This talk focuses mostly on statistical theory and methods. However, no specialised methods like those used in molecular biology or vegetation analysis will be described in this talk.

Part 3: Examples of modern methods using R

Wednesday 19 November, 13:15-15:00

In this talk I will present some examples of types of analyses that have become available to any biologist thanks to the increase in computing capacity and the development of new theory and methods that make use of these new possibilities. The aim not to teach you how to apply this methods, but instead to give an idea of what a broad array of methods are currently available to anyone with access to a run-of-the-mill personal computer, or failing this a cheap cloud server.

Part 4: Reproducible research and data analysis

Thursday 20 November, 13:15-15:00

This talk introduces the currently hot topic of research accountability and repeatability. Why is this openness needed, and how it can be achieved in practice, and how modern software and modern combinations of old software make it possible to achieve this goal rather painlessly even for complex data analyses. I will also reflect on the origins of these ideas in relation to computer programming around the concept of literate programming proposed by Donald Knuth in the early 1980’s.

A Blog for Plant Science Students

A scrapbook of thoughts on plants, methods, careers and science in general

Author Archives: Pedro J Aphalo

About Pedro J Aphalo

How to fail a PhD (Hyndsight post)

New Plant Biology Flyer [FI, EN]

Lovelacen kreivitär kirjoitti tietokoneohjelman, ja häntä opasti Mary Somerville (Hessari 7.11.2014)

A series of four seminars on Statistics (week 47)

Series title: Statistics: Changes since I was an undergrad

Part 1: Increased easy of computation

Part 2: Advances in theory and methods

Part 3: Examples of modern methods using R

Part 4: Reproducible research and data analysis

Subscribing to this blog’s updates