Misbehaviour and fault tolerance in machine learning systems

Our interview paper on misbehaviour and fault tolerance in machine learning systems was recently published in Journal of Systems and Software. Machine learning (ML) provides us with numerous opportunities, allowing ML systems to adapt to new situations and contexts. At the same time, this adaptability raises uncertainties concerning the run-time product quality or dependability, such as reliability and security, of these systems. Systems can be tested and monitored, but this does not provide protection against faults and failures in adapted ML systems themselves.

We studied software designs that aim at introducing fault tolerance in ML systems so that possible problems in ML components of the systems can be avoided. The research was conducted as a case study, and its data was collected through five semi-structured interviews with experienced software architects.

We present a conceptualisation of the misbehaviour of ML systems, the perceived role of fault tolerance, and the designs used. The problems in the systems rise from problems in inputs, concept drift, bugs and inaccuracies in the models, their faulty deployment, and not really understanding what the utilised model does. Common patterns to incorporating ML components in design in a fault tolerant fashion have started to emerge. ML models are, for example, guarded by monitoring the inputs and their distribution, and enforcing business rules on acceptable outputs. Multiple, specialised ML models are used to adapt to the variations and changes in the surrounding world, and simpler fall-over techniques like default outputs are put in place to have systems up and running in the face of problems.

However, the general role of these patterns is not widely acknowledged. This is mainly due to the relative immaturity of using ML as part of a complete software system: the field still lacks established frameworks and practices beyond training to implement, operate, and maintain the software that utilises ML. ML software engineering needs further analysis and development on all fronts.

The full paper can be read here (open access): https://doi.org/10.1016/j.jss.2021.111096

Validation Methods For AI Systems

Our systematic literature review on validation methods for AI systems was recently published in Journal of Systems and Software. Artificial intelligence (AI) – especially in the form of machine learning (ML) – has steadily, yet firmly, made its way to our lives. Suggestions on what to buy, what too see, what to listen – all works of AI. Easily available tools make these strong techniques implementable even to those with little to no knowledge or experience on the implications the intelligent components can have on the built systems. This begs the question: how to trust these systems?

The paper studies the methods to validate practical AI systems reported in the research literature. In other words, we classified and described the methods used to ensure that AI systems with potential and or actual use work as intended and designed. The review was based on 90 relevant papers, narrowed down from an initial set of more than 1100 papers. The systems presented in the papers were analysed based on their domain, task, complexity, and applied validation methods. The systems performed 18 different kinds of tasks in 14 different application domains.

The validation methods were synthesized into a taxonomy consisting of different forms of trials and simulations, model-centred approaches, and expert opinions. Trials were further divided into trials in real and mock environment, and simulations could be conducted fully virtually, or having hardware or even the entire system in the validation loop. With the provided descriptions and examples, the taxonomy should be easily used as a basis for further attempts to synthesize the validation of AI systems or even to propose general ideas on how to validate systems.

To ensure the dependability of the systems beyond the initial validation – which we gave an umbrella term “continuous validation” -, the papers implemented failure monitors, safety channels, redundancy, voting, and input and output restrictions. These were, however, described to a lesser degree in the papers.

The full paper can be read here (open access): https://doi.org/10.1016/j.jss.2021.111050

A good start for the Autumn semester – 4 papers accepted to Profes 2016

Profes, aPROFES-LOGO-2016n International Conference on Product-Focused software Process Improvement, is among the top recognized software development and process improvement conferences. Profes 2016 which will be held in Trondheim, Norway in November 22-24th, 2016 had a tough competition as they received close to 80 submissions. Thus at ESE, we are glad to announce the acceptance of four papers. The papers with the following titles and author list were accepted and will be made available here once they have been published:

  • Transitioning Towards Continuous Experimentation in a Large Software Product and Service Development Organization – A Case Study: Sezin Gizem Yaman, Fabian Fagerholm, Myriam Munezero, Jürgen Münch, Mika Aaltola, Christina Palmu and Tomi Männistö
  • Towards Continuous Customer Satisfaction and Experience Management: A Measurement Framework Design Case in Wireless B2B Industry: Petri Kettunen, Mikko Ämmälä, Tanja Sauvola, Susanna Teppola, Jari Partanen, Simo Rontti
  • Supporting management of hybrid OSS communnities – A stakeholder analysis approach: Hanna Mäenpää, Tero Kojo, Myriam Munezero, Terhi Kilamo, Fabian Fagerholm, Mikko Nurmela, Tomi Männistö
  • DevOps Adoption Benefits and Challenges in Practice: A Case Study: Leah Riungu-Kalliosaari, Simo Mäkinen, Lucy Ellen Lwakatare,  Juha Tiihonen and Tomi Männistö

Congratulations to all the authors!