Validation Methods For AI Systems

Our systematic literature review on validation methods for AI systems was recently published in Journal of Systems and Software. Artificial intelligence (AI) – especially in the form of machine learning (ML) – has steadily, yet firmly, made its way to our lives. Suggestions on what to buy, what too see, what to listen – all works of AI. Easily available tools make these strong techniques implementable even to those with little to no knowledge or experience on the implications the intelligent components can have on the built systems. This begs the question: how to trust these systems?

The paper studies the methods to validate practical AI systems reported in the research literature. In other words, we classified and described the methods used to ensure that AI systems with potential and or actual use work as intended and designed. The review was based on 90 relevant papers, narrowed down from an initial set of more than 1100 papers. The systems presented in the papers were analysed based on their domain, task, complexity, and applied validation methods. The systems performed 18 different kinds of tasks in 14 different application domains.

The validation methods were synthesized into a taxonomy consisting of different forms of trials and simulations, model-centred approaches, and expert opinions. Trials were further divided into trials in real and mock environment, and simulations could be conducted fully virtually, or having hardware or even the entire system in the validation loop. With the provided descriptions and examples, the taxonomy should be easily used as a basis for further attempts to synthesize the validation of AI systems or even to propose general ideas on how to validate systems.

To ensure the dependability of the systems beyond the initial validation – which we gave an umbrella term “continuous validation” -, the papers implemented failure monitors, safety channels, redundancy, voting, and input and output restrictions. These were, however, described to a lesser degree in the papers.

The full paper can be read here (open access): https://doi.org/10.1016/j.jss.2021.111050

Digital world needs architecting more than ever

The slogan of our Computer Science department is “Architects of the digital world” (cs.helsinki.fi). The meaning of this slogan can be understood and approached from different perspectives. In this blog post, we aim to do our part by addressing what architecting the digital world means from the viewpoint of software, and in particular, that of software engineering research, where the word (software) architecture has been partially reserved for technical interpretation.

The digital word – and software in it – is already around us. Many of the devices, such as cars, are almost mobile computers carrying massive amounts of software onboard for controlling their behavior; our surroundings, such as homes, are packed with devices that run software, for example, to enable connectivity to our mobile devices for telling how they are doing; many if not most services are being digitalized – or have a digital component; and even many of our social interactions takes place in the digital world. The increasing digitalization just keeps expanding all that. 

So where’s the architecture and architecting in all of the above? Briefly, in the blueprint that defines the DNA of the system as a whole. In the small scale, software is code created by software developers, with the assistance of software-based tools, frameworks, cloud environments, and so on. However, when things get big, complex, integrated, interoperable, or critical, conscious architecting is needed to give the whole a reasonable shape and to ensure the whole has desired properties. Often referred to as architecturally significant, such properties may include the protection of privacy, availability of services, modifiability of software to different uses, fault tolerance, energy efficiency, and so on, to mention some. Furthermore, as the modern systems and their software evolve throughout their lifespans, only future-proof architectural designs will keep the software systems sustainable. In software engineering research and practice, these are the kinds of concerns architects and architecture design methods aim at addressing. 

To elaborate, architecting means, first of all, understanding what is expected of the digital world. Designing and maintaining complex and large, often interconnected software systems of the digital world demand robust architecting. Architecting is then about making the important design decisions that give the digital world its shape and answers “how to position the load-bearing structures”. Architecting is also about risk management of those parts of the digital world that have not yet been realised. When planning something not yet existing, an architect needs to envision the technical solutions with the acceptably low risk of feasibility for meeting the expectations over the entire life cycle.

It seems inevitable that the future digital world will start shaping itself with little overall control or planning, when more and more systems are constructed that depend on each other in their operations. In this new context, the role of software engineering research is to try to understand where things may start cracking and crumbling – and then see what architectural solutions would be needed to solidify the digital world. This naturally requires understanding the expectation the society sets for the digital world. In addition to the expected needs, the digital world sets constraints on software systems. For example, as more and more regulated software systems become connected, the need for architecture work becomes more stringent. Architecture becomes the essential communication tool that enables risk managers and regulatory affairs professionals to effectively perform their duties in the area of safety, cybersecurity or privacy.

To summarize: The expectations that are architecturally significant, i.e., fundamental in the design and thus hard to change if wrong and that enable or prohibit the expected properties of the digital world, include for example:

  • When and how should the software protect the privacy of personal data, particularly when pieces of data are combined from distinct sources?
  • How to make the public sector software systems interoperable, open and cost efficient – with some open and crowd sourced architecting?
  • How to control the risks related to systems critical to society or individual human life, taking into account their various intertwined dependencies (e.g., banking systems currently used also for general-purpose user identification purposes)?
  • How to take into account the sustainability goals of the future digital world (e.g., electricity consumption)?
  • How to design software systems in which artificial intelligence, including machine learning, can be embedded in an understandable, trustworthy, transparent, and testable manner? 

This is where the architects of the digital world are needed – to help make the architectural decisions regarding the software that runs the world. This is not only about technical issues and possibilities but also about understanding the contexts of use, decisions about policies that constrain technical solutions, incentives for collaboration for common benefit, or ensuring that the voice of minorities is also heard. It is therefore in the interest of the society that the architectural decisions of important services and systems of the digital world and their justifications are transparent and open to scrutiny. 

As a research group in software engineering and architectural topics in particular, these are issues we envision to encounter on our journey when living up to the slogan of the department: “Architects of the digital world”. 

– the ESE Team

Nordic Agile Survey 2020

We have again continued our agile survey research with Nitor started in 2018 (see Agile Now in Finland Continued):

  • In November 2020, we presented further results of the two survey rounds conducted in 2018 and 2019: ICSOB 2020.
  • In December 2020, in collaboration with Karlstad University, we finished another survey round targeted to respondents in Finland and Sweden. We received a substantial set of responses, and we are currently working on the analysis to publish the research results. The questionnaire included items about the impacts of the current situation of the pandemic and its potential relations to agility of companies.

 

Collective Executions: Coordinating Joint Device Behavior in the Fog

This is a re-post from JSS Editor’s Selection on Medium. Read the original post from here.

The contemporary computing environment has shifted from single node computing to increasingly distributed computing as the number of interconnected devices has grown. Despite of this shift in the computing paradigm, the way we think about software has not dramatically changed over the time: Software application is yet considered to be solely executed by a single entity at the time — one entity possibly coordinating the others.

The interconnected entities can together be considered as the runtime environment.

With our novel concept of Collective Executions we have taken the first steps towards new kind of computing paradigm and programming model where multiple devices dynamically, at the same time form the runtime environment where multiple devices are at the same time executing the same piece of software. The devices — as well as other entities, like humans and services — are then assigned to certain roles in the execution. Underneath a novel framework takes care of forming this runtime that enables such collective software execution by taking care of things like state synchronization, device coordination etc.

Coalescence and disintegration — Novel way of thinking the application state

In the very core of this model is the concept of coalescence and disintegration. In physics, these concepts describe the phenomena where two water drops merge into each other, or split. The analogy in our approach is that the software instances running on different entities can be merged as well into one instance. Then the state synchronization between the newly joined devices begins. Similarly, when disintegration takes place, the state synchronization ends.

The coalescence and disintegration take place when the distance between the executing entities meets certain thresholds in n-dimensional space. For example, the distance can be physical distance — like the proximity of other devices executing the same application, or it can be for instance the social distance: the family members, friends or people having some other certain social relationship may cause the application instances to coalescence and jointly start executing the same application.

Figure 1: Coalescence and Disintegration of Collective Executions

The distance can also be measured in the virtual world, where today common interests connect millions of people: People liking the same video game trailers could then jointly start a multiple player game.

More formally, we define the distance for two collective execution instances:

Where p and q are the instances of the Collective Execution E and dist is their mutual distance along the dimension of i. As the value of the distance is the maximum, it is the so-called Chebyshev distance between the instances.

An Example Application

To illustrate how the Collective Executions work in practice, let’s consider the following PhotoSharing example: A group of people has gathered together. The group consists of friends and family members, so the distance in the social world is short. This can act as a trigger that their PhotoSharing executions are enabled to coalescence.

From the user’s perspective, the idea of the PhotoSharing is to initiate discussion among the people that have gathered together. Hence while the PhotoSharing is running, it also notices that the physical distance is under the threshold value that is being used for proactively suggest viewing their most recent and previously un-shared photos related to specific topic, like recent trips to abroad or family life for instance. (Common interests, distance in the virtual world). The Collective Execution then forms the state by synchronizing the unseen photos and the currently shown photo among the participating devices. One device at the time acts as a coordinator and coordinates the other to show certain photo. Other devices act as screens, showing the same selected photo at the same time. This has been depicted in Figure 2.

Figure 2: PhotoSharing Collective Execution coordinating entities.

Future Work

It’s obvious, that there are numerous privacy and security concerns in such proactive approach. Hence plenty of research would be required for instance to make sure that no personal and intimate data will leak into wrong hands. Also, user experience perspective on the security should not be overlooked to make sure that the users a) understand how and by whom their data is being used, and b) not to burden the user with constant interruptions for regarding the data usage or proactive invocation of the activities.

Acknowledgements

This blog post is based on our articles in JSS and IEEE Software that we from the department of Computer Science, University of Helsinki, Finland have co-authored with our colleagues at the departments of Electronics and Communications Engineering and Pervasive Computing, Tampere University, Finland.

References

Mäkitalo, N., Aaltonen, T., Raatikainen, M., Ometov, A., Andreev, S., Koucheryavy, Y., & Mikkonen, T. (2019). Action-Oriented Programming Model: Collective Executions and Interactions in the Fog. Journal of Systems and Software, 157, 110391. DOI: https://doi.org/10.1016/j.jss.2019.110391

Mäkitalo, N., Ometov, A., Kannisto, J., Andreev, S., Koucheryavy, Y., & Mikkonen, T. (2017). Safe, secure executions at the network edge: coordinating cloud, edge, and fog computing. IEEE Software, 35(1), 30–37. DOI: http://doi.org/10.1109/MS.2017.4541037

 

Notes on blockchains

Originally published in the 4APIs project blog on the 24th of August 2020, https://4apis.fi/blog/

The best-known applications of blockchain technology are cryptocurrencies, but there is considerable interest in applying blockchains as a data storage method in various different fields. Blockchains can be used to record transactions in a reliable, secure and immutable manner. Transactions are saved to linked blocks that form a digital, encrypted ledger. Each party or node in a blockchain maintains a copy of this immutable ledger. Consensus among the parties is achieved by using, for example, proof-of-work.

Blockchains can be permissionless or permissioned. Permissionless, public blockchain systems allow anyone to join the blockchain, whereas permissioned, private blockchain systems use membership control to allow only identified parties to join. In public blockchains, anyone can read or write data, but while reading is free, writing to a blockchain requires paying a fee in cryptocurrency. The fee will be paid to a miner who first completes the proof-of-work to secure a new block containing the transaction data. In private blockchains, the owner of the blockchain can decide on the transaction fees.

Blockchains can be used to eliminate the need for trust among the parties sending transactions to each other. All transactions are visible in the distributed ledger and tampering the transaction history would require the malicious party taking control of the majority (51%) of the blockchain network’s mining hash rate.

Ethereum [1] is the most popular permissionless blockchain that allows writing of smart contracts on the blockchain. Smart contracts consist of contracts or business logic that are installed on the blockchain system. Parties in the blockchain can execute smart contracts to create different transactions that are then validated by other parties and saved to the blockchain.

There are several different platforms for building permissioned, private blockchains. Some of the most widely used are Hyperledger [2] and R3’s Corda [3]. Private blockchains are meant to allow saving sensitive data to a blockchain so that only selected parties are able to view it. However, it is possible to save encrypted, private data also to a public blockchain.

Since public blockchains use computationally more expensive consensus protocols and have more nodes, private blockchains can potentially offer better scalability and faster transactions. However, private blockchains are not truly decentralized and, for example, Hyperledger Fabric was found at least in 2019 to have issues with network delays causing desynchronization in the blockchain [4].

There has been a lot of interest in the possibilities of blockchain technology, and hopes of revolutions in many different areas such as finance and Internet of Things (IoT). Blockchains can provide secure ways to manage confidential data and identity information, and thus provide potential use cases also in health care.

However, so far there are only a few fully operational blockchains besides systems related to cryptocurrencies and Bitcoin [5] remains the most successful real-world application of blockchain technology.

Ethereum was the first blockchain to support the implementation of smart contracts, which enable building decentralized applications (dapps) on Ethereum blockchain. There are various potential use cases for dapps and plenty of tutorials on dapp development available online. Despite this, most dapps have practically no users or transactions. On 9th of June 2020 on DappRadar [6], the list of Ethereum dapps showed over 1880 dapps, but only 330 had had at least one user during the previous seven days. All Top Ten dapps (based on user count during the previous seven days), save one, were related to money exchange, high risk investments and decentralized finance. There are some gaming applications on Ethereum (e.g. CryptoKitties, My Crypto Heroes), but most dapps appear to be related to finances and gambling.

In Deloitte’s Global Blockchain survey 2019, 53% of organizations saw blockchains as critical and being in top five of their priorities [7]. In the same survey one of the top five “organizational barriers to greater investment in blockchain technology” was the lack of in-house capabilities. As the need for blockchain professionals is likely to grow in the future, in Finland a project has been launched to provide education on blockchain technology in universities [8].

Supply chains are one area where the use of blockchain technology can potentially streamline the process and reduce paperwork besides creating a transparent, immutable record of the product history. However, Gartner’s report (2019) estimates that “contrary to initial market hype and for the time being, blockchain is not enabling a major digital business revolution, and may not enable one until at least 2028” [9]. This is due to several factors that currently make it challenging for organisations to adopt blockchain technology.

In a blockchain system, there is overhead from replicating the data. For example, in Ethereum, those users that host the full node need approximately 180 GB disk space [10]. However, not all users need to download the full node, as there are also light nodes that store only the transaction headers and are able to request other information from full nodes. Light nodes can be used, for example, in mobile phones or embedded devices.

For instance, Peker et al. (2020) studied the cost of saving IoT sensor data to Ethereum blockchain. In their experiment the cost of storing 6000 data points (256-bit integers) was approximately 335 – 467 US dollars depending on the method used [11]. According to other informal estimates, the cost of storing 1kB of data to Ethereum blockchain would have been approximately 1.6 US dollars in 2018, and the storage of 1GB was over 1.6 million US dollars. According to Kumar et al. (2020) “the cost of storage on a public blockchain platform can be staggering, a few thousand times higher than on a distributed database system or in the cloud. On a permissioned blockchain system, the cost is likely to be less but still one or two orders of magnitude higher.” [12]

Due to mining being computationally expensive, public blockchain systems consume more energy than regular distributed databases. Bitcoin mining is notoriously energy consuming, and sustainability issues are one area where more research is needed. The popularity of Bitcoin and other cryptocurrencies has also led to different scams related to cryptocurrencies, and for example, infected websites harnessing visitors’ computers to mine Bitcoin.

In their article Kumar et al. (2020) suggest that “blockchain technology should be deployed selectively, mainly for interorganizational transactions among untrusted parties, and in applications that need high levels of provenance and visibility.” For example, tracking the origin and shipping of precious gemstones or other expensive or critical commodities is one area where blockchain systems have been trialled. Regarding supply chains, major challenges for using blockchains include creating legislation and standards all parties can agree on, and getting everyone to use blockchain technology despite additional costs. Joining a consortium is usually necessary to properly utilize blockchains.

To conclude, as blockchains are today a high-cost and high-overhead storage method, careful consideration is needed to determine the proper use cases. Also, a decision should be made on what data to store to the blockchain, as it might be feasible to store only the most critical parts of the whole data to save resources. Blockchain technology is being piloted in various different fields, and in the future, blockchains are likely to be utilized in a much wider scale.

References and recommended reading

[1] https://ethereum.org/en/

[2] https://www.hyperledger.org/

[3] https://www.r3.com/corda-platform/

[4] Nguyen, T.S.L., Jourjon, G., Potop-Butucaru, M. & Thai, K.L. 2019, “Impact of network delays on Hyperledger Fabric”, INFOCOM 2019 – IEEE Conference on Computer Communications Workshops, INFOCOM WORKSHOPS 2019, pp. 222-227.

[5] https://bitcoin.org/en/

[6] https://dappradar.com/rankings/protocol/eth

[7] Deloitte’s Global Blockchain survey 2019

[8] https://www.eura2014.fi/rrtiepa/projekti.php?projektikoodi=S22027

[9] Gartner 2019: Blockchain Unraveled: Determining Its Suitability for Your Organization https://www.gartner.com/en/doc/3913807-blockchain-unraveled-determining-its-suitability-for-your-organization

[10] https://medium.com/@marcandrdumas/are-ethereum-full-nodes-really-full-an-experiment-b77acd086ca7

[11] Peker, Y.K., Rodriguez, X., Ericsson, J., Lee, S.J. & Perez, A.J. 2020, “A cost analysis of internet of things sensor data storage on blockchain via smart contracts”, Electronics (Switzerland), vol. 9, no. 2.

[12] Kumar, A., Liu, R., Shan, Z. 2020, “Is Blockchain a Silver Bullet for Supply Chain Management? Technical Challenges and Research Opportunities”, Decision Sciences 51 (1), pp. 8-37.

 

ICWE’20 Successfully Completed

ICWE’20 live from Helsinki, Finland, June 9-12, 2020

June 9-12, 2020 was exciting times for Aalto University, Metropolia University of Applied Sciences, Tampere University, and University of Helsinki. We jointly hosted the 20th International Conference on Web Engineering (ICWE’20, https://icwe2020.webengineering.org/). The conference is one of the key events for the Web Engineering community, and this year it included one day of workshops, tutorials, and PhD symposium, followed by three days of the main event.

When we for the first time promoted the event in Daejeon, South Korea in June 2019, the world was a very different place. We were worried about things such as venue (downtown, easily accessible, with nice places to visit nearby), getting to Finland (Finnair has a great network for entering Europe from various places) and social program (Helsinki has great restaurants and an option to go to a sauna by the seaside, in downtown). However, during the spring things took a different turn, and we basically had to abandon all the plans virtually overnight, leaving us two options, either to postpone the event or to go completely online.

 

We decided to go for the latter — what would have been more natural for a community of web engineering researchers and practitioners? This meant that the website became the portal of just about everything related to the conference, including links to live sessions, video presentations, slides, Slack channels, and so on. Despite our worries while organising things, it turned out that indeed it is feasible to organize an international conference completely online. Here’s a couple of ideas that we adopted that worked well.

  • Staying focused online is harder than staying focused when someone is physically present and presenting. To allow the participants to familiarize with the topics, we collected video presentations of the accepted papers and put them online well before the event.
  • To further focus on the essentials when online, we changed the format of the sessions. Usually, ICWE has had 1.5 hour sessions, with three 30 min presentations, consisting of 20-25 min presentation and 5-10 min questions. We shortened the sessions to 60 min, and the presentations to 20 min. Since full videos of presentations were available for viewing beforehand, the format of the presentation was 5 min for recap by the authors(s), and 15 min of interaction, questions, comments, and so on. This worked surprisingly well, and, given that communication is slower online than in in-premise situations, provided a nice opportunity to interact with the presenters.
  • Keynotes were streamed live, to maintain excitement throughout the presentation. The three keynotes, presented by David Bryant (Fellow in Emerging Technologies, Mozilla), Prof. Dr. Olaf Zimmermann (University of Applied Sciences of Eastern Switzerland) and Jaakko Lempinen (Head of AI, YLE) all delivered number of experiences as well as inspired a lively discussion afterwards.
  • Every session in the conference had a Slack channel, where the topics related to that session could be discussed. Many session chairs used the channel to discuss practicalities with the presenters, and the audience used that to raise questions. This definitely would work on a physical conference, too.

While the event did not require any physical traveling, this is not to say that it was an easy four-day activity. Instead, the event was intense, and it was clearly possible to see the intensity of people, at least when they had their camera on. Having the event completely online also meant taking into account details such as the time zones while composing the program, which added a layer of complexity in the planning of the event.

Goodbye all you ICWE’20 delegates, and thanks to those of you who helped us to organize the event. It was great to have you all in Finland, even if only remotely.

Kari^2, Markku, Niko and Tommi

 

Discontinuity and Continuous X Within Software Development

Many – if not all – software organizations are currently faced with extraordinary circumstances and highly uncertain business conditions. Hardly any “business-as-usual” exists. Some of the discontinuities may even become “new normal”. In these discontinuous times, it is especially apt to consider, what continuous activities and capabilities relate to modern software creation and production.

Continuous delivery and continuous deployment (CD) are nowadays mainstream practices in modern software engineering. Such practices coupled with efficient infrastructures make it possible to develop and maintain software systems frequently based on the current feedback and usage conditions. Continuous integration (CI) supports that way of working.

Continuous experimentation facilitates software product creation by reducing uncertainties with systematic experiments (c.f., here). Consequently, the more uncertainties the software product is faced with, the more useful such experimental development approaches with continuous learning may be.

Advancing from and building on the aforementioned developmental capabilities continuous innovation integrates continuous learning, improvement and innovation. Continuity of the innovation activities and related business processes are especially important in volatile and fast-moving environments where stable states may not prevail for longer times and disruptions may blur and even reposition industry boundaries.

We have recently investigated continuous innovation in an industrial case study (see https://doi.org/10.1007/978-3-030-33742-1_13). ICT use may improve organization-wide ideation and the subsequent innovation process activities by making key information transparent and ubiquitously accessible for all stakeholders. That enables every employee to continuously engage and contribute to idea generation, development and validation. Ideally, the knowledge and creative potential of the entire organization is utilized at critical times.

 

IVVES project on the testing of machine learning systems starting

Last week Business Finland decided to fund our three-year IVVES project (Industrial-grade Verification and Validation of Evolving Systems) https://ivves.weebly.com/. We can now significantly extend our research efforts on testing, continuous development, and maintenance of machine learning systems together with our partners in Finland, The Netherlands, Sweden, and Canada. We are also planning to set up an interest group for Finnish companies interested in the project. The University of Helsinki’s work in the project is jointly headed by Prof Tommi Mikkonen and Prof. Jukka K. Nurminen.

Open Source Software Framework for Data Fault Injection to Test Machine Learning Systems

dpEmu is our Python library for emulating data problems in the use and training of machine learning systems. It provides tools for injecting errors into data, running machine learning models with different error parameters and visualizing the results.
Data-intensive systems are sensitive to the quality of data. Data often has problems due to faulty sensors or network problems, for instance. dpEmu framework can emulate faults in data and use it to study how machine learning (ML) systems work when the data has problems. The Python framework aims for flexibility: users can use predefined or their own dedicated fault models. Likewise, different kinds of data (e.g. text, time series, video) can be used and the system under test can vary from a single ML model to a complicated software system.
The software and a set of Jupyter notebooks illustrating different use cases are available at https://github.com/dpEmu/dpEmu
We just presented the work at ISSRE conference: Jukka K. Nurminen, Tuomas Halvari, Juha Harviainen, Juha Mylläri, Antti Röyskö, Juuso Silvennoinen, and Tommi Mikkonen. “Software Framework for Data Fault Injection to Test Machine Learning Systems”. 4th IEEE International Workshop on Reliability and Security Data Analysis (RSDA 2019) at 30th Annual IEEE International Symposium on Software Reliability Engineering (ISSRE 2019), Berlin, Germany, October 2019.