Nordic Agile Survey 2020

We have again continued our agile survey research with Nitor started in 2018 (see Agile Now in Finland Continued):

  • In November 2020, we presented further results of the two survey rounds conducted in 2018 and 2019: ICSOB 2020.
  • In December 2020, in collaboration with Karlstad University, we finished another survey round targeted to respondents in Finland and Sweden. We received a substantial set of responses, and we are currently working on the analysis to publish the research results. The questionnaire included items about the impacts of the current situation of the pandemic and its potential relations to agility of companies.

 

Collective Executions: Coordinating Joint Device Behavior in the Fog

This is a re-post from JSS Editor’s Selection on Medium. Read the original post from here.

The contemporary computing environment has shifted from single node computing to increasingly distributed computing as the number of interconnected devices has grown. Despite of this shift in the computing paradigm, the way we think about software has not dramatically changed over the time: Software application is yet considered to be solely executed by a single entity at the time — one entity possibly coordinating the others.

The interconnected entities can together be considered as the runtime environment.

With our novel concept of Collective Executions we have taken the first steps towards new kind of computing paradigm and programming model where multiple devices dynamically, at the same time form the runtime environment where multiple devices are at the same time executing the same piece of software. The devices — as well as other entities, like humans and services — are then assigned to certain roles in the execution. Underneath a novel framework takes care of forming this runtime that enables such collective software execution by taking care of things like state synchronization, device coordination etc.

Coalescence and disintegration — Novel way of thinking the application state

In the very core of this model is the concept of coalescence and disintegration. In physics, these concepts describe the phenomena where two water drops merge into each other, or split. The analogy in our approach is that the software instances running on different entities can be merged as well into one instance. Then the state synchronization between the newly joined devices begins. Similarly, when disintegration takes place, the state synchronization ends.

The coalescence and disintegration take place when the distance between the executing entities meets certain thresholds in n-dimensional space. For example, the distance can be physical distance — like the proximity of other devices executing the same application, or it can be for instance the social distance: the family members, friends or people having some other certain social relationship may cause the application instances to coalescence and jointly start executing the same application.

Figure 1: Coalescence and Disintegration of Collective Executions

The distance can also be measured in the virtual world, where today common interests connect millions of people: People liking the same video game trailers could then jointly start a multiple player game.

More formally, we define the distance for two collective execution instances:

Where p and q are the instances of the Collective Execution E and dist is their mutual distance along the dimension of i. As the value of the distance is the maximum, it is the so-called Chebyshev distance between the instances.

An Example Application

To illustrate how the Collective Executions work in practice, let’s consider the following PhotoSharing example: A group of people has gathered together. The group consists of friends and family members, so the distance in the social world is short. This can act as a trigger that their PhotoSharing executions are enabled to coalescence.

From the user’s perspective, the idea of the PhotoSharing is to initiate discussion among the people that have gathered together. Hence while the PhotoSharing is running, it also notices that the physical distance is under the threshold value that is being used for proactively suggest viewing their most recent and previously un-shared photos related to specific topic, like recent trips to abroad or family life for instance. (Common interests, distance in the virtual world). The Collective Execution then forms the state by synchronizing the unseen photos and the currently shown photo among the participating devices. One device at the time acts as a coordinator and coordinates the other to show certain photo. Other devices act as screens, showing the same selected photo at the same time. This has been depicted in Figure 2.

Figure 2: PhotoSharing Collective Execution coordinating entities.

Future Work

It’s obvious, that there are numerous privacy and security concerns in such proactive approach. Hence plenty of research would be required for instance to make sure that no personal and intimate data will leak into wrong hands. Also, user experience perspective on the security should not be overlooked to make sure that the users a) understand how and by whom their data is being used, and b) not to burden the user with constant interruptions for regarding the data usage or proactive invocation of the activities.

Acknowledgements

This blog post is based on our articles in JSS and IEEE Software that we from the department of Computer Science, University of Helsinki, Finland have co-authored with our colleagues at the departments of Electronics and Communications Engineering and Pervasive Computing, Tampere University, Finland.

References

Mäkitalo, N., Aaltonen, T., Raatikainen, M., Ometov, A., Andreev, S., Koucheryavy, Y., & Mikkonen, T. (2019). Action-Oriented Programming Model: Collective Executions and Interactions in the Fog. Journal of Systems and Software, 157, 110391. DOI: https://doi.org/10.1016/j.jss.2019.110391

Mäkitalo, N., Ometov, A., Kannisto, J., Andreev, S., Koucheryavy, Y., & Mikkonen, T. (2017). Safe, secure executions at the network edge: coordinating cloud, edge, and fog computing. IEEE Software, 35(1), 30–37. DOI: http://doi.org/10.1109/MS.2017.4541037

 

Notes on blockchains

Originally published in the 4APIs project blog on the 24th of August 2020, https://4apis.fi/blog/

The best-known applications of blockchain technology are cryptocurrencies, but there is considerable interest in applying blockchains as a data storage method in various different fields. Blockchains can be used to record transactions in a reliable, secure and immutable manner. Transactions are saved to linked blocks that form a digital, encrypted ledger. Each party or node in a blockchain maintains a copy of this immutable ledger. Consensus among the parties is achieved by using, for example, proof-of-work.

Blockchains can be permissionless or permissioned. Permissionless, public blockchain systems allow anyone to join the blockchain, whereas permissioned, private blockchain systems use membership control to allow only identified parties to join. In public blockchains, anyone can read or write data, but while reading is free, writing to a blockchain requires paying a fee in cryptocurrency. The fee will be paid to a miner who first completes the proof-of-work to secure a new block containing the transaction data. In private blockchains, the owner of the blockchain can decide on the transaction fees.

Blockchains can be used to eliminate the need for trust among the parties sending transactions to each other. All transactions are visible in the distributed ledger and tampering the transaction history would require the malicious party taking control of the majority (51%) of the blockchain network’s mining hash rate.

Ethereum [1] is the most popular permissionless blockchain that allows writing of smart contracts on the blockchain. Smart contracts consist of contracts or business logic that are installed on the blockchain system. Parties in the blockchain can execute smart contracts to create different transactions that are then validated by other parties and saved to the blockchain.

There are several different platforms for building permissioned, private blockchains. Some of the most widely used are Hyperledger [2] and R3’s Corda [3]. Private blockchains are meant to allow saving sensitive data to a blockchain so that only selected parties are able to view it. However, it is possible to save encrypted, private data also to a public blockchain.

Since public blockchains use computationally more expensive consensus protocols and have more nodes, private blockchains can potentially offer better scalability and faster transactions. However, private blockchains are not truly decentralized and, for example, Hyperledger Fabric was found at least in 2019 to have issues with network delays causing desynchronization in the blockchain [4].

There has been a lot of interest in the possibilities of blockchain technology, and hopes of revolutions in many different areas such as finance and Internet of Things (IoT). Blockchains can provide secure ways to manage confidential data and identity information, and thus provide potential use cases also in health care.

However, so far there are only a few fully operational blockchains besides systems related to cryptocurrencies and Bitcoin [5] remains the most successful real-world application of blockchain technology.

Ethereum was the first blockchain to support the implementation of smart contracts, which enable building decentralized applications (dapps) on Ethereum blockchain. There are various potential use cases for dapps and plenty of tutorials on dapp development available online. Despite this, most dapps have practically no users or transactions. On 9th of June 2020 on DappRadar [6], the list of Ethereum dapps showed over 1880 dapps, but only 330 had had at least one user during the previous seven days. All Top Ten dapps (based on user count during the previous seven days), save one, were related to money exchange, high risk investments and decentralized finance. There are some gaming applications on Ethereum (e.g. CryptoKitties, My Crypto Heroes), but most dapps appear to be related to finances and gambling.

In Deloitte’s Global Blockchain survey 2019, 53% of organizations saw blockchains as critical and being in top five of their priorities [7]. In the same survey one of the top five “organizational barriers to greater investment in blockchain technology” was the lack of in-house capabilities. As the need for blockchain professionals is likely to grow in the future, in Finland a project has been launched to provide education on blockchain technology in universities [8].

Supply chains are one area where the use of blockchain technology can potentially streamline the process and reduce paperwork besides creating a transparent, immutable record of the product history. However, Gartner’s report (2019) estimates that “contrary to initial market hype and for the time being, blockchain is not enabling a major digital business revolution, and may not enable one until at least 2028” [9]. This is due to several factors that currently make it challenging for organisations to adopt blockchain technology.

In a blockchain system, there is overhead from replicating the data. For example, in Ethereum, those users that host the full node need approximately 180 GB disk space [10]. However, not all users need to download the full node, as there are also light nodes that store only the transaction headers and are able to request other information from full nodes. Light nodes can be used, for example, in mobile phones or embedded devices.

For instance, Peker et al. (2020) studied the cost of saving IoT sensor data to Ethereum blockchain. In their experiment the cost of storing 6000 data points (256-bit integers) was approximately 335 – 467 US dollars depending on the method used [11]. According to other informal estimates, the cost of storing 1kB of data to Ethereum blockchain would have been approximately 1.6 US dollars in 2018, and the storage of 1GB was over 1.6 million US dollars. According to Kumar et al. (2020) “the cost of storage on a public blockchain platform can be staggering, a few thousand times higher than on a distributed database system or in the cloud. On a permissioned blockchain system, the cost is likely to be less but still one or two orders of magnitude higher.” [12]

Due to mining being computationally expensive, public blockchain systems consume more energy than regular distributed databases. Bitcoin mining is notoriously energy consuming, and sustainability issues are one area where more research is needed. The popularity of Bitcoin and other cryptocurrencies has also led to different scams related to cryptocurrencies, and for example, infected websites harnessing visitors’ computers to mine Bitcoin.

In their article Kumar et al. (2020) suggest that “blockchain technology should be deployed selectively, mainly for interorganizational transactions among untrusted parties, and in applications that need high levels of provenance and visibility.” For example, tracking the origin and shipping of precious gemstones or other expensive or critical commodities is one area where blockchain systems have been trialled. Regarding supply chains, major challenges for using blockchains include creating legislation and standards all parties can agree on, and getting everyone to use blockchain technology despite additional costs. Joining a consortium is usually necessary to properly utilize blockchains.

To conclude, as blockchains are today a high-cost and high-overhead storage method, careful consideration is needed to determine the proper use cases. Also, a decision should be made on what data to store to the blockchain, as it might be feasible to store only the most critical parts of the whole data to save resources. Blockchain technology is being piloted in various different fields, and in the future, blockchains are likely to be utilized in a much wider scale.

References and recommended reading

[1] https://ethereum.org/en/

[2] https://www.hyperledger.org/

[3] https://www.r3.com/corda-platform/

[4] Nguyen, T.S.L., Jourjon, G., Potop-Butucaru, M. & Thai, K.L. 2019, “Impact of network delays on Hyperledger Fabric”, INFOCOM 2019 – IEEE Conference on Computer Communications Workshops, INFOCOM WORKSHOPS 2019, pp. 222-227.

[5] https://bitcoin.org/en/

[6] https://dappradar.com/rankings/protocol/eth

[7] Deloitte’s Global Blockchain survey 2019

[8] https://www.eura2014.fi/rrtiepa/projekti.php?projektikoodi=S22027

[9] Gartner 2019: Blockchain Unraveled: Determining Its Suitability for Your Organization https://www.gartner.com/en/doc/3913807-blockchain-unraveled-determining-its-suitability-for-your-organization

[10] https://medium.com/@marcandrdumas/are-ethereum-full-nodes-really-full-an-experiment-b77acd086ca7

[11] Peker, Y.K., Rodriguez, X., Ericsson, J., Lee, S.J. & Perez, A.J. 2020, “A cost analysis of internet of things sensor data storage on blockchain via smart contracts”, Electronics (Switzerland), vol. 9, no. 2.

[12] Kumar, A., Liu, R., Shan, Z. 2020, “Is Blockchain a Silver Bullet for Supply Chain Management? Technical Challenges and Research Opportunities”, Decision Sciences 51 (1), pp. 8-37.

 

ICWE’20 Successfully Completed

ICWE’20 live from Helsinki, Finland, June 9-12, 2020

June 9-12, 2020 was exciting times for Aalto University, Metropolia University of Applied Sciences, Tampere University, and University of Helsinki. We jointly hosted the 20th International Conference on Web Engineering (ICWE’20, https://icwe2020.webengineering.org/). The conference is one of the key events for the Web Engineering community, and this year it included one day of workshops, tutorials, and PhD symposium, followed by three days of the main event.

When we for the first time promoted the event in Daejeon, South Korea in June 2019, the world was a very different place. We were worried about things such as venue (downtown, easily accessible, with nice places to visit nearby), getting to Finland (Finnair has a great network for entering Europe from various places) and social program (Helsinki has great restaurants and an option to go to a sauna by the seaside, in downtown). However, during the spring things took a different turn, and we basically had to abandon all the plans virtually overnight, leaving us two options, either to postpone the event or to go completely online.

 

We decided to go for the latter — what would have been more natural for a community of web engineering researchers and practitioners? This meant that the website became the portal of just about everything related to the conference, including links to live sessions, video presentations, slides, Slack channels, and so on. Despite our worries while organising things, it turned out that indeed it is feasible to organize an international conference completely online. Here’s a couple of ideas that we adopted that worked well.

  • Staying focused online is harder than staying focused when someone is physically present and presenting. To allow the participants to familiarize with the topics, we collected video presentations of the accepted papers and put them online well before the event.
  • To further focus on the essentials when online, we changed the format of the sessions. Usually, ICWE has had 1.5 hour sessions, with three 30 min presentations, consisting of 20-25 min presentation and 5-10 min questions. We shortened the sessions to 60 min, and the presentations to 20 min. Since full videos of presentations were available for viewing beforehand, the format of the presentation was 5 min for recap by the authors(s), and 15 min of interaction, questions, comments, and so on. This worked surprisingly well, and, given that communication is slower online than in in-premise situations, provided a nice opportunity to interact with the presenters.
  • Keynotes were streamed live, to maintain excitement throughout the presentation. The three keynotes, presented by David Bryant (Fellow in Emerging Technologies, Mozilla), Prof. Dr. Olaf Zimmermann (University of Applied Sciences of Eastern Switzerland) and Jaakko Lempinen (Head of AI, YLE) all delivered number of experiences as well as inspired a lively discussion afterwards.
  • Every session in the conference had a Slack channel, where the topics related to that session could be discussed. Many session chairs used the channel to discuss practicalities with the presenters, and the audience used that to raise questions. This definitely would work on a physical conference, too.

While the event did not require any physical traveling, this is not to say that it was an easy four-day activity. Instead, the event was intense, and it was clearly possible to see the intensity of people, at least when they had their camera on. Having the event completely online also meant taking into account details such as the time zones while composing the program, which added a layer of complexity in the planning of the event.

Goodbye all you ICWE’20 delegates, and thanks to those of you who helped us to organize the event. It was great to have you all in Finland, even if only remotely.

Kari^2, Markku, Niko and Tommi

 

Discontinuity and Continuous X Within Software Development

Many – if not all – software organizations are currently faced with extraordinary circumstances and highly uncertain business conditions. Hardly any “business-as-usual” exists. Some of the discontinuities may even become “new normal”. In these discontinuous times, it is especially apt to consider, what continuous activities and capabilities relate to modern software creation and production.

Continuous delivery and continuous deployment (CD) are nowadays mainstream practices in modern software engineering. Such practices coupled with efficient infrastructures make it possible to develop and maintain software systems frequently based on the current feedback and usage conditions. Continuous integration (CI) supports that way of working.

Continuous experimentation facilitates software product creation by reducing uncertainties with systematic experiments (c.f., here). Consequently, the more uncertainties the software product is faced with, the more useful such experimental development approaches with continuous learning may be.

Advancing from and building on the aforementioned developmental capabilities continuous innovation integrates continuous learning, improvement and innovation. Continuity of the innovation activities and related business processes are especially important in volatile and fast-moving environments where stable states may not prevail for longer times and disruptions may blur and even reposition industry boundaries.

We have recently investigated continuous innovation in an industrial case study (see https://doi.org/10.1007/978-3-030-33742-1_13). ICT use may improve organization-wide ideation and the subsequent innovation process activities by making key information transparent and ubiquitously accessible for all stakeholders. That enables every employee to continuously engage and contribute to idea generation, development and validation. Ideally, the knowledge and creative potential of the entire organization is utilized at critical times.

 

IVVES project on the testing of machine learning systems starting

Last week Business Finland decided to fund our three-year IVVES project (Industrial-grade Verification and Validation of Evolving Systems) https://ivves.weebly.com/. We can now significantly extend our research efforts on testing, continuous development, and maintenance of machine learning systems together with our partners in Finland, The Netherlands, Sweden, and Canada. We are also planning to set up an interest group for Finnish companies interested in the project. The University of Helsinki’s work in the project is jointly headed by Prof Tommi Mikkonen and Prof. Jukka K. Nurminen.

Open Source Software Framework for Data Fault Injection to Test Machine Learning Systems

dpEmu is our Python library for emulating data problems in the use and training of machine learning systems. It provides tools for injecting errors into data, running machine learning models with different error parameters and visualizing the results.
Data-intensive systems are sensitive to the quality of data. Data often has problems due to faulty sensors or network problems, for instance. dpEmu framework can emulate faults in data and use it to study how machine learning (ML) systems work when the data has problems. The Python framework aims for flexibility: users can use predefined or their own dedicated fault models. Likewise, different kinds of data (e.g. text, time series, video) can be used and the system under test can vary from a single ML model to a complicated software system.
The software and a set of Jupyter notebooks illustrating different use cases are available at https://github.com/dpEmu/dpEmu
We just presented the work at ISSRE conference: Jukka K. Nurminen, Tuomas Halvari, Juha Harviainen, Juha Mylläri, Antti Röyskö, Juuso Silvennoinen, and Tommi Mikkonen. “Software Framework for Data Fault Injection to Test Machine Learning Systems”. 4th IEEE International Workshop on Reliability and Security Data Analysis (RSDA 2019) at 30th Annual IEEE International Symposium on Software Reliability Engineering (ISSRE 2019), Berlin, Germany, October 2019.

Doctoral defence on Continuous Experimentation in Software Engineering

Sezin Yaman from ESE research group will have a public examination of her Phd thesis titled: “Initiating the Transition towards Continuous Experimentation : Empirical Studies with Software Development Teams and Practitioners”. The opponent will be Professor Brian Fitzgerald, University of Limerick. The public event will take place in Room 302, Athena Building, University of Helsinki, on the 25th of October, 2019 at 12 o’clock noon.

See:
https://hy.etapahtuma.fi/flamma-2019/fi?id=56765#.XadD_sRS9hE

https://helda.helsinki.fi/handle/10138/305855