Discussion

Table of Contents

  1. Introduction and Background
  2. Data
  3. Methods and Key Measures
  4. Results
  5. Discussion
  6. Limitations
  7. Implications and Conclusions
  8. Division of Labor
  9. References

Discussion

 From a technical point of view, the largest barriers to properly verifying YOLOv8’s performance were lack of well-balanced classes and small size of Dataset 1. Book pages in our data have random counts of several classes and it was difficult to distribute the classes for training in an even manner.
We later discovered that underrepresented classes could have been retrieved from the images and included in the training data as a standalone printmark with some augmentation applied to them, but there was not enough time to test it. Even if there is no need for balancing classes, we simply need bigger annotated dataset, so our classifier learns on more diverse data. In conclusion, we can only state that YOLOv8 is able to learn from the ECCO data and does so quite fast even on the CPU. However, to truly build usable model for printmark detection task, we need a bigger well-balanced data.

The clusters labeled as noise seem to have many images that could actually belong to the same clusters, at least by human standards. This could be due to significant dimensions being removed by the PCA dimensionality, meaning that the 2D visualization is actually not a completely accurate representation of the distance between these objects, but it is hard to tell without using metrics.

In general, the other recovered clusters seem to do a pretty good job, capturing the general details in images that make them quite similar. In fact, they were able to capture the style of the headpieces quite well to group together the headpieces that look like artworks, and the letters in the initials don’t seem to have a huge bearing on the result. The first initials cluster shows quite a bit of variety in the artwork used, but the cluster in general seems to be stylistically very similar.

From a humanities point of view, we were able to show that a quite reliable clustering is possible with a dataset as the Tonson dataset. The clustering on a less well curated dataset as Dataset 1 on the contrary is not fine-grained enough. The quality and the quantity of the data for each printer needs to be better to be reliable enough for a task as printer identification. As presented in the work of Maslen in 2001, grouping of images by human perception is less dependent on the number of images per printer and therefore especially better suited for the identification of printer with a sparse number of conveyed works. The identification of additional works printed by known printers associated with a certain number of printing products are thus a more suitable use case for automatic clustering.

A clustering of a quality like the clustered Dataset 1 can be used to associate rare ornaments with each other and to find stylistic similarities as shown in the clusters 4, 6 and 39. Thus, the results of a clustering like the ones produced with Dataset 1 can be used as a starting point for a further qualitative analysis as started in the Section about the results of the metadata analysis. The analysis could be extended via research collecting more insights about the single involved actors and possible relationships as apprenticeship for example. Without considering further aspects, the metadata analysis revealed already quite negative results. If circumstances such as the centrality of London particularly in the first half of the 18th century are taken into account, clusters like cluster 1 and 3 which were published in the first two decades of the 18th century are less expressive because the location London was to be expected. Moreover, the works of many clusters which were printed with a close temporal proximity were published in different locations. The troublesomeness of travel and transport though made it highly unlikely that the same printer owned printing houses in several towns far apart.

< Previous section: Results | Next section: Limitations >