Inspired by Juan Sánchez Cotán’s Still Life with Game, Vegetables and Fruit, this series explores how patterns can disappear when data is not treated with care. It reflects on the process of normalization of the training and unseen data, where using a different reference can distort the original structure. Like the absent game in this composition, meaningful insights may vanish when scaling breaks the link to what once held them in view.
Original image source: Museo Nacional del Prado
Observations with variables such as pixel coordinates and color value are given for different layers of the above painting. A machine learning algorithm is trained on a portion of the observations such as the background layer and the objects except the Game which is designated as unseen data.
Upon training the model, statistics from the training set are extracted and are used to scale variables of both the training and the unseen set. With the correct approach, the unseen data is scaled appropriately with respect to the learned information.
When the unseen dataset is scaled with respect to its own statistics and not to the training ones, the original relative structure gets distorted and patterns such as an object on the canvas—like the Game—vanish.
Two brief experiments related to the painting are documented, titled as The Vanishing Effect of Inconsistently Scaled Data.