Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)
Permutations as a testbed for studying the effect of input representations on learning
Sarah Scullen · Davis Brown · Robert Jasper · Henry Kvinge · Helen Jenne
Keywords: [ data ] [ data representation ] [ permutations ] [ model architecture ] [ deep learning ]
Quality data is crucial for deep learning. However, relative to progress in model training and data curation, there is a lesser focus on understanding the effects of how data is encoded and passed to the neural network— the “data representation.” This is especially true for non-textual domains, where there are often challenges in distinguishing between the difficulty of the learning task versus difficulties arising merely from the format of the input data. We propose using permutations, which have multiple natural mathematical representations, as a systematic way to study how task difficulty and learning outcomes are influenced by the choice of input data representation. In our setting, we find that the model performance on a data representation can change significantly with the number of examples and architecture type; however, with enough examples most tasks are learned regardless of data representation.