Explainable AI is a field of study focused on producing more explainable models while maintaining a high level of learning performance (prediction accuracy) to enable human users to understand, appropriately trust, and effectively manage the interaction with artificially intelligent agents. Nested Abstraction Modeling (NAM) is an experimental machine learning technique designed to amplify the characteristics and biases present in convolutional neural networks. The approach works by priming a trained model with random inputs to abstract its outputs by training a second model--1/200th of the original size.
Discipline: AI research, explainable AI, machine learning
Role: Software, Interface Design, Data Visualization
Team: Valdis Silins, Alyona Shapovalova, Nataliya Tyshkevich
NAM works by priming GPT-2 with various text inputs, and further processing its output using techniques such as natural language processing, and sentiment analyses. The data obtained is then used as training data for a 2nd model (nested architecture). While far from state-of-the-art, the model obtained from this process suggests the possibility to translate mathematical logic into abstractions that are more human-readable (language, visual environments, knowledge graphs), providing a better glimpse into the original model’s behavior and tendencies under control conditions.
The first proof of concept was done in early 2019 using the newly released GPT-2 model by Open Ai. GPT-2’s main objective is to predict the next word, given all of the previous words within a text. Trained on 40GB of internet text, the model produces highly readable and comprehensible responses based on a given text input, while keeping its biases and ideological inclinations inaccessible to its creators and users.
NAM’s findings and a new working prototype are currently being written into a research paper to be released by the end of 2019.