STREZOSKI

Our work on exploiting secondary latent features for task grouping got accepted for oral presentation in ICMR 2019 in Ottawa, Canada. This paper introduces Selective Sharing, a method using the factorized gradients per task as a signal that helps in grouping tasks that benefit each other's learning process.

The Approach

Multimedia applications often require concurrent solutions to multiple tasks. These tasks hold clues to each other's solutions, however as these relations can be complex this remains a rarely utilized property. When task relations are explicitly defined based on domain knowledge multi-task learning (MTL) offers such concurrent solutions, while exploiting relatedness between multiple tasks performed over the same dataset.

In most cases however, this relatedness is not explicitly defined and the domain expert knowledge that defines it is not available. To address this issue, we introduce Selective Sharing, a method that learns the inter-task relatedness from secondary latent features while the model trains.

Key Insights

In any deep learning system, gradients flow from the final layers of the model towards the starting ones carrying a corrective signal for the weights and biases along the way. Selective Sharing is based on the assumption that the identically constructed task specific estimators, sharing the same input and feature extraction platform, would manifest a correlation between the back-propagated gradients for related tasks.

The name Selective Sharing is derived from an important property of the method itself, which is allowing the model to select the tasks where sharing should occur, without it being specifically programmed to do so. Using this insight, we can automatically group tasks and allow them to share knowledge in a mutually beneficial way.