If you are interested in working on any of the listed projects, please open an issue on Github to track the progress.
A recent paper by researchers from MIT, FAIR and Berkeley  shows how one can generate a very small synthetic dataset which is enough to train a neural network to achieve good performance on MNIST dataset. The authors plan to extend the work to larger image datasets and to non-image datasets. Adapting the method for text and EHR data can be both challenging and interesting.
 Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, Alexei A. Efros, Dataset Distillation, arXiv
Victoria Poghosyan is working on this.
There is a paper by Deepmind  about the stability of a trained large ConvNet when removing some of its neurons (Fig. 1). They also showed that there is a correlation between generalization and robustness (Fig. 3b).
 Ari S. Morcos, David G.T. Barrett, Neil C. Rabinowitz, Matthew Botvinick, On the importance of single directions for generalization, arXiv
Tatev Mejunts is working on this
Attempt to confirm the results of the famous paper on “rethinking generalization”  for recurrent networks
 Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals, Understanding deep learning requires rethinking generalization, arXiv
There are two recent pretrained multilingual embeddings:
The task is to validate the quality of these embeddings for transfer learning.
Hakob Tamazyan is working on this.
There is a recent paper  by Vetrov’s team that shows the following. If one trains a deep neural net (e.g. ResNet) on ImageNet and finds two local minima A and B (with different weight initializations), then there exists a point C in the space of weights such that the loss function is almost constant along the straight lines AC and BC. This has not been tested yet on NLP tasks.
 Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry Vetrov, Andrew Gordon Wilson, Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs, arXiv
The main repo of the benchmark: YerevaNN/mimic3-benchmarks
There is an interesting paper by Google Brain team on adversarially reprogramming pretrained neural networks to perform a new task. The idea is demonstrated to reprogram ImageNet network to perform MNIST classification.
So far, there is no evidence that it might work on recurrent networks. This fact makes this task a risky one :) A paper from UC San Diego applied the technique for text classification tasks based on LSTMs and CNNs
David Karamyan is working on this.
There are lots of papers on “visualizing and understanding” convolutional networks, mostly starting from . In recent years a few similar papers appeared for RNNs, especially about sentiment analysis [2,3]. Another recent paper does similar things for RNNs running on EHR notes .
 Karen Simonyan, Andrea Vedaldi, Andrew Zisserman, Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, arXiv
 Jiwei Li, Xinlei Chen, Eduard Hovy and Dan Jurafsky, Visualizing and Understanding Neural Models in NLP, NAACL-HLT 2016, ACLWEB
 Leila Arras, Gregoire Montavon, Klaus-Robert Muller, and Wojciech Samek, Explaining Recurrent Neural Network Predictions in Sentiment Analysis, 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2017, ACLWEB
 Jingshu Liu, Zachariah Zhang, Narges Razavian, Deep EHR: Chronic Disease Prediction Using Medical Notes, arXiv
It is generally a hard problem to determine weights for the tasks in the multitask training setting (TODO: any reference?). The experiments on MIMIC benchmarks showed that the networks overfit on some tasks earlier than on others.
Is it possible to create an architecture that will automatically modify the weights during the training? Something similar to …
 Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas, Learning to learn by gradient descent by gradient descent, 2016, arXiv