Deep Learning Methods for Inverse Problems | �鶹��ýӳ�� Information Theory Society

2022

Deep Learning Methods for Inverse Problems

Guest editors

Paul Hand

Reinhard Heckel

Jonathan Scarlett

Deep learning methods have emerged as highly successful tools for solving inverse problems. They achieve state-of-the-art performance on tasks such as image denoising, inpainting, super-resolution, and compressive sensing. They are also starting to be used in inverse problems beyond imaging, including for solving inverse problems arising in communications, signal processing, and even on non-Euclidean data such as graphs. However, a wide range of important theoretical and practical questions remain unsolved or even completely open, including precise theoretical guarantees for signal recovery, robustness and out-of-distribution generalization, architecture design, and domain-specific applications and challenges. This special issue aims to advance cutting-edge research in this area, with an emphasis on its intersection with information theory.

Paul Hand Reinhard Heckel Jonathan Scarlett

Welcome to the eleventh issue of the Journal on Selected Areas in Information Theory (JSAIT), dedicated to “Deep Learning Methods for Inverse Problems”.

Jonathan Scarlett Reinhard Heckel Miguel R. D. Rodrigues Paul Hand Yonina C. Eldar

In recent years, there have been significant advances in the use of deep learning methods in inverse problems such as denoising, compressive sensing, inpainting, and super-resolution. While this line of works has predominantly been driven by practical algorithms and experiments, it has also given rise to a variety of intriguing theoretical problems. In this paper, we survey some of the prominent theoretical developments in this line of works, focusing in particular on generative priors, untrained neural network priors, and unfolding algorithms. In addition to summarizing existing results in these topics, we highlight several ongoing challenges and open problems.

Chen Xu Xiuyuan Cheng Yao Xie

Graph prediction problems prevail in data analysis and machine learning. The inverse prediction problem, namely to infer input data from given output labels, is of emerging interest in various applications. In this work, we develop invertible graph neural network (iGNN), a deep generative model to tackle the inverse prediction problem on graphs by casting it as a conditional generative task. The proposed model consists of an invertible sub-network that maps one-to-one from data to an intermediate encoded feature, which allows forward prediction by a linear classification sub-network as well as efficient generation from output labels via a parametric mixture model. The invertibility of the encoding sub-network is ensured by a Wasserstein-2 regularization which allows free-form layers in the residual blocks. The model is scalable to large graphs by a factorized parametric mixture model of the encoded feature and is computationally scalable by using GNN layers. The existence of invertible flow mapping is backed by theories of optimal transport and diffusion process, and we prove the expressiveness of graph convolution layers to approximate the theoretical flows of graph data. The proposed iGNN model is experimentally examined on synthetic data, including the example on large graphs, and the empirical advantage is also demonstrated on real-application datasets of solar ramping event data and traffic flow anomaly detection.

Shirin Shoushtari Jiaming Liu Yuyang Hu Ulugbek S. Kamilov

There is a growing interest in deep model-based architectures (DMBAs) for solving imaging inverse problems by combining physical measurement models and learned image priors specified using convolutional neural nets (CNNs). For example, well-known frameworks for systematically designing DMBAs include plug-and-play priors (PnP), deep unfolding (DU), and deep equilibrium models (DEQ). While the empirical performance and theoretical properties of DMBAs have been widely investigated, the existing work in the area has primarily focused on their performance when the desired image prior is known exactly. This work addresses the gap in the prior work by providing new theoretical and numerical insights into DMBAs under mismatched CNN priors. Mismatched priors arise naturally when there is a distribution shift between training and testing data, for example, due to test images being from a different distribution than images used for training the CNN prior. They also arise when the CNN prior used for inference is an approximation of some desired statistical estimator (MAP or MMSE). Our theoretical analysis provides explicit error bounds on the solution due to the mismatched CNN priors under a set of clearly specified assumptions. Our numerical results compare the empirical performance of DMBAs under realistic distribution shifts and approximate statistical estimators.

Jonathan Sauder Martin Genzel Peter Jung

Countless signal processing applications include the reconstruction of signals from few indirect linear measurements. The design of effective measurement operators is typically constrained by the underlying hardware and physics, posing a challenging and often even discrete optimization task. While the potential of gradient-based learning via the unrolling of iterative recovery algorithms has been demonstrated, it has remained unclear how to leverage this technique when the set of admissible measurement operators is structured and discrete. We tackle this problem by combining unrolled optimization with Gumbel reparametrizations, which enable the computation of low-variance gradient estimates of categorical random variables. Our approach is formalized by GLODISMO (Gradient-based Learning of DIscrete Structured Measurement Operators). This novel method is easy-to-implement, computationally efficient, and extendable due to its compatibility with automatic differentiation. We empirically demonstrate the performance and flexibility of GLODISMO in several prototypical signal recovery applications, verifying that the learned measurement matrices outperform conventional designs based on randomization as well as discrete optimization baselines.

Alireza Naderi Yaniv Plan

We study the problem of reconstructing a high-dimensional signal $\mathrm {x} \in \mathbb {R}^{n}$ from a low-dimensional noisy linear measurement $\mathrm {y}=\mathrm {M}\mathrm {x}+\mathrm {e} \in \mathbb {R}^{\ell }$ , assuming x admits a certain structure. We model the measurement matrix as M = BA, with arbitrary $\mathrm {B} \in \mathbb {R}^{\ell \times m}$ and sub-gaussian $\mathrm {A} \in \mathbb {R}^{m \times n}$ ; therefore allowing for a family of random measurement matrices which may have heavy tails, dependent rows and columns, and a large dynamic range for the singular values. The structure is either given as a non-convex cone $T \subset \mathbb {R}^{n}$ , or is induced via minimizing a given convex function $f(\cdot)$ , hence our study is sparsity-free. We prove, in both cases, that an approximate empirical risk minimizer robustly recovers the signal if the effective number of measurements is sufficient, even in the presence of a model mismatch, i.e., the signal not exactly admitting the model’s structure. While in classical compressed sensing the number of independent (sub)-gaussian measurements regulates the possibility of a robust reconstruction, in our setting the effective number of measurements depends on the properties of B. We show that, in this model, the stable rank of B indicates the effective number of measurements, and an accurate recovery is guaranteed whenever it exceeds, to within a constant factor, the effective dimension of the structure set. We apply our results to the special case of generative priors, i.e., when x is close to the range of a Generative Neural Network (GNN) with ReLU activation functions. Also, if the GNN has random weights in the last layer, our theory allows a partial Fourier measurement matrix, thus taking the first step towards a theoretical analysis of compressed sensing MRI with GNN. Our work relies on a recent result in random matrix theory by Jeong et al. (2020).

Aaron Berk Simone Brugiapaglia Babhru Joshi Yaniv Plan Matthew Scott Özgür Yilmaz

In Bora et al. (2017), a mathematical framework was developed for compressed sensing guarantees in the setting where the measurement matrix is Gaussian and the signal structure is the range of a generative neural network (GNN). The problem of compressed sensing with GNNs has since been extensively analyzed when the measurement matrix and/or network weights follow a subgaussian distribution. We move beyond the subgaussian assumption, to measurement matrices that are derived by sampling uniformly at random rows of a unitary matrix (including subsampled Fourier measurements as a special case). Specifically, we prove the first known restricted isometry guarantee for generative compressed sensing (GCS) with subsampled isometries, and provide recovery bounds with nearly order-optimal sample complexity, addressing an open problem of (Scarlett et al., 2022, p. 10). Recovery efficacy is characterized by the coherence, a new parameter, which measures the interplay between the range of the network and the measurement matrix. Our approach relies on subspace counting arguments and ideas central to high-dimensional probability. Furthermore, we propose a regularization strategy for training GNNs to have favourable coherence with the measurement operator. We provide compelling numerical simulations that support this regularized training strategy: our strategy yields low coherence networks that require fewer measurements for signal recovery. This, together with our theoretical results, supports coherence as a natural quantity for characterizing GCS with subsampled isometries.

Huan Liu George Zhang Jun Chen Ashish Khisti

We study an extension of lossy compression where the reconstruction is subject to a distribution constraint which can be different from the source distribution. We formulate our setting as a generalization of optimal transport with an entropy bottleneck to account for the rate constraint due to compression. We provide expressions for the tradeoff between compression rate and the achievable distortion with and without shared common randomness between the encoder and decoder. We study the examples of binary, uniform and Gaussian sources (in an asymptotic setting) in detail and demonstrate that shared randomness can strictly improve the tradeoff. For the case without common randomness and squared-Euclidean distortion, we show that the optimal solution partially decouples into the problem of optimal compression and transport and also characterize the penalty associated with fully decoupling them. We provide experimental results by training deep learning end-to-end compression systems for performing denoising on SVHN (The Street View House Numbers) and super-resolution on MNIST (Modified National Institute of Standards and Technology) datasets suggesting consistency with our theoretical results. Our code is available at https://github.com/liuh127/Cross_domain_LC.

Saurav K. Shastri Rizwan Ahmad Christopher A. Metzler Philip Schniter

To solve inverse problems, plug-and-play (PnP) methods replace the proximal step in a convex optimization algorithm with a call to an application-specific denoiser, often implemented using a deep neural network (DNN). Although such methods yield accurate solutions, they can be improved. For example, denoisers are usually designed/trained to remove white Gaussian noise, but the denoiser input error in PnP algorithms is usually far from white or Gaussian. Approximate message passing (AMP) methods provide white and Gaussian denoiser input error, but only when the forward operator is sufficiently random. In this work, for Fourier-based forward operators, we propose a PnP algorithm based on generalized expectation-consistent (GEC) approximation—a close cousin of AMP—that offers predictable error statistics at each iteration, as well as a new DNN denoiser that leverages those statistics. We apply our approach to magnetic resonance (MR) image recovery and demonstrate its advantages over existing PnP and AMP methods.

Brandon Y. Feng Mingyang Xie Christopher A. Metzler

We present a self-supervised and self-calibrating multi-shot approach to imaging through atmospheric turbulence, called TurbuGAN. Our approach requires no paired training data, adapts itself to the distribution of the turbulence, leverages domain-specific data priors, and can generalize from tens to thousands of measurements. We achieve such functionality through an adversarial sensing framework adapted from CryoGAN (Gupta et al. 2021), which uses a discriminator network to match the distributions of captured and simulated measurements. Our framework builds on CryoGAN by (1) generalizing the forward measurement model to incorporate physically accurate and computationally efficient models for light propagation through anisoplanatic turbulence, (2) enabling adaptation to slightly misspecified forward models, and (3) leveraging domain-specific prior knowledge using pretrained generative networks, when available. We validate TurbuGAN on both computationally simulated and experimentally captured images distorted with anisoplanatic turbulence.

Akash S. Doshi Manan Gupta Jeffrey G. Andrews

Future wireless systems are trending towards higher carrier frequencies that offer larger communication bandwidth but necessitate the use of large antenna arrays. Signal processing techniques for channel estimation currently deployed in wireless devices do not scale well to this “high-dimensional” regime in terms of performance and pilot overhead. Meanwhile, training deep learning-based approaches for channel estimation requires large labeled datasets mapping pilot measurements to clean channel realizations, which can only be generated offline using simulated channels. In this paper, we develop a novel unsupervised over-the-air (OTA) algorithm that utilizes noisy received pilot measurements to train a deep generative model to output beamspace MIMO channel realizations. Our approach leverages Generative Adversarial Networks (GAN), while using a conditional input to distinguish between Line-of-Sight (LOS) and Non-Line-of-Sight (NLOS) channel realizations. We also present a federated implementation of the OTA algorithm that distributes the GAN training over multiple users and greatly reduces the user side computation. We then formulate channel estimation from a limited number of pilot measurements as an inverse problem and reconstruct the channel by optimizing the input vector of the trained generative model. Our proposed approach significantly outperforms Orthogonal Matching Pursuit on both LOS and NLOS channel models, and EM-GM-AMP – an Approximate Message Passing algorithm – on LOS channel models, while achieving comparable performance on NLOS channel models in terms of the normalized channel reconstruction error. More importantly, our proposed framework has the potential to be trained online using real noisy pilot measurements, is not restricted to a specific channel model and can even be utilized for a federated OTA design of a dataset generator from noisy data.

Karl Chahine Rajesh Mishra Hyeji Kim

Designing reliable codes for channels with feedback, which has significant theoretical and practical importance, is one of the long-standing open problems in coding theory. While there are numerous prior works on analytical codes for channels with feedback, the majority of them focus on channels with noiseless output feedback, where the optimal coding scheme is still unknown. For channels with noisy feedback, deriving analytical codes becomes even more challenging, and much less is known. Recently, it has been shown that deep learning can, in part, address these challenges and lead to the discovery of new codes for channels with noisy output feedback. Despite the success, there are three important open problems: $(a)$ deep learning-based codes mainly focus on the passive feedback setup, which is shown to be worse than the active feedback setup; $(b)$ deep learning-based codes are hard to interpret or analyze; and $(c)$ they have not been successfully demonstrated in the over-the-air channels with feedback. We address these three challenges. First, we present a learning-based framework for designing codes for channels with active feedback. Second, we analyze the latent features of the learned codes to devise an analytical coding scheme. We show that the approximated analytical code is a non-trivial variation of the state-of-the-art codes, demonstrating that deep learning is a powerful tool for deriving a new analytical communication scheme for challenging communication scenarios. Finally, we demonstrate the over-the-air performance of our neural codes by building a wireless testbed that consists of two separate N200 USRPs operating as the transmitter and the receiver. To the best of our knowledge, this is the first over-the-air hardware implementation of neural codes for interactive channels.

Emre Ozfatura Yulin Shao Alberto G. Perotti Branislav M. Popović Deniz Gündüz

Deep neural network (DNN)-based channel code designs have recently gained interest as an alternative to conventional coding schemes, particularly for channels in which existing codes do not provide satisfactory performance. Coding in the presence of feedback is one such problem, for which promising results have recently been obtained by various DNN-based coding architectures. In this paper, we introduce a novel learning-aided feedback code design, dubbed generalized block attention feedback (GBAF) codes, that achieves orders-of-magnitude improvements in block error rate (BLER) compared to existing solutions. Sequence-to-sequence encoding and block-by-block processing of the message bits are the two important design principles of the GBAF codes, which not only reduce the communication overhead, due to fewer interactions between the transmitter and receiver, but also enable flexible coding rates. GBAF codes also have a modular structure that can be implemented using different neural network architectures. In this work, we employ the popular transformer architecture, which outperforms all the prior DNN-based code designs in terms in terms of BLER BLER in the low signal-to-noise ratio regime when the feedback channel is noiseless.

�鶹��ýӳ��

Table of Contents