I moved to the university of Vienna for an assistant professorship in machine learning.
I recently finished a preprint with Gitta Kutyniok, Mones Raslan, and Reinhold Schneider on the approximation of parametric maps by deep neural networks. We demonstrate that, under some technical conditions, the size of approximating neural networks to approximate a discretised parametric map does not, or only weakly, depend on the size of the discretisation. Instead, the size of these networks is determined by the size of a reduced basis. In this regard, our results constitutes an approximation result where the curse of dimension is overcome and shows that deep learning techniques can efficiently implement model order reduction techniques.
Together with the participants of the Oberwolfach Seminar: Mathematics of Deep Learning, I wrote a (not entirely serious) paper called "The Oracle of DLPhi" proving that Deep Learning techniques can perform accurate classifications on test data that is entirely uncorrelated to the training data. This, however, requires a couple of non-standard assumptions such as uncountably many data points and the axiom of choice. In a sense this shows that mathematical results on machine learning need to be approached with a bit of scepticism.
Felix and I submitted our preprint: Equivalence of approximation by convolutional neural networks and fully-connected networks to the arXiv. In this note, we establish approximation theoretical results, i.e., lower and upper bounds on approximation fidelity compared to the number of parameters, for convolutional neural networks. In practice, convolutional neural networks are used to a much greater extent than standard neural networks, while, traditionally, mathematical analysis mostly dealt with standard neural networks. We now show that all classical approximation results of standard neural networks imply very similar approximation results for convolutional neural networks.
I will give a mini-lecture on applied harmonic analysis at the PDE-CDT Summer School 2018 at Ripon college. The lecture notes can be found here.
Felix, Mones, and I just uploaded a preprint on topological properties of sets of functions that are representable by neural networks of fixed size. In this work we analyse simple set topological properties, such as, convexity, closedness or density of the set of networks with a fixed architecture. Quite surprisingly we found that the topology of this set is not particularly convenient for optimisation. Indeed, for all commonly-used activation functions, the sets of networks of fixed size are non-convex (not even weakly), nowhere dense, cannot be stably parametrised, and are not closed with respect to L^p norms. For almost all commonly-used activation functions except for the parametric ReLU, the non-closedness extends to the uniform norm. In fact, for the parametric ReLU the associated spaces are closed with respect to the supremum norm if the architecture has only one hidden layer. When training a network, these properties can lead to many local minima of the minimization problem, exploding coefficients, and very slow convergence.
I created this webpage after I moved from TU Berlin to U Oxford.