Philipp Petersen

Assistant Professor for Machine Learning
at the Faculty of Mathematics
of the University of Vienna.

Faculty of Mathematics,
University of Vienna,
Oskar Morgenstern Platz 1,
1090 Wien,
OMP 07.132

Web Pages: Google Scholar, ResearchGate
CV (last updated on 10 Juli 2019)

Research interests

My research focuses on the following topics:
  • Approximation theory and structural properties of neural networks
  • Application of deep learning in numerical analysis
  • Applied harmonic analysis, in particuar, multiscale systems (wavelets, shearlets, and generalisations)


19 November 2020

I uploaded two preprints: First, with Carlo Marcati, Joost A. A. Opschoor, Christoph Schwab and I studied in "Exponential ReLU Neural Network Approximation Rates for Point and Edge Singularities " the extent to which one can emulate higher dimensional hp-FEM using deep neural networks. Using a reapproximation of hp-FEM, we can show that for a wide range of functions on bounded, but quite general domains, we achieve exponentially fast approximation via neural networks. In a second preprint with Andrei Caragea and Felix Voigtlaender called "Neural network approximation and estimation of classifiers with classification boundary in a Barron class", we study the approximation and estimation of high-dimensional functions that have structured singularities. In essence, we show that if the singularities are locally of Barron-type, then one can approximate and estimate, with rates independent of the underlying dimension.

18 June 2020

Simon, Surbhi, Chao, Song, Matthew, Stephan and I created an online seminar series on the Mathematics of Machine Learning.

03 October 2019

I moved to the university of Vienna for an assistant professorship in machine learning.

02 April 2019

I recently finished a preprint with Gitta Kutyniok, Mones Raslan, and Reinhold Schneider on the approximation of parametric maps by deep neural networks. We demonstrate that, under some technical conditions, the size of approximating neural networks to approximate a discretised parametric map does not, or only weakly, depend on the size of the discretisation. Instead, the size of these networks is determined by the size of a reduced basis. In this regard, our results constitutes an approximation result where the curse of dimension is overcome and shows that deep learning techniques can efficiently implement model order reduction techniques.

17 January 2019

Together with the participants of the Oberwolfach Seminar: Mathematics of Deep Learning, I wrote a (not entirely serious) paper called "The Oracle of DLPhi" proving that Deep Learning techniques can perform accurate classifications on test data that is entirely uncorrelated to the training data. This, however, requires a couple of non-standard assumptions such as uncountably many data points and the axiom of choice. In a sense this shows that mathematical results on machine learning need to be approached with a bit of scepticism.

04 September 2018

Felix and I submitted our preprint: Equivalence of approximation by convolutional neural networks and fully-connected networks to the arXiv. In this note, we establish approximation theoretical results, i.e., lower and upper bounds on approximation fidelity compared to the number of parameters, for convolutional neural networks. In practice, convolutional neural networks are used to a much greater extent than standard neural networks, while, traditionally, mathematical analysis mostly dealt with standard neural networks. We now show that all classical approximation results of standard neural networks imply very similar approximation results for convolutional neural networks.

03 September 2018

I will give a mini-lecture on applied harmonic analysis at the PDE-CDT Summer School 2018 at Ripon college. The lecture notes can be found here.

25 June 2018

Felix, Mones, and I just uploaded a preprint on topological properties of sets of functions that are representable by neural networks of fixed size.
In this work we analyse simple set topological properties, such as, convexity, closedness or density of the set of networks with a fixed architecture. Quite surprisingly we found that the topology of this set is not particularly convenient for optimisation. Indeed, for all commonly-used activation functions, the sets of networks of fixed size are non-convex (not even weakly), nowhere dense, cannot be stably parametrised, and are not closed with respect to L^p norms. For almost all commonly-used activation functions except for the parametric ReLU, the non-closedness extends to the uniform norm. In fact, for the parametric ReLU the associated spaces are closed with respect to the supremum norm if the architecture has only one hidden layer.
When training a network, these properties can lead to many local minima of the minimization problem, exploding coefficients, and very slow convergence.

25 June 2018

I created this webpage after I moved from TU Berlin to U Oxford.