Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit neural representations are incapable of modeling signals with fine detail, and fail to represent a signalâ€™s spatial and temporal derivatives, despite the fact that these are essential to many physical signals defined implicitly as the solution to partial differential equations. We propose to leverage periodic activation functions for implicit neural representations and demonstrate that these networks, dubbed sinusoidal representation networks or SIREN, are ideally suited for representing complex natural signals and their derivatives. We analyze SIREN activation statistics to propose a principled initialization scheme and demonstrate the representation of images, wavefields, video, sound, and their derivatives. Further, we show how SIREN s can be leveraged to solve challenging boundary value problems, such as particular Eikonal equations (yielding signed distance functions), the Poisson equation, and the Helmholtz and wave equations. Lastly, we combine SIREN with hypernetworks to learn priors over the space of SIREN functions.

The following results compare SIREN to a variety of network architectures. **TanH**, **ReLU**,
**Softplus** etc. means an MLP of equal size with the respective nonlinearity.
We also compare to the recently proposed positional encoding, combined with a ReLU nonlinearity, noted as
**ReLU P.E.**
SIREN outperforms all baselines by a significant margin, converges significantly faster, and is the only
architecture that accurately represents the gradients of the signal, enabling its use to solve boundary
value problems.

A Siren that maps 2D pixel coordinates to a color may be used to parameterize images. Here, we supervise Siren directly with ground-truth pixel values. Siren not only fits the image with a 10 dB higher PSNR and in significantly fewer iterations than all baseline architectures, but is also the only MLP that accurately represents the first- and second order derivatives of the image.

A Siren with a single, time-coordinate input and scalar output may parameterize audio signals. Siren is the only network architecture that succeeds in reproducing the audio signal, both for music and human voice.

A Siren with pixel coordinates together with a time coordinate can be used to parameterize a video. Here, Siren is directly supervised with the ground-truth pixel values, and parameterizes video significantly better than a ReLU MLP.

By supervising only the derivatives of Siren, we can solve Poisson's equation. Siren is again the only architecture that fits image, gradient, and laplacian domains accurately and swiftly.

Interactive 3D SDF Viewer - Use Your Mouse to Navigate the Scenes

We can recover an SDF from a pointcloud and surface normals by solving the Eikonal
equation,
a first-order boundary value problem. SIREN can recover a room-scale scene given only its pointcloud
and surface normals, accurately reproducing fine detail, in less than an hour of training.
In contrast to recent work on combining voxel grids with neural implicit representations,
this stores the full scene in the weights of a single, 5-layer neural network, with no 2D or 3D
convolutions, and orders of magnitude fewer parameters. Zoom in to compare fine detail!
**Note that these SDFs are not supervised with ground-truth SDF / occupancy values, but rather, are the
result of solving the above Eikonal boundary value problem. This is a significantly harder task,
which requires supervision in the gradient domain (see paper). As a result, architectures whose gradients
are not well-behaved perform worse than SIREN.**

Here, we use Siren to solve the inhomogeneous Helmholtz equation. ReLU- and Tanh-based architectures fail entirely to converge to a solution.

In the time domain, Siren succeeds to solve the wave equation, while a Tanh-based architecture fails to discover the correct solution.

Check out our related projects on the topic of implicit neural representations!

We identify a key relationship between generalization across implicit neural representations and meta-
learning, and propose to leverage gradient-based meta-learning for learning priors over deep signed distance
functions. This allows us to reconstruct SDFs an order of magnitude faster than the auto-decoder framework,
with no loss in performance!

A continuous, 3D-structure-aware neural scene representation that encodes both geometry and appearance,
supervised only in 2D via a neural renderer, and generalizes for 3D reconstruction from a single posed 2D image.

We demonstrate that the features learned by neural implicit scene representations are useful for downstream
tasks, such as semantic segmentation, and propose a model that can learn to perform continuous 3D
semantic segmentation on a class of objects (such as chairs) given only a single, 2D (!) semantic label map!

@inproceedings{sitzmann2019siren,
author = {Sitzmann, Vincent
and Martel, Julien N.P.
and Bergman, Alexander W.
and Lindell, David B.
and Wetzstein, Gordon},
title = {Implicit Neural Representations
with Periodic Activation Functions},
booktitle = {arXiv},
year={2020}
}