I am studying a PhD in Intelligent Cinematography (Yr. 3/3.5). My research is focused on dynamic 3-D reconstruction from video, for cinematographic applications. I have works on NeRFs and Gaussian Splatting for both dynamic scene and scene relighting tasks. The project is supervised by Dave Bull and Pui Anantrasirichai and funded by My World.

I completed an MEng in Electronic Engineering with AI (First Class) at the University of Southampton. My dissertation focused on describing philosophical, legal and social decision making frameworks with machine learning algorithms; supervised by Mark Weal.


Splatography: Sparse multi-view dynamic Gaussian Splatting for film-making challenge (technical)

International Conference on 3D Vision 2026 (3DV) - page - code - Jan-Aug 2025

Deformable Gaussian Splatting (GS) accomplishes photorealistic dynamic 3-D reconstruction from dense multi-view video (MVV) by learning to deform a canonical GS representation. However, in filmmaking, tight budgets can result in sparse camera configurations, which limits state-of-the-art (SotA) methods when capturing complex dynamic features. To address this issue, we introduce an approach that splits the canonical Gaussians and deformation field into foreground and background components using a sparse set of masks for frames at t=0. Each representation is separately trained on different loss functions during canonical pre-training. Then, during dynamic training, different parameters are modeled for each deformation field following common filmmaking practices. The foreground stage contains diverse dynamic features so changes in color, position and rotation are learned. While, the background containing film-crew and equipment, is typically dimmer and less dynamic so only changes in point position are learned. Experiments on 3-D and 2.5-D entertainment datasets show that our method produces SotA qualitative and quantitative results; up to 3 PSNR higher with half the model size on 3-D scenes. Unlike the SotA and without the need for dense mask supervision, our method also produces segmented dynamic reconstructions including transparent and dynamic textures.

WavePlanes: A compact Wavelet representation for Dynamic Neural Radiance Fields (technical)

arXiv - page - code - July-Nov 2024

Dynamic Novel View Synthesis (Dynamic NVS) enhances NVS technologies to model moving 3-D scenes. However, current methods are resource intensive and challenging to compress. To address this, we present WavePlanes, a fast and more compact hex plane representation, applicable to both dynamic Neural Radiance Fields and Gaussian Splatting methods. Rather than modeling many feature scales separately (as done previously), we use the inverse discrete wavelet transform to reconstruct features at varying scales. This leads to a more compact representation and allows us to explore wavelet-based compression schemes for further gains. The proposed compression scheme exploits the sparsity of wavelet coefficients, by applying hard thresholding to the wavelet planes and storing nonzero coefficients and their locations on each plane in a Hash Map. Compared to the state-of-the-art (SotA), WavePlanes is significantly smaller, less resource demanding and competitive in reconstruction quality. Compared to small SotA models, WavePlanes outperforms methods in both model size and quality of novel views.

ViVo: A Dataset for Volumetric Video Reconstruction and Compression (dataset)

arXiv (TBD) - page - code - June-June 2024/5

As research on neural volumetric video reconstruction and compression flourishes, there is a need for diverse and realistic datasets, which can be used to develop and validate reconstruction and compression models. However, existing volumetric video datasets lack diverse content in terms of both semantic and low-level features that are commonly present in real-world production pipelines. In this context, we propose a new dataset, \name, for VolumetrIc VideO reconstruction and compression. The dataset is faithful to real-world volumetric video production and is the first dataset to extend the definition of diversity to include both human-centric characteristics (skin, hair, etc.) and dynamic visual phenomena (transparent, reflective, liquid, etc.). Each video sequence in this database contains raw data including fourteen multi-view RGB and depth video pairs, synchronized at 30FPS with per-frame calibration and audio data, and their associated 2-D foreground masks and 3-D point clouds. To demonstrate the use of this database, we have benchmarked three state-of-the-art (SotA) 3-D reconstruction methods and two volumetric video compression algorithms. The obtained results evidence the challenging nature of the proposed dataset and the limitations of existing datasets for both volumetric video reconstruction and compression tasks, highlighting the need to develop more effective algorithms for these applications.

Reviewing Intelligent Cinematography: AI research for camera-based video production (review; 41 pages)

Springer Nature: Artificial Intelligence Review - Sep-Apr 2022/25

The first (comprehensive) review of computer vision research in the context of real video content acquisition for entertainment. To establish a structure, we categorise work by General, Virtual, Live and Aerial production, and within each category we discuss various machine learning applications and their links to other forms production. We also provide category-specific comments on future works and discuss the socail responsibilities for conducting ethical research.

Exploring Dynamic Novel View Synthesis Technologies for Cinematography (review/art application)

arXiv - Mar-May 2024

We provide an overview of Dynamic NeRF and Gaussian Splatting research in the context of cinematography and explore the use of these technologies (Nerfacto, 4D-GS and SC-GS) to produce (very) short film. Topics discussed: (1) Dynamic representations, (2) Articulated models vs Scene-based modelling, (3) Data collection

Towards a Robust Framework for NeRF Evaluation (technical)

arXiv - code - Jan-May 2023

Moving towards robust NeRF evaluation using synthetic datasets for point-based architectures. This focuses on point-quality and biases involved with various training and test camera distributions to derive metrics for scene complexity.


(Industry Collaboration) Online Trend Detection at Scale

MEng Group Design Project - Client: Senseye Ltd. - Sept-Jan 2021/22

We produced three unsupervised general trend detection models models which. My part focused on detection using variance-related errors. This allowed for comprehensive evaluation of monotonic and heteroscedastic trends and was capable of detecting various behaviours that classical and SoA approaches can not. Additionally, I investigated multi-scale detection for non-uniformly distributed data using a binary change-point detection algorithm to control the window size for evaluation. Documentation here.

(Thesis) Generalised Ethical Dilemma Solver

MEng Thesis - Sept-May 2020/21

By relating classical ML techniques to well defined philosophical, social and ethical frameworks I developed a machine learning algorithm to resolve casual social dilemas. Given a set of possible outcomes, this: (1) selects the appropriate philosophical/social/ethical model based on generally accepted criteria, (2) simulates each model and derive scores for the various resolutions, (3) applies a weighted fusion to make a final choice, and (4) provides reasoning for the choice by refering to the selected models and scores. Documentation here.

Solving a Maze blind using a Duelling Deep Q-Network

Model Implementation and Extension - Jan-June 2022 - Not Published

The problem: You are blind, in a maze and fires appear randomly arround you each time you take a step. Can we resolve the maze unsupervised? This looks at deep Q-learning approaches to solve the maze and compares results for Deep-Q, Duelling Deep-Q, Rainbow algorithms. Additionally, I investigated combined experience replay and compared the greedy episilon and Boltzmann eplxoration methods. Code here.

(Paper Extension) The Importance of Group-Size Preferences for the Evolution of Cooperation Under the Conditions of Individual Selection

Literature Review and Model Extension - Jan-May 2022 - Not Published

I reproduced and extended the proposed Genetic Algorithm; which focuses on replicating the conditions of grouping and dispersal in bacterial micro-colonies to provide insight into cooperative structures. The extension focused on better representing individuals to allow for imprecisions in their understanding of the simulated environment. Paper and code found here.

(Paper Reproduction) Inspecting Functional Modularity in NNs

Group Project - Jan-June 2022 - Not Published

A reproduction of, Are Neural Nets Modular? Inspecting Functional Modularity through Differentiable Weight Masks (ICLR '21). We reproduce the proposed tool for inspecting functional modularity and tested it on a range of NN architectures (namely CNNs and RNNs).

(Internship) Improved Distribution of Physical Sensor Networks under Star Topology

Research Internship at Lurtis Rules - July-Nov 2020 - Rejection

We wrote a comparative short paper on evaluating various linear and non-linear strategies for multi-objective sensor distribution for general irregular (agricultural) fields; supervised by Jose M. Pena. This primarily uses unsupervised differential evolution.










Adrian Azzarelli