Uncrewed Systems Technology 052 l Keybotic Keyper l Video encoding l Dufour Aero2 l Subsea SeaCAT l Space vehicles l CUAV 2023 report l SkyPower SP engine l Cable harnesses l Paris Air Show 2023 report I Nauticus Aquanaut

44 or software. Essentially, the video standard is fully described and verified against agreed metrics. When using ML, it is sometimes difficult to explain exactly how the implementation operates. While it is perfectly feasible to have flexibility in the final implementation, there must be an understanding of how AI-based algorithms adhere to the specifications and produce standards-compliant bitstreams. Researchers at Google have used a set of ML tools, including general adversarial networks (GANs), to enhance the development of ML video compression algorithms. This approach uses ML to generate new algorithms and test them against each other to converge on new, more efficient techniques. The compression is achieved by exploiting similarities among video frames. This is possible because most of the content is almost identical between frames, as a typical video contains 30 frames per second. The compression algorithm tries to find the residual information between the frames. Despite achieving impressive compression performance, neural video compression methods suffer when producing realistic outputs. These can generate the output video close to the input, but miss the realism. The goal of adding the realism constraint for neural networks is to ensure the output is indistinguishable from real images while staying close to the input video. The main challenge is ensuring the network can generalise well to unseen content. The researchers constructed a generative neural video compression technique that excels in synthesis and detail preservation using the GAN. In video compression, specific frames are selected as key frames (I-frames) that are used as a base for reconstructing upcoming frames. The frames are allocated higher bitrates and so have better details. This is also valid for the proposed method that synthesises the dependent frames (P-frames) based on the available I-frame in three steps. First, the ML network synthesises essential details within the I-frame, which will be used as a base for upcoming frames. This is done by using a combination of convolutional neural network and GAN components. A discriminator in the GAN is responsible for ensuring I-frame level details. Second, the synthesised details are propagated where needed. A powerful optical flow method called UFlow is used to predict movement between frames. The P-frame component has two auto-encoder parts, one for predicting the optical flow and one for the residual information. These work together to propagate details from the previous step as sharply as possible. Finally, an auto-encoder is used to determine when to synthesise new details from the I-frame. Since new content can appear in P-frames, the existing details can become irrelevant, and propagating them would distort The overall architecture used to compare AI-generated video with the original source (Courtesy of Google) An MPEG-5 Low Complexity Enhancement Video Coding (LCEVC) decoder (Courtesy of Aachen University) October/November 2023 | Uncrewed Systems Technology

RkJQdWJsaXNoZXIy MjI2Mzk4