Deep Learning

Deep learning is concerned with learning increasingly abstract, hierarchical representations of data so that the representations still contain relevant information from the original input. Deep learning models constitute the state-of-the-art for virtually all tasks involving high-dimensional, complex data, like images, speech, and video data, as it was shown that they tend to scale very well with more data.

Training a German LLM from Scratch, 14 Nov. 2024 (blog)

This article is not finished and will be updated. The research group I work with has access to a small GPU cluster, which occasionally sits idle. To avoid wasting valuable compute resources (IDLE GPUs essentially burn money through opportunity costs), I decided to train a German GPT-2-style model from scratch, using only German text. Existing German models available on Hugging Face have 137M parameters and a context length of 1024 tokens1, which is quite limited compared to recently released …

Categories: Deep Learning

2794 Words, Tagged with: Deep Learning · Generative Models · LLM

Thumbnail for Training a German LLM from Scratch

Deep learning-based harmonization and super-resolution of Landsat-8 and Sentinel-2 images, 17 May. 2024 (papers)

Our paper Deep learning-based harmonization and super-resolution of Landsat-8 and Sentinel-2 images, which is based on the masters thesis of my colleague Venkatesh Thirugnana Sambandham, has been published in the ISPRS Journal of Photogrammetry and Remote Sensing. This work is an extension of our previous workshop paper on transformers for satellite homogenization. In summary, we find that a simple UNet model provides surprisingly good performance for the satellite homogenization task. We …

Categories: Deep Learning

344 Words, Tagged with: Deep Learning · Superresolution

Thumbnail for Deep learning-based harmonization and super-resolution of Landsat-8 and Sentinel-2 images

Towards Transformer-based Homogenization of Satellite Imagery for Landsat-8 and Sentinel-2, 13 Aug. 2022 (papers)

Our abstract Towards Transformer-based Homogenization of Satellite Imagery for Landsat-8 and Sentinel-2 was accepted for presentation on the Transformers Workshop for Environmental Science. In summary, we somewhat surprisingly found that transformers, a neural network architecture that achieves state-of-the-art results on most tasks it is applied to, does not outperform a vanilla U-Net model on our particular superresolution task.

Categories: Deep Learning

58 Words, Tagged with: ESST · Superresolution

Thumbnail for Towards Transformer-based Homogenization of Satellite Imagery for Landsat-8 and Sentinel-2

Convolutional Filter Visualization, 27 Jul. 2022 (blog)

Deep Neural Networks are black-boxes: they map some input to some output, and we can make them do this surprisingly well. However, we usually have no idea how this mapping works. Particularly Convolutional Neural Networks (CNNs), which employ “convolutions” as filters, achieved some impressive results (before Vision Transformers came along). Filter Visualization can help us understand what kind of patterns the convolutional filters in CNNs detect. Why would we want to do it? § …

Categories: Deep Learning

472 Words, Tagged with: Deep Learning · Explainability · CNN