machine vision thesis topics

The Future of AI Research: 20 Thesis Ideas for Undergraduate Students in Machine Learning and Deep Learning for 2023!

A comprehensive guide for crafting an original and innovative thesis in the field of ai..

By Aarafat Islam on 2023-01-11

“The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples.” — Andrew Ng

This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an introduction , which presents a brief overview of the topic and the research objectives . The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more. The article also includes explanations, examples, and conclusions for each thesis idea, which can help guide the research and provide a clear understanding of the potential contributions and outcomes of the proposed research. The article also emphasized the importance of originality and the need for proper citation in order to avoid plagiarism.

1. Investigating the use of Generative Adversarial Networks (GANs) in medical imaging: A deep learning approach to improve the accuracy of medical diagnoses.

Introduction: Medical imaging is an important tool in the diagnosis and treatment of various medical conditions. However, accurately interpreting medical images can be challenging, especially for less experienced doctors. This thesis aims to explore the use of GANs in medical imaging, in order to improve the accuracy of medical diagnoses.

2. Exploring the use of deep learning in natural language generation (NLG): An analysis of the current state-of-the-art and future potential.

Introduction: Natural language generation is an important field in natural language processing (NLP) that deals with creating human-like text automatically. Deep learning has shown promising results in NLP tasks such as machine translation, sentiment analysis, and question-answering. This thesis aims to explore the use of deep learning in NLG and analyze the current state-of-the-art models, as well as potential future developments.

3. Development and evaluation of deep reinforcement learning (RL) for robotic navigation and control.

Introduction: Robotic navigation and control are challenging tasks, which require a high degree of intelligence and adaptability. Deep RL has shown promising results in various robotics tasks, such as robotic arm control, autonomous navigation, and manipulation. This thesis aims to develop and evaluate a deep RL-based approach for robotic navigation and control and evaluate its performance in various environments and tasks.

4. Investigating the use of deep learning for drug discovery and development.

Introduction: Drug discovery and development is a time-consuming and expensive process, which often involves high failure rates. Deep learning has been used to improve various tasks in bioinformatics and biotechnology, such as protein structure prediction and gene expression analysis. This thesis aims to investigate the use of deep learning for drug discovery and development and examine its potential to improve the efficiency and accuracy of the drug development process.

5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data.

Introduction: Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in time series data, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for anomaly detection in time series data and examine their respective strengths and weaknesses.

Photo by Joanna Kosinska on Unsplash

6. Use of deep transfer learning in speech recognition and synthesis.

Introduction: Speech recognition and synthesis are areas of natural language processing that focus on converting spoken language to text and vice versa. Transfer learning has been widely used in deep learning-based speech recognition and synthesis systems to improve their performance by reusing the features learned from other tasks. This thesis aims to investigate the use of transfer learning in speech recognition and synthesis and how it improves the performance of the system in comparison to traditional methods.

7. The use of deep learning for financial prediction.

Introduction: Financial prediction is a challenging task that requires a high degree of intelligence and adaptability, especially in the field of stock market prediction. Deep learning has shown promising results in various financial prediction tasks, such as stock price prediction and credit risk analysis. This thesis aims to investigate the use of deep learning for financial prediction and examine its potential to improve the accuracy of financial forecasting.

8. Investigating the use of deep learning for computer vision in agriculture.

Introduction: Computer vision has the potential to revolutionize the field of agriculture by improving crop monitoring, precision farming, and yield prediction. Deep learning has been used to improve various computer vision tasks, such as object detection, semantic segmentation, and image classification. This thesis aims to investigate the use of deep learning for computer vision in agriculture and examine its potential to improve the efficiency and accuracy of crop monitoring and precision farming.

9. Development and evaluation of deep learning models for generative design in engineering and architecture.

Introduction: Generative design is a powerful tool in engineering and architecture that can help optimize designs and reduce human error. Deep learning has been used to improve various generative design tasks, such as design optimization and form generation. This thesis aims to develop and evaluate deep learning models for generative design in engineering and architecture and examine their potential to improve the efficiency and accuracy of the design process.

10. Investigating the use of deep learning for natural language understanding.

Introduction: Natural language understanding is a complex task of natural language processing that involves extracting meaning from text. Deep learning has been used to improve various NLP tasks, such as machine translation, sentiment analysis, and question-answering. This thesis aims to investigate the use of deep learning for natural language understanding and examine its potential to improve the efficiency and accuracy of natural language understanding systems.

Photo by UX Indonesia on Unsplash

11. Comparing deep learning and traditional machine learning methods for image compression.

Introduction: Image compression is an important task in image processing and computer vision. It enables faster data transmission and storage of image files. Deep learning methods have been used to improve image compression, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for image compression and examine their respective strengths and weaknesses.

12. Using deep learning for sentiment analysis in social media.

Introduction: Sentiment analysis in social media is an important task that can help businesses and organizations understand their customers’ opinions and feedback. Deep learning has been used to improve sentiment analysis in social media, by training models on large datasets of social media text. This thesis aims to use deep learning for sentiment analysis in social media, and evaluate its performance against traditional machine learning methods.

13. Investigating the use of deep learning for image generation.

Introduction: Image generation is a task in computer vision that involves creating new images from scratch or modifying existing images. Deep learning has been used to improve various image generation tasks, such as super-resolution, style transfer, and face generation. This thesis aims to investigate the use of deep learning for image generation and examine its potential to improve the quality and diversity of generated images.

14. Development and evaluation of deep learning models for anomaly detection in cybersecurity.

Introduction: Anomaly detection in cybersecurity is an important task that can help detect and prevent cyber-attacks. Deep learning has been used to improve various anomaly detection tasks, such as intrusion detection and malware detection. This thesis aims to develop and evaluate deep learning models for anomaly detection in cybersecurity and examine their potential to improve the efficiency and accuracy of cybersecurity systems.

15. Investigating the use of deep learning for natural language summarization.

Introduction: Natural language summarization is an important task in natural language processing that involves creating a condensed version of a text that preserves its main meaning. Deep learning has been used to improve various natural language summarization tasks, such as document summarization and headline generation. This thesis aims to investigate the use of deep learning for natural language summarization and examine its potential to improve the efficiency and accuracy of natural language summarization systems.

Photo by Windows on Unsplash

16. Development and evaluation of deep learning models for facial expression recognition.

Introduction: Facial expression recognition is an important task in computer vision and has many practical applications, such as human-computer interaction, emotion recognition, and psychological studies. Deep learning has been used to improve facial expression recognition, by training models on large datasets of images. This thesis aims to develop and evaluate deep learning models for facial expression recognition and examine their performance against traditional machine learning methods.

17. Investigating the use of deep learning for generative models in music and audio.

Introduction: Music and audio synthesis is an important task in audio processing, which has many practical applications, such as music generation and speech synthesis. Deep learning has been used to improve generative models for music and audio, by training models on large datasets of audio data. This thesis aims to investigate the use of deep learning for generative models in music and audio and examine its potential to improve the quality and diversity of generated audio.

18. Study the comparison of deep learning models with traditional algorithms for anomaly detection in network traffic.

Introduction: Anomaly detection in network traffic is an important task that can help detect and prevent cyber-attacks. Deep learning models have been used for this task, and traditional methods such as clustering and rule-based systems are widely used as well. This thesis aims to compare deep learning models with traditional algorithms for anomaly detection in network traffic and analyze the trade-offs between the models in terms of accuracy and scalability.

19. Investigating the use of deep learning for improving recommender systems.

Introduction: Recommender systems are widely used in many applications such as online shopping, music streaming, and movie streaming. Deep learning has been used to improve the performance of recommender systems, by training models on large datasets of user-item interactions. This thesis aims to investigate the use of deep learning for improving recommender systems and compare its performance with traditional content-based and collaborative filtering approaches.

20. Development and evaluation of deep learning models for multi-modal data analysis.

Introduction: Multi-modal data analysis is the task of analyzing and understanding data from multiple sources such as text, images, and audio. Deep learning has been used to improve multi-modal data analysis, by training models on large datasets of multi-modal data. This thesis aims to develop and evaluate deep learning models for multi-modal data analysis and analyze their potential to improve performance in comparison to single-modal models.

I hope that this article has provided you with a useful guide for your thesis research in machine learning and deep learning. Remember to conduct a thorough literature review and to include proper citations in your work, as well as to be original in your research to avoid plagiarism. I wish you all the best of luck with your thesis and your research endeavors!

Continue Learning

Understanding the mechanics: how ai art generators produce unique artworks, the best free ai tool for image generation: not midjourney, midjourney lighting guide: tips and advice, wondershare virbo reviewed: the best ai video creator, how ai is altering our memories and perception of reality, prompt engineering: how to turn your words into works of art.

A list of completed theses and new thesis topics from the Computer Vision Group.

Are you about to start a BSc or MSc thesis? Please read our instructions for preparing and delivering your work.

Below we list possible thesis topics for Bachelor and Master students in the areas of Computer Vision, Machine Learning, Deep Learning and Pattern Recognition. The project descriptions leave plenty of room for your own ideas. If you would like to discuss a topic in detail, please contact the supervisor listed below and Prof. Paolo Favaro to schedule a meeting. Note that for MSc students in Computer Science it is required that the official advisor is a professor in CS.

AI deconvolution of light microscopy images

Level: master.

Background Light microscopy became an indispensable tool in life sciences research. Deconvolution is an important image processing step in improving the quality of microscopy images for removing out-of-focus light, higher resolution, and beter signal to noise ratio. Currently classical deconvolution methods, such as regularisation or blind deconvolution, are implemented in numerous commercial software packages and widely used in research. Recently AI deconvolution algorithms have been introduced and being currently actively developed, as they showed a high application potential.

Aim Adaptation of available AI algorithms for deconvolution of microscopy images. Validation of these methods against state-of-the -art commercially available deconvolution software.

Material and Methods Student will implement and further develop available AI deconvolution methods and acquire test microscopy images of different modalities. Performance of developed AI algorithms will be validated against available commercial deconvolution software.

Al algorithm development and implementation: 50%.
Data acquisition: 10%.
Comparison of performance: 40 %.

Requirements

Interest in imaging.
Solid knowledge of AI.
Good programming skills.

Supervisors Paolo Favaro, Guillaume Witz, Yury Belyaev.

Institutes Computer Vison Group, Digital Science Lab, Microscopy imaging Center.

Contact Yury Belyaev, Microscopy imaging Center, [email protected] , + 41 78 899 0110.

Instance segmentation of cryo-ET images

Level: bachelor/master.

In the 1600s, a pioneering Dutch scientist named Antonie van Leeuwenhoek embarked on a remarkable journey that would forever transform our understanding of the natural world. Armed with a simple yet ingenious invention, the light microscope, he delved into uncharted territory, peering through its lens to reveal the hidden wonders of microscopic structures. Fast forward to today, where cryo-electron tomography (cryo-ET) has emerged as a groundbreaking technique, allowing researchers to study proteins within their natural cellular environments. Proteins, functioning as vital nano-machines, play crucial roles in life and understanding their localization and interactions is key to both basic research and disease comprehension. However, cryo-ET images pose challenges due to inherent noise and a scarcity of annotated data for training deep learning models.

Credit: S. Albert et al./PNAS (CC BY 4.0)

To address these challenges, this project aims to develop a self-supervised pipeline utilizing diffusion models for instance segmentation in cryo-ET images. By leveraging the power of diffusion models, which iteratively diffuse information to capture underlying patterns, the pipeline aims to refine and accurately segment cryo-ET images. Self-supervised learning, which relies on unlabeled data, reduces the dependence on extensive manual annotations. Successful implementation of this pipeline could revolutionize the field of structural biology, facilitating the analysis of protein distribution and organization within cellular contexts. Moreover, it has the potential to alleviate the limitations posed by limited annotated data, enabling more efficient extraction of valuable information from cryo-ET images and advancing biomedical applications by enhancing our understanding of protein behavior.

Methods The segmentation pipeline for cryo-electron tomography (cryo-ET) images consists of two stages: training a diffusion model for image generation and training an instance segmentation U-Net using synthetic and real segmentation masks.

1. Diffusion Model Training: a. Data Collection: Collect and curate cryo-ET image datasets from the EMPIAR database (https://www.ebi.ac.uk/empiar/). b. Architecture Design: Select an appropriate architecture for the diffusion model. c. Model Evaluation: Cryo-ET experts will help assess image quality and fidelity through visual inspection and quantitative measures 2. Building the Segmentation dataset: a. Synthetic and real mask generation: Use the trained diffusion model to generate synthetic cryo-ET images. The diffusion process will be seeded from either a real or a synthetic segmentation mask. This will yield to pairs of cryo-ET images and segmentation masks. 3. Instance Segmentation U-Net Training: a. Architecture Design: Choose an appropriate instance segmentation U-Net architecture. b. Model Evaluation: Evaluate the trained U-Net using precision, recall, and F1 score metrics.

By combining the diffusion model for cryo-ET image generation and the instance segmentation U-Net, this pipeline provides an efficient and accurate approach to segment structures in cryo-ET images, facilitating further analysis and interpretation.

References 1. Kwon, Diana. "The secret lives of cells-as never seen before." Nature 598.7882 (2021): 558-560. 2. Moebel, Emmanuel, et al. "Deep learning improves macromolecule identification in 3D cellular cryo-electron tomograms." Nature methods 18.11 (2021): 1386-1394. 3. Rice, Gavin, et al. "TomoTwin: generalized 3D localization of macromolecules in cryo-electron tomograms with structural data mining." Nature Methods (2023): 1-10.

Contacts Prof. Thomas Lemmin Institute of Biochemistry and Molecular Medicine Bühlstrasse 28, 3012 Bern ( [email protected] )

Prof. Paolo Favaro Institute of Computer Science Neubrückstrasse 10 3012 Bern ( [email protected] )

Adding and removing multiple sclerosis lesions with to imaging with diffusion networks

Background multiple sclerosis lesions are the result of demyelination: they appear as dark spots on t1 weighted mri imaging and as bright spots on flair mri imaging. image analysis for ms patients requires both the accurate detection of new and enhancing lesions, and the assessment of atrophy via local thickness and/or volume changes in the cortex. detection of new and growing lesions is possible using deep learning, but made difficult by the relative lack of training data: meanwhile cortical morphometry can be affected by the presence of lesions, meaning that removing lesions prior to morphometry may be more robust. existing ‘lesion filling’ methods are rather crude, yielding unrealistic-appearing brains where the borders of the removed lesions are clearly visible., aim: denoising diffusion networks are the current gold standard in mri image generation [1]: we aim to leverage this technology to remove and add lesions to existing mri images. this will allow us to create realistic synthetic mri images for training and validating ms lesion segmentation algorithms, and for investigating the sensitivity of morphometry software to the presence of ms lesions at a variety of lesion load levels., materials and methods: a large, annotated, heterogeneous dataset of mri data from ms patients, as well as images of healthy controls without white matter lesions, will be available for developing the method. the student will work in a research group with a long track record in applying deep learning methods to neuroimaging data, as well as experience training denoising diffusion networks..

Nature of the Thesis:

Literature review: 10%

Replication of Blob Loss paper: 10%

Implementation of the sliding window metrics:10%

Training on MS lesion segmentation task: 30%

Extension to other datasets: 20%

Results analysis: 20%

Fig. Results of an existing lesion filling algorithm, showing inadequate performance

Requirements:

Interest/Experience with image processing

Python programming knowledge (Pytorch bonus)

Interest in neuroimaging

Supervisor(s):

PD. Dr. Richard McKinley

Institutes: Diagnostic and Interventional Neuroradiology

Center for Artificial Intelligence in Medicine (CAIM), University of Bern

References: [1] Brain Imaging Generation with Latent Diffusion Models , Pinaya et al, Accepted in the Deep Generative Models workshop @ MICCAI 2022 , https://arxiv.org/abs/2209.07162

Contact : PD Dr Richard McKinley, Support Centre for Advanced Neuroimaging ( [email protected] )

Improving metrics and loss functions for targets with imbalanced size: sliding window Dice coefficient and loss.

Background The Dice coefficient is the most commonly used metric for segmentation quality in medical imaging, and a differentiable version of the coefficient is often used as a loss function, in particular for small target classes such as multiple sclerosis lesions. Dice coefficient has the benefit that it is applicable in instances where the target class is in the minority (for example, in case of segmenting small lesions). However, if lesion sizes are mixed, the loss and metric is biased towards performance on large lesions, leading smaller lesions to be missed and harming overall lesion detection. A recently proposed loss function (blob loss[1]) aims to combat this by treating each connected component of a lesion mask separately, and claims improvements over Dice loss on lesion detection scores in a variety of tasks.

Aim: The aim of this thesisis twofold. First, to benchmark blob loss against a simple, potentially superior loss for instance detection: sliding window Dice loss, in which the Dice loss is calculated over a sliding window across the area/volume of the medical image. Second, we will investigate whether a sliding window Dice coefficient is better corellated with lesion-wise detection metrics than Dice coefficient and may serve as an alternative metric capturing both global and instance-wise detection.

Materials and Methods: A large, annotated, heterogeneous dataset of MRI data from MS patients will be available for benchmarking the method, as well as our existing codebases for MS lesion segmentation. Extension of the method to other diseases and datasets (such as covered in the blob loss paper) will make the method more plausible for publication. The student will work alongside clinicians and engineers carrying out research in multiple sclerosis lesion segmentation, in particular in the context of our running project supported by the CAIM grant.

Fig. An annotated MS lesion case, showing the variety of lesion sizes

References: [1] blob loss: instance imbalance aware loss functions for semantic segmentation, Kofler et al, https://arxiv.org/abs/2205.08209

Idempotent and partial skull-stripping in multispectral MRI imaging

Background Skull stripping (or brain extraction) refers to the masking of non-brain tissue from structural MRI imaging. Since 3D MRI sequences allow reconstruction of facial features, many data providers supply data only after skull-stripping, making this a vital tool in data sharing. Furthermore, skull-stripping is an important pre-processing step in many neuroimaging pipelines, even in the deep-learning era: while many methods could now operate on data with skull present, they have been trained only on skull-stripped data and therefore produce spurious results on data with the skull present.

High-quality skull-stripping algorithms based on deep learning are now widely available: the most prominent example is HD-BET [1]. A major downside of HD-BET is its behaviour on datasets to which skull-stripping has already been applied: in this case the algorithm falsely identifies brain tissue as skull and masks it. A skull-stripping algorithm F not exhibiting this behaviour would be idempotent: F(F(x)) = F(x) for any image x. Furthermore, legacy datasets from before the availability of high-quality skull-stripping algorithms may still contain images which have been inadequately skull-stripped: currently the only solution to improve the skull-stripping on this data is to go back to the original datasource or to manually correct the skull-stripping, which is time-consuming and prone to error.

Aim: In this project, the student will develop an idempotent skull-stripping network which can also handle partially skull-stripped inputs. In the best case, the network will operate well on a large subset of the data we work with (e.g. structural MRI, diffusion-weighted MRI, Perfusion-weighted MRI, susceptibility-weighted MRI, at a variety of field strengths) to maximize the future applicability of the network across the teams in our group.

Materials and Methods: Multiple datasets, both publicly available and internal (encompassing thousands of 3D volumes) will be available. Silver standard reference data for standard sequences at 1.5T and 3T can be generated using existing tools such as HD-BET: for other sequences and field strengths semi-supervised learning or methods improving robustness to domain shift may be employed. Robustness to partial skull-stripping may be induced by a combination of learning theory and model-based approaches.

Dataset curation: 10%

Idempotent skull-stripping model building: 30%

Modelling of partial skull-stripping:10%

Extension of model to handle partial skull: 30%

Results analysis: 10%

Fig. An example of failed skull-stripping requiring manual correction

References: [1] Isensee, F, Schell, M, Pflueger, I, et al. Automated brain extraction of multisequence MRI using artificial neural networks. Hum Brain Mapp . 2019; 40: 4952– 4964. https://doi.org/10.1002/hbm.24750

Automated leaf detection and leaf area estimation (for Arabidopsis thaliana)

Correlating plant phenotypes such as leaf area or number of leaves to the genotype (i.e. changes in DNA) is a common goal for plant breeders and molecular biologists. Such data can not only help to understand fundamental processes in nature, but also can help to improve ecotypes, e.g., to perform better under climate change, or reduce fertiliser input. However, collecting data for many plants is very time consuming and automated data acquisition is necessary.

The project aims at building a machine learning model to automatically detect plants in top-view images (see examples below), segment their leaves (see Fig C) and to estimate the leaf area. This information will then be used to determine the leaf area of different Arabidopsis ecotypes. The project will be carried out in collaboration with researchers of the Institute of Plant Sciences at the University of Bern. It will also involve the design and creation of a dataset of plant top-views with the corresponding annotation (provided by experts at the Institute of Plant Sciences).

Contact: Prof. Dr. Paolo Favaro ( [email protected] )

Master Projects at the ARTORG Center

The Gerontechnology and Rehabilitation group at the ARTORG Center for Biomedical Engineering is offering multiple MSc thesis projects to students, which are interested in working with real patient data, artificial intelligence and machine learning algorithms. The goal of these projects is to transfer the findings to the clinic in order to solve today’s healthcare problems and thus to improve the quality of life of patients. Assessment of Digital Biomarkers at Home by Radar. [PDF] Comparison of Radar, Seismograph and Ballistocardiography and to Monitor Sleep at Home. [PDF] Sentimental Analysis in Speech. [PDF] Contact: Dr. Stephan Gerber ( [email protected] )

Internship in Computational Imaging at Prophesee

A 6 month intership at Prophesee, Grenoble is offered to a talented Master Student.

The topic of the internship is working on burst imaging following the work of Sam Hasinoff , and exploring ways to improve it using event-based vision.

A compensation to cover the expenses of living in Grenoble is offered. Only students that have legal rights to work in France can apply.

Anyone interested can send an email with the CV to Daniele Perrone ( [email protected] ).

Using machine learning applied to wearables to predict mental health

This Master’s project lies at the intersection of psychiatry and computer science and aims to use machine learning techniques to improve health. Using sensors to detect sleep and waking behavior has as of yet unexplored potential to reveal insights into health. In this study, we make use of a watch-like device, called an actigraph, which tracks motion to quantify sleep behavior and waking activity. Participants in the study consist of healthy and depressed adolescents and wear actigraphs for a year during which time we query their mental health status monthly using online questionnaires. For this masters thesis we aim to make use of machine learning methods to predict mental health based on the data from the actigraph. The ability to predict mental health crises based on sleep and wake behavior would provide an opportunity for intervention, significantly impacting the lives of patients and their families. This Masters thesis is a collaboration between Professor Paolo Favaro at the Institute of Computer Science ( [email protected] ) and Dr Leila Tarokh at the Universitäre Psychiatrische Dienste (UPD) ( [email protected] ). We are looking for a highly motivated individual interested in bridging disciplines.

Bachelor or Master Projects at the ARTORG Center

The Gerontechnology and Rehabilitation group at the ARTORG Center for Biomedical Engineering is offering multiple BSc- and MSc thesis projects to students, which are interested in working with real patient data, artificial intelligence and machine learning algorithms. The goal of these projects is to transfer the findings to the clinic in order to solve today’s healthcare problems and thus to improve the quality of life of patients. Machine Learning Based Gait-Parameter Extraction by Using Simple Rangefinder Technology. [PDF] Detection of Motion in Video Recordings [PDF] Home-Monitoring of Elderly by Radar [PDF] Gait feature detection in Parkinson's Disease [PDF] Development of an arthroscopic training device using virtual reality [PDF] Contact: Dr. Stephan Gerber ( [email protected] ), Michael Single ( [email protected]. ch )

Dynamic Transformer

Level: bachelor.

Visual Transformers have obtained state of the art classification accuracies [ViT, DeiT, T2T, BoTNet]. Mixture of experts could be used to increase the capacity of a neural network by learning instance dependent execution pathways in a network [MoE]. In this research project we aim to push the transformers to their limit and combine their dynamic attention with MoEs, compared to Switch Transformer [Switch], we will use a much more efficient formulation of mixing [CondConv, DynamicConv] and we will use this idea in the attention part of the transformer, not the fully connected layer.

Input dependent attention kernel generation for better transformer layers.

Publication Opportunity: Dynamic Neural Networks Meets Computer Vision (a CVPR 2021 Workshop)

Extensions:

The same idea could be extended to other ViT/Transformer based models [DETR, SETR, LSTR, TrackFormer, BERT]

Quantized ViT

Visual Transformers have obtained state of the art classification accuracies [ViT, CLIP, DeiT], but the best ViT models are extremely compute heavy and running them even only for inference (not doing backpropagation) is expensive. Running transformers cheaply by quantization is not a new problem and it has been tackled before for BERT [BERT] in NLP [Q-BERT, Q8BERT, TernaryBERT, BinaryBERT]. In this project we will be trying to quantize pretrained ViT models.

Quantizing ViT models for faster inference and smaller models without losing accuracy

Publication Opportunity: Binary Networks for Computer Vision 2021 (a CVPR workshop)

Extensions:

Having a fast pipeline for image inference with ViT will allow us to dig deep into the attention of ViT and analyze it, we might be able to prune some attention heads or replace them with static patterns (like local convolution or dilated patterns), We might be even able to replace the transformer with performer and increase the throughput even more [Performer].
The same idea could be extended to other ViT based models [DETR, SETR, LSTR, TrackFormer, CPTR, BoTNet, T2TViT]
Learning Transferable Visual Models From Natural Language Supervision [CLIP]
Visual Transformers: Token-based Image Representation and Processing for Computer Vision [ViT]
DeiT: Data-efficient Image Transformers [DeiT]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [BERT]
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT [Q-BERT]
Q8BERT: Quantized 8Bit BERT [Q8BERT]
TernaryBERT: Distillation-aware Ultra-low Bit BERT [TernaryBERT]
BinaryBERT: Pushing the Limit of BERT Quantization [BinaryBERT]
Rethinking Attention with Performers [Performer]
End-to-End Object Detection with Transformers [DETR]
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [SETR]
End-to-end Lane Shape Prediction with Transformers [LSTR]
TrackFormer: Multi-Object Tracking with Transformers [TrackFormer]
CPTR: Full Transformer Network for Image Captioning [CPTR]
Bottleneck Transformers for Visual Recognition [BoTNet]
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [T2TViT]

Multimodal Contrastive Learning

Recently contrastive learning has gained a lot of attention for self-supervised image representation learning [SimCLR, MoCo]. Contrastive learning could be extended to multimodal data, like videos (images and audio) [CMC, CoCLR]. Most contrastive methods require large batch sizes (or large memory pools) which makes them expensive for training. In this project we are going to use non batch size dependent contrastive methods [SwAV, BYOL, SimSiam] to train multimodal representation extractors.

Our main goal is to compare the proposed method with the CMC baseline, so we will be working with STL10, ImageNet, UCF101, HMDB51, and NYU Depth-V2 datasets.

Inspired by the recent works on smaller datasets [ConVIRT, CPD], to accelerate the training speed, we could start with two pretrained single-modal models and finetune them with the proposed method.

Extending SwAV to multimodal datasets
Grasping a better understanding of the BYOL

Publication Opportunity: MULA 2021 (a CVPR workshop on Multimodal Learning and Applications)

Most knowledge distillation methods for contrastive learners also use large batch sizes (or memory pools) [CRD, SEED], the proposed method could be extended for knowledge distillation.
One could easily extend this idea to multiview learning, for example one could have two different networks working on the same input and train them with contrastive learning, this may lead to better models [DeiT] by cross-model inductive biases communications.
Self-supervised Co-training for Video Representation Learning [CoCLR]
Learning Spatiotemporal Features via Video and Text Pair Discrimination [CPD]
Audio-Visual Instance Discrimination with Cross-Modal Agreement [AVID-CMA]
Self-Supervised Learning by Cross-Modal Audio-Video Clustering [XDC]
Contrastive Multiview Coding [CPC]
Contrastive Learning of Medical Visual Representations from Paired Images and Text [ConVIRT]
A Simple Framework for Contrastive Learning of Visual Representations [SimCLR]
Momentum Contrast for Unsupervised Visual Representation Learning [MoCo]
Bootstrap your own latent: A new approach to self-supervised Learning [BYOL]
Exploring Simple Siamese Representation Learning [SimSiam]
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments [SwAV]
Contrastive Representation Distillation [CRD]
SEED: Self-supervised Distillation For Visual Representation [SEED]

Robustness of Neural Networks

Neural Networks have been found to achieve surprising performance in several tasks such as classification, detection and segmentation. However, they are also very sensitive to small (controlled) changes to the input. It has been shown that some changes to an image that are not visible to the naked eye may lead the network to output an incorrect label. This thesis will focus on studying recent progress in this area and aim to build a procedure for a trained network to self-assess its reliability in classification or one of the popular computer vision tasks.

Contact: Paolo Favaro

Masters projects at sitem center

The Personalised Medicine Research Group at the sitem Center for Translational Medicine and Biomedical Entrepreneurship is offering multiple MSc thesis projects to the biomed eng MSc students that may also be of interest to the computer science students. Automated quantification of cartilage quality for hip treatment decision support. PDF Automated quantification of massive rotator cuff tears from MRI. PDF Deep learning-based segmentation and fat fraction analysis of the shoulder muscles using quantitative MRI. PDF Unsupervised Domain Adaption for Cross-Modality Hip Joint Segmentation. PDF Contact: Dr. Kate Gerber

Internships/Master thesis @ Chronocam

3-6 months internships on event-based computer vision. Chronocam is a rapidly growing startup developing event-based technology, with more than 15 PhDs working on problems like tracking, detection, classification, SLAM, etc. Event-based computer vision has the potential to solve many long-standing problems in traditional computer vision, and this is a super exciting time as this potential is becoming more and more tangible in many real-world applications. For next year we are looking for motivated Master and PhD students with good software engineering skills (C++ and/or python), and preferable good computer vision and deep learning background. PhD internships will be more research focused and possibly lead to a publication. For each intern we offer a compensation to cover the expenses of living in Paris. List of some of the topics we want to explore:

Photo-realistic image synthesis and super-resolution from event-based data (PhD)
Self-supervised representation learning (PhD)
End-to-end Feature Learning for Event-based Data
Bio-inspired Filtering using Spiking Networks
On-the fly Compression of Event-based Streams for Low-Power IoT Cameras
Tracking of Multiple Objects with a Dual-Frequency Tracker
Event-based Autofocus
Stabilizing an Event-based Stream using an IMU
Crowd Monitoring for Low-power IoT Cameras
Road Extraction from an Event-based Camera Mounted in a Car for Autonomous Driving
Sign detection from an Event-based Camera Mounted in a Car for Autonomous Driving
High-frequency Eye Tracking

Email with attached CV to Daniele Perrone at [email protected] .

Contact: Daniele Perrone

Object Detection in 3D Point Clouds

Today we have many 3D scanning techniques that allow us to capture the shape and appearance of objects. It is easier than ever to scan real 3D objects and transform them into a digital model for further processing, such as modeling, rendering or animation. However, the output of a 3D scanner is often a raw point cloud with little to no annotations. The unstructured nature of the point cloud representation makes it difficult for processing, e.g. surface reconstruction. One application is the detection and segmentation of an object of interest. In this project, the student is challenged to design a system that takes a point cloud (a 3D scan) as input and outputs the names of objects contained in the scan. This output can then be used to eliminate outliers or points that belong to the background. The approach involves collecting a large dataset of 3D scans and training a neural network on it.

Contact: Adrian Wälchli

Shape Reconstruction from a Single RGB Image or Depth Map

A photograph accurately captures the world in a moment of time and from a specific perspective. Since it is a projection of the 3D space to a 2D image plane, the depth information is lost. Is it possible to restore it, given only a single photograph? In general, the answer is no. This problem is ill-posed, meaning that many different plausible depth maps exist, and there is no way of telling which one is the correct one. However, if we cover one of our eyes, we are still able to recognize objects and estimate how far away they are. This motivates the exploration of an approach where prior knowledge can be leveraged to reduce the ill-posedness of the problem. Such a prior could be learned by a deep neural network, trained with many images and depth maps.

CNN Based Deblurring on Mobile

Deblurring finds many applications in our everyday life. It is particularly useful when taking pictures on handheld devices (e.g. smartphones) where camera shake can degrade important details. Therefore, it is desired to have a good deblurring algorithm implemented directly in the device. In this project, the student will implement and optimize a state-of-the-art deblurring method based on a deep neural network for deployment on mobile phones (Android). The goal is to reduce the number of network weights in order to reduce the memory footprint while preserving the quality of the deblurred images. The result will be a camera app that automatically deblurs the pictures, giving the user a choice of keeping the original or the deblurred image.

Depth from Blur

If an object in front of the camera or the camera itself moves while the aperture is open, the region of motion becomes blurred because the incoming light is accumulated in different positions across the sensor. If there is camera motion, there is also parallax. Thus, a motion blurred image contains depth information. In this project, the student will tackle the problem of recovering a depth-map from a motion-blurred image. This includes the collection of a large dataset of blurred- and sharp images or videos using a pair or triplet of GoPro action cameras. Two cameras will be used in stereo to estimate the depth map, and the third captures the blurred frames. This data is then used to train a convolutional neural network that will predict the depth map from the blurry image.

Unsupervised Clustering Based on Pretext Tasks

The idea of this project is that we have two types of neural networks that work together: There is one network A that assigns images to k clusters and k (simple) networks of type B perform a self-supervised task on those clusters. The goal of all the networks is to make the k networks of type B perform well on the task. The assumption is that clustering in semantically similar groups will help the networks of type B to perform well. This could be done on the MNIST dataset with B being linear classifiers and the task being rotation prediction.

Adversarial Data-Augmentation

The student designs a data augmentation network that transforms training images in such a way that image realism is preserved (e.g. with a constrained spatial transformer network) and the transformed images are more difficult to classify (trained via adversarial loss against an image classifier). The model will be evaluated for different data settings (especially in the low data regime), for example on the MNIST and CIFAR datasets.

Unsupervised Learning of Lip-reading from Videos

People with sensory impairment (hearing, speech, vision) depend heavily on assistive technologies to communicate and navigate in everyday life. The mass production of media content today makes it impossible to manually translate everything into a common language for assistive technologies, e.g. captions or sign language. In this project, the student employs a neural network to learn a representation for lip-movement in videos in an unsupervised fashion, possibly with an encoder-decoder structure where the decoder reconstructs the audio signal. This requires collecting a large dataset of videos (e.g. from YouTube) of speakers or conversations where lip movement is visible. The outcome will be a neural network that learns an audio-visual representation of lip movement in videos, which can then be leveraged to generate captions for hearing impaired persons.

Learning to Generate Topographic Maps from Satellite Images

Satellite images have many applications, e.g. in meteorology, geography, education, cartography and warfare. They are an accurate and detailed depiction of the surface of the earth from above. Although it is relatively simple to collect many satellite images in an automated way, challenges arise when processing them for use in navigation and cartography. The idea of this project is to automatically convert an arbitrary satellite image, of e.g. a city, to a map of simple 2D shapes (streets, houses, forests) and label them with colors (semantic segmentation). The student will collect a dataset of satellite image and topological maps and train a deep neural network that learns to map from one domain to the other. The data could be obtained from a Google Maps database or similar.

Optimization of OmniMotion, a tracking algorithm

Martí farré farrús · june 2024.

This thesis presents Quasi-OmniFastTrack, an improved version of the OmniMotion algorithm for long-term pixel tracking in videos. The key contribution is reducing the computational expense and training time of OmniMotion while maintaining comparable tracking performance. The main bottleneck in OmniMotion was identified to be the NeRF network used for 3D scene representation. Quasi-OmniFastTrack replaces this with a pre-trained depth estimation model, significantly reducing training time, based on the work introduced in OmniFastTrack, hence the name. The invertible neural network for mapping between local and canonical coordinates is retained, but optimized depths are used to lift 2D pixels to 3D. Experiments show that Quasi-OmniFastTrack reduces training time by over 50% compared to OmniMotion while achieving similar qualitative tracking results on sequences with occlusions. Performance degrades somewhat on fast-moving scenes. The ablation studies demonstrate the importance of optimizing the initial depth estimates during training. While not matching OmniMotion's robustness in all scenarios, Quasi-OmniFastTrack offers a compelling speed-accuracy tradeoff, enabling long-term tracking on more videos in practical timeframes. Future work on incorporating other modifications introduced in OmniFastTrack, like long-term semantic features, could further improve tracking consistency.

New Variables of Brain Morphometry: the Potential and Limitations of CNN Regression

Timo blattner · sept. 2022.

The calculation of variables of brain morphology is computationally very expensive and time-consuming. A previous work showed the feasibility of ex- tracting the variables directly from T1-weighted brain MRI images using a con- volutional neural network. We used significantly more data and extended their model to a new set of neuromorphological variables, which could become inter- esting biomarkers in the future for the diagnosis of brain diseases. The model shows for nearly all subjects a less than 5% mean relative absolute error. This high relative accuracy can be attributed to the low morphological variance be- tween subjects and the ability of the model to predict the cortical atrophy age trend. The model however fails to capture all the variance in the data and shows large regional differences. We attribute these limitations in part to the moderate to poor reliability of the ground truth generated by FreeSurfer. We further investigated the effects of training data size and model complexity on this regression task and found that the size of the dataset had a significant impact on performance, while deeper models did not perform better. Lack of interpretability and dependence on a silver ground truth are the main drawbacks of this direct regression approach.

Home Monitoring by Radar

Lars ziegler · sept. 2022.

Detection and tracking of humans via UWB radars is a promising and continuously evolving field with great potential for medical technology. This contactless method of acquiring data of a patients movement patterns is ideal for in home application. As irregularities in a patients movement patterns are an indicator for various health problems including neurodegenerative diseases, the insight this data could provide may enable earlier detection of such problems. In this thesis a signal processing pipeline is presented with which a persons movement is modeled. During an experiment 142 measurements were recorded by two separate radar systems and one lidar system which each consisted of multiple sensors. The models that were calculated on these measurements by the signal processing pipeline were used to predict the times when a person stood up or sat down. The predictions showed an accuracy of 72.2%.

Revisiting non-learning based 3D reconstruction from multiple images

Aaron sägesser · oct. 2021.

Arthroscopy consists of challenging tasks and requires skills that even today, young surgeons still train directly throughout the surgery. Existing simulators are expensive and rarely available. Through the growing potential of virtual reality(VR) (head-mounted) devices for simulation and their applicability in the medical context, these devices have become a promising alternative that would be orders of magnitude cheaper and could be made widely available. To build a VR-based training device for arthroscopy is the overall aim of our project, as this would be of great benefit and might even be applicable in other minimally invasive surgery (MIS). This thesis marks a first step of the project with its focus to explore and compare well-known algorithms in a multi-view stereo (MVS) based 3D reconstruction with respect to imagery acquired by an arthroscopic camera. Simultaneously with this reconstruction, we aim to gain essential measures to compare the VR environment to the real world, as validation of the realism of future VR tasks. We evaluate 3 different feature extraction algorithms with 3 different matching techniques and 2 different algorithms for the estimation of the fundamental (F) matrix. The evaluation of these 18 different setups is made with a reconstruction pipeline embedded in a jupyter notebook implemented in python based on common computer vision libraries and compared with imagery generated with a mobile phone as well as with the reconstruction results of state-of-the-art (SOTA) structure-from-motion (SfM) software COLMAP and Multi-View Environment (MVE). Our comparative analysis manifests the challenges of heavy distortion, the fish-eye shape and weak image quality of arthroscopic imagery, as all results are substantially worse using this data. However, there are huge differences regarding the different setups. Scale Invariant Feature Transform (SIFT) and Oriented FAST Rotated BRIEF (ORB) in combination with k-Nearest Neighbour (kNN) matching and Least Median of Squares (LMedS) present the most promising results. Overall, the 3D reconstruction pipeline is a useful tool to foster the process of gaining measurements from the arthroscopic exploration device and to complement the comparative research in this context.

Examination of Unsupervised Representation Learning by Predicting Image Rotations

Eric lagger · sept. 2020.

In recent years deep convolutional neural networks achieved a lot of progress. To train such a network a lot of data is required and in supervised learning algorithms it is necessary that the data is labeled. To label data there is a lot of human work needed and this takes a lot of time and money to be done. To avoid the inconveniences that come with this we would like to find systems that don’t need labeled data and therefore are unsupervised learning algorithms. This is the importance of unsupervised algorithms, even though their outcome is not yet on the same qualitative level as supervised algorithms. In this thesis we will discuss an approach of such a system and compare the results to other papers. A deep convolutional neural network is trained to learn the rotations that have been applied to a picture. So we take a large amount of images and apply some simple rotations and the task of the network is to discover in which direction the image has been rotated. The data doesn’t need to be labeled to any category or anything else. As long as all the pictures are upside down we hope to find some high dimensional patterns for the network to learn.

StitchNet: Image Stitching using Autoencoders and Deep Convolutional Neural Networks

Maurice rupp · sept. 2019.

This thesis explores the prospect of artificial neural networks for image processing tasks. More specifically, it aims to achieve the goal of stitching multiple overlapping images to form a bigger, panoramic picture. Until now, this task is solely approached with ”classical”, hardcoded algorithms while deep learning is at most used for specific subtasks. This thesis introduces a novel end-to-end neural network approach to image stitching called StitchNet, which uses a pre-trained autoencoder and deep convolutional networks. Additionally to presenting several new datasets for the task of supervised image stitching with each 120’000 training and 5’000 validation samples, this thesis also conducts various experiments with different kinds of existing networks designed for image superresolution and image segmentation adapted to the task of image stitching. StitchNet outperforms most of the adapted networks in both quantitative as well as qualitative results.

Facial Expression Recognition in the Wild

Luca rolshoven · sept. 2019.

The idea of inferring the emotional state of a subject by looking at their face is nothing new. Neither is the idea of automating this process using computers. Researchers used to computationally extract handcrafted features from face images that had proven themselves to be effective and then used machine learning techniques to classify the facial expressions using these features. Recently, there has been a trend towards using deeplearning and especially Convolutional Neural Networks (CNNs) for the classification of these facial expressions. Researchers were able to achieve good results on images that were taken in laboratories under the same or at least similar conditions. However, these models do not perform very well on more arbitrary face images with different head poses and illumination. This thesis aims to show the challenges of Facial Expression Recognition (FER) in this wild setting. It presents the currently used datasets and the present state-of-the-art results on one of the biggest facial expression datasets currently available. The contributions of this thesis are twofold. Firstly, I analyze three famous neural network architectures and their effectiveness on the classification of facial expressions. Secondly, I present two modifications of one of these networks that lead to the proposed STN-COV model. While this model does not outperform all of the current state-of-the-art models, it does beat several ones of them.

A Study of 3D Reconstruction of Varying Objects with Deformable Parts Models

Raoul grossenbacher · july 2019.

This work covers a new approach to 3D reconstruction. In traditional 3D reconstruction one uses multiple images of the same object to calculate a 3D model by taking information gained from the differences between the images, like camera position, illumination of the images, rotation of the object and so on, to compute a point cloud representing the object. The characteristic trait shared by all these approaches is that one can almost change everything about the image, but it is not possible to change the object itself, because one needs to find correspondences between the images. To be able to use different instances of the same object, we used a 3D DPM model that can find different parts of an object in an image, thereby detecting the correspondences between the different pictures, which we then can use to calculate the 3D model. To take this theory to practise, we gave a 3D DPM model, which was trained to detect cars, pictures of different car brands, where no pair of images showed the same vehicle and used the detected correspondences and the Factorization Method to compute the 3D point cloud. This technique leads to a completely new approach in 3D reconstruction, because changing the object itself was never done before.

Motion deblurring in the wild replication and improvements

Alvaro juan lahiguera · jan. 2019, coma outcome prediction with convolutional neural networks, stefan jonas · oct. 2018, automatic correction of self-introduced errors in source code, sven kellenberger · aug. 2018, neural face transfer: training a deep neural network to face-swap, till nikolaus schnabel · july 2018.

This thesis explores the field of artificial neural networks with realistic looking visual outputs. It aims at morphing face pictures of a specific identity to look like another individual by only modifying key features, such as eye color, while leaving identity-independent features unchanged. Prior works have covered the topic of symmetric translation between two specific domains but failed to optimize it on faces where only parts of the image may be changed. This work applies a face masking operation to the output at training time, which forces the image generator to preserve colors while altering the face, fitting it naturally inside the unmorphed surroundings. Various experiments are conducted including an ablation study on the final setting, decreasing the baseline identity switching performance from 81.7% to 75.8 % whilst improving the average χ2 color distance from 0.551 to 0.434. The provided code-based software gives users easy access to apply this neural face swap to images and videos of arbitrary crop and brings Computer Vision one step closer to replacing Computer Graphics in this specific area.

A Study of the Importance of Parts in the Deformable Parts Model

Sammer puran · june 2017, self-similarity as a meta feature, lucas husi · april 2017, a study of 3d deformable parts models for detection and pose-estimation, simon jenni · march 2015, accelerated federated learning on client silos with label noise: rho selection in classification and segmentation, irakli kelbakiani · may 2024.

Federated Learning has recently gained more research interest. This increased attention is caused by factors including the growth of decentralized data, privacy concerns, and new privacy regulations. In Federated Learning, remote servers keep training a model on local datasets independently, and subsequently, local models are aggregated into a global model, which achieves better overall performance. Sending local model weights instead of the entire dataset is a significant advantage of Federated Learning over centralized classical machine learning algorithms. Federated learning involves uploading and downloading model parameters multiple times, so there are multiple communication rounds between the global server and remote client servers, which imposes challenges. The high number of necessary communication rounds not only increases high-cost communication overheads but is also a critical limitation for servers with low network bandwidth, which leads to latency and a higher probability of training failures caused by communication breakdowns. To mitigate these challenges, we aim to provide a fast-convergent Federated Learning training methodology that decreases the number of necessary communication rounds. We found a paper about Reducible Holdout Loss Selection (RHO-Loss) batch selection methodology, which ”selects low-noise, task-relevant, non-redundant points for training” [1]; we hypothesize, if client silos employ RHO-Loss methodology and successfully avoid training their local models on noisy and non-relevant samples, clients may offer stable and consistent updates to the global server, which could lead to faster convergence of the global model. Our contribution focuses on investigating the RHO-Loss method in a simulated federated setting for the Clothing1M dataset. We also examine its applicability to medical datasets and check its effectiveness in a simulated federated environment. Our experimental results show a promising outcome, specifically a reduction in communication rounds for the Clothing1M dataset. However, as the success of the RHO-Loss selection method depends on the availability of sufficient training data for the target RHO model and for the Irreducible RHO model, we emphasize that our contribution applies to those Federated Learning scenarios where client silos hold enough training data to successfully train and benefit from their RHO model on their local dataset.

Amodal Leaf Segmentation

Nicolas maier · nov. 2023.

Plant phenotyping is the process of measuring and analyzing various traits of plants. It provides essential information on how genetic and environmental factors affect plant growth and development. Manual phenotyping is highly time-consuming; therefore, many computer vision and machine learning based methods have been proposed in the past years to perform this task automatically based on images of the plants. However, the publicly available datasets (in particular, of Arabidopsis thaliana) are limited in size and diversity, making them unsuitable to generalize to new unseen environments. In this work, we propose a complete pipeline able to automatically extract traits of interest from an image of Arabidopsis thaliana. Our method uses a minimal amount of existing annotated data from a source domain to generate a large synthetic dataset adapted to a different target domain (e.g., different backgrounds, lighting conditions, and plant layouts). In addition, unlike the source dataset, the synthetic one provides ground-truth annotations for the occluded parts of the leaves, which are relevant when measuring some characteristics of the plant, e.g., its total area. This synthetic dataset is then used to train a model to perform amodal instance segmentation of the leaves to obtain the total area, leaf count, and color of each plant. To validate our approach, we create a small dataset composed of manually annotated real images of Arabidopsis thaliana, which is used to assess the performance of the models.

Assessment of movement and pose in a hospital bed by ambient and wearable sensor technology in healthy subjects

Tony licata · sept. 2022.

The use of automated systems describing the human motion has become possible in various domains. Most of the proposed systems are designed to work with people moving around in a standing position. Because such system could be interesting in a medical environment, we propose in this work a pipeline that can effectively predict human motion from people lying on beds. The proposed pipeline is tested with a data set composed of 41 participants executing 7 predefined tasks in a bed. The motion of the participants is measured with video cameras, accelerometers and pressure mat. Various experiments are carried with the information retrieved from the data set. Two approaches combining the data from the different measure technologies are explored. The performance of the different carried experiments is measured, and the proposed pipeline is composed with components providing the best results. Later on, we show that the proposed pipeline only needs to use the video cameras, which make the proposed environment easier to implement in real life situations.

Machine Learning Based Prediction of Mental Health Using Wearable-measured Time Series

Seyedeh sharareh mirzargar · sept. 2022.

Depression is the second major cause for years spent in disability and has a growing prevalence in adolescents. The recent Covid-19 pandemic has intensified the situation and limited in-person patient monitoring due to distancing measures. Recent advances in wearable devices have made it possible to record the rest/activity cycle remotely with high precision and in real-world contexts. We aim to use machine learning methods to predict an individual's mental health based on wearable-measured sleep and physical activity. Predicting an impending mental health crisis of an adolescent allows for prompt intervention, detection of depression onset or its recursion, and remote monitoring. To achieve this goal, we train three primary forecasting models; linear regression, random forest, and light gradient boosted machine (LightGBM); and two deep learning models; block recurrent neural network (block RNN) and temporal convolutional network (TCN); on Actigraph measurements to forecast mental health in terms of depression, anxiety, sleepiness, stress, sleep quality, and behavioral problems. Our models achieve a high forecasting performance, the random forest being the winner to reach an accuracy of 98% for forecasting the trait anxiety. We perform extensive experiments to evaluate the models' performance in accuracy, generalization, and feature utilization, using a naive forecaster as the baseline. Our analysis shows minimal mental health changes over two months, making the prediction task easily achievable. Due to these minimal changes in mental health, the models tend to primarily use the historical values of mental health evaluation instead of Actigraph features. At the time of this master thesis, the data acquisition step is still in progress. In future work, we plan to train the models on the complete dataset using a longer forecasting horizon to increase the level of mental health changes and perform transfer learning to compensate for the small dataset size. This interdisciplinary project demonstrates the opportunities and challenges in machine learning based prediction of mental health, paving the way toward using the same techniques to forecast other mental disorders such as internalizing disorder, Parkinson's disease, Alzheimer's disease, etc. and improving the quality of life for individuals who have some mental disorder.

CNN Spike Detector: Detection of Spikes in Intracranial EEG using Convolutional Neural Networks

Stefan jonas · oct. 2021.

The detection of interictal epileptiform discharges in the visual analysis of electroencephalography (EEG) is an important but very difficult, tedious, and time-consuming task. There have been decades of research on computer-assisted detection algorithms, most recently focused on using Convolutional Neural Networks (CNNs). In this thesis, we present the CNN Spike Detector, a convolutional neural network to detect spikes in intracranial EEG. Our dataset of 70 intracranial EEG recordings from 26 subjects with epilepsy introduces new challenges in this research field. We report cross-validation results with a mean AUC of 0.926 (+- 0.04), an area under the precision-recall curve (AUPRC) of 0.652 (+- 0.10) and 12.3 (+- 7.47) false positive epochs per minute for a sensitivity of 80%. A visual examination of false positive segments is performed to understand the model behavior leading to a relatively high false detection rate. We notice issues with the evaluation measures and highlight a major limitation of the common approach of detecting spikes using short segments, namely that the network is not capable to consider the greater context of the segment with regards to its origination. For this reason, we present the Context Model, an extension in which the CNN Spike Detector is supplied with additional information about the channel. Results show promising but limited performance improvements. This thesis provides important findings about the spike detection task for intracranial EEG and lays out promising future research directions to develop a network capable of assisting experts in real-world clinical applications.

PolitBERT - Deepfake Detection of American Politicians using Natural Language Processing

Maurice rupp · april 2021.

This thesis explores the application of modern Natural Language Processing techniques to the detection of artificially generated videos of popular American politicians. Instead of focusing on detecting anomalies and artifacts in images and sounds, this thesis focuses on detecting irregularities and inconsistencies in the words themselves, opening up a new possibility to detect fake content. A novel, domain-adapted, pre-trained version of the language model BERT combined with several mechanisms to overcome severe dataset imbalances yielded the best quantitative as well as qualitative results. Additionally to the creation of the biggest publicly available dataset of English-speaking politicians consisting of 1.5 M sentences from over 1000 persons, this thesis conducts various experiments with different kinds of text classification and sequence processing algorithms applied to the political domain. Furthermore, multiple ablations to manage severe data imbalance are presented and evaluated.

A Study on the Inversion of Generative Adversarial Networks

Ramona beck · march 2021.

The desire to use generative adversarial networks (GANs) for real-world tasks such as object segmentation or image manipulation is increasing as synthesis quality improves, which has given rise to an emerging research area called GAN inversion that focuses on exploring methods for embedding real images into the latent space of a GAN. In this work, we investigate different GAN inversion approaches using an existing generative model architecture that takes a completely unsupervised approach to object segmentation and is based on StyleGAN2. In particular, we propose and analyze algorithms for embedding real images into the different latent spaces Z, W, and W+ of StyleGAN following an optimization-based inversion approach, while also investigating a novel approach that allows fine-tuning of the generator during the inversion process. Furthermore, we investigate a hybrid and a learning-based inversion approach, where in the former we train an encoder with embeddings optimized by our best optimization-based inversion approach, and in the latter we define an autoencoder, consisting of an encoder and the generator of our generative model as a decoder, and train it to map an image into the latent space. We demonstrate the effectiveness of our methods as well as their limitations through a quantitative comparison with existing inversion methods and by conducting extensive qualitative and quantitative experiments with synthetic data as well as real images from a complex image dataset. We show that we achieve qualitatively satisfying embeddings in the W and W+ spaces with our optimization-based algorithms, that fine-tuning the generator during the inversion process leads to qualitatively better embeddings in all latent spaces studied, and that the learning-based approach also benefits from a variable generator as well as a pre-training with our hybrid approach. Furthermore, we evaluate our approaches on the object segmentation task and show that both our optimization-based and our hybrid and learning-based methods are able to generate meaningful embeddings that achieve reasonable object segmentations. Overall, our proposed methods illustrate the potential that lies in the GAN inversion and its application to real-world tasks, especially in the relaxed version of the GAN inversion where the weights of the generator are allowed to vary.

Multi-scale Momentum Contrast for Self-supervised Image Classification

Zhao xueqi · dec. 2020.

With the maturity of supervised learning technology, people gradually shift the research focus to the field of self-supervised learning. ”Momentum Contrast” (MoCo) proposes a new self-supervised learning method and raises the correct rate of self-supervised learning to a new level. Inspired by another article ”Representation Learning by Learning to Count”, if a picture is divided into four parts and passed through a neural network, it is possible to further improve the accuracy of MoCo. Different from the original MoCo, this MoCo variant (Multi-scale MoCo) does not directly pass the image through the encoder after the augmented images. Multi-scale MoCo crops and resizes the augmented images, and the obtained four parts are respectively passed through the encoder and then summed (upsampled version do not do resize to input but resize the contrastive samples). This method of images crop is not only used for queue q but also used for comparison queue k, otherwise the weights of queue k might be damaged during the moment update. This will further discussed in the experiments chapter between downsampled Multi-scale version and downsampled both Multi-scale version. Human beings also have the same principle of object recognition: when human beings see something they are familiar with, even if the object is not fully displayed, people can still guess the object itself with a high probability. Because of this, Multi-scale MoCo applies this concept to the pretext part of MoCo, hoping to obtain better feature extraction. In this thesis, there are three versions of Multi-scale MoCo, downsampled input samples version, downsampled input samples and contrast samples version and upsampled input samples version. The differences between these versions will be described in more detail later. The neural network architecture comparison includes ResNet50 , and the tested data set is STL-10. The weights obtained in pretext will be transferred to self-supervised learning, and in the process of self-supervised learning, the weights of other layers except the final linear layer are frozen without changing (these weights come from pretext).

Self-Supervised Learning Using Siamese Networks and Binary Classifier

Dušan mihajlov · march 2020.

In this thesis, we present several approaches for training a convolutional neural network using only unlabeled data. Our autonomously supervised learning algorithms are based on connections between image patch i. e. zoomed image and its original. Using the siamese architecture neural network we aim to recognize, if the image patch, which is input to the first neural network part, comes from the same image presented to the second neural network part. By applying transformations to both images, and different zoom sizes at different positions, we force the network to extract high level features using its convolutional layers. At the top of our siamese architecture, we have a simple binary classifier that measures the difference between feature maps that we extract and makes a decision. Thus, the only way that the classifier will solve the task correctly is when our convolutional layers are extracting useful representations. Those representations we can than use to solve many different tasks that are related to the data used for unsupervised training. As the main benchmark for all of our models, we used STL10 dataset, where we train a linear classifier on the top of our convolutional layers with a small amount of manually labeled images, which is a widely used benchmark for unsupervised learning tasks. We also combine our idea with recent work on the same topic, and the network called RotNet, which makes use of image rotations and therefore forces the network to learn rotation dependent features from the dataset. As a result of this combination we create a new procedure that outperforms original RotNet.

Learning Object Representations by Mixing Scenes

Lukas zbinden · may 2019.

In the digital age of ever increasing data amassment and accessibility, the demand for scalable machine learning models effective at refining the new oil is unprecedented. Unsupervised representation learning methods present a promising approach to exploit this invaluable yet unlabeled digital resource at scale. However, a majority of these approaches focuses on synthetic or simplified datasets of images. What if a method could learn directly from natural Internet-scale image data? In this thesis, we propose a novel approach for unsupervised learning of object representations by mixing natural image scenes. Without any human help, our method mixes visually similar images to synthesize new realistic scenes using adversarial training. In this process the model learns to represent and understand the objects prevalent in natural image data and makes them available for downstream applications. For example, it enables the transfer of objects from one scene to another. Through qualitative experiments on complex image data we show the effectiveness of our method along with its limitations. Moreover, we benchmark our approach quantitatively against state-of-the-art works on the STL-10 dataset. Our proposed method demonstrates the potential that lies in learning representations directly from natural image data and reinforces it as a promising avenue for future research.

Representation Learning using Semantic Distances

Markus roth · may 2019, zero-shot learning using generative adversarial networks, hamed hemati · dec. 2018, dimensionality reduction via cnns - learning the distance between images, ioannis glampedakis · sept. 2018, learning to play othello using deep reinforcement learning and self play, thomas simon steinmann · sept. 2018, aba-j interactive multi-modality tissue sectionto-volume alignment: a brain atlasing toolkit for imagej, felix meyenhofer · march 2018, learning visual odometry with recurrent neural networks, adrian wälchli · feb. 2018.

In computer vision, Visual Odometry is the problem of recovering the camera motion from a video. It is related to Structure from Motion, the problem of reconstructing the 3D geometry from a collection of images. Decades of research in these areas have brought successful algorithms that are used in applications like autonomous navigation, motion capture, augmented reality and others. Despite the success of these prior works in real-world environments, their robustness is highly dependent on manual calibration and the magnitude of noise present in the images in form of, e.g., non-Lambertian surfaces, dynamic motion and other forms of ambiguity. This thesis explores an alternative approach to the Visual Odometry problem via Deep Learning, that is, a specific form of machine learning with artificial neural networks. It describes and focuses on the implementation of a recent work that proposes the use of Recurrent Neural Networks to learn dependencies over time due to the sequential nature of the input. Together with a convolutional neural network that extracts motion features from the input stream, the recurrent part accumulates knowledge from the past to make camera pose estimations at each point in time. An analysis on the performance of this system is carried out on real and synthetic data. The evaluation covers several ways of training the network as well as the impact and limitations of the recurrent connection for Visual Odometry.

Crime location and timing prediction

Bernard swart · jan. 2018, from cartoons to real images: an approach to unsupervised visual representation learning, simon jenni · feb. 2017, automatic and large-scale assessment of fluid in retinal oct volume, nina mujkanovic · dec. 2016, segmentation in 3d using eye-tracking technology, michele wyss · july 2016, accurate scale thresholding via logarithmic total variation prior, remo diethelm · aug. 2014, unsupervised object segmentation with generative models, adam jakub bielski · april 2024.

Advances in computer vision have transformed how we interact with technology, driven by significant breakthroughs in scalable deep learning and the availability of large datasets. These technologies now play a crucial role in various applications, from improving user experience through applications like organizing digital photo libraries, to advancing medical diagnostics and treatments. Despite these valuable applications, the creation of annotated datasets remains a significant bottleneck. It is not only costly and labor-intensive but also prone to inaccuracies and human biases. Moreover, it often requires specialized knowledge or careful handling of sensitive information. Among the tasks in computer vision, image segmentation particularly highlights these challenges, with its need for precise pixel-level annotations. This context underscores the need for unsupervised approaches in computer vision, which can leverage the large volumes of unlabeled images produced every day. This thesis introduces several novel methods for learning fully unsupervised object segmentation models using only collections of images. Unlike much prior work, our approaches are effective on complex real-world images and do not rely on any form of annotations, including pre-trained supervised networks, bounding boxes, or class labels. We identify and leverage intrinsic properties of objects – most notably, the cohesive movement of object parts – as powerful signals for driving unsupervised object segmentation. Utilizing innovative generative adversarial models, we employ this principle to either generate segmented objects or directly segment them in a manner that allows for realistic movement within scenes. Our work demonstrates how such generated data can train a segmentation model that effectively generalizes to realworld images. Furthermore, we introduce a method that, in conjunction with recent advances in self-supervised learning, achieves state-of-the-art results in unsupervised object segmentation. Our methods rely on the effectiveness of Generative Adversarial Networks, which are known to be challenging to train and exhibit mode collapse. We propose a new, more principled GAN loss, whose gradients encourage the generator model to explore missing modes in its distribution, addressing these limitations and enhancing the robustness of generative models.

Novel Techniques for Robust and Generalizable Machine Learning

Abdelhak lemkhenter · sept. 2023.

Neural networks have transcended their status of powerful proof-of-concept machine learning into the realm of a highly disruptive technology that has revolutionized many quantitative fields such as drug discovery, autonomous vehicles, and machine translation. Today, it is nearly impossible to go a single day without interacting with a neural network-powered application. From search engines to on-device photo-processing, neural networks have become the go-to solution thanks to recent advances in computational hardware and an unprecedented scale of training data. Larger and less curated datasets, typically obtained through web crawling, have greatly propelled the capabilities of neural networks forward. However, this increase in scale amplifies certain challenges associated with training such models. Beyond toy or carefully curated datasets, data in the wild is plagued with biases, imbalances, and various noisy components. Given the larger size of modern neural networks, such models run the risk of learning spurious correlations that fail to generalize beyond their training data. This thesis addresses the problem of training more robust and generalizable machine learning models across a wide range of learning paradigms for medical time series and computer vision tasks. The former is a typical example of a low signal-to-noise ratio data modality with a high degree of variability between subjects and datasets. There, we tailor the training scheme to focus on robust patterns that generalize to new subjects and ignore the noisier and subject-specific patterns. To achieve this, we first introduce a physiologically inspired unsupervised training task and then extend it by explicitly optimizing for cross-dataset generalization using meta-learning. In the context of image classification, we address the challenge of training semi-supervised models under class imbalance by designing a novel label refinement strategy with higher local sensitivity to minority class samples while preserving the global data distribution. Lastly, we introduce a new Generative Adversarial Networks training loss. Such generative models could be applied to improve the training of subsequent models in the low data regime by augmenting the dataset using generated samples. Unfortunately, GAN training relies on a delicate balance between its components, making it prone mode collapse. Our contribution consists of defining a more principled GAN loss whose gradients incentivize the generator model to seek out missing modes in its distribution. All in all, this thesis tackles the challenge of training more robust machine learning models that can generalize beyond their training data. This necessitates the development of methods specifically tailored to handle the diverse biases and spurious correlations inherent in the data. It is important to note that achieving greater generalizability in models goes beyond simply increasing the volume of data; it requires meticulous consideration of training objectives and model architecture. By tackling these challenges, this research contributes to advancing the field of machine learning and underscores the significance of thoughtful design in obtaining more resilient and versatile models.

Automated Sleep Scoring, Deep Learning and Physician Supervision

Luigi fiorillo · oct. 2022.

Sleep plays a crucial role in human well-being. Polysomnography is used in sleep medicine as a diagnostic tool, so as to objectively analyze the quality of sleep. Sleep scoring is the procedure of extracting sleep cycle information from the wholenight electrophysiological signals. The scoring is done worldwide by the sleep physicians according to the official American Academy of Sleep Medicine (AASM) scoring manual. In the last decades, a wide variety of deep learning based algorithms have been proposed to automatise the sleep scoring task. In this thesis we study the reasons why these algorithms fail to be introduced in the daily clinical routine, with the perspective of bridging the existing gap between the automatic sleep scoring models and the sleep physicians. In this light, the primary step is the design of a simplified sleep scoring architecture, also providing an estimate of the model uncertainty. Beside achieving results on par with most up-to-date scoring systems, we demonstrate the efficiency of ensemble learning based algorithms, together with label smoothing techniques, in both enhancing the performance and calibrating the simplified scoring model. We introduced an uncertainty estimate procedure, so as to identify the most challenging sleep stage predictions, and to quantify the disagreement between the predictions given by the model and the annotation given by the physicians. In this thesis we also propose a novel method to integrate the inter-scorer variability into the training procedure of a sleep scoring model. We clearly show that a deep learning model is able to encode this variability, so as to better adapt to the consensus of a group of scorers-physicians. We finally address the generalization ability of a deep learning based sleep scoring system, further studying its resilience to the sleep complexity and to the AASM scoring rules. We can state that there is no need to train the algorithm strictly following the AASM guidelines. Most importantly, using data from multiple data centers results in a better performing model compared with training on a single data cohort. The variability among different scorers and data centers needs to be taken into account, more than the variability among sleep disorders.

Learning Representations for Controllable Image Restoration

Givi meishvili · march 2022.

Deep Convolutional Neural Networks have sparked a renaissance in all the sub-fields of computer vision. Tremendous progress has been made in the area of image restoration. The research community has pushed the boundaries of image deblurring, super-resolution, and denoising. However, given a distorted image, most existing methods typically produce a single restored output. The tasks mentioned above are inherently ill-posed, leading to an infinite number of plausible solutions. This thesis focuses on designing image restoration techniques capable of producing multiple restored results and granting users more control over the restoration process. Towards this goal, we demonstrate how one could leverage the power of unsupervised representation learning. Image restoration is vital when applied to distorted images of human faces due to their social significance. Generative Adversarial Networks enable an unprecedented level of generated facial details combined with smooth latent space. We leverage the power of GANs towards the goal of learning controllable neural face representations. We demonstrate how to learn an inverse mapping from image space to these latent representations, tuning these representations towards a specific task, and finally manipulating latent codes in these spaces. For example, we show how GANs and their inverse mappings enable the restoration and editing of faces in the context of extreme face super-resolution and the generation of novel view sharp videos from a single motion-blurred image of a face. This thesis also addresses more general blind super-resolution, denoising, and scratch removal problems, where blur kernels and noise levels are unknown. We resort to contrastive representation learning and first learn the latent space of degradations. We demonstrate that the learned representation allows inference of ground-truth degradation parameters and can guide the restoration process. Moreover, it enables control over the amount of deblurring and denoising in the restoration via manipulation of latent degradation features.

Learning Generalizable Visual Patterns Without Human Supervision

Simon jenni · oct. 2021.

Owing to the existence of large labeled datasets, Deep Convolutional Neural Networks have ushered in a renaissance in computer vision. However, almost all of the visual data we generate daily - several human lives worth of it - remains unlabeled and thus out of reach of today’s dominant supervised learning paradigm. This thesis focuses on techniques that steer deep models towards learning generalizable visual patterns without human supervision. Our primary tool in this endeavor is the design of Self-Supervised Learning tasks, i.e., pretext-tasks for which labels do not involve human labor. Besides enabling the learning from large amounts of unlabeled data, we demonstrate how self-supervision can capture relevant patterns that supervised learning largely misses. For example, we design learning tasks that learn deep representations capturing shape from images, motion from video, and 3D pose features from multi-view data. Notably, these tasks’ design follows a common principle: The recognition of data transformations. The strong performance of the learned representations on downstream vision tasks such as classiﬁcation, segmentation, action recognition, or pose estimation validate this pretext-task design. This thesis also explores the use of Generative Adversarial Networks (GANs) for unsupervised representation learning. Besides leveraging generative adversarial learning to deﬁne image transformation for self-supervised learning tasks, we also address training instabilities of GANs through the use of noise. While unsupervised techniques can signiﬁcantly reduce the burden of supervision, in the end, we still rely on some annotated examples to ﬁne-tune learned representations towards a target task. To improve the learning from scarce or noisy labels, we describe a supervised learning algorithm with improved generalization in these challenging settings.

Learning Interpretable Representations of Images

Attila szabó · june 2019.

Computers represent images with pixels and each pixel contains three numbers for red, green and blue colour values. These numbers are meaningless for humans and they are mostly useless when used directly with classical machine learning techniques like linear classifiers. Interpretable representations are the attributes that humans understand: the colour of the hair, viewpoint of a car or the 3D shape of the object in the scene. Many computer vision tasks can be viewed as learning interpretable representations, for example a supervised classification algorithm directly learns to represent images with their class labels. In this work we aim to learn interpretable representations (or features) indirectly with lower levels of supervision. This approach has the advantage of cost savings on dataset annotations and the flexibility of using the features for multiple follow-up tasks. We made contributions in three main areas: weakly supervised learning, unsupervised learning and 3D reconstruction. In the weakly supervised case we use image pairs as supervision. Each pair shares a common attribute and differs in a varying attribute. We propose a training method that learns to separate the attributes into separate feature vectors. These features then are used for attribute transfer and classification. We also show theoretical results on the ambiguities of the learning task and the ways to avoid degenerate solutions. We show a method for unsupervised representation learning, that separates semantically meaningful concepts. We explain and show ablation studies how the components of our proposed method work: a mixing autoencoder, a generative adversarial net and a classifier. We propose a method for learning single image 3D reconstruction. It is done using only the images, no human annotation, stereo, synthetic renderings or ground truth depth map is needed. We train a generative model that learns the 3D shape distribution and an encoder to reconstruct the 3D shape. For that we exploit the notion of image realism. It means that the 3D reconstruction of the object has to look realistic when it is rendered from different random angles. We prove the efficacy of our method from first principles.

Learning Controllable Representations for Image Synthesis

Qiyang hu · june 2019.

In this thesis, our focus is learning a controllable representation and applying the learned controllable feature representation on images synthesis, video generation, and even 3D reconstruction. We propose different methods to disentangle the feature representation in neural network and analyze the challenges in disentanglement such as reference ambiguity and shortcut problem when using the weak label. We use the disentangled feature representation to transfer attributes between images such as exchanging hairstyle between two face images. Furthermore, we study the problem of how another type of feature, sketch, works in a neural network. The sketch can provide shape and contour of an object such as the silhouette of the side-view face. We leverage the silhouette constraint to improve the 3D face reconstruction from 2D images. The sketch can also provide the moving directions of one object, thus we investigate how one can manipulate the object to follow the trajectory provided by a user sketch. We propose a method to automatically generate video clips from a single image input using the sketch as motion and trajectory guidance to animate the object in that image. We demonstrate the efficiency of our approaches on several synthetic and real datasets.

Beyond Supervised Representation Learning

Mehdi noroozi · jan. 2019.

The complexity of any information processing task is highly dependent on the space where data is represented. Unfortunately, pixel space is not appropriate for the computer vision tasks such as object classification. The traditional computer vision approaches involve a multi-stage pipeline where at first images are transformed to a feature space through a handcrafted function and then consequenced by the solution in the feature space. The challenge with this approach is the complexity of designing handcrafted functions that extract robust features. The deep learning based approaches address this issue by end-to-end training of a neural network for some tasks that lets the network to discover the appropriate representation for the training tasks automatically. It turns out that image classification task on large scale annotated datasets yields a representation transferable to other computer vision tasks. However, supervised representation learning is limited to annotations. In this thesis we study self-supervised representation learning where the goal is to alleviate these limitations by substituting the classification task with pseudo tasks where the labels come for free. We discuss self-supervised learning by solving jigsaw puzzles that uses context as supervisory signal. The rational behind this task is that the network requires to extract features about object parts and their spatial configurations to solve the jigsaw puzzles. We also discuss a method for representation learning that uses an artificial supervisory signal based on counting visual primitives. This supervisory signal is obtained from an equivariance relation. We use two image transformations in the context of counting: scaling and tiling. The first transformation exploits the fact that the number of visual primitives should be invariant to scale. The second transformation allows us to equate the total number of visual primitives in each tile to that in the whole image. The most effective transfer strategy is fine-tuning, which restricts one to use the same model or parts thereof for both pretext and target tasks. We discuss a novel framework for self-supervised learning that overcomes limitations in designing and comparing different tasks, models, and data domains. In particular, our framework decouples the structure of the self-supervised model from the final task-specific finetuned model. Finally, we study the problem of multi-task representation learning. A naive approach to enhance the representation learned by a task is to train the task jointly with other tasks that capture orthogonal attributes. Having a diverse set of auxiliary tasks, imposes challenges on multi-task training from scratch. We propose a framework that allows us to combine arbitrarily different feature spaces into a single deep neural network. We reduce the auxiliary tasks to classification tasks and the multi-task learning to multi-label classification task consequently. Nevertheless, combining multiple representation space without being aware of the target task might be suboptimal. As our second contribution, we show empirically that this is indeed the case and propose to combine multiple tasks after the fine-tuning on the target task.

Motion Deblurring from a Single Image

Meiguang jin · dec. 2018.

With the information explosion, a tremendous amount photos is captured and shared via social media everyday. Technically, a photo requires a finite exposure to accumulate light from the scene. Thus, objects moving during the exposure generate motion blur in a photo. Motion blur is an image degradation that makes visual content less interpretable and is therefore often seen as a nuisance. Although motion blur can be reduced by setting a short exposure time, an insufficient amount of light has to be compensated through increasing the sensor’s sensitivity, which will inevitably bring large amount of sensor noise. Thus this motivates the necessity of removing motion blur computationally. Motion deblurring is an important problem in computer vision and it is challenging due to its ill-posed nature, which means the solution is not well defined. Mathematically, a blurry image caused by uniform motion is formed by the convolution operation between a blur kernel and a latent sharp image. Potentially there are infinite pairs of blur kernel and latent sharp image that can result in the same blurry image. Hence, some prior knowledge or regularization is required to address this problem. Even if the blur kernel is known, restoring the latent sharp image is still difficult as the high frequency information has been removed. Although we can model the uniform motion deblurring problem mathematically, it can only address the camera in-plane translational motion. Practically, motion is more complicated and can be non-uniform. Non-uniform motion blur can come from many sources, camera out-of-plane rotation, scene depth change, object motion and so on. Thus, it is more challenging to remove non-uniform motion blur. In this thesis, our focus is motion blur removal. We aim to address four challenging motion deblurring problems. We start from the noise blind image deblurring scenario where blur kernel is known but the noise level is unknown. We introduce an efficient and robust solution based on a Bayesian framework using a smooth generalization of the 0−1 loss to address this problem. Then we study the blind uniform motion deblurring scenario where both the blur kernel and the latent sharp image are unknown. We explore the relative scale ambiguity between the latent sharp image and blur kernel to address this issue. Moreover, we study the face deblurring problem and introduce a novel deep learning network architecture to solve it. We also address the general motion deblurring problem and particularly we aim at recovering a sequence of 7 frames each depicting some instantaneous motion of the objects in the scene.

Towards a Novel Paradigm in Blind Deconvolution: From Natural to Cartooned Image Statistics

Daniele perrone · july 2015.

In this thesis we study the blind deconvolution problem. Blind deconvolution consists in the estimation of a sharp image and a blur kernel from an observed blurry image. Because the blur model admits several solutions it is necessary to devise an image prior that favors the true blur kernel and sharp image. Recently it has been shown that a class of blind deconvolution formulations and image priors has the no-blur solution as global minimum. Despite this shortcoming, algorithms based on these formulations and priors can successfully solve blind deconvolution. In this thesis we show that a suitable initialization can exploit the non-convexity of the problem and yield the desired solution. Based on these conclusions, we propose a novel “vanilla” algorithm stripped of any enhancement typically used in the literature. Our algorithm, despite its simplicity, is able to compete with the top performers on several datasets. We have also investigated a remarkable behavior of a 1998 algorithm, whose formulation has the no-blur solution as global minimum: even when initialized at the no-blur solution, it converges to the correct solution. We show that this behavior is caused by an apparently insignificant implementation strategy that makes the algorithm no longer minimize the original cost functional. We also demonstrate that this strategy improves the results of our “vanilla” algorithm. Finally, we present a study of image priors for blind deconvolution. We provide experimental evidence supporting the recent belief that a good image prior is one that leads to a good blur estimate rather than being a good natural image statistical model. By focusing the attention on the blur estimation alone, we show that good blur estimates can be obtained even when using images quite different from the true sharp image. This allows using image priors, such as those leading to “cartooned” images, that avoid the no-blur solution. By using an image prior that produces “cartooned” images we achieve state-of-the-art results on different publicly available datasets. We therefore suggests a shift of paradigm in blind deconvolution: from modeling natural image statistics to modeling cartooned image statistics.

New Perspectives on Uncalibrated Photometric Stereo

Thoma papadhimitri · june 2014.

This thesis investigates the problem of 3D reconstruction of a scene from 2D images. In particular, we focus on photometric stereo which is a technique that computes the 3D geometry from at least three images taken from the same viewpoint and under different illumination conditions. When the illumination is unknown (uncalibrated photometric stereo) the problem is ambiguous: different combinations of geometry and illumination can generate the same images. First, we solve the ambiguity by exploiting the Lambertian reflectance maxima. These are points defined on curved surfaces where the normals are parallel to the light direction. Then, we propose a solution that can be computed in closed-form and thus very efficiently. Our algorithm is also very robust and yields always the same estimate regardless of the initial ambiguity. We validate our method on real world experiments and achieve state-of-art results. In this thesis we also solve for the first time the uncalibrated photometric stereo problem under the perspective projection model. We show that unlike in the orthographic case, one can uniquely reconstruct the normals of the object and the lights given only the input images and the camera calibration (focal length and image center). We also propose a very efficient algorithm which we validate on synthetic and real world experiments and show that the proposed technique is a generalization of the orthographic case. Finally, we investigate the uncalibrated photometric stereo problem in the case where the lights are distributed near the scene. In this case we propose an alternating minimization technique which converges quickly and overcomes the limitations of prior work that assumes distant illumination. We show experimentally that adopting a near-light model for real world scenes yields very accurate reconstructions.

Computer Science Thesis Topics

This page provides a comprehensive list of computer science thesis topics , carefully curated to support students in identifying and selecting innovative and relevant areas for their academic research. Whether you are at the beginning of your research journey or are seeking a specific area to explore further, this guide aims to serve as an essential resource. With an expansive array of topics spread across various sub-disciplines of computer science, this list is designed to meet a diverse range of interests and academic needs. From the complexities of artificial intelligence to the intricate designs of web development, each category is equipped with 40 specific topics, offering a breadth of possibilities to inspire your next big thesis project. Explore our guide to find not only a topic that resonates with your academic ambitions but also one that has the potential to contribute significantly to the field of computer science.

1000 Computer Science Thesis Topics and Ideas

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% off with 24start discount code, browse computer science thesis topics:, artificial intelligence thesis topics, augmented reality thesis topics, big data analytics thesis topics, bioinformatics thesis topics, blockchain technology thesis topics, cloud computing thesis topics, computer engineering thesis topics, computer vision thesis topics, cybersecurity thesis topics, data science thesis topics, digital transformation thesis topics, distributed systems and networks thesis topics, geographic information systems (gis) thesis topics, human-computer interaction (hci) thesis topics, image processing thesis topics, information system thesis topics, information technology thesis topics.

Internet Of Things (IoT) Thesis Topics

Machine Learning Thesis Topics

Neural networks thesis topics, programming thesis topics, quantum computing thesis topics, robotics thesis topics, software engineering thesis topics, web development thesis topics.

Ethical Implications of AI in Decision-Making Processes
The Role of AI in Personalized Medicine: Opportunities and Challenges
Advances in AI-Driven Predictive Analytics in Retail
AI in Autonomous Vehicles: Safety, Regulation, and Technology Integration
Natural Language Processing: Improving Human-Machine Interaction
The Future of AI in Cybersecurity: Threats and Defenses
Machine Learning Algorithms for Real-Time Data Processing
AI and the Internet of Things: Transforming Smart Home Technology
The Impact of Deep Learning on Image Recognition Technologies
Reinforcement Learning: Applications in Robotics and Automation
AI in Finance: Algorithmic Trading and Risk Assessment
Bias and Fairness in AI: Addressing Socio-Technical Challenges
The Evolution of AI in Education: Customized Learning Experiences
AI for Environmental Conservation: Tracking and Predictive Analysis
The Role of Artificial Neural Networks in Weather Forecasting
AI in Agriculture: Predictive Analytics for Crop and Soil Management
Emotional Recognition AI: Implications for Mental Health Assessments
AI in Space Exploration: Autonomous Rovers and Mission Planning
Enhancing User Experience with AI in Video Games
AI-Powered Virtual Assistants: Trends, Effectiveness, and User Trust
The Integration of AI in Traditional Industries: Case Studies
Generative AI Models in Art and Creativity
AI in LegalTech: Document Analysis and Litigation Prediction
Healthcare Diagnostics: AI Applications in Radiology and Pathology
AI and Blockchain: Enhancing Security in Decentralized Systems
Ethics of AI in Surveillance: Privacy vs. Security
AI in E-commerce: Personalization Engines and Customer Behavior Analysis
The Future of AI in Telecommunications: Network Optimization and Service Delivery
AI in Manufacturing: Predictive Maintenance and Quality Control
Challenges of AI in Elderly Care: Ethical Considerations and Technological Solutions
The Role of AI in Public Safety and Emergency Response
AI for Content Creation: Impact on Media and Journalism
AI-Driven Algorithms for Efficient Energy Management
The Role of AI in Cultural Heritage Preservation
AI and the Future of Public Transport: Optimization and Management
Enhancing Sports Performance with AI-Based Analytics
AI in Human Resources: Automating Recruitment and Employee Management
Real-Time Translation AI: Breaking Language Barriers
AI in Mental Health: Tools for Monitoring and Therapy Assistance
The Future of AI Governance: Regulation and Standardization
AR in Medical Training and Surgery Simulation
The Impact of Augmented Reality in Retail: Enhancing Consumer Experience
Augmented Reality for Enhanced Navigation Systems
AR Applications in Maintenance and Repair in Industrial Settings
The Role of AR in Enhancing Online Education
Augmented Reality in Cultural Heritage: Interactive Visitor Experiences
Developing AR Tools for Improved Sports Coaching and Training
Privacy and Security Challenges in Augmented Reality Applications
The Future of AR in Advertising: Engagement and Measurement
User Interface Design for AR: Principles and Best Practices
AR in Automotive Industry: Enhancing Driving Experience and Safety
Augmented Reality for Emergency Response Training
AR and IoT: Converging Technologies for Smart Environments
Enhancing Physical Rehabilitation with AR Applications
The Role of AR in Enhancing Public Safety and Awareness
Augmented Reality in Fashion: Virtual Fitting and Personalized Shopping
AR for Environmental Education: Interactive and Immersive Learning
The Use of AR in Building and Architecture Planning
AR in the Entertainment Industry: Games and Live Events
Implementing AR in Museums and Art Galleries for Interactive Learning
Augmented Reality for Real Estate: Virtual Tours and Property Visualization
AR in Consumer Electronics: Integration in Smart Devices
The Development of AR Applications for Children’s Education
AR for Enhancing User Engagement in Social Media Platforms
The Application of AR in Field Service Management
Augmented Reality for Disaster Management and Risk Assessment
Challenges of Content Creation for Augmented Reality
Future Trends in AR Hardware: Wearables and Beyond
Legal and Ethical Considerations of Augmented Reality Technology
AR in Space Exploration: Tools for Simulation and Training
Interactive Shopping Experiences with AR: The Future of Retail
AR in Wildlife Conservation: Educational Tools and Awareness
The Impact of AR on the Publishing Industry: Interactive Books and Magazines
Augmented Reality and Its Role in Automotive Manufacturing
AR for Job Training: Bridging the Skill Gap in Various Industries
The Role of AR in Therapy: New Frontiers in Mental Health Treatment
The Future of Augmented Reality in Sports Broadcasting
AR as a Tool for Enhancing Public Art Installations
Augmented Reality in the Tourism Industry: Personalized Travel Experiences
The Use of AR in Security Training: Realistic and Safe Simulations
The Role of Big Data in Improving Healthcare Outcomes
Big Data and Its Impact on Consumer Behavior Analysis
Privacy Concerns in Big Data: Ethical and Legal Implications
The Application of Big Data in Predictive Maintenance for Manufacturing
Real-Time Big Data Processing: Tools and Techniques
Big Data in Financial Services: Fraud Detection and Risk Management
The Evolution of Big Data Technologies: From Hadoop to Spark
Big Data Visualization: Techniques for Effective Communication of Insights
The Integration of Big Data and Artificial Intelligence
Big Data in Smart Cities: Applications in Traffic Management and Energy Use
Enhancing Supply Chain Efficiency with Big Data Analytics
Big Data in Sports Analytics: Improving Team Performance and Fan Engagement
The Role of Big Data in Environmental Monitoring and Sustainability
Big Data and Social Media: Analyzing Sentiments and Trends
Scalability Challenges in Big Data Systems
The Future of Big Data in Retail: Personalization and Customer Experience
Big Data in Education: Customized Learning Paths and Student Performance Analysis
Privacy-Preserving Techniques in Big Data
Big Data in Public Health: Epidemiology and Disease Surveillance
The Impact of Big Data on Insurance: Tailored Policies and Pricing
Edge Computing in Big Data: Processing at the Source
Big Data and the Internet of Things: Generating Insights from IoT Data
Cloud-Based Big Data Analytics: Opportunities and Challenges
Big Data Governance: Policies, Standards, and Management
The Role of Big Data in Crisis Management and Response
Machine Learning with Big Data: Building Predictive Models
Big Data in Agriculture: Precision Farming and Yield Optimization
The Ethics of Big Data in Research: Consent and Anonymity
Cross-Domain Big Data Integration: Challenges and Solutions
Big Data and Cybersecurity: Threat Detection and Prevention Strategies
Real-Time Streaming Analytics in Big Data
Big Data in the Media Industry: Content Optimization and Viewer Insights
The Impact of GDPR on Big Data Practices
Quantum Computing and Big Data: Future Prospects
Big Data in E-Commerce: Optimizing Logistics and Inventory Management
Big Data Talent: Education and Skill Development for Data Scientists
The Role of Big Data in Political Campaigns and Voting Behavior Analysis
Big Data and Mental Health: Analyzing Patterns for Better Interventions
Big Data in Genomics and Personalized Medicine
The Future of Big Data in Autonomous Driving Technologies
The Role of Bioinformatics in Personalized Medicine
Next-Generation Sequencing Data Analysis: Challenges and Opportunities
Bioinformatics and the Study of Genetic Diseases
Computational Models for Understanding Protein Structure and Function
Bioinformatics in Drug Discovery and Development
The Impact of Big Data on Bioinformatics: Data Management and Analysis
Machine Learning Applications in Bioinformatics
Bioinformatics Approaches for Cancer Genomics
The Development of Bioinformatics Tools for Metagenomics Analysis
Ethical Considerations in Bioinformatics: Data Sharing and Privacy
The Role of Bioinformatics in Agricultural Biotechnology
Bioinformatics and Viral Evolution: Tracking Pathogens and Outbreaks
The Integration of Bioinformatics and Systems Biology
Bioinformatics in Neuroscience: Mapping the Brain
The Future of Bioinformatics in Non-Invasive Prenatal Testing
Bioinformatics and the Human Microbiome: Health Implications
The Application of Artificial Intelligence in Bioinformatics
Structural Bioinformatics: Computational Techniques for Molecular Modeling
Comparative Genomics: Insights into Evolution and Function
Bioinformatics in Immunology: Vaccine Design and Immune Response Analysis
High-Performance Computing in Bioinformatics
The Challenge of Proteomics in Bioinformatics
RNA-Seq Data Analysis and Interpretation
Cloud Computing Solutions for Bioinformatics Data
Computational Epigenetics: DNA Methylation and Histone Modification Analysis
Bioinformatics in Ecology: Biodiversity and Conservation Genetics
The Role of Bioinformatics in Forensic Analysis
Mobile Apps and Tools for Bioinformatics Research
Bioinformatics and Public Health: Epidemiological Studies
The Use of Bioinformatics in Clinical Diagnostics
Genetic Algorithms in Bioinformatics
Bioinformatics for Aging Research: Understanding the Mechanisms of Aging
Data Visualization Techniques in Bioinformatics
Bioinformatics and the Development of Therapeutic Antibodies
The Role of Bioinformatics in Stem Cell Research
Bioinformatics and Cardiovascular Diseases: Genomic Insights
The Impact of Machine Learning on Functional Genomics in Bioinformatics
Bioinformatics in Dental Research: Genetic Links to Oral Diseases
The Future of CRISPR Technology and Bioinformatics
Bioinformatics and Nutrition: Genomic Insights into Diet and Health
Blockchain for Enhancing Cybersecurity in Various Industries
The Impact of Blockchain on Supply Chain Transparency
Blockchain in Healthcare: Patient Data Management and Security
The Application of Blockchain in Voting Systems
Blockchain and Smart Contracts: Legal Implications and Applications
Cryptocurrencies: Market Trends and the Future of Digital Finance
Blockchain in Real Estate: Improving Property and Land Registration
The Role of Blockchain in Managing Digital Identities
Blockchain for Intellectual Property Management
Energy Sector Innovations: Blockchain for Renewable Energy Distribution
Blockchain and the Future of Public Sector Operations
The Impact of Blockchain on Cross-Border Payments
Blockchain for Non-Fungible Tokens (NFTs): Applications in Art and Media
Privacy Issues in Blockchain Applications
Blockchain in the Automotive Industry: Supply Chain and Beyond
Decentralized Finance (DeFi): Opportunities and Challenges
The Role of Blockchain in Combating Counterfeiting and Fraud
Blockchain for Sustainable Environmental Practices
The Integration of Artificial Intelligence with Blockchain
Blockchain Education: Curriculum Development and Training Needs
Blockchain in the Music Industry: Rights Management and Revenue Distribution
The Challenges of Blockchain Scalability and Performance Optimization
The Future of Blockchain in the Telecommunications Industry
Blockchain and Consumer Data Privacy: A New Paradigm
Blockchain for Disaster Recovery and Business Continuity
Blockchain in the Charity and Non-Profit Sectors
Quantum Resistance in Blockchain: Preparing for the Quantum Era
Blockchain and Its Impact on Traditional Banking and Financial Institutions
Legal and Regulatory Challenges Facing Blockchain Technology
Blockchain for Improved Logistics and Freight Management
The Role of Blockchain in the Evolution of the Internet of Things (IoT)
Blockchain and the Future of Gaming: Transparency and Fair Play
Blockchain for Academic Credentials Verification
The Application of Blockchain in the Insurance Industry
Blockchain and the Future of Content Creation and Distribution
Blockchain for Enhancing Data Integrity in Scientific Research
The Impact of Blockchain on Human Resources: Employee Verification and Salary Payments
Blockchain and the Future of Retail: Customer Loyalty Programs and Inventory Management
Blockchain and Industrial Automation: Trust and Efficiency
Blockchain for Digital Marketing: Transparency and Consumer Engagement
Multi-Cloud Strategies: Optimization and Security Challenges
Advances in Cloud Computing Architectures for Scalable Applications
Edge Computing: Extending the Reach of Cloud Services
Cloud Security: Novel Approaches to Data Encryption and Threat Mitigation
The Impact of Serverless Computing on Software Development Lifecycle
Cloud Computing and Sustainability: Energy-Efficient Data Centers
Cloud Service Models: Comparative Analysis of IaaS, PaaS, and SaaS
Cloud Migration Strategies: Best Practices and Common Pitfalls
The Role of Cloud Computing in Big Data Analytics
Implementing AI and Machine Learning Workloads on Cloud Platforms
Hybrid Cloud Environments: Management Tools and Techniques
Cloud Computing in Healthcare: Compliance, Security, and Use Cases
Cost-Effective Cloud Solutions for Small and Medium Enterprises (SMEs)
The Evolution of Cloud Storage Solutions: Trends and Technologies
Cloud-Based Disaster Recovery Solutions: Design and Reliability
Blockchain in Cloud Services: Enhancing Transparency and Trust
Cloud Networking: Managing Connectivity and Traffic in Cloud Environments
Cloud Governance: Managing Compliance and Operational Risks
The Future of Cloud Computing: Quantum Computing Integration
Performance Benchmarking of Cloud Services Across Different Providers
Privacy Preservation in Cloud Environments
Cloud Computing in Education: Virtual Classrooms and Learning Management Systems
Automation in Cloud Deployments: Tools and Strategies
Cloud Auditing and Monitoring Techniques
Mobile Cloud Computing: Challenges and Future Trends
The Role of Cloud Computing in Digital Media Production and Distribution
Security Risks in Multi-Tenancy Cloud Environments
Cloud Computing for Scientific Research: Enabling Complex Simulations
The Impact of 5G on Cloud Computing Services
Federated Clouds: Building Collaborative Cloud Environments
Managing Software Dependencies in Cloud Applications
The Economics of Cloud Computing: Cost Models and Pricing Strategies
Cloud Computing in Government: Security Protocols and Citizen Services
Cloud Access Security Brokers (CASBs): Security Enforcement Points
DevOps in the Cloud: Strategies for Continuous Integration and Deployment
Predictive Analytics in Cloud Computing
The Role of Cloud Computing in IoT Deployment
Implementing Robust Cybersecurity Measures in Cloud Architecture
Cloud Computing in the Financial Sector: Handling Sensitive Data
Future Trends in Cloud Computing: The Role of AI in Cloud Optimization
Advances in Microprocessor Design and Architecture
FPGA-Based Design: Innovations and Applications
The Role of Embedded Systems in Consumer Electronics
Quantum Computing: Hardware Development and Challenges
High-Performance Computing (HPC) and Parallel Processing
Design and Analysis of Computer Networks
Cyber-Physical Systems: Design, Analysis, and Security
The Impact of Nanotechnology on Computer Hardware
Wireless Sensor Networks: Design and Optimization
Cryptographic Hardware: Implementations and Security Evaluations
Machine Learning Techniques for Hardware Optimization
Hardware for Artificial Intelligence: GPUs vs. TPUs
Energy-Efficient Hardware Designs for Sustainable Computing
Security Aspects of Mobile and Ubiquitous Computing
Advanced Algorithms for Computer-Aided Design (CAD) of VLSI
Signal Processing in Communication Systems
The Development of Wearable Computing Devices
Computer Hardware Testing: Techniques and Tools
The Role of Hardware in Network Security
The Evolution of Interface Designs in Consumer Electronics
Biometric Systems: Hardware and Software Integration
The Integration of IoT Devices in Smart Environments
Electronic Design Automation (EDA) Tools and Methodologies
Robotics: Hardware Design and Control Systems
Hardware Accelerators for Deep Learning Applications
Developments in Non-Volatile Memory Technologies
The Future of Computer Hardware in the Era of Quantum Computing
Hardware Solutions for Data Storage and Retrieval
Power Management Techniques in Embedded Systems
Challenges in Designing Multi-Core Processors
System on Chip (SoC) Design Trends and Challenges
The Role of Computer Engineering in Aerospace Technology
Real-Time Systems: Design and Implementation Challenges
Hardware Support for Virtualization Technology
Advances in Computer Graphics Hardware
The Impact of 5G Technology on Mobile Computing Hardware
Environmental Impact Assessment of Computer Hardware Production
Security Vulnerabilities in Modern Microprocessors
Computer Hardware Innovations in the Automotive Industry
The Role of Computer Engineering in Medical Device Technology
Deep Learning Approaches to Object Recognition
Real-Time Image Processing for Autonomous Vehicles
Computer Vision in Robotic Surgery: Techniques and Challenges
Facial Recognition Technology: Innovations and Privacy Concerns
Machine Vision in Industrial Automation and Quality Control
3D Reconstruction Techniques in Computer Vision
Enhancing Sports Analytics with Computer Vision
Augmented Reality: Integrating Computer Vision for Immersive Experiences
Computer Vision for Environmental Monitoring
Thermal Imaging and Its Applications in Computer Vision
Computer Vision in Retail: Customer Behavior and Store Layout Optimization
Motion Detection and Tracking in Security Systems
The Role of Computer Vision in Content Moderation on Social Media
Gesture Recognition: Methods and Applications
Computer Vision in Agriculture: Pest Detection and Crop Analysis
Advances in Medical Imaging: Machine Learning and Computer Vision
Scene Understanding and Contextual Inference in Images
The Development of Vision-Based Autonomous Drones
Optical Character Recognition (OCR): Latest Techniques and Applications
The Impact of Computer Vision on Virtual Reality Experiences
Biometrics: Enhancing Security Systems with Computer Vision
Computer Vision for Wildlife Conservation: Species Recognition and Behavior Analysis
Underwater Image Processing: Challenges and Techniques
Video Surveillance: The Evolution of Algorithmic Approaches
Advanced Driver-Assistance Systems (ADAS): Leveraging Computer Vision
Computational Photography: Enhancing Image Capture Techniques
The Integration of AI in Computer Vision: Ethical and Technical Considerations
Computer Vision in the Gaming Industry: From Design to Interaction
The Future of Computer Vision in Smart Cities
Pattern Recognition in Historical Document Analysis
The Role of Computer Vision in the Manufacturing of Customized Products
Enhancing Accessibility with Computer Vision: Tools for the Visually Impaired
The Use of Computer Vision in Behavioral Research
Predictive Analytics with Computer Vision in Sports
Image Synthesis with Generative Adversarial Networks (GANs)
The Use of Computer Vision in Remote Sensing
Real-Time Video Analytics for Public Safety
The Role of Computer Vision in Telemedicine
Computer Vision and the Internet of Things (IoT): A Synergistic Approach
Future Trends in Computer Vision: Quantum Computing and Beyond
Advances in Cryptography: Post-Quantum Cryptosystems
Artificial Intelligence in Cybersecurity: Threat Detection and Response
Blockchain for Enhanced Security in Distributed Networks
The Impact of IoT on Cybersecurity: Vulnerabilities and Solutions
Cybersecurity in Cloud Computing: Best Practices and Tools
Ethical Hacking: Techniques and Ethical Implications
The Role of Human Factors in Cybersecurity Breaches
Privacy-preserving Technologies in an Age of Surveillance
The Evolution of Ransomware Attacks and Defense Strategies
Secure Software Development: Integrating Security in DevOps (DevSecOps)
Cybersecurity in Critical Infrastructure: Challenges and Innovations
The Future of Biometric Security Systems
Cyber Warfare: State-sponsored Attacks and Defense Mechanisms
The Role of Cybersecurity in Protecting Digital Identities
Social Engineering Attacks: Prevention and Countermeasures
Mobile Security: Protecting Against Malware and Exploits
Wireless Network Security: Protocols and Practices
Data Breaches: Analysis, Consequences, and Mitigation
The Ethics of Cybersecurity: Balancing Privacy and Security
Regulatory Compliance and Cybersecurity: GDPR and Beyond
The Impact of 5G Technology on Cybersecurity
The Role of Machine Learning in Cyber Threat Intelligence
Cybersecurity in Automotive Systems: Challenges in a Connected Environment
The Use of Virtual Reality for Cybersecurity Training and Simulation
Advanced Persistent Threats (APT): Detection and Response
Cybersecurity for Smart Cities: Challenges and Solutions
Deep Learning Applications in Malware Detection
The Role of Cybersecurity in Healthcare: Protecting Patient Data
Supply Chain Cybersecurity: Identifying Risks and Solutions
Endpoint Security: Trends, Challenges, and Future Directions
Forensic Techniques in Cybersecurity: Tracking and Analyzing Cyber Crimes
The Influence of International Law on Cyber Operations
Protecting Financial Institutions from Cyber Frauds and Attacks
Quantum Computing and Its Implications for Cybersecurity
Cybersecurity and Remote Work: Emerging Threats and Strategies
IoT Security in Industrial Applications
Cyber Insurance: Risk Assessment and Management
Security Challenges in Edge Computing Environments
Anomaly Detection in Network Security Using AI Techniques
Securing the Software Supply Chain in Application Development
Big Data Analytics: Techniques and Applications in Real-time
Machine Learning Algorithms for Predictive Analytics
Data Science in Healthcare: Improving Patient Outcomes with Predictive Models
The Role of Data Science in Financial Market Predictions
Natural Language Processing: Emerging Trends and Applications
Data Visualization Tools and Techniques for Enhanced Business Intelligence
Ethics in Data Science: Privacy, Fairness, and Transparency
The Use of Data Science in Environmental Science for Sustainability Studies
The Impact of Data Science on Social Media Marketing Strategies
Data Mining Techniques for Detecting Patterns in Large Datasets
AI and Data Science: Synergies and Future Prospects
Reinforcement Learning: Applications and Challenges in Data Science
The Role of Data Science in E-commerce Personalization
Predictive Maintenance in Manufacturing Through Data Science
The Evolution of Recommendation Systems in Streaming Services
Real-time Data Processing with Stream Analytics
Deep Learning for Image and Video Analysis
Data Governance in Big Data Analytics
Text Analytics and Sentiment Analysis for Customer Feedback
Fraud Detection in Banking and Insurance Using Data Science
The Integration of IoT Data in Data Science Models
The Future of Data Science in Quantum Computing
Data Science for Public Health: Epidemic Outbreak Prediction
Sports Analytics: Performance Improvement and Injury Prevention
Data Science in Retail: Inventory Management and Customer Journey Analysis
Data Science in Smart Cities: Traffic and Urban Planning
The Use of Blockchain in Data Security and Integrity
Geospatial Analysis for Environmental Monitoring
Time Series Analysis in Economic Forecasting
Data Science in Education: Analyzing Trends and Student Performance
Predictive Policing: Data Science in Law Enforcement
Data Science in Agriculture: Yield Prediction and Soil Health
Computational Social Science: Analyzing Societal Trends
Data Science in Energy Sector: Consumption and Optimization
Personalization Technologies in Healthcare Through Data Science
The Role of Data Science in Content Creation and Media
Anomaly Detection in Network Security Using Data Science Techniques
The Future of Autonomous Vehicles: Data Science-Driven Innovations
Multimodal Data Fusion Techniques in Data Science
Scalability Challenges in Data Science Projects
The Role of Digital Transformation in Business Model Innovation
The Impact of Digital Technologies on Customer Experience
Digital Transformation in the Banking Sector: Trends and Challenges
The Use of AI and Robotics in Digital Transformation of Manufacturing
Digital Transformation in Healthcare: Telemedicine and Beyond
The Influence of Big Data on Decision-Making Processes in Corporations
Blockchain as a Driver for Transparency in Digital Transformation
The Role of IoT in Enhancing Operational Efficiency in Industries
Digital Marketing Strategies: SEO, Content, and Social Media
The Integration of Cyber-Physical Systems in Industrial Automation
Digital Transformation in Education: Virtual Learning Environments
Smart Cities: The Role of Digital Technologies in Urban Planning
Digital Transformation in the Retail Sector: E-commerce Evolution
The Future of Work: Impact of Digital Transformation on Workplaces
Cybersecurity Challenges in a Digitally Transformed World
Mobile Technologies and Their Impact on Digital Transformation
The Role of Digital Twin Technology in Industry 4.0
Digital Transformation in the Public Sector: E-Government Services
Data Privacy and Security in the Age of Digital Transformation
Digital Transformation in the Energy Sector: Smart Grids and Renewable Energy
The Use of Augmented Reality in Training and Development
The Role of Virtual Reality in Real Estate and Architecture
Digital Transformation and Sustainability: Reducing Environmental Footprint
The Role of Digital Transformation in Supply Chain Optimization
Digital Transformation in Agriculture: IoT and Smart Farming
The Impact of 5G on Digital Transformation Initiatives
The Influence of Digital Transformation on Media and Entertainment
Digital Transformation in Insurance: Telematics and Risk Assessment
The Role of AI in Enhancing Customer Service Operations
The Future of Digital Transformation: Trends and Predictions
Digital Transformation and Corporate Governance
The Role of Leadership in Driving Digital Transformation
Digital Transformation in Non-Profit Organizations: Challenges and Benefits
The Economic Implications of Digital Transformation
The Cultural Impact of Digital Transformation on Organizations
Digital Transformation in Transportation: Logistics and Fleet Management
User Experience (UX) Design in Digital Transformation
The Role of Digital Transformation in Crisis Management
Digital Transformation and Human Resource Management
Implementing Change Management in Digital Transformation Projects
Scalability Challenges in Distributed Systems: Solutions and Strategies
Blockchain Technology: Enhancing Security and Transparency in Distributed Networks
The Role of Edge Computing in Distributed Systems
Designing Fault-Tolerant Systems in Distributed Networks
The Impact of 5G Technology on Distributed Network Architectures
Machine Learning Algorithms for Network Traffic Analysis
Load Balancing Techniques in Distributed Computing
The Use of Distributed Ledger Technology Beyond Cryptocurrencies
Network Function Virtualization (NFV) and Its Impact on Service Providers
The Evolution of Software-Defined Networking (SDN) in Enterprise Environments
Implementing Robust Cybersecurity Measures in Distributed Systems
Quantum Computing: Implications for Network Security in Distributed Systems
Peer-to-Peer Network Protocols and Their Applications
The Internet of Things (IoT): Network Challenges and Communication Protocols
Real-Time Data Processing in Distributed Sensor Networks
The Role of Artificial Intelligence in Optimizing Network Operations
Privacy and Data Protection Strategies in Distributed Systems
The Future of Distributed Computing in Cloud Environments
Energy Efficiency in Distributed Network Systems
Wireless Mesh Networks: Design, Challenges, and Applications
Multi-Access Edge Computing (MEC): Use Cases and Deployment Challenges
Consensus Algorithms in Distributed Systems: From Blockchain to New Applications
The Use of Containers and Microservices in Building Scalable Applications
Network Slicing for 5G: Opportunities and Challenges
The Role of Distributed Systems in Big Data Analytics
Managing Data Consistency in Distributed Databases
The Impact of Distributed Systems on Digital Transformation Strategies
Augmented Reality over Distributed Networks: Performance and Scalability Issues
The Application of Distributed Systems in Smart Grid Technology
Developing Distributed Applications Using Serverless Architectures
The Challenges of Implementing IPv6 in Distributed Networks
Distributed Systems for Disaster Recovery: Design and Implementation
The Use of Virtual Reality in Distributed Network Environments
Security Protocols for Ad Hoc Networks in Emergency Situations
The Role of Distributed Networks in Enhancing Mobile Broadband Services
Next-Generation Protocols for Enhanced Network Reliability and Performance
The Application of Blockchain in Securing Distributed IoT Networks
Dynamic Resource Allocation Strategies in Distributed Systems
The Integration of Distributed Systems with Existing IT Infrastructure
The Future of Autonomous Systems in Distributed Networking
The Integration of GIS with Remote Sensing for Environmental Monitoring
GIS in Urban Planning: Techniques for Sustainable Development
The Role of GIS in Disaster Management and Response Strategies
Real-Time GIS Applications in Traffic Management and Route Planning
The Use of GIS in Water Resource Management
GIS and Public Health: Tracking Epidemics and Healthcare Access
Advances in 3D GIS: Technologies and Applications
GIS in Agricultural Management: Precision Farming Techniques
The Impact of GIS on Biodiversity Conservation Efforts
Spatial Data Analysis for Crime Pattern Detection and Prevention
GIS in Renewable Energy: Site Selection and Resource Management
The Role of GIS in Historical Research and Archaeology
GIS and Machine Learning: Integrating Spatial Analysis with Predictive Models
Cloud Computing and GIS: Enhancing Accessibility and Data Processing
The Application of GIS in Managing Public Transportation Systems
GIS in Real Estate: Market Analysis and Property Valuation
The Use of GIS for Environmental Impact Assessments
Mobile GIS Applications: Development and Usage Trends
GIS and Its Role in Smart City Initiatives
Privacy Issues in the Use of Geographic Information Systems
GIS in Forest Management: Monitoring and Conservation Strategies
The Impact of GIS on Tourism: Enhancing Visitor Experiences through Technology
GIS in the Insurance Industry: Risk Assessment and Policy Design
The Development of Participatory GIS (PGIS) for Community Engagement
GIS in Coastal Management: Addressing Erosion and Flood Risks
Geospatial Analytics in Retail: Optimizing Location and Consumer Insights
GIS for Wildlife Tracking and Habitat Analysis
The Use of GIS in Climate Change Studies
GIS and Social Media: Analyzing Spatial Trends from User Data
The Future of GIS: Augmented Reality and Virtual Reality Applications
GIS in Education: Tools for Teaching Geographic Concepts
The Role of GIS in Land Use Planning and Zoning
GIS for Emergency Medical Services: Optimizing Response Times
Open Source GIS Software: Development and Community Contributions
GIS and the Internet of Things (IoT): Converging Technologies for Advanced Monitoring
GIS for Mineral Exploration: Techniques and Applications
The Role of GIS in Municipal Management and Services
GIS and Drone Technology: A Synergy for Precision Mapping
Spatial Statistics in GIS: Techniques for Advanced Data Analysis
Future Trends in GIS: The Integration of AI for Smarter Solutions
The Evolution of User Interface (UI) Design: From Desktop to Mobile and Beyond
The Role of HCI in Enhancing Accessibility for Disabled Users
Virtual Reality (VR) and Augmented Reality (AR) in HCI: New Dimensions of Interaction
The Impact of HCI on User Experience (UX) in Software Applications
Cognitive Aspects of HCI: Understanding User Perception and Behavior
HCI and the Internet of Things (IoT): Designing Interactive Smart Devices
The Use of Biometrics in HCI: Security and Usability Concerns
HCI in Educational Technologies: Enhancing Learning through Interaction
Emotional Recognition and Its Application in HCI
The Role of HCI in Wearable Technology: Design and Functionality
Advanced Techniques in Voice User Interfaces (VUIs)
The Impact of HCI on Social Media Interaction Patterns
HCI in Healthcare: Designing User-Friendly Medical Devices and Software
HCI and Gaming: Enhancing Player Engagement and Experience
The Use of HCI in Robotic Systems: Improving Human-Robot Interaction
The Influence of HCI on E-commerce: Optimizing User Journeys and Conversions
HCI in Smart Homes: Interaction Design for Automated Environments
Multimodal Interaction: Integrating Touch, Voice, and Gesture in HCI
HCI and Aging: Designing Technology for Older Adults
The Role of HCI in Virtual Teams: Tools and Strategies for Collaboration
User-Centered Design: HCI Strategies for Developing User-Focused Software
HCI Research Methodologies: Experimental Design and User Studies
The Application of HCI Principles in the Design of Public Kiosks
The Future of HCI: Integrating Artificial Intelligence for Smarter Interfaces
HCI in Transportation: Designing User Interfaces for Autonomous Vehicles
Privacy and Ethics in HCI: Addressing User Data Security
HCI and Environmental Sustainability: Promoting Eco-Friendly Behaviors
Adaptive Interfaces: HCI Design for Personalized User Experiences
The Role of HCI in Content Creation: Tools for Artists and Designers
HCI for Crisis Management: Designing Systems for Emergency Use
The Use of HCI in Sports Technology: Enhancing Training and Performance
The Evolution of Haptic Feedback in HCI
HCI and Cultural Differences: Designing for Global User Bases
The Impact of HCI on Digital Marketing: Creating Engaging User Interactions
HCI in Financial Services: Improving User Interfaces for Banking Apps
The Role of HCI in Enhancing User Trust in Technology
HCI for Public Safety: User Interfaces for Security Systems
The Application of HCI in the Film and Television Industry
HCI and the Future of Work: Designing Interfaces for Remote Collaboration
Innovations in HCI: Exploring New Interaction Technologies and Their Applications
Deep Learning Techniques for Advanced Image Segmentation
Real-Time Image Processing for Autonomous Driving Systems
Image Enhancement Algorithms for Underwater Imaging
Super-Resolution Imaging: Techniques and Applications
The Role of Image Processing in Remote Sensing and Satellite Imagery Analysis
Machine Learning Models for Medical Image Diagnosis
The Impact of AI on Photographic Restoration and Enhancement
Image Processing in Security Systems: Facial Recognition and Motion Detection
Advanced Algorithms for Image Noise Reduction
3D Image Reconstruction Techniques in Tomography
Image Processing for Agricultural Monitoring: Crop Disease Detection and Yield Prediction
Techniques for Panoramic Image Stitching
Video Image Processing: Real-Time Streaming and Data Compression
The Application of Image Processing in Printing Technology
Color Image Processing: Theory and Practical Applications
The Use of Image Processing in Biometrics Identification
Computational Photography: Image Processing Techniques in Smartphone Cameras
Image Processing for Augmented Reality: Real-time Object Overlay
The Development of Image Processing Algorithms for Traffic Control Systems
Pattern Recognition and Analysis in Forensic Imaging
Adaptive Filtering Techniques in Image Processing
Image Processing in Retail: Customer Tracking and Behavior Analysis
The Role of Image Processing in Cultural Heritage Preservation
Image Segmentation Techniques for Cancer Detection in Medical Imaging
High Dynamic Range (HDR) Imaging: Algorithms and Display Techniques
Image Classification with Deep Convolutional Neural Networks
The Evolution of Edge Detection Algorithms in Image Processing
Image Processing for Wildlife Monitoring: Species Recognition and Behavior Analysis
Application of Wavelet Transforms in Image Compression
Image Processing in Sports: Enhancing Broadcasts and Performance Analysis
Optical Character Recognition (OCR) Improvements in Document Scanning
Multi-Spectral Imaging for Environmental and Earth Studies
Image Processing for Space Exploration: Analysis of Planetary Images
Real-Time Image Processing for Event Surveillance
The Influence of Quantum Computing on Image Processing Speed and Security
Machine Vision in Manufacturing: Defect Detection and Quality Control
Image Processing in Neurology: Visualizing Brain Functions
Photogrammetry and Image Processing in Geology: 3D Terrain Mapping
Advanced Techniques in Image Watermarking for Copyright Protection
The Future of Image Processing: Integrating AI for Automated Editing
The Evolution of Enterprise Resource Planning (ERP) Systems in the Digital Age
Information Systems for Managing Distributed Workforces
The Role of Information Systems in Enhancing Supply Chain Management
Cybersecurity Measures in Information Systems
The Impact of Big Data on Decision Support Systems
Blockchain Technology for Information System Security
The Development of Sustainable IT Infrastructure in Information Systems
The Use of AI in Information Systems for Business Intelligence
Information Systems in Healthcare: Improving Patient Care and Data Management
The Influence of IoT on Information Systems Architecture
Mobile Information Systems: Development and Usability Challenges
The Role of Geographic Information Systems (GIS) in Urban Planning
Social Media Analytics: Tools and Techniques in Information Systems
Information Systems in Education: Enhancing Learning and Administration
Cloud Computing Integration into Corporate Information Systems
Information Systems Audit: Practices and Challenges
User Interface Design and User Experience in Information Systems
Privacy and Data Protection in Information Systems
The Future of Quantum Computing in Information Systems
The Role of Information Systems in Environmental Management
Implementing Effective Knowledge Management Systems
The Adoption of Virtual Reality in Information Systems
The Challenges of Implementing ERP Systems in Multinational Corporations
Information Systems for Real-Time Business Analytics
The Impact of 5G Technology on Mobile Information Systems
Ethical Issues in the Management of Information Systems
Information Systems in Retail: Enhancing Customer Experience and Management
The Role of Information Systems in Non-Profit Organizations
Development of Decision Support Systems for Strategic Planning
Information Systems in the Banking Sector: Enhancing Financial Services
Risk Management in Information Systems
The Integration of Artificial Neural Networks in Information Systems
Information Systems and Corporate Governance
Information Systems for Disaster Response and Management
The Role of Information Systems in Sports Management
Information Systems for Public Health Surveillance
The Future of Information Systems: Trends and Predictions
Information Systems in the Film and Media Industry
Business Process Reengineering through Information Systems
Implementing Customer Relationship Management (CRM) Systems in E-commerce
Emerging Trends in Artificial Intelligence and Machine Learning
The Future of Cloud Services and Technology
Cybersecurity: Current Threats and Future Defenses
The Role of Information Technology in Sustainable Energy Solutions
Internet of Things (IoT): From Smart Homes to Smart Cities
Blockchain and Its Impact on Information Technology
The Use of Big Data Analytics in Predictive Modeling
Virtual Reality (VR) and Augmented Reality (AR): The Next Frontier in IT
The Challenges of Digital Transformation in Traditional Businesses
Wearable Technology: Health Monitoring and Beyond
5G Technology: Implementation and Impacts on IT
Biometrics Technology: Uses and Privacy Concerns
The Role of IT in Global Health Initiatives
Ethical Considerations in the Development of Autonomous Systems
Data Privacy in the Age of Information Overload
The Evolution of Software Development Methodologies
Quantum Computing: The Next Revolution in IT
IT Governance: Best Practices and Standards
The Integration of AI in Customer Service Technology
IT in Manufacturing: Industrial Automation and Robotics
The Future of E-commerce: Technology and Trends
Mobile Computing: Innovations and Challenges
Information Technology in Education: Tools and Trends
IT Project Management: Approaches and Tools
The Role of IT in Media and Entertainment
The Impact of Digital Marketing Technologies on Business Strategies
IT in Logistics and Supply Chain Management
The Development and Future of Autonomous Vehicles
IT in the Insurance Sector: Enhancing Efficiency and Customer Engagement
The Role of IT in Environmental Conservation
Smart Grid Technology: IT at the Intersection of Energy Management
Telemedicine: The Impact of IT on Healthcare Delivery
IT in the Agricultural Sector: Innovations and Impact
Cyber-Physical Systems: IT in the Integration of Physical and Digital Worlds
The Influence of Social Media Platforms on IT Development
Data Centers: Evolution, Technologies, and Sustainability
IT in Public Administration: Improving Services and Transparency
The Role of IT in Sports Analytics
Information Technology in Retail: Enhancing the Shopping Experience
The Future of IT: Integrating Ethical AI Systems

Internet of Things (IoT) Thesis Topics

Enhancing IoT Security: Strategies for Safeguarding Connected Devices
IoT in Smart Cities: Infrastructure and Data Management Challenges
The Application of IoT in Precision Agriculture: Maximizing Efficiency and Yield
IoT and Healthcare: Opportunities for Remote Monitoring and Patient Care
Energy Efficiency in IoT: Techniques for Reducing Power Consumption in Devices
The Role of IoT in Supply Chain Management and Logistics
Real-Time Data Processing Using Edge Computing in IoT Networks
Privacy Concerns and Data Protection in IoT Systems
The Integration of IoT with Blockchain for Enhanced Security and Transparency
IoT in Environmental Monitoring: Systems for Air Quality and Water Safety
Predictive Maintenance in Industrial IoT: Strategies and Benefits
IoT in Retail: Enhancing Customer Experience through Smart Technology
The Development of Standard Protocols for IoT Communication
IoT in Smart Homes: Automation and Security Systems
The Role of IoT in Disaster Management: Early Warning Systems and Response Coordination
Machine Learning Techniques for IoT Data Analytics
IoT in Automotive: The Future of Connected and Autonomous Vehicles
The Impact of 5G on IoT: Enhancements in Speed and Connectivity
IoT Device Lifecycle Management: From Creation to Decommissioning
IoT in Public Safety: Applications for Emergency Response and Crime Prevention
The Ethics of IoT: Balancing Innovation with Consumer Rights
IoT and the Future of Work: Automation and Labor Market Shifts
Designing User-Friendly Interfaces for IoT Applications
IoT in the Energy Sector: Smart Grids and Renewable Energy Integration
Quantum Computing and IoT: Potential Impacts and Applications
The Role of AI in Enhancing IoT Solutions
IoT for Elderly Care: Technologies for Health and Mobility Assistance
IoT in Education: Enhancing Classroom Experiences and Learning Outcomes
Challenges in Scaling IoT Infrastructure for Global Coverage
The Economic Impact of IoT: Industry Transformations and New Business Models
IoT and Tourism: Enhancing Visitor Experiences through Connected Technologies
Data Fusion Techniques in IoT: Integrating Diverse Data Sources
IoT in Aquaculture: Monitoring and Managing Aquatic Environments
Wireless Technologies for IoT: Comparing LoRa, Zigbee, and NB-IoT
IoT and Intellectual Property: Navigating the Legal Landscape
IoT in Sports: Enhancing Training and Audience Engagement
Building Resilient IoT Systems against Cyber Attacks
IoT for Waste Management: Innovations and System Implementations
IoT in Agriculture: Drones and Sensors for Crop Monitoring
The Role of IoT in Cultural Heritage Preservation: Monitoring and Maintenance
Advanced Algorithms for Supervised and Unsupervised Learning
Machine Learning in Genomics: Predicting Disease Propensity and Treatment Outcomes
The Use of Neural Networks in Image Recognition and Analysis
Reinforcement Learning: Applications in Robotics and Autonomous Systems
The Role of Machine Learning in Natural Language Processing and Linguistic Analysis
Deep Learning for Predictive Analytics in Business and Finance
Machine Learning for Cybersecurity: Detection of Anomalies and Malware
Ethical Considerations in Machine Learning: Bias and Fairness
The Integration of Machine Learning with IoT for Smart Device Management
Transfer Learning: Techniques and Applications in New Domains
The Application of Machine Learning in Environmental Science
Machine Learning in Healthcare: Diagnosing Conditions from Medical Images
The Use of Machine Learning in Algorithmic Trading and Stock Market Analysis
Machine Learning in Social Media: Sentiment Analysis and Trend Prediction
Quantum Machine Learning: Merging Quantum Computing with AI
Feature Engineering and Selection in Machine Learning
Machine Learning for Enhancing User Experience in Mobile Applications
The Impact of Machine Learning on Digital Marketing Strategies
Machine Learning for Energy Consumption Forecasting and Optimization
The Role of Machine Learning in Enhancing Network Security Protocols
Scalability and Efficiency of Machine Learning Algorithms
Machine Learning in Drug Discovery and Pharmaceutical Research
The Application of Machine Learning in Sports Analytics
Machine Learning for Real-Time Decision-Making in Autonomous Vehicles
The Use of Machine Learning in Predicting Geographical and Meteorological Events
Machine Learning for Educational Data Mining and Learning Analytics
The Role of Machine Learning in Audio Signal Processing
Predictive Maintenance in Manufacturing Through Machine Learning
Machine Learning and Its Implications for Privacy and Surveillance
The Application of Machine Learning in Augmented Reality Systems
Deep Learning Techniques in Medical Diagnosis: Challenges and Opportunities
The Use of Machine Learning in Video Game Development
Machine Learning for Fraud Detection in Financial Services
The Role of Machine Learning in Agricultural Optimization and Management
The Impact of Machine Learning on Content Personalization and Recommendation Systems
Machine Learning in Legal Tech: Document Analysis and Case Prediction
Adaptive Learning Systems: Tailoring Education Through Machine Learning
Machine Learning in Space Exploration: Analyzing Data from Space Missions
Machine Learning for Public Sector Applications: Improving Services and Efficiency
The Future of Machine Learning: Integrating Explainable AI
Innovations in Convolutional Neural Networks for Image and Video Analysis
Recurrent Neural Networks: Applications in Sequence Prediction and Analysis
The Role of Neural Networks in Predicting Financial Market Trends
Deep Neural Networks for Enhanced Speech Recognition Systems
Neural Networks in Medical Imaging: From Detection to Diagnosis
Generative Adversarial Networks (GANs): Applications in Art and Media
The Use of Neural Networks in Autonomous Driving Technologies
Neural Networks for Real-Time Language Translation
The Application of Neural Networks in Robotics: Sensory Data and Movement Control
Neural Network Optimization Techniques: Overcoming Overfitting and Underfitting
The Integration of Neural Networks with Blockchain for Data Security
Neural Networks in Climate Modeling and Weather Forecasting
The Use of Neural Networks in Enhancing Internet of Things (IoT) Devices
Graph Neural Networks: Applications in Social Network Analysis and Beyond
The Impact of Neural Networks on Augmented Reality Experiences
Neural Networks for Anomaly Detection in Network Security
The Application of Neural Networks in Bioinformatics and Genomic Data Analysis
Capsule Neural Networks: Improving the Robustness and Interpretability of Deep Learning
The Role of Neural Networks in Consumer Behavior Analysis
Neural Networks in Energy Sector: Forecasting and Optimization
The Evolution of Neural Network Architectures for Efficient Learning
The Use of Neural Networks in Sentiment Analysis: Techniques and Challenges
Deep Reinforcement Learning: Strategies for Advanced Decision-Making Systems
Neural Networks for Precision Medicine: Tailoring Treatments to Individual Genetic Profiles
The Use of Neural Networks in Virtual Assistants: Enhancing Natural Language Understanding
The Impact of Neural Networks on Pharmaceutical Research
Neural Networks for Supply Chain Management: Prediction and Automation
The Application of Neural Networks in E-commerce: Personalization and Recommendation Systems
Neural Networks for Facial Recognition: Advances and Ethical Considerations
The Role of Neural Networks in Educational Technologies
The Use of Neural Networks in Predicting Economic Trends
Neural Networks in Sports: Analyzing Performance and Strategy
The Impact of Neural Networks on Digital Security Systems
Neural Networks for Real-Time Video Surveillance Analysis
The Integration of Neural Networks in Edge Computing Devices
Neural Networks for Industrial Automation: Improving Efficiency and Accuracy
The Future of Neural Networks: Towards More General AI Applications
Neural Networks in Art and Design: Creating New Forms of Expression
The Role of Neural Networks in Enhancing Public Health Initiatives
The Future of Neural Networks: Challenges in Scalability and Generalization
The Evolution of Programming Paradigms: Functional vs. Object-Oriented Programming
Advances in Compiler Design and Optimization Techniques
The Impact of Programming Languages on Software Security
Developing Programming Languages for Quantum Computing
Machine Learning in Automated Code Generation and Optimization
The Role of Programming in Developing Scalable Cloud Applications
The Future of Web Development: New Frameworks and Technologies
Cross-Platform Development: Best Practices in Mobile App Programming
The Influence of Programming Techniques on Big Data Analytics
Real-Time Systems Programming: Challenges and Solutions
The Integration of Programming with Blockchain Technology
Programming for IoT: Languages and Tools for Device Communication
Secure Coding Practices: Preventing Cyber Attacks through Software Design
The Role of Programming in Data Visualization and User Interface Design
Advances in Game Programming: Graphics, AI, and Network Play
The Impact of Programming on Digital Media and Content Creation
Programming Languages for Robotics: Trends and Future Directions
The Use of Artificial Intelligence in Enhancing Programming Productivity
Programming for Augmented and Virtual Reality: New Challenges and Techniques
Ethical Considerations in Programming: Bias, Fairness, and Transparency
The Future of Programming Education: Interactive and Adaptive Learning Models
Programming for Wearable Technology: Special Considerations and Challenges
The Evolution of Programming in Financial Technology
Functional Programming in Enterprise Applications
Memory Management Techniques in Programming: From Garbage Collection to Manual Control
The Role of Open Source Programming in Accelerating Innovation
The Impact of Programming on Network Security and Cryptography
Developing Accessible Software: Programming for Users with Disabilities
Programming Language Theories: New Models and Approaches
The Challenges of Legacy Code: Strategies for Modernization and Integration
Energy-Efficient Programming: Optimizing Code for Green Computing
Multithreading and Concurrency: Advanced Programming Techniques
The Impact of Programming on Computational Biology and Bioinformatics
The Role of Scripting Languages in Automating System Administration
Programming and the Future of Quantum Resistant Cryptography
Code Review and Quality Assurance: Techniques and Tools
Adaptive and Predictive Programming for Dynamic Environments
The Role of Programming in Enhancing E-commerce Technology
Programming for Cyber-Physical Systems: Bridging the Gap Between Digital and Physical
The Influence of Programming Languages on Computational Efficiency and Performance
Quantum Algorithms: Development and Applications Beyond Shor’s and Grover’s Algorithms
The Role of Quantum Computing in Solving Complex Biological Problems
Quantum Cryptography: New Paradigms for Secure Communication
Error Correction Techniques in Quantum Computing
Quantum Computing and Its Impact on Artificial Intelligence
The Integration of Classical and Quantum Computing: Hybrid Models
Quantum Machine Learning: Theoretical Foundations and Practical Applications
Quantum Computing Hardware: Advances in Qubit Technology
The Application of Quantum Computing in Financial Modeling and Risk Assessment
Quantum Networking: Establishing Secure Quantum Communication Channels
The Future of Drug Discovery: Applications of Quantum Computing
Quantum Computing in Cryptanalysis: Threats to Current Cryptography Standards
Simulation of Quantum Systems for Material Science
Quantum Computing for Optimization Problems in Logistics and Manufacturing
Theoretical Limits of Quantum Computing: Understanding Quantum Complexity
Quantum Computing and the Future of Search Algorithms
The Role of Quantum Computing in Climate Science and Environmental Modeling
Quantum Annealing vs. Universal Quantum Computing: Comparative Studies
Implementing Quantum Algorithms in Quantum Programming Languages
The Impact of Quantum Computing on Public Key Cryptography
Quantum Entanglement: Experiments and Applications in Quantum Networks
Scalability Challenges in Quantum Processors
The Ethics and Policy Implications of Quantum Computing
Quantum Computing in Space Exploration and Astrophysics
The Role of Quantum Computing in Developing Next-Generation AI Systems
Quantum Computing in the Energy Sector: Applications in Smart Grids and Nuclear Fusion
Noise and Decoherence in Quantum Computers: Overcoming Practical Challenges
Quantum Computing for Predicting Economic Market Trends
Quantum Sensors: Enhancing Precision in Measurement and Imaging
The Future of Quantum Computing Education and Workforce Development
Quantum Computing in Cybersecurity: Preparing for a Post-Quantum World
Quantum Computing and the Internet of Things: Potential Intersections
Practical Quantum Computing: From Theory to Real-World Applications
Quantum Supremacy: Milestones and Future Goals
The Role of Quantum Computing in Genetics and Genomics
Quantum Computing for Material Discovery and Design
The Challenges of Quantum Programming Languages and Environments
Quantum Computing in Art and Creative Industries
The Global Race for Quantum Computing Supremacy: Technological and Political Aspects
Quantum Computing and Its Implications for Software Engineering
Advances in Humanoid Robotics: New Developments and Challenges
Robotics in Healthcare: From Surgery to Rehabilitation
The Integration of AI in Robotics: Enhanced Autonomy and Learning Capabilities
Swarm Robotics: Coordination Strategies and Applications
The Use of Robotics in Hazardous Environments: Deep Sea and Space Exploration
Soft Robotics: Materials, Design, and Applications
Robotics in Agriculture: Automation of Farming and Harvesting Processes
The Role of Robotics in Manufacturing: Increased Efficiency and Flexibility
Ethical Considerations in the Deployment of Robots in Human Environments
Autonomous Vehicles: Technological Advances and Regulatory Challenges
Robotic Assistants for the Elderly and Disabled: Improving Quality of Life
The Use of Robotics in Education: Teaching Science, Technology, Engineering, and Math (STEM)
Robotics and Computer Vision: Enhancing Perception and Decision Making
The Impact of Robotics on Employment and the Workforce
The Development of Robotic Systems for Environmental Monitoring and Conservation
Machine Learning Techniques for Robotic Perception and Navigation
Advances in Robotic Surgery: Precision and Outcomes
Human-Robot Interaction: Building Trust and Cooperation
Robotics in Retail: Automated Warehousing and Customer Service
Energy-Efficient Robots: Design and Utilization
Robotics in Construction: Automation and Safety Improvements
The Role of Robotics in Disaster Response and Recovery Operations
The Application of Robotics in Art and Creative Industries
Robotics and the Future of Personal Transportation
Ethical AI in Robotics: Ensuring Safe and Fair Decision-Making
The Use of Robotics in Logistics: Drones and Autonomous Delivery Vehicles
Robotics in the Food Industry: From Production to Service
The Integration of IoT with Robotics for Enhanced Connectivity
Wearable Robotics: Exoskeletons for Rehabilitation and Enhanced Mobility
The Impact of Robotics on Privacy and Security
Robotic Pet Companions: Social Robots and Their Psychological Effects
Robotics for Planetary Exploration and Colonization
Underwater Robotics: Innovations in Oceanography and Marine Biology
Advances in Robotics Programming Languages and Tools
The Role of Robotics in Minimizing Human Exposure to Contaminants and Pathogens
Collaborative Robots (Cobots): Working Alongside Humans in Shared Spaces
The Use of Robotics in Entertainment and Sports
Robotics and Machine Ethics: Programming Moral Decision-Making
The Future of Military Robotics: Opportunities and Challenges
Sustainable Robotics: Reducing the Environmental Impact of Robotic Systems
Agile Methodologies: Evolution and Future Trends
DevOps Practices: Improving Software Delivery and Lifecycle Management
The Impact of Microservices Architecture on Software Development
Containerization Technologies: Docker, Kubernetes, and Beyond
Software Quality Assurance: Modern Techniques and Tools
The Role of Artificial Intelligence in Automated Software Testing
Blockchain Applications in Software Development and Security
The Integration of Continuous Integration and Continuous Deployment (CI/CD) in Software Projects
Cybersecurity in Software Engineering: Best Practices for Secure Coding
Low-Code and No-Code Development: Implications for Professional Software Development
The Future of Software Engineering Education
Software Sustainability: Developing Green Software and Reducing Carbon Footprints
The Role of Software Engineering in Healthcare: Telemedicine and Patient Data Management
Privacy by Design: Incorporating Privacy Features at the Development Stage
The Impact of Quantum Computing on Software Engineering
Software Engineering for Augmented and Virtual Reality: Challenges and Innovations
Cloud-Native Applications: Design, Development, and Deployment
Software Project Management: Agile vs. Traditional Approaches
Open Source Software: Community Engagement and Project Sustainability
The Evolution of Graphical User Interfaces in Application Development
The Challenges of Integrating IoT Devices into Software Systems
Ethical Issues in Software Engineering: Bias, Accountability, and Regulation
Software Engineering for Autonomous Vehicles: Safety and Regulatory Considerations
Big Data Analytics in Software Development: Enhancing Decision-Making Processes
The Future of Mobile App Development: Trends and Technologies
The Role of Software Engineering in Artificial Intelligence: Frameworks and Algorithms
Performance Optimization in Software Applications
Adaptive Software Development: Responding to Changing User Needs
Software Engineering in Financial Services: Compliance and Security Challenges
User Experience (UX) Design in Software Engineering
The Role of Software Engineering in Smart Cities: Infrastructure and Services
The Impact of 5G on Software Development and Deployment
Real-Time Systems in Software Engineering: Design and Implementation Challenges
Cross-Platform Development Challenges: Ensuring Consistency and Performance
Software Testing Automation: Tools and Trends
The Integration of Cyber-Physical Systems in Software Engineering
Software Engineering in the Entertainment Industry: Game Development and Beyond
The Application of Machine Learning in Predicting Software Bugs
The Role of Software Engineering in Cybersecurity Defense Strategies
Accessibility in Software Engineering: Creating Inclusive and Usable Software
Progressive Web Apps (PWAs): Advantages and Implementation Challenges
The Future of Web Accessibility: Standards and Practices
Single-Page Applications (SPAs) vs. Multi-Page Applications (MPAs): Performance and Usability
The Impact of Serverless Computing on Web Development
The Evolution of CSS for Modern Web Design
Security Best Practices in Web Development: Defending Against XSS and CSRF Attacks
The Role of Web Development in Enhancing E-commerce User Experience
The Use of Artificial Intelligence in Web Personalization and User Engagement
The Future of Web APIs: Standards, Security, and Scalability
Responsive Web Design: Techniques and Trends
JavaScript Frameworks: Vue.js, React.js, and Angular – A Comparative Analysis
Web Development for IoT: Interfaces and Connectivity Solutions
The Impact of 5G on Web Development and User Experiences
The Use of Blockchain Technology in Web Development for Enhanced Security
Web Development in the Cloud: Using AWS, Azure, and Google Cloud
Content Management Systems (CMS): Trends and Future Developments
The Application of Web Development in Virtual and Augmented Reality
The Importance of Web Performance Optimization: Tools and Techniques
Sustainable Web Design: Practices for Reducing Energy Consumption
The Role of Web Development in Digital Marketing: SEO and Social Media Integration
Headless CMS: Benefits and Challenges for Developers and Content Creators
The Future of Web Typography: Design, Accessibility, and Performance
Web Development and Data Protection: Complying with GDPR and Other Regulations
Real-Time Web Communication: Technologies like WebSockets and WebRTC
Front-End Development Tools: Efficiency and Innovation in Workflow
The Challenges of Migrating Legacy Systems to Modern Web Architectures
Microfrontends Architecture: Designing Scalable and Decoupled Web Applications
The Impact of Cryptocurrencies on Web Payment Systems
User-Centered Design in Web Development: Methods for Engaging Users
The Role of Web Development in Business Intelligence: Dashboards and Reporting Tools
Web Development for Mobile Platforms: Optimization and Best Practices
The Evolution of E-commerce Platforms: From Web to Mobile Commerce
Web Security in E-commerce: Protecting Transactions and User Data
Dynamic Web Content: Server-Side vs. Client-Side Rendering
The Future of Full Stack Development: Trends and Skills
Web Design Psychology: How Design Influences User Behavior
The Role of Web Development in the Non-Profit Sector: Fundraising and Community Engagement
The Integration of AI Chatbots in Web Development
The Use of Motion UI in Web Design: Enhancing Aesthetics and User Interaction
The Future of Web Development: Predictions and Emerging Technologies

We trust that this comprehensive list of computer science thesis topics will serve as a valuable starting point for your research endeavors. With 1000 unique and carefully selected topics distributed across 25 key areas of computer science, students are equipped to tackle complex questions and contribute meaningful advancements to the field. As you proceed to select your thesis topic, consider not only your personal interests and career goals but also the potential impact of your research. We encourage you to explore these topics thoroughly and choose one that will not only challenge you but also push the boundaries of technology and innovation.

The Range of Computer Science Thesis Topics

Computer science stands as a dynamic and ever-evolving field that continuously reshapes how we interact with the world. At its core, the discipline encompasses not just the study of algorithms and computation, but a broad spectrum of practical and theoretical knowledge areas that drive innovation in various sectors. This article aims to explore the rich landscape of computer science thesis topics, offering students and researchers a glimpse into the potential areas of study that not only challenge the intellect but also contribute significantly to technological progress. As we delve into the current issues, recent trends, and future directions of computer science, it becomes evident that the possibilities for research are both vast and diverse. Whether you are intrigued by the complexities of artificial intelligence, the robust architecture of networks and systems, or the innovative approaches in cybersecurity, computer science offers a fertile ground for developing thesis topics that are as impactful as they are intellectually stimulating.

Current Issues in Computer Science

One of the prominent current issues in computer science revolves around data security and privacy. As digital transformation accelerates across industries, the massive influx of data generated poses significant challenges in terms of its protection and ethical use. Cybersecurity threats have become more sophisticated, with data breaches and cyber-attacks causing major concerns for organizations worldwide. This ongoing battle demands continuous improvements in security protocols and the development of robust cybersecurity measures. Computer science thesis topics in this area can explore new cryptographic methods, intrusion detection systems, and secure communication protocols to fortify digital defenses. Research could also delve into the ethical implications of data collection and use, proposing frameworks that ensure privacy while still leveraging data for innovation.

Another critical issue facing the field of computer science is the ethical development and deployment of artificial intelligence (AI) systems. As AI technologies become more integrated into daily life and critical infrastructure, concerns about bias, fairness, and accountability in AI systems have intensified. Thesis topics could focus on developing algorithms that address these ethical concerns, including techniques for reducing bias in machine learning models and methods for increasing transparency and explainability in AI decisions. This research is crucial for ensuring that AI technologies promote fairness and do not perpetuate or exacerbate existing societal inequalities.

Furthermore, the rapid pace of technological change presents a challenge in terms of sustainability and environmental impact. The energy consumption of large data centers, the carbon footprint of producing and disposing of electronic waste, and the broader effects of high-tech innovations on the environment are significant concerns within computer science. Thesis research in this domain could focus on creating more energy-efficient computing methods, developing algorithms that reduce power consumption, or innovating recycling technologies that address the issue of e-waste. This research not only contributes to the field of computer science but also plays a crucial role in ensuring that technological advancement does not come at an unsustainable cost to the environment.

These current issues highlight the dynamic nature of computer science and its direct impact on society. Addressing these challenges through focused research and innovative thesis topics not only advances the field but also contributes to resolving some of the most pressing problems facing our global community today.

Recent Trends in Computer Science

In recent years, computer science has witnessed significant advancements in the integration of artificial intelligence (AI) and machine learning (ML) across various sectors, marking one of the most exciting trends in the field. These technologies are not just reshaping traditional industries but are also at the forefront of driving innovations in areas like healthcare, finance, and autonomous systems. Thesis topics within this trend could explore the development of advanced ML algorithms that enhance predictive analytics, improve automated decision-making, or refine natural language processing capabilities. Additionally, AI’s role in ethical decision-making and its societal impacts offers a rich vein of inquiry for research, focusing on mitigating biases and ensuring that AI systems operate transparently and justly.

Another prominent trend in computer science is the rapid growth of blockchain technology beyond its initial application in cryptocurrencies. Blockchain is proving its potential in creating more secure, decentralized, and transparent networks for a variety of applications, from enhancing supply chain logistics to revolutionizing digital identity verification processes. Computer science thesis topics could investigate novel uses of blockchain for ensuring data integrity in digital transactions, enhancing cybersecurity measures, or even developing new frameworks for blockchain integration into existing technological infrastructures. The exploration of blockchain’s scalability, speed, and energy consumption also presents critical research opportunities that are timely and relevant.

Furthermore, the expansion of the Internet of Things (IoT) continues to be a significant trend, with more devices becoming connected every day, leading to increasingly smart environments. This proliferation poses unique challenges and opportunities for computer science research, particularly in terms of scalability, security, and new data management strategies. Thesis topics might focus on optimizing network protocols to handle the massive influx of data from IoT devices, developing solutions to safeguard against IoT-specific security vulnerabilities, or innovative applications of IoT in urban planning, smart homes, or healthcare. Research in this area is crucial for advancing the efficiency and functionality of IoT systems and for ensuring they can be safely and effectively integrated into modern life.

These recent trends underscore the vibrant and ever-evolving nature of computer science, reflecting its capacity to influence and transform an array of sectors through technological innovation. The continual emergence of new research topics within these trends not only enriches the academic discipline but also provides substantial benefits to society by addressing practical challenges and enhancing the capabilities of technology in everyday life.

Future Directions in Computer Science

As we look toward the future, one of the most anticipated areas in computer science is the advancement of quantum computing. This emerging technology promises to revolutionize problem-solving in fields that require immense computational power, such as cryptography, drug discovery, and complex system modeling. Quantum computing has the potential to process tasks at speeds unachievable by classical computers, offering breakthroughs in materials science and encryption methods. Computer science thesis topics might explore the theoretical underpinnings of quantum algorithms, the development of quantum-resistant cryptographic systems, or practical applications of quantum computing in industry-specific scenarios. Research in this area not only contributes to the foundational knowledge of quantum mechanics but also paves the way for its integration into mainstream computing, marking a significant leap forward in computational capabilities.

Another promising direction in computer science is the advancement of autonomous systems, particularly in robotics and vehicle automation. The future of autonomous technologies hinges on improving their safety, reliability, and decision-making processes under uncertain conditions. Thesis topics could focus on the enhancement of machine perception through computer vision and sensor fusion, the development of more sophisticated AI-driven decision frameworks, or ethical considerations in the deployment of autonomous systems. As these technologies become increasingly prevalent, research will play a crucial role in addressing the societal and technical challenges they present, ensuring their beneficial integration into daily life and industry operations.

Additionally, the ongoing expansion of artificial intelligence applications poses significant future directions for research, especially in the realm of AI ethics and policy. As AI systems become more capable and widespread, their impact on privacy, employment, and societal norms continues to grow. Future thesis topics might delve into the development of guidelines and frameworks for responsible AI, studies on the impact of AI on workforce dynamics, or innovations in transparent and fair AI systems. This research is vital for guiding the ethical evolution of AI technologies, ensuring they enhance societal well-being without diminishing human dignity or autonomy.

These future directions in computer science not only highlight the field’s potential for substantial technological advancements but also underscore the importance of thoughtful consideration of their broader implications. By exploring these areas in depth, computer science research can lead the way in not just technological innovation, but also in shaping a future where technology and ethics coexist harmoniously for the betterment of society.

In conclusion, the field of computer science is not only foundational to the technological advancements that characterize the modern age but also crucial in solving some of the most pressing challenges of our time. The potential thesis topics discussed in this article reflect a mere fraction of the opportunities that lie in the realms of theory, application, and innovation within this expansive field. As emerging technologies such as quantum computing, artificial intelligence, and blockchain continue to evolve, they open new avenues for research that could potentially redefine existing paradigms. For students embarking on their thesis journey, it is essential to choose a topic that not only aligns with their academic passions but also contributes to the ongoing expansion of computer science knowledge. By pushing the boundaries of what is known and exploring uncharted territories, students can leave a lasting impact on the field and pave the way for future technological breakthroughs. As we look forward, it’s clear that computer science will continue to be a key driver of change, making it an exciting and rewarding area for academic and professional growth.

Thesis Writing Services by iResearchNet

At iResearchNet, we specialize in providing exceptional thesis writing services tailored to meet the diverse needs of students, particularly those pursuing advanced topics in computer science. Understanding the pivotal role a thesis plays in a student’s academic career, we offer a suite of services designed to assist students in crafting papers that are not only well-researched and insightful but also perfectly aligned with their academic objectives. Here are the key features of our thesis writing services:

Expert Degree-Holding Writers : Our team consists of writers who hold advanced degrees in computer science and related fields. Their academic and professional backgrounds ensure that they bring a wealth of knowledge and expertise to your thesis.
Custom Written Works : Every thesis we produce is tailor-made to meet the specific requirements and guidelines provided by the student. This bespoke approach ensures that each paper is unique and of the highest quality.
In-depth Research : We pride ourselves on conducting thorough and comprehensive research for every thesis. Our writers utilize the latest resources, databases, and scholarly articles to gather the most relevant and up-to-date information.
Custom Formatting : Each thesis is formatted according to academic standards and the specific requirements of the student’s program, whether it’s APA, MLA, Chicago/Turabian, or Harvard style.
Top Quality : Quality is at the core of our services. From language clarity to factual accuracy, each thesis is crafted to meet the highest academic standards.
Customized Solutions : Recognizing that every student’s needs are different, we offer customized solutions that cater to individual preferences and requirements.
Flexible Pricing : We provide a range of pricing options to accommodate students’ different budgets, ensuring that our services are accessible to everyone.
Short Deadlines : Our services are designed to accommodate even the tightest deadlines, with the ability to handle requests that require a turnaround as quick as 3 hours.
Timely Delivery : We guarantee timely delivery of all our papers, helping students meet their submission deadlines without compromising on quality.
24/7 Support : Our customer support team is available around the clock to answer any questions and provide assistance whenever needed.
Absolute Privacy : We maintain a strict privacy policy to ensure that all client information is kept confidential and secure.
Easy Order Tracking : Our client portal allows for easy tracking of orders, giving students the ability to monitor the progress of their thesis writing process.
Money-Back Guarantee : We offer a money-back guarantee to ensure that all students are completely satisfied with our services.

At iResearchNet, we are dedicated to supporting students by providing them with high-quality, reliable, and professional thesis writing services. By choosing us, students can be confident that they are receiving expert help that not only meets but exceeds their expectations. Whether you are tackling complex topics in computer science or any other academic discipline, our team is here to help you achieve academic success.

Order Your Custom Thesis Paper Today!

Are you ready to take the next step towards academic excellence in computer science? At iResearchNet, we are committed to helping you achieve your academic goals with our premier thesis writing services. Our team of expert writers is equipped to handle the most challenging topics and tightest deadlines, ensuring that you receive a top-quality, custom-written thesis that not only meets but exceeds your academic requirements.

Don’t let the stress of thesis writing hold you back. Whether you’re grappling with complex algorithms, innovative software solutions, or groundbreaking data analysis, our custom thesis papers are crafted to provide you with the insights and depth needed to excel. With flexible pricing, personalized support, and guaranteed confidentiality, you can trust iResearchNet to be your partner in your academic journey.

Act now to secure your future! Visit our website to place your order or speak with one of our representatives to learn more about how we can assist you. Remember, when you choose iResearchNet, you’re not just getting a thesis paper; you’re investing in your success. Order your custom thesis paper today and take the first step towards standing out in the competitive field of computer science. With iResearchNet, you’re one step closer to not only completing your degree but also making a significant impact in the world of technology.

ORDER HIGH QUALITY CUSTOM PAPER

Advisers & Contacts
Bachelor of Arts & Bachelor of Science in Engineering
Prerequisites
Declaring Computer Science for AB Students
Declaring Computer Science for BSE Students
Class of '25, '26 & '27 - Departmental Requirements
Class of 2024 - Departmental Requirements
COS126 Information
Important Steps and Deadlines
Independent Work Seminars
Guidelines and Useful Information

Undergraduate Research Topics

AB Junior Research Workshops
Undergraduate Program FAQ
Minor Program
Funding for Student Group Activities
Mailing Lists and Policies
Study Abroad
Jobs & Careers
Admissions Requirements
Breadth Requirements
Pre-FPO Checklist
FPO Checklist
M.S.E. Track
M.Eng. Track
Departmental Internship Policy (for Master's students)
General Examination
Fellowship Opportunities
Travel Reimbursement Policy
Communication Skills
Course Schedule
Course Catalog
Research Areas
Interdisciplinary Programs
Technical Reports
Computing Facilities
Researchers
Technical Staff
Administrative Staff
Graduate Students
Undergraduate Students
Graduate Alumni
Climate and Inclusion Committee
Resources for Undergraduate & Graduate Students
Outreach Initiatives
Resources for Faculty & Staff
Spotlight Stories
Job Openings
Undergraduate Program
Independent Work & Theses

How to Contact Faculty for IW/Thesis Advising

Send the professor an e-mail. When you write a professor, be clear that you want a meeting regarding a senior thesis or one-on-one IW project, and briefly describe the topic or idea that you want to work on. Check the faculty listing for email addresses.

*Updated April 9, 2024
Table Legend: X = Available \| N/A = Not Available

	X	X	X
	X	X	X
	X	N/A	N/A
	X	X	X
	N/A	N/A	N/A
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	N/A	N/A
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	N/A	N/A
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	N/A	X	N/A
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	N/A	N/A	N/A
	X	X	X
	N/A	N/A	N/A
	X	X	X
	X	X	X
	X	X	X
	N/A	X	N/A
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	X
	X	X	N/A
	X	X	X
	X	X	X
	X	X	X
	X	X	X

Parastoo Abtahi, Room 419

Available for single-semester IW and senior thesis advising, 2024-2025

Research Areas: Human-Computer Interaction (HCI), Augmented Reality (AR), and Spatial Computing
Input techniques for on-the-go interaction (e.g., eye-gaze, microgestures, voice) with a focus on uncertainty, disambiguation, and privacy.
Minimal and timely multisensory output (e.g., spatial audio, haptics) that enables users to attend to their physical environment and the people around them, instead of a 2D screen.
Interaction with intelligent systems (e.g., IoT, robots) situated in physical spaces with a focus on updating users’ mental model despite the complexity and dynamicity of these systems.

Ryan Adams, Room 411

Research areas:

Machine learning driven design
Generative models for structured discrete objects
Approximate inference in probabilistic models
Accelerating solutions to partial differential equations
Innovative uses of automatic differentiation
Modeling and optimizing 3d printing and CNC machining

Andrew Appel, Room 209

Available for Fall 2024 IW advising, only

Research Areas: Formal methods, programming languages, compilers, computer security.
Software verification (for which taking COS 326 / COS 510 is helpful preparation)
Game theory of poker or other games (for which COS 217 / 226 are helpful)
Computer game-playing programs (for which COS 217 / 226)
Risk-limiting audits of elections (for which ORF 245 or other knowledge of probability is useful)

Sanjeev Arora, Room 407

Theoretical machine learning, deep learning and its analysis, natural language processing. My advisees would typically have taken a course in algorithms (COS423 or COS 521 or equivalent) and a course in machine learning.
Show that finding approximate solutions to NP-complete problems is also NP-complete (i.e., come up with NP-completeness reductions a la COS 487).
Experimental Algorithms: Implementing and Evaluating Algorithms using existing software packages.
Studying/designing provable algorithms for machine learning and implementions using packages like scipy and MATLAB, including applications in Natural language processing and deep learning.
Any topic in theoretical computer science.

David August, Room 221

Not available for IW or thesis advising, 2024-2025

Research Areas: Computer Architecture, Compilers, Parallelism
Containment-based approaches to security: We have designed and tested a simple hardware+software containment mechanism that stops incorrect communication resulting from faults, bugs, or exploits from leaving the system. Let's explore ways to use containment to solve real problems. Expect to work with corporate security and technology decision-makers.
Parallelism: Studies show much more parallelism than is currently realized in compilers and architectures. Let's find ways to realize this parallelism.
Any other interesting topic in computer architecture or compilers.

Mark Braverman, 194 Nassau St., Room 231

Research Areas: computational complexity, algorithms, applied probability, computability over the real numbers, game theory and mechanism design, information theory.
Topics in computational and communication complexity.
Applications of information theory in complexity theory.
Algorithms for problems under real-life assumptions.
Game theory, network effects
Mechanism design (could be on a problem proposed by the student)

Sebastian Caldas, 221 Nassau Street, Room 105

Research Areas: collaborative learning, machine learning for healthcare. Typically, I will work with students that have taken COS324.
Methods for collaborative and continual learning.
Machine learning for healthcare applications.

Bernard Chazelle, 194 Nassau St., Room 301

Research Areas: Natural Algorithms, Computational Geometry, Sublinear Algorithms.
Natural algorithms (flocking, swarming, social networks, etc).
Sublinear algorithms
Self-improving algorithms
Markov data structures

Danqi Chen, Room 412

My advisees would be expected to have taken a course in machine learning and ideally have taken COS484 or an NLP graduate seminar.
Representation learning for text and knowledge bases
Pre-training and transfer learning
Question answering and reading comprehension
Information extraction
Text summarization
Any other interesting topics related to natural language understanding/generation

Marcel Dall'Agnol, Corwin 034

Research Areas: Theoretical computer science. (Specifically, quantum computation, sublinear algorithms, complexity theory, interactive proofs and cryptography)
Research Areas: Machine learning

Jia Deng, Room 423

Research Areas: Computer Vision, Machine Learning.
Object recognition and action recognition
Deep Learning, autoML, meta-learning
Geometric reasoning, logical reasoning

Adji Bousso Dieng, Room 406

Research areas: Vertaix is a research lab at Princeton University led by Professor Adji Bousso Dieng. We work at the intersection of artificial intelligence (AI) and the natural sciences. The models and algorithms we develop are motivated by problems in those domains and contribute to advancing methodological research in AI. We leverage tools in statistical machine learning and deep learning in developing methods for learning with the data, of various modalities, arising from the natural sciences.

Robert Dondero, Corwin Hall, Room 038

Research Areas: Software engineering; software engineering education.
Develop or evaluate tools to facilitate student learning in undergraduate computer science courses at Princeton, and beyond.
In particular, can code critiquing tools help students learn about software quality?

Zeev Dvir, 194 Nassau St., Room 250

Research Areas: computational complexity, pseudo-randomness, coding theory and discrete mathematics.
Independent Research: I have various research problems related to Pseudorandomness, Coding theory, Complexity and Discrete mathematics - all of which require strong mathematical background. A project could also be based on writing a survey paper describing results from a few theory papers revolving around some particular subject.

Benjamin Eysenbach, Room 416

Research areas: reinforcement learning, machine learning. My advisees would typically have taken COS324.
Using RL algorithms to applications in science and engineering.
Emergent behavior of RL algorithms on high-fidelity robotic simulators.
Studying how architectures and representations can facilitate generalization.

Christiane Fellbaum, 1-S-14 Green

Research Areas: theoretical and computational linguistics, word sense disambiguation, lexical resource construction, English and multilingual WordNet(s), ontology
Anything having to do with natural language--come and see me with/for ideas suitable to your background and interests. Some topics students have worked on in the past:
Developing parsers, part-of-speech taggers, morphological analyzers for underrepresented languages (you don't have to know the language to develop such tools!)
Quantitative approaches to theoretical linguistics questions
Extensions and interfaces for WordNet (English and WN in other languages),
Applications of WordNet(s), including:
Foreign language tutoring systems,
Spelling correction software,
Word-finding/suggestion software for ordinary users and people with memory problems,
Machine Translation
Sentiment and Opinion detection
Automatic reasoning and inferencing
Collaboration with professors in the social sciences and humanities ("Digital Humanities")

Adam Finkelstein, Room 424

Research Areas: computer graphics, audio.

Robert S. Fish, Corwin Hall, Room 037

Networking and telecommunications
Learning, perception, and intelligence, artificial and otherwise;
Human-computer interaction and computer-supported cooperative work
Online education, especially in Computer Science Education
Topics in research and development innovation methodologies including standards, open-source, and entrepreneurship
Distributed autonomous organizations and related blockchain technologies

Michael Freedman, Room 308

Research Areas: Distributed systems, security, networking
Projects related to streaming data analysis, datacenter systems and networks, untrusted cloud storage and applications. Please see my group website at http://sns.cs.princeton.edu/ for current research projects.

Ruth Fong, Room 032

Research Areas: computer vision, machine learning, deep learning, interpretability, explainable AI, fairness and bias in AI
Develop a technique for understanding AI models
Design a AI model that is interpretable by design
Build a paradigm for detecting and/or correcting failure points in an AI model
Analyze an existing AI model and/or dataset to better understand its failure points
Build a computer vision system for another domain (e.g., medical imaging, satellite data, etc.)
Develop a software package for explainable AI
Adapt explainable AI research to a consumer-facing problem

Note: I am happy to advise any project if there's a sufficient overlap in interest and/or expertise; please reach out via email to chat about project ideas.

Tom Griffiths, Room 405

Available for Fall 2024 single-semester IW advising, only

Research areas: computational cognitive science, computational social science, machine learning and artificial intelligence

Note: I am open to projects that apply ideas from computer science to understanding aspects of human cognition in a wide range of areas, from decision-making to cultural evolution and everything in between. For example, we have current projects analyzing chess game data and magic tricks, both of which give us clues about how human minds work. Students who have expertise or access to data related to games, magic, strategic sports like fencing, or other quantifiable domains of human behavior feel free to get in touch.

Aarti Gupta, Room 220

Research Areas: Formal methods, program analysis, logic decision procedures
Finding bugs in open source software using automatic verification tools
Software verification (program analysis, model checking, test generation)
Decision procedures for logical reasoning (SAT solvers, SMT solvers)

Elad Hazan, Room 409

Research interests: machine learning methods and algorithms, efficient methods for mathematical optimization, regret minimization in games, reinforcement learning, control theory and practice
Machine learning, efficient methods for mathematical optimization, statistical and computational learning theory, regret minimization in games.
Implementation and algorithm engineering for control, reinforcement learning and robotics
Implementation and algorithm engineering for time series prediction

Felix Heide, Room 410

Research Areas: Computational Imaging, Computer Vision, Machine Learning (focus on Optimization and Approximate Inference).
Optical Neural Networks
Hardware-in-the-loop Holography
Zero-shot and Simulation-only Learning
Object recognition in extreme conditions
3D Scene Representations for View Generation and Inverse Problems
Long-range Imaging in Scattering Media
Hardware-in-the-loop Illumination and Sensor Optimization
Inverse Lidar Design
Phase Retrieval Algorithms
Proximal Algorithms for Learning and Inference
Domain-Specific Language for Optics Design

Peter Henderson , 302 Sherrerd Hall

Research Areas: Machine learning, law, and policy

Kyle Jamieson, Room 306

Research areas: Wireless and mobile networking; indoor radar and indoor localization; Internet of Things
See other topics on my independent work ideas page (campus IP and CS dept. login req'd)

Alan Kaplan, 221 Nassau Street, Room 105

Research Areas:

Random apps of kindness - mobile application/technology frameworks used to help individuals or communities; topic areas include, but are not limited to: first response, accessibility, environment, sustainability, social activism, civic computing, tele-health, remote learning, crowdsourcing, etc.
Tools automating programming language interoperability - Java/C++, React Native/Java, etc.
Software visualization tools for education
Connected consumer devices, applications and protocols

Brian Kernighan, Room 311

Research Areas: application-specific languages, document preparation, user interfaces, software tools, programming methodology
Application-oriented languages, scripting languages.
Tools; user interfaces
Digital humanities

Zachary Kincaid, Room 219

Research areas: programming languages, program analysis, program verification, automated reasoning
Independent Research Topics:
Develop a practical algorithm for an intractable problem (e.g., by developing practical search heuristics, or by reducing to, or by identifying a tractable sub-problem, ...).
Design a domain-specific programming language, or prototype a new feature for an existing language.
Any interesting project related to programming languages or logic.

Gillat Kol, Room 316

Research area: theory

Aleksandra Korolova, 309 Sherrerd Hall

Research areas: Societal impacts of algorithms and AI; privacy; fair and privacy-preserving machine learning; algorithm auditing.

Advisees typically have taken one or more of COS 226, COS 324, COS 423, COS 424 or COS 445.

Pravesh Kothari, Room 320

Research areas: Theory

Amit Levy, Room 307

Research Areas: Operating Systems, Distributed Systems, Embedded Systems, Internet of Things
Distributed hardware testing infrastructure
Second factor security tokens
Low-power wireless network protocol implementation
USB device driver implementation

Kai Li, Room 321

Research Areas: Distributed systems; storage systems; content-based search and data analysis of large datasets.
Fast communication mechanisms for heterogeneous clusters.
Approximate nearest-neighbor search for high dimensional data.
Data analysis and prediction of in-patient medical data.
Optimized implementation of classification algorithms on manycore processors.

Xiaoyan Li, 221 Nassau Street, Room 104

Research areas: Information retrieval, novelty detection, question answering, AI, machine learning and data analysis.
Explore new statistical retrieval models for document retrieval and question answering.
Apply AI in various fields.
Apply supervised or unsupervised learning in health, education, finance, and social networks, etc.
Any interesting project related to AI, machine learning, and data analysis.

Lydia Liu, Room 414

Research Areas: algorithmic decision making, machine learning and society
Theoretical foundations for algorithmic decision making (e.g. mathematical modeling of data-driven decision processes, societal level dynamics)
Societal impacts of algorithms and AI through a socio-technical lens (e.g. normative implications of worst case ML metrics, prediction and model arbitrariness)
Machine learning for social impact domains, especially education (e.g. responsible development and use of LLMs for education equity and access)
Evaluation of human-AI decision making using statistical methods (e.g. causal inference of long term impact)

Wyatt Lloyd, Room 323

Research areas: Distributed Systems
Caching algorithms and implementations
Storage systems
Distributed transaction algorithms and implementations

Alex Lombardi , Room 312

Research Areas: Theory

Margaret Martonosi, Room 208

Quantum Computing research, particularly related to architecture and compiler issues for QC.
Computer architectures specialized for modern workloads (e.g., graph analytics, machine learning algorithms, mobile applications
Investigating security and privacy vulnerabilities in computer systems, particularly IoT devices.
Other topics in computer architecture or mobile / IoT systems also possible.

Jonathan Mayer, Sherrerd Hall, Room 307

Available for Spring 2025 single-semester IW, only

Research areas: Technology law and policy, with emphasis on national security, criminal procedure, consumer privacy, network management, and online speech.
Assessing the effects of government policies, both in the public and private sectors.
Collecting new data that relates to government decision making, including surveying current business practices and studying user behavior.
Developing new tools to improve government processes and offer policy alternatives.

Mae Milano, Room 307

Local-first / peer-to-peer systems
Wide-ares storage systems
Consistency and protocol design
Type-safe concurrency
Language design
Gradual typing
Domain-specific languages
Languages for distributed systems

Andrés Monroy-Hernández, Room 405

Research Areas: Human-Computer Interaction, Social Computing, Public-Interest Technology, Augmented Reality, Urban Computing
Research interests:developing public-interest socio-technical systems. We are currently creating alternatives to gig work platforms that are more equitable for all stakeholders. For instance, we are investigating the socio-technical affordances necessary to support a co-op food delivery network owned and managed by workers and restaurants. We are exploring novel system designs that support self-governance, decentralized/federated models, community-centered data ownership, and portable reputation systems. We have opportunities for students interested in human-centered computing, UI/UX design, full-stack software development, and qualitative/quantitative user research.
Beyond our core projects, we are open to working on research projects that explore the use of emerging technologies, such as AR, wearables, NFTs, and DAOs, for creative and out-of-the-box applications.

Christopher Moretti, Corwin Hall, Room 036

Research areas: Distributed systems, high-throughput computing, computer science/engineering education
Expansion, improvement, and evaluation of open-source distributed computing software.
Applications of distributed computing for "big science" (e.g. biometrics, data mining, bioinformatics)
Software and best practices for computer science education and study, especially Princeton's 126/217/226 sequence or MOOCs development
Sports analytics and/or crowd-sourced computing

Radhika Nagpal, F316 Engineering Quadrangle

Research areas: control, robotics and dynamical systems

Karthik Narasimhan, Room 422

Research areas: Natural Language Processing, Reinforcement Learning
Autonomous agents for text-based games ( https://www.microsoft.com/en-us/research/project/textworld/ )
Transfer learning/generalization in NLP
Techniques for generating natural language
Model-based reinforcement learning

Arvind Narayanan, 308 Sherrerd Hall

Research Areas: fair machine learning (and AI ethics more broadly), the social impact of algorithmic systems, tech policy

Pedro Paredes, Corwin Hall, Room 041

My primary research work is in Theoretical Computer Science.

* Research Interest: Spectral Graph theory, Pseudorandomness, Complexity theory, Coding Theory, Quantum Information Theory, Combinatorics.

The IW projects I am interested in advising can be divided into three categories:

1. Theoretical research

I am open to advise work on research projects in any topic in one of my research areas of interest. A project could also be based on writing a survey given results from a few papers. Students should have a solid background in math (e.g., elementary combinatorics, graph theory, discrete probability, basic algebra/calculus) and theoretical computer science (226 and 240 material, like big-O/Omega/Theta, basic complexity theory, basic fundamental algorithms). Mathematical maturity is a must.

A (non exhaustive) list of topics of projects I'm interested in: * Explicit constructions of better vertex expanders and/or unique neighbor expanders. * Construction deterministic or random high dimensional expanders. * Pseudorandom generators for different problems. * Topics around the quantum PCP conjecture. * Topics around quantum error correcting codes and locally testable codes, including constructions, encoding and decoding algorithms.

2. Theory informed practical implementations of algorithms Very often the great advances in theoretical research are either not tested in practice or not even feasible to be implemented in practice. Thus, I am interested in any project that consists in trying to make theoretical ideas applicable in practice. This includes coming up with new algorithms that trade some theoretical guarantees for feasible implementation yet trying to retain the soul of the original idea; implementing new algorithms in a suitable programming language; and empirically testing practical implementations and comparing them with benchmarks / theoretical expectations. A project in this area doesn't have to be in my main areas of research, any theoretical result could be suitable for such a project.

Some examples of areas of interest: * Streaming algorithms. * Numeric linear algebra. * Property testing. * Parallel / Distributed algorithms. * Online algorithms. 3. Machine learning with a theoretical foundation

I am interested in projects in machine learning that have some mathematical/theoretical, even if most of the project is applied. This includes topics like mathematical optimization, statistical learning, fairness and privacy.

One particular area I have been recently interested in is in the area of rating systems (e.g., Chess elo) and applications of this to experts problems.

Final Note: I am also willing to advise any project with any mathematical/theoretical component, even if it's not the main one; please reach out via email to chat about project ideas.

Iasonas Petras, Corwin Hall, Room 033

Research Areas: Information Based Complexity, Numerical Analysis, Quantum Computation.
Prerequisites: Reasonable mathematical maturity. In case of a project related to Quantum Computation a certain familiarity with quantum mechanics is required (related courses: ELE 396/PHY 208).
Possible research topics include:

1. Quantum algorithms and circuits:

i. Design or simulation quantum circuits implementing quantum algorithms.
ii. Design of quantum algorithms solving/approximating continuous problems (such as Eigenvalue problems for Partial Differential Equations).

2. Information Based Complexity:

i. Necessary and sufficient conditions for tractability of Linear and Linear Tensor Product Problems in various settings (for example worst case or average case).
ii. Necessary and sufficient conditions for tractability of Linear and Linear Tensor Product Problems under new tractability and error criteria.
iii. Necessary and sufficient conditions for tractability of Weighted problems.
iv. Necessary and sufficient conditions for tractability of Weighted Problems under new tractability and error criteria.

3. Topics in Scientific Computation:

i. Randomness, Pseudorandomness, MC and QMC methods and their applications (Finance, etc)

Yuri Pritykin, 245 Carl Icahn Lab

Research interests: Computational biology; Cancer immunology; Regulation of gene expression; Functional genomics; Single-cell technologies.
Potential research projects: Development, implementation, assessment and/or application of algorithms for analysis, integration, interpretation and visualization of multi-dimensional data in molecular biology, particularly single-cell and spatial genomics data.

Benjamin Raphael, Room 309

Research interests: Computational biology and bioinformatics; Cancer genomics; Algorithms and machine learning approaches for analysis of large-scale datasets
Implementation and application of algorithms to infer evolutionary processes in cancer
Identifying correlations between combinations of genomic mutations in human and cancer genomes
Design and implementation of algorithms for genome sequencing from new DNA sequencing technologies
Graph clustering and network anomaly detection, particularly using diffusion processes and methods from spectral graph theory

Vikram Ramaswamy, 035 Corwin Hall

Research areas: Interpretability of AI systems, Fairness in AI systems, Computer vision.
Constructing a new method to explain a model / create an interpretable by design model
Analyzing a current model / dataset to understand bias within the model/dataset
Proposing new fairness evaluations
Proposing new methods to train to improve fairness
Developing synthetic datasets for fairness / interpretability benchmarks
Understanding robustness of models

Ran Raz, Room 240

Research Area: Computational Complexity
Independent Research Topics: Computational Complexity, Information Theory, Quantum Computation, Theoretical Computer Science

Szymon Rusinkiewicz, Room 406

Research Areas: computer graphics; computer vision; 3D scanning; 3D printing; robotics; documentation and visualization of cultural heritage artifacts
Research ways of incorporating rotation invariance into computer visiontasks such as feature matching and classification
Investigate approaches to robust 3D scan matching
Model and compensate for imperfections in 3D printing
Given a collection of small mobile robots, apply control policies learned in simulation to the real robots.

Olga Russakovsky, Room 408

Research Areas: computer vision, machine learning, deep learning, crowdsourcing, fairness&bias in AI
Design a semantic segmentation deep learning model that can operate in a zero-shot setting (i.e., recognize and segment objects not seen during training)
Develop a deep learning classifier that is impervious to protected attributes (such as gender or race) that may be erroneously correlated with target classes
Build a computer vision system for the novel task of inferring what object (or part of an object) a human is referring to when pointing to a single pixel in the image. This includes both collecting an appropriate dataset using crowdsourcing on Amazon Mechanical Turk, creating a new deep learning formulation for this task, and running extensive analysis of both the data and the model

Sebastian Seung, Princeton Neuroscience Institute, Room 153

Research Areas: computational neuroscience, connectomics, "deep learning" neural networks, social computing, crowdsourcing, citizen science
Gamification of neuroscience (EyeWire 2.0)
Semantic segmentation and object detection in brain images from microscopy
Computational analysis of brain structure and function
Neural network theories of brain function

Jaswinder Pal Singh, Room 324

Research Areas: Boundary of technology and business/applications; building and scaling technology companies with special focus at that boundary; parallel computing systems and applications: parallel and distributed applications and their implications for software and architectural design; system software and programming environments for multiprocessors.
Develop a startup company idea, and build a plan/prototype for it.
Explore tradeoffs at the boundary of technology/product and business/applications in a chosen area.
Study and develop methods to infer insights from data in different application areas, from science to search to finance to others.
Design and implement a parallel application. Possible areas include graphics, compression, biology, among many others. Analyze performance bottlenecks using existing tools, and compare programming models/languages.
Design and implement a scalable distributed algorithm.

Mona Singh, Room 420

Research Areas: computational molecular biology, as well as its interface with machine learning and algorithms.
Whole and cross-genome methods for predicting protein function and protein-protein interactions.
Analysis and prediction of biological networks.
Computational methods for inferring specific aspects of protein structure from protein sequence data.
Any other interesting project in computational molecular biology.

Robert Tarjan, 194 Nassau St., Room 308

Research Areas: Data structures; graph algorithms; combinatorial optimization; computational complexity; computational geometry; parallel algorithms.
Implement one or more data structures or combinatorial algorithms to provide insight into their empirical behavior.
Design and/or analyze various data structures and combinatorial algorithms.

Olga Troyanskaya, Room 320

Research Areas: Bioinformatics; analysis of large-scale biological data sets (genomics, gene expression, proteomics, biological networks); algorithms for integration of data from multiple data sources; visualization of biological data; machine learning methods in bioinformatics.
Implement and evaluate one or more gene expression analysis algorithm.
Develop algorithms for assessment of performance of genomic analysis methods.
Develop, implement, and evaluate visualization tools for heterogeneous biological data.

David Walker, Room 211

Research Areas: Programming languages, type systems, compilers, domain-specific languages, software-defined networking and security
Independent Research Topics: Any other interesting project that involves humanitarian hacking, functional programming, domain-specific programming languages, type systems, compilers, software-defined networking, fault tolerance, language-based security, theorem proving, logic or logical frameworks.

Shengyi Wang, Postdoctoral Research Associate, Room 216

Available for Fall 2024 single-semester IW, only

Independent Research topics: Explore Escher-style tilings using (introductory) group theory and automata theory to produce beautiful pictures.

Kevin Wayne, Corwin Hall, Room 040

Research Areas: design, analysis, and implementation of algorithms; data structures; combinatorial optimization; graphs and networks.
Design and implement computer visualizations of algorithms or data structures.
Develop pedagogical tools or programming assignments for the computer science curriculum at Princeton and beyond.
Develop assessment infrastructure and assessments for MOOCs.

Matt Weinberg, 194 Nassau St., Room 222

Research Areas: algorithms, algorithmic game theory, mechanism design, game theoretical problems in {Bitcoin, networking, healthcare}.
Theoretical questions related to COS 445 topics such as matching theory, voting theory, auction design, etc.
Theoretical questions related to incentives in applications like Bitcoin, the Internet, health care, etc. In a little bit more detail: protocols for these systems are often designed assuming that users will follow them. But often, users will actually be strictly happier to deviate from the intended protocol. How should we reason about user behavior in these protocols? How should we design protocols in these settings?

Huacheng Yu, Room 310

data structures
streaming algorithms
design and analyze data structures / streaming algorithms
prove impossibility results (lower bounds)
implement and evaluate data structures / streaming algorithms

Ellen Zhong, Room 314

Opportunities outside the department.

We encourage students to look in to doing interdisciplinary computer science research and to work with professors in departments other than computer science. However, every CS independent work project must have a strong computer science element (even if it has other scientific or artistic elements as well.) To do a project with an adviser outside of computer science you must have permission of the department. This can be accomplished by having a second co-adviser within the computer science department or by contacting the independent work supervisor about the project and having he or she sign the independent work proposal form.

Here is a list of professors outside the computer science department who are eager to work with computer science undergraduates.

Maria Apostolaki, Engineering Quadrangle, C330

Research areas: Computing & Networking, Data & Information Science, Security & Privacy

Branko Glisic, Engineering Quadrangle, Room E330

Documentation of historic structures
Cyber physical systems for structural health monitoring
Developing virtual and augmented reality applications for documenting structures
Applying machine learning techniques to generate 3D models from 2D plans of buildings
Contact : Rebecca Napolitano, rkn2 (@princeton.edu)

Mihir Kshirsagar, Sherrerd Hall, Room 315

Center for Information Technology Policy.

Consumer protection
Content regulation
Competition law
Economic development
Surveillance and discrimination

Sharad Malik, Engineering Quadrangle, Room B224

Select a Senior Thesis Adviser for the 2020-21 Academic Year.

Design of reliable hardware systems
Verifying complex software and hardware systems

Prateek Mittal, Engineering Quadrangle, Room B236

Internet security and privacy
Social Networks
Privacy technologies, anonymous communication
Network Science
Internet security and privacy: The insecurity of Internet protocols and services threatens the safety of our critical network infrastructure and billions of end users. How can we defend end users as well as our critical network infrastructure from attacks?
Trustworthy social systems: Online social networks (OSNs) such as Facebook, Google+, and Twitter have revolutionized the way our society communicates. How can we leverage social connections between users to design the next generation of communication systems?
Privacy Technologies: Privacy on the Internet is eroding rapidly, with businesses and governments mining sensitive user information. How can we protect the privacy of our online communications? The Tor project (https://www.torproject.org/) is a potential application of interest.

Ken Norman, Psychology Dept, PNI 137

Research Areas: Memory, the brain and computation
Lab: Princeton Computational Memory Lab

Potential research topics

Methods for decoding cognitive state information from neuroimaging data (fMRI and EEG)
Neural network simulations of learning and memory

Caroline Savage

Office of Sustainability, Phone:(609)258-7513, Email: cs35 (@princeton.edu)

The Campus as Lab program supports students using the Princeton campus as a living laboratory to solve sustainability challenges. The Office of Sustainability has created a list of campus as lab research questions, filterable by discipline and topic, on its website .

An example from Computer Science could include using TigerEnergy , a platform which provides real-time data on campus energy generation and consumption, to study one of the many energy systems or buildings on campus. Three CS students used TigerEnergy to create a live energy heatmap of campus .

Other potential projects include:

Apply game theory to sustainability challenges
Develop a tool to help visualize interactions between complex campus systems, e.g. energy and water use, transportation and storm water runoff, purchasing and waste, etc.
How can we learn (in aggregate) about individuals’ waste, energy, transportation, and other behaviors without impinging on privacy?

Janet Vertesi, Sociology Dept, Wallace Hall, Room 122

Research areas: Sociology of technology; Human-computer interaction; Ubiquitous computing.
Possible projects: At the intersection of computer science and social science, my students have built mixed reality games, produced artistic and interactive installations, and studied mixed human-robot teams, among other projects.

David Wentzlaff, Engineering Quadrangle, Room 228

Computing, Operating Systems, Sustainable Computing.

Instrument Princeton's Green (HPCRC) data center
Investigate power utilization on an processor core implemented in an FPGA
Dismantle and document all of the components in modern electronics. Invent new ways to build computers that can be recycled easier.
Other topics in parallel computer architecture or operating systems

Are you training large models? Explore Neptune Scale: The experiment tracker for foundation model training Read more

15 Computer Visions Projects You Can Do Right Now

Computer vision deals with how computers extract meaningful information from images or videos. It has a wide range of applications, including reverse engineering, security inspections, image editing and processing, computer animation, autonomous navigation, and robotics.

In this article, we’re going to explore 15 great OpenCV projects, from beginner-level to expert-level . For each project, you’ll see the essential guides, source codes, and datasets, so you can get straight to work on them if you want.

Top Tools to Run a Computer Vision Project

What is Computer Vision?

Computer vision is about helping machines interpret images and videos. It’s the science of interacting with an object through a digital medium and using sensors to analyze and understand what it sees. It’s a broad discipline that’s useful for machine translation, pattern recognition, robotic positioning, 3D reconstruction, driverless cars, and much more.

The field of computer vision keeps evolving and becoming more impactful thanks to constant technological innovations. As time goes by, it will offer increasingly powerful tools for researchers, businesses, and eventually consumers.

Computer Vision today

Computer vision has become a relatively standard technology in recent years due to the advancement of AI. Many companies use it for product development, sales operations, marketing campaigns, access control, security, and more.

Computer vision has plenty of applications in healthcare (including pathology), industrial automation, military use, cybersecurity, automotive engineering, drone navigation—the list goes on.

How does Computer Vision work?

Machine learning finds patterns by learning from its mistakes. The training data makes a model, which guesses and predicts things. Real-world images are broken down into simple patterns. The computer recognizes patterns in images using a neural network built with many layers.

The first layer takes pixel value and tries to identify the edges . The next few layers will try to detect simple shapes with the help of edges . In the end, all of it is put together to understand the image.

It can take thousands, sometimes millions of images, to train a computer vision application. Sometimes even that’s not enough—some facial recognition applications can’t detect people of different skin colors because they’re trained on white people. Sometimes the application might not be able to find the difference between a dog and a bagel. Ultimately, the algorithm will only ever be as good as the data that was used for training it.

OK, enough introduction! Let’s get into the projects.

Beginner level Computer Vision projects

If you’re new or learning computer vision, these projects will help you learn a lot.

1. Edge & Contour Detection

If you’re new to computer vision, this project is a great start. CV applications detect edges first and then collect other information. There are many edge detection algorithms, and the most popular is the Canny edge detector because it’s pretty effective compared to others. It’s also a complex edge-detection technique. Below are the steps for Canny edge detection:

Reduce noise and smoothen image,
Calculate the gradient,
Non-maximum suppression,
Double the threshold,
Linking and edge detecting – hysteresis.

Code for Canny edge detection:

Contours are lines joining all the continuous objects or points (along the boundary), having the same color or intensity. For example, it detects the shape of a leaf based on its parameters or border. Contours are an important tool for shape and object detection. The contours of an object are the boundary lines that make up the shape of an object as it is. Contours are also called outline, edges, or structure, for a very good reason: they’re a way to mark changes in depth.

Code to find contours:

2. Colour Detection & Invisibility Cloak

This project is about detecting color in images. You can use it to edit and recognize colors from images or videos. The most popular project that uses the color detection technique is the invisibility cloak. In movies, invisibility works by doing tasks on a green screen, but here we’ll be doing it by removing the foreground layer. The invisibility cloak process is this:

Capture and store the background frame (just the background),
Detect colors,
Generate a mask,
Generate the final output to create the invisible effect.

It works on HSV (Hue Saturation Value). HSV is one of the three ways that Lightroom lets us change color ranges in photographs. It’s particularly useful for introducing or removing certain colors from an image or scene, such as changing night-time shots to day-time shots (or vice versa). It’s the color portion, identified from 0 to 360. Reducing this component toward zero introduces more grey and produces a faded effect.

Value (brightness) works in conjunction with saturation. It describes the brightness or intensity of the color, from 0–100%. So 0 is completely black, and 100 is the brightest and reveals the most color.

Github Repo – https://github.com/its-harshil/invisible_cloak
Invisibility Cloak using OpenCV – Guide

3. Text Recognition using OpenCV and Tesseract (OCR)

Here, you use OpenCV and OCR (Optical Character Recognition) on your image to identify each letter and convert them into text. It’s perfect for anyone looking to take information from an image or video and turn it into text-based data. Many apps use OCR, like Google Lens, PDF Scanner, and more.

Ways to detect text from images:

Use OpenCV – popular,
Use Deep Learning models – the newest method,
Use your custom model.

Text Classification: All Tips and Tricks from 5 Kaggle Competitions

Text Detection using OpenCV

Sample code after processing the image and contour detection:

Text Detection with Tesseract

It’s an open-source application that can recognize text in 100+ languages, and it’s backed by Google. You can also train this application to recognize many other languages.

Code to detect text using tesseract:

4. Face Recognition with Python and OpenCV

It’s been just over a decade since the American television show CSI: Crime Scene Investigation first aired. During that time, facial recognition software has become increasingly sophisticated. Present-day software isn’t limited by superficial features like skin or hair color—instead, it identifies faces based on facial features that are more stable through changes in appearance, like eye shape and distance between eyes. This type of facial recognition is called “template matching”. You can use OpenCV, Deep learning, or a custom database to create facial recognition systems/applications.

Process of detecting a face from an image:

Find face locations and encodings,
Extract features using face embedding,
Face recognition, compare those faces.

How to Choose a Loss Function for Face Recognition Create a Face Recognition Application Using Swift, Core ML, and TuriCreate

Below is the full code for recognizing faces from images:

Code to recognize faces from webcam or live camera:

Face Recognition with OpenCV – Docs
Face Recognition- Guide
AT&T Face database
The Extended Yale Face Database B

5. Object Detection

Object detection is the automatic inference of what an object is in a given image or video frame. It’s used in self-driving cars, tracking, face detection, pose detection, and a lot more. There are 3 major types of object detection – using OpenCV, a machine learning-based approach, and a deep learning-based approach.

May interest you

Below is the full code to detect objects:

Object Detection (objdetect module)
Detecting Objects – Guide
Object Detection – Tutorial

Intermediate level Computer Vision projects

We’re taking things to the next level with a few intermediate-level projects. These projects will probably be more fun than beginner projects, but also more challenging.

6. Hand Gesture Recognition

In this project, you need to detect hand gestures. After detecting the gesture, we’ll assign commands to them. You can even play games with multiple commands using hand gesture recognition.

How gesture recognition works:

Install the Pyautogui library – it helps to control the mouse and keyboard without any user interaction,
Convert it into HSV,
Find contours,
Assign command at any value – below we used 5 (from hand) to jump.

Full code to play the dino game with hand gestures:

Hand Recognition and Gesture Control – Docs
Playing Chrome’s Dinosaur Game using OpenCV – Tutorial
Github Repo

7. Human Pose Detection

Many applications use human pose detection to see how a player plays in a specific game (for example – baseball). The ultimate goal is to locate landmarks in the body . Human pose detection is used in many real-life videos and image-based applications, including physical exercise, sign language detection, dance, yoga, and much more.

Deep Learning-based Human Pose Estimation using OpenCV – Tutorial
MPII Human Pose Dataset
Human Pose Evaluator Dataset
Human-Pose-Estimation – Github

8. Road Lane Detection in Autonomous Vehicles

If you want to get into self-driving cars, this project will be a good start. You’ll detect lanes, edges of the road, and a lot more. Lane detection works like this:

Apply the mask,
Do image thresholding (thresholding converts an image to grayscale by replacing each pixel >= specified gray level with the corresponding gray level),
Do hough line transformation (detecting lane lines).

Car Lane Detection – Github
Real-time lane detection for autonomous vehicles – Docs
Real-time Car Lane Detection – Tutorial

9. Pathology Classification

Computer vision is emerging in healthcare. The amount of data that pathologists analyze in a day can be too much to handle. Luckily, deep learning algorithms can identify patterns in large amounts of data that humans wouldn’t notice otherwise. As more images are entered and categorized into groups, the accuracy of these algorithms becomes better and better over time.

It can detect various diseases in plants, animals, and humans. For this application, the goal is to get datasets from Kaggle OCT and classify data into different sections. The dataset has around 85000 images. Optical coherence tomography (OCT) is an emerging medical technology for performing high-resolution cross-sectional imaging. Optical coherence tomography uses light waves to look inside a living human body. It can be used to evaluate thinning skin, broken blood vessels, heart diseases, and many other medical problems.

Over time, it’s gained the trust of doctors around the globe as a quick and effective way of diagnosing more quality patients than traditional methods. It can also be used to examine tattoo pigments or assess different layers of a skin graft that’s placed on a burn patient.

Pathology classification - computer vision

Code for Gradcam library used for classification:

Kaggle Datasets Link

10. Fashion MNIST for Image Classification

One of the most used MNIST datasets was a database of handwritten images, which contains around 60,000 train and 10,000 test images of handwritten digits from 0 to 9. Inspired by this, they created Fashion MNIST, which classifies clothes. As a result of the large database and all the resources provided by MNIST, you get a high accuracy range from 96-99%.

This is a complex dataset containing 60,000 training images of clothes (35 categories) from online shops like ASOS or H&M. These images are divided into two subsets, one with clothes similar to the fashion industry, and the other with clothes belonging to the general public. The dataset contains 1.2 million samples (clothes and prices) for each category.

MNIST colab file
Fashion MNIST Colab file
Handwritten datasets
Fashion MNIST Dataset
Fashion MNIST Tutorial

Advanced level Computer Vision projects

Once you’re an expert in computer vision, you can develop projects from your own ideas. Below are a few advanced-level fun projects you can work with if you have enough skills and knowledge.

11. Image Deblurring using Generative Adversarial Networks

Image deblurring is an interesting technology with plenty of applications. Here, a generative adversarial network (GAN) automatically trains a generative model, like Image DeBlur’s AI algorithm. Before looking into this project, let’s understand what GANs are and how they work.

Understanding GAN Loss Functions 6 GAN Architectures You Really Should Know

Generative Adversarial Networks is a new deep-learning approach that has shown unprecedented success in various computer vision tasks, such as image super-resolution. However, it remains an open problem how best to train these networks. A Generative Adversarial Network can be thought of as two networks competing with one another; just like humans compete against each other on game shows like Jeopardy or Survivor. Both parties have tasks and need to come up with strategies based on their opponent’s appearance or moves throughout the game, while also trying not to be eliminated first. There are 3 major steps involved in training for deblurring:

Create fake inputs based on noise using the generator,
Train it with both real and fake sets,
Train the whole model.
Application to Image Deblurring
Blind Motion Deblurring Using Conditional Adversarial Networks – Paper
Datasets of blurred street view

12. Image Transformation

With this project, you can transform any image into different forms. For example, you can change a real image into a graphical one. This is kind of a creative and fun project to do. When we use the standard GAN method, it becomes difficult to transform the images, but for this project, most people use Cycle GAN.

What Image Processing Techniques Are Actually Used in the ML Industry?

The idea is that you train two competing neural networks against each other. One network creates new data samples, called the “generator,” while the other network judges whether it’s real or fake. The generator alters its parameters to try to fool the judge by producing more realistic samples. In this way, both networks improve with time and continue to improve indefinitely – this makes GANs an ongoing project rather than a one-off assignment. This is a different type of GAN, it’s an extension of GAN architecture. What Cycle Gan does is create a cycle of generating the input. Let’s say you’re using Google Translate, you translate English to German, you open a new tab, copy the german output and translate German to English—the goal here is to get the original input you had. Below is an example of how transforming images to artwork works.

CycleGAN – Github
Transforming real photos into master artworks with gans – Guide

13. Automatic Colorization of Photos using Deep Neural Networks

When it comes to coloring black and white images, machines have never been able to do an adequate job. They can’t understand the boundary between grey and white, leading to a range of monochromatic hues that seem unrealistic. To overcome this issue, scientists from UC Berkeley, along with colleagues at Microsoft Research, developed a new algorithm that automatically colorizes photographs by using deep neural networks.

Deep neural networks are a very promising technique for image classification because they can learn the composition of an image by looking at many pictures. Densely connected convolutional neural networks (CNN) have been used to classify images in this study. CNN’s are trained with large amounts of labeled data, and output a score corresponding to the associated class label for any input image. They can be thought of as feature detectors that are applied to the original input image.

Colourization is the process of adding color to a black and white photo. It can be accomplished by hand, but it’s a tedious process that takes hours or days, depending on the level of detail in the photo. Recently, there’s been an explosion in deep neural networks for image recognition tasks such as facial recognition and text detection. In simple terms, it’s the process of adding colors to grayscale images or videos. However, with the rapid advance of deep learning in recent years, a Convolutional Neural Network (CNN) can colorize black and white images by predicting what the colors should be on a per-pixel basis. This project helps to colorize old photos. As you can see in the image below, it can even properly predict the color of coca-cola, because of the large number of datasets.

Automatic colorization - computer vision

14. Vehicle Counting and Classification

Nowadays, many places are equipped with surveillance systems that combine AI with cameras, from government organizations to private facilities. These AI-based cameras help in many ways, and one of the main features is to count the number of vehicles. It can be used to count the number of vehicles passing by or entering any particular place. This project can be used in many areas like crowd counting, traffic management, vehicle number plate, sports, and many more. The process is simple:

Frame differencing,
Image thresholding,
Contour finding,
Image dilation.

And finally, vehicle counting:

Vehicle-counting Github
Vehicle Detection Guide

15. Vehicle license plate scanners

A vehicle license plate scanner in computer vision is a type of computer vision application that can be used to identify plates and read their numbers. This technology is used for a variety of purposes, including law enforcement, identifying stolen vehicles, and tracking down fugitives.

A more sophisticated vehicle license plate scanner in computer vision can scan, read and identify hundreds, even thousands of cars per minute with 99% accuracy from distances up to half a mile away in heavy traffic conditions on highways and city streets. This project is very useful in many cases.

The goal is to first detect the license plate and then scan the numbers and text written on it. It’s also referred to as an automatic number plate detection system. The process is simple:

Capture image,
Search for the number plate,
Filter image,
Line separate using row segmentation,
OCR for the numbers and characters.

Number Plate Recognition Tutorial
Automatic Number Plate Recognition System for Vehicle Identification Using Optical Character Recognition

Conclusion

And that’s it! Hope you liked the computer vision projects. As a cherry on top, I’ll leave you with several extra projects that you might also be interested in.

Extra projects

Photo Sketching
Collage Mosaic Generator
Blur the Face
Image Segmentation
Sudoku Solver
Object Tracking
Watermarking Images
Image Reverse Search Engine

Additional research and recommended reading

https://neptune.ai/blog/building-and-deploying-cv-models
https://www.forbes.com/sites/cognitiveworld/2019/06/26/the-present-and-future-of-computer-vision/?sh=490b290f517d
https://www.youtube.com/watch?v=2hXG8v8p0KM
https://towardsdatascience.com/everything-you-ever-wanted-to-know-about-computer-vision-heres-a-look-why-it-s-so-awesome-e8a58dfb641e
https://docs.opencv.org/3.4/d2/d96/tutorial_py_table_of_contents_imgproc.html
https://www.analyticsvidhya.com/blog/2020/05/build-your-own-ocr-google-tesseract-opencv/
https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/

Was the article useful?

More about 15 computer visions projects you can do right now, check out our product resources and related articles below:, how to migrate from mlflow to neptune, introducing redesigned navigation, run groups, reports, and more, ml/ai platform build vs buy decision: what factors to consider, mlops journey: building a mature ml development process, explore more content topics:, manage your model metadata in a single place.

Join 50,000+ ML Engineers & Data Scientists using Neptune to easily log, compare, register, and share ML metadata.

Skip to main content
Skip to primary sidebar
Skip to footer

The Best of Applied Artificial Intelligence, Machine Learning, Automation, Bots, Chatbots

10 Cutting Edge Research Papers In Computer Vision & Image Generation

January 24, 2019 by Mariya Yao

UPDATE: We’ve also summarized the top 2019 and top 2020 Computer Vision research papers.

Ever since convolutional neural networks began outperforming humans in specific image recognition tasks, research in the field of computer vision has proceeded at breakneck pace.

The basic architecture of CNNs (or ConvNets) was developed in the 1980s . Yann LeCun improved upon the original design in 1989 by using backpropagation to train models to recognize handwritten digits.

We’ve come a long way since then.

In 2018, we saw novel architecture designs that improve upon performance benchmarks and also expand the range of media that machine learning models can analyze. We also saw a number of breakthroughs with media generation which enable photorealistic style transfer, high-resolution image generation, and video-to-video synthesis.

Due to the importance and prevalence of computer vision and image generation for applied and enterprise AI, we did feature some of the papers below in our previous article summarizing the top overall machine learning papers of 2018 . Since you might not have read that previous piece, we chose to highlight the vision-related research ones again here.

We’ve done our best to summarize these papers correctly, but if we’ve made any mistakes, please contact us to request a fix . Special thanks also goes to computer vision specialist Rebecca BurWei for generously offering her expertise in editing and revising drafts of this article.

If these summaries of scientific AI research papers are useful for you, you can subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries. We’re planning to release summaries of important papers in computer vision, reinforcement learning, and conversational AI in the next few weeks.

If you’d like to skip around, here are the papers we featured:

Spherical CNNs
Adversarial Examples that Fool both Computer Vision and Time-Limited Humans
A Closed-form Solution to Photorealistic Image Stylization
Group Normalization
Taskonomy: Disentangling Task Transfer Learning
Self-Attention Generative Adversarial Networks
GANimation: Anatomically-aware Facial Animation from a Single Image
Video-to-Video Synthesis
Everybody Dance Now
Large Scale GAN Training for High Fidelity Natural Image Synthesis

Important Computer Vision Research Papers of 2018

1. spherical cnns , by taco s. cohen, mario geiger, jonas koehler, and max welling, original abstract.

Convolutional Neural Networks (CNNs) have become the method of choice for learning problems involving 2D planar images. However, a number of problems of recent interest have created a demand for models that can analyze spherical images. Examples include omnidirectional vision for drones, robots, and autonomous cars, molecular regression problems, and global weather and climate modelling. A naive application of convolutional networks to a planar projection of the spherical signal is destined to fail, because the space-varying distortions introduced by such a projection will make translational weight sharing ineffective.

In this paper we introduce the building blocks for constructing spherical CNNs. We propose a definition for the spherical cross-correlation that is both expressive and rotation-equivariant. The spherical correlation satisfies a generalized Fourier theorem, which allows us to compute it efficiently using a generalized (non-commutative) Fast Fourier Transform (FFT) algorithm. We demonstrate the computational efficiency, numerical accuracy, and effectiveness of spherical CNNs applied to 3D model recognition and atomization energy regression.

Our Summary

Omnidirectional cameras that are already used by cars, drones, and other robots capture a spherical image of their entire surroundings. We could analyze such spherical signals by projecting them to the plane and using CNNs. However, any planar projection of a spherical signal results in distortions. To overcome this problem, the group of researchers from the University of Amsterdam introduces the theory of spherical CNNs, the networks that can analyze spherical images without being fooled by distortions. The approach demonstrates its effectiveness for classifying 3D shapes and Spherical MNIST images as well as for molecular energy regression, an important problem in computational chemistry.

What’s the core idea of this paper?

Planar projections of spherical signals result in significant distortions as some areas look larger or smaller than they really are.
Traditional CNNs are ineffective for spherical images because as objects move around the sphere, they also appear to shrink and stretch (think maps where Greenland looks much bigger than it actually is).
The solution is to use a spherical CNN which is robust to spherical rotations in the input data. By preserving the original shape of the input data, spherical CNNs treat all objects on the sphere equally without distortion.

What’s the key achievement?

Introducing a mathematical framework for building spherical CNNs.
Providing easy to use, fast and memory efficient PyTorch code for implementation of these CNNs.
classification of Spherical MNIST images
classification of 3D shapes,
molecular energy regression.

What does the AI community think?

The paper won the Best Paper Award at ICLR 2018, one of the leading machine learning conferences.

What are future research areas?

Development of a Steerable CNN for the sphere to analyze sections of vector bundles over the sphere (e.g., wind directions).
Expanding the mathematical theory from 2D spheres to 3D point clouds for classification tasks that are invariant under reflections as well as rotations.

What are possible business applications?

the omnidirectional vision for drones, robots, and autonomous cars;
molecular regression problems in computational chemistry;
global weather and climate modeling.

Where can you get implementation code?

The authors provide the original implementation for this research paper on GitHub .

2. Adversarial Examples that Fool both Computer Vision and Time-Limited Humans , by Gamaleldin F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alex Kurakin, Ian Goodfellow, Jascha Sohl-Dickstein

Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, we address this question by leveraging recent techniques that transfer adversarial examples from computer vision models with known parameters and architecture to other models with unknown parameters and architecture, and by matching the initial processing of the human visual system. We find that adversarial examples that strongly transfer across computer vision models influence the classifications made by time-limited human observers.

Google Brain researchers seek an answer to the question: do adversarial examples that are not model-specific and can fool different computer vision models without access to their parameters and architectures, can also fool time-limited humans? They leverage key ideas from machine learning, neuroscience, and psychophysics to create adversarial examples that do in fact impact human perception in a time-limited setting. Thus, the paper introduces a new class of illusions that are shared between machines and humans.

As the first step, the researchers use the black box adversarial example construction techniques that create adversarial examples without access to the model’s architecture or parameters.
prepending each model with a retinal layer that pre-processes the input to incorporate some of the transformations performed by the human eye;
performing an eccentricity-dependent blurring of the image to approximate the input which is received by the visual cortex of human subjects through their retinal lattice.
Classification decisions of humans are evaluated in a time-limited setting to detect even subtle effects in human perception.
Showing that adversarial examples that transfer across computer vision models do also successfully influence the perception of humans.
Demonstrating the similarity between convolutional neural networks and the human visual system.
The paper is widely discussed by the AI community. While most of the researchers are stunned by the results , some argue that we need a stricter definition of adversarial image because if humans classify the perturbated picture of a cat as a dog than it’s probably already a dog, not a cat.
Researching which techniques are crucial for the transfer of adversarial examples to humans (i.e., retinal preprocessing, model ensembling).
Practitioners should consider the risk that imagery could be manipulated to cause human observers to have unusual reactions because adversarial images can affect us below the horizon of awareness .

3. A Closed-form Solution to Photorealistic Image Stylization , by Yijun Li, Ming-Yu Liu, Xueting Li, Ming-Hsuan Yang, Jan Kautz

Photorealistic image stylization concerns transferring style of a reference photo to a content photo with the constraint that the stylized photo should remain photorealistic. While several photorealistic image stylization methods exist, they tend to generate spatially inconsistent stylizations with noticeable artifacts. In this paper, we propose a method to address these issues. The proposed method consists of a stylization step and a smoothing step. While the stylization step transfers the style of the reference photo to the content photo, the smoothing step ensures spatially consistent stylizations. Each of the steps has a closed-form solution and can be computed efficiently. We conduct extensive experimental validations. The results show that the proposed method generates photorealistic stylization outputs that are more preferred by human subjects as compared to those by the competing methods while running much faster. Source code and additional results are available at https://github.com/NVIDIA/FastPhotoStyle .

The team of scientists at NVIDIA and the University of California, Merced propose a new solution to photorealistic image stylization, FastPhotoStyle. The method consists of two steps: stylization and smoothing. Extensive experiments show that the suggested approach generates more realistic and compelling images than previous state-of-the-art. Even more, thanks to the closed-form solution, FastPhotoStyle can produce the stylized image 49 times faster than traditional methods.

The goal of photorealistic image stylization is to transfer style of a reference photo to a content photo while keeping the stylized image photorealistic.
The stylization step is based on the whitening and coloring transform (WCT), which processes images via feature projections. However, WCT was developed for artistic image stylizations, and thus, often generates structural artifacts for photorealistic image stylization. To overcome this problem, the paper introduces PhotoWCT method, which replaces the upsampling layers in the WCT with unpooling layers, and so, preserves more spatial information.
The smoothing step is required to solve spatially inconsistent stylizations that could arise after the first step. Smoothing is based on a manifold ranking algorithm.
Both steps have a closed-form solution, which means that the solution can be obtained in a fixed number of operations (i.e., convolutions, max-pooling, whitening, etc.). Thus, computations are much more efficient compared to the traditional methods.
outperforms artistic stylization algorithms by rendering much fewer structural artifacts and inconsistent stylizations, and
outperforms photorealistic stylization algorithms by synthesizing not only colors but also patterns in the style photos.
The experiments demonstrate that users prefer FastPhotoStyle results over the previous state-of-the-art in terms of both stylization effects (63.1%) and photorealism (73.5%).
FastPhotoSyle can synthesize an image of 1024 x 512 resolution in only 13 seconds, while the previous state-of-the-art method needs 650 seconds for the same task.
The paper was presented at ECCV 2018, leading European Conference on Computer Vision.
Finding the way to transfer small patterns from the style photo as they are smoothed away by the suggested method.
Exploring the possibilities to further reduce the number of structural artifacts in the stylized photos.
Content creators in the business settings can largely benefit from photorealistic image stylization as the tool basically allows you to automatically change the style of any photo based on what fits the narrative.
The photographers also discuss the tremendous impact that this technology can have in real estate photography.
NVIDIA team provides the original implementation for this research paper on GitHub .

4. Group Normalization , by Yuxin Wu and Kaiming He

Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems – BN’s error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN’s usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN’s computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.

Facebook AI research team suggest Group Normalization (GN) as an alternative to Batch Normalization (BN). They argue that BN’s error increases dramatically for small batch sizes. This limits the usage of BN when working with large models to solve computer vision tasks that require small batches due to memory constraints. On the contrary, Group Normalization is independent of batch sizes as it divides the channels into groups and computes the mean and variance for normalization within each group. The experiments confirm that GN outperforms BN in a variety of tasks, including object detection, segmentation, and video classification.

Group Normalization is a simple alternative to Batch Normalization, especially in the scenarios where batch size tends to be small, for example, computer vision tasks, requiring high-resolution input.
GN explores only the layer dimensions, and thus, its computation is independent of batch size. Specifically, GN divides channels, or feature maps, into groups and normalizes the features within each group.
Group Normalization can be easily implemented by a few lines of code in PyTorch and TensorFlow.
Introducing Group Normalization, new effective normalization method.
GN’s accuracy is stable in a wide range of batch sizes as its computation is independent of batch size. For example, GN demonstrated a 10.6% lower error rate than its BN-based counterpart for ResNet-50 in ImageNet with a batch size of 2.
GN can be also transferred to fine-tuning. The experiments show that GN can outperform BN counterparts for object detection and segmentation in COCO dataset and video classification in Kinetics dataset.
The paper received an honorable mention at ECCV 2018, leading European Conference on Computer Vision.
It is also the second most popular paper in 2018 based on the people’s libraries at Arxiv Sanity Preserver.
Applying group normalization to sequential or generative models.
Investigating GN’s performance on learning representations for reinforcement learning.
Exploring if GN combined with a suitable regularizer will improve results.
Business applications that rely on BN-based models for object detection, segmentation, video classification and other computer vision tasks that require high-resolution input may benefit from moving to GN-based models as they are more accurate in these settings.
Facebook AI research team provides Mask R-CNN baseline results and models trained with Group Normalization .
PyTorch implementation of group normalization is also available on GitHub.

5. Taskonomy: Disentangling Task Transfer Learning , by Amir R. Zamir, Alexander Sax, William Shen, Leonidas J. Guibas, Jitendra Malik, and Silvio Savarese

Do visual tasks have a relationship, or are they unrelated? For instance, could having surface normals simplify estimating the depth of an image? Intuition answers these questions positively, implying existence of a structure among visual tasks. Knowing this structure has notable values; it is the concept underlying transfer learning and provides a principled way for identifying redundancies across tasks, e.g., to seamlessly reuse supervision among related tasks or solve many tasks in one system without piling up the complexity.

We proposes a fully computational approach for modeling the structure of space of visual tasks. This is done via finding (first and higher-order) transfer learning dependencies across a dictionary of twenty six 2D, 2.5D, 3D, and semantic tasks in a latent space. The product is a computational taxonomic map for task transfer learning. We study the consequences of this structure, e.g. nontrivial emerged relationships, and exploit them to reduce the demand for labeled data. For example, we show that the total number of labeled datapoints needed for solving a set of 10 tasks can be reduced by roughly 2/3 (compared to training independently) while keeping the performance nearly the same. We provide a set of tools for computing and probing this taxonomical structure including a solver that users can employ to devise efficient supervision policies for their use cases.

Assertions of the existence of a structure among visual tasks have been made by many researchers since the early years of modern computer science. And now Amir Zamir and his team make an attempt to actually find this structure. They model it using a fully computational approach and discover lots of useful relationships between different visual tasks, including the nontrivial ones. They also show that by taking advantage of these interdependencies, it is possible to achieve the same model performance with the labeled data requirements reduced by roughly ⅔.

A model aware of the relationships among different visual tasks demands less supervision, uses less computation, and behaves in more predictable ways.
A fully computational approach to discovering the relationships between visual tasks is preferable because it avoids imposing prior, and possibly incorrect, assumptions: the priors are derived from either human intuition or analytical knowledge, while neural networks might operate on different principles.
Identifying relationships between 26 common visual tasks.
Showing how this structure helps in discovering types of transfer learning that will be most effective for each visual task.
Creating a new dataset of 4 million images of indoor scenes including 600 buildings annotated with 26 tasks.
The paper won the Best Paper Award at CVPR 2018, the key conference on computer vision and pattern recognition.
The results are very important as for the most real-world tasks large-scale labeled datasets are not available .
To move from a model where common visual tasks are entirely defined by humans and try an approach where human-defined visual tasks are viewed as observed samples which are composed of computationally found latent subtasks.
Exploring the possibility to transfer the findings to not entirely visual tasks, e.g. robotic manipulation.
Relationships discovered in this paper can be used to build more effective visual systems that will require less labeled data and lower computational costs.

6. Self-Attention Generative Adversarial Networks , by Han Zhang, Ian Goodfellow, Dimitris Metaxas, Augustus Odena

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps. In SAGAN, details can be generated using cues from all feature locations. Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other. Furthermore, recent work has shown that generator conditioning affects GAN performance. Leveraging this insight, we apply spectral normalization to the GAN generator and find that this improves training dynamics. The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset. Visualization of the attention layers shows that the generator leverages neighborhoods that correspond to object shapes rather than local regions of fixed shape.

Traditional convolutional GANs demonstrated some very promising results with respect to image synthesis. However, they have at least one important weakness – convolutional layers alone fail to capture geometrical and structural patterns in the images. Since convolution is a local operation, it is hardly possible for an output on the top-left position to have any relation to the output at bottom-right . The paper introduces a simple solution to this problem – incorporating the self-attention mechanism into the GAN framework. This solution combined with several stabilization techniques helps the Senf-Attention Generative Adversarial Networks (SAGANs) achieve the state-of-the-art results in image synthesis.

Convolutional layers alone are computationally inefficient for modeling long-range dependencies in images. On the contrary, a self-attention mechanism incorporated into the GAN framework will enable both the generator and the discriminator to efficiently model relationships between widely separated spatial regions.
The self-attention module calculates response at a position as a weighted sum of the features at all positions.
Applying spectral normalization for both generator and discriminator – the researchers argue that not only the discriminator but also the generator can benefit from spectral normalization, as it can prevent the escalation of parameter magnitudes and avoid unusual gradients.
Using separate learning rates for the generator and the discriminator to compensate for the problem of slow learning in a regularized discriminator and make it possible to use fewer generator steps per discriminator step.
Showing that self-attention module incorporated into the GAN framework is, in fact, effective in modeling long-range dependencies.
spectral normalization applied to the generator stabilizes GAN training;
utilizing imbalanced learning rates speeds up training of regularized discriminators.
Achieving state-of-the-art results in image synthesis by boosting the Inception Score from 36.8 to 52.52 and reducing Fréchet Inception Distance from 27.62 to 18.65.
“The idea is simple and intuitive yet very effective, plus easy to implement.” – Sebastian Raschka , assistant professor of Statistics at the University of Wisconsin-Madison.
Exploring the possibilities to reduce the number of weird samples generated by GANs.
Image synthesis with GANs can replace expensive manual media creation for advertising and e-commerce purposes.
PyTorch and TensorFlow implementations of Self-Attention GANs are available on GitHub.

7. GANimation: Anatomically-aware Facial Animation from a Single Image , by Albert Pumarola, Antonio Agudo, Aleix M. Martinez, Alberto Sanfeliu, Francesc Moreno-Noguer

Recent advances in Generative Adversarial Networks (GANs) have shown impressive results for task of facial expression synthesis. The most successful architecture is StarGAN, that conditions GANs generation process with images of a specific domain, namely a set of images of persons sharing the same expression. While effective, this approach can only generate a discrete number of expressions, determined by the content of the dataset. To address this limitation, in this paper, we introduce a novel GAN conditioning scheme based on Action Units (AU) annotations, which describes in a continuous manifold the anatomical facial movements defining a human expression. Our approach allows controlling the magnitude of activation of each AU and combine several of them. Additionally, we propose a fully unsupervised strategy to train the model, that only requires images annotated with their activated AUs, and exploit attention mechanisms that make our network robust to changing backgrounds and lighting conditions. Extensive evaluation show that our approach goes beyond competing conditional generators both in the capability to synthesize a much wider range of expressions ruled by anatomically feasible muscle movements, as in the capacity of dealing with images in the wild.

The paper introduces a novel GAN model that is able to generate anatomically-aware facial animations from a single image under changing backgrounds and illumination conditions. It advances current works, which had only addressed the problem for discrete emotions category editing and portrait images. The approach renders a wide range of emotions by encoding facial deformations as Action Units. The resulting animations demonstrate a remarkably smooth and consistent transformation across frames even with challenging light conditions and backgrounds.

Facial expressions can be described in terms of Action Units (AUs), which anatomically describe the contractions of specific facial muscles. For example, the facial expression for ‘fear’ is generally produced with the following activations: Inner Brow Raiser (AU1), Outer Brow Raiser (AU2), Brow Lowerer (AU4), Upper Lid Raiser (AU5), Lid Tightener (AU7), Lip Stretcher (AU20) and Jaw Drop (AU26). The magnitude of each AU defines the extent of emotion.
A model for synthetic facial animation is based on the GAN architecture, which is conditioned on a one-dimensional vector indicating the presence/absence and the magnitude of each Action Unit.
To circumvent the need for pairs of training images of the same person under different expressions, a bidirectional generator is used to both transform an image into a desired expression and transform the synthesized image back into the original pose.
To handle images under changing backgrounds and illumination conditions, the model includes an attention layer that focuses the action of the network only in those regions of the image that are relevant to convey the novel expression.
Introducing a novel GAN model for face animation in the wild that can be trained in a fully unsupervised manner and generate visually compelling images with remarkably smooth and consistent transformation across frames even with challenging light conditions and non-real world data.
Demonstrating how a wider range of emotions can be generated by interpolating between emotions the GAN has already seen.
Applying the introduced approach to video sequences.
The technology that automatically animates the facial expression from a single image can be applied in several areas including the fashion and e-commerce business, the movie industry, photography technologies.
The authors provide the original implementation of this research paper on GitHub .

8. Video-to-Video Synthesis , by Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. While its image counterpart, the image-to-image synthesis problem, is a popular topic, the video-to-video synthesis problem is less explored in the literature. Without understanding temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality. In this paper, we propose a novel video-to-video synthesis approach under the generative adversarial learning framework. Through carefully-designed generator and discriminator architectures, coupled with a spatio-temporal adversarial objective, we achieve high-resolution, photorealistic, temporally coherent video results on a diverse set of input formats including segmentation masks, sketches, and poses. Experiments on multiple benchmarks show the advantage of our method compared to strong baselines. In particular, our model is capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long, which significantly advances the state-of-the-art of video synthesis. Finally, we apply our approach to future video prediction, outperforming several state-of-the-art competing systems.

Researchers from NVIDIA have introduced a novel video-to-video synthesis approach. The framework is based on conditional GANs. Specifically, the method couples carefully-designed generator and discriminator with a spatio-temporal adversarial objective. The experiments demonstrate that the suggested vid2vid approach can synthesize high-resolution, photorealistic, temporally coherent videos on a diverse set of input formats including segmentation masks, sketches, and poses. It can also predict the next frames with far superior results than the baseline models.

current source frame;
past two source frames;
past two generated frames.
Conditional image discriminator ensures that each output frame resembles a real image given the same source image.
Conditional video discriminator ensures that consecutive output frames resemble the temporal dynamics of a real video given the same optical flow.
Foreground-background prior in the generator design further improves the synthesis performance of the proposed model.
Using a soft occlusion mask instead of binary allows to better handle the “zoom in” scenario: we can add details by gradually blending the warped pixels and the newly synthesized pixels.
Generating high-resolution (2048х2048), photorealistic, temporally coherent videos up to 30 seconds long.
Outputting several videos with different visual appearances depending on sampling different feature vectors.
Outperforming the baseline models in future video prediction.
Converting semantic labels into realistic real-world videos.
Generating multiple outputs of talking people from edge maps.
Generating an entire human body given a pose.
“NVIDIA’s new vid2vid is the first open-source code that lets you fake anybody’s face convincingly from one source video. […] interesting times ahead…”, Gene Kogan , an artist and a programmer.
The paper has also received some criticism over the concern that it can be used to create deepfakes or tampered videos which can deceive people.
Using object tracking information to make sure that each object has a consistent appearance across the whole video.
Researching if training the model with coarser semantic labels will help reduce the visible artifacts that appear after semantic manipulations (e.g., turning trees into buildings).
Adding additional 3D cues, such as depth maps, to enable synthesis of turning cars.
Marketing and advertising can benefit from the opportunities created by the vid2vid method (e.g., replacing the face or even the entire body in the video). However, this should be used with caution, keeping in mind the ethical considerations.
NVIDIA team provides the original implementation of this research paper on GitHub .

9. Everybody Dance Now , by Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros

This paper presents a simple method for “do as I do” motion transfer: given a source video of a person dancing we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves. We pose this problem as a per-frame image-to-image translation with spatio-temporal smoothing. Using pose detections as an intermediate representation between source and target, we learn a mapping from pose images to a target subject’s appearance. We adapt this setup for temporally coherent video generation including realistic face synthesis. Our video demo can be found at https://youtu.be/PCBTZh41Ris .

UC Berkeley researchers present a simple method for generating videos with amateur dancers performing like professional dancers. If you want to take part in the experiment, all you need to do is to record a few minutes of yourself performing some standard moves and then pick up the video with the dance you want to repeat. The neural network will do the main job: it solves the problem as a per-frame image-to-image translation with spatio-temporal smoothing. By conditioning the prediction at each frame on that of the previous time step for temporal smoothness and applying a specialized GAN for realistic face synthesis, the method achieves really amazing results.

A pre-trained state-of-the-art pose detector creates pose stick figures from the source video.
Global pose normalization is applied to account for differences between the source and target subjects in body shapes and locations within the frame.
Normalized pose stick figures are mapped to the target subject.
To make videos smooth, the researchers suggest conditioning the generator on the previously generated frame and then giving both images to the discriminator. Gaussian smoothing on the pose keypoints allows to further reduce jitter.
To generate more realistic faces, the method includes an additional face-specific GAN that brushes up the face after the main generation is finished.
Suggesting a novel approach to motion transfer that outperforms a strong baseline (pix2pixHD), according to both qualitative and quantitative assessments.
Demonstrating that face-specific GAN adds considerable detail to the output video.
“Overall I thought this was really fun and well executed. Looking forward to the code release so that I can start training my dance moves.”, Tom Brown , member of technical staff at Google Brain.
“’Everybody Dance Now’ from Caroline Chan, Alyosha Efros and team transfers dance moves from one subject to another. The only way I’ll ever dance well. Amazing work!!!”, Soumith Chintala‏, AI Research Engineer at Facebook.
Replacing pose stick figures with temporally coherent inputs and representation specifically optimized for motion transfer.
“Do as I do” motion transfer might be applied to replace subjects when creating marketing and promotional videos.
PyTorch implementation of this research paper is available on GitHub .

10. Large Scale GAN Training for High Fidelity Natural Image Synthesis , by Andrew Brock, Jeff Donahue, and Karen Simonyan

Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple “truncation trick”, allowing fine control over the trade-off between sample fidelity and variety by truncating the latent space. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128×128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.3 and Frechet Inception Distance (FID) of 9.6, improving over the previous best IS of 52.52 and FID of 18.65.

DeepMind team finds that current techniques are sufficient for synthesizing high-resolution, diverse images from available datasets such as ImageNet and JFT-300M. In particular, they show that Generative Adversarial Networks (GANs) can generate images that look very realistic if they are trained at the very large scale, i.e. using two to four times as many parameters and eight times the batch size compared to prior art. These large-scale GANs, or BigGANs, are the new state-of-the-art in class-conditional image synthesis.

GANs perform much better with the increased batch size and number of parameters.
Applying orthogonal regularization to the generator makes the model responsive to a specific technique (“truncation trick”), which provides control over the trade-off between sample fidelity and variety.
Demonstrating that GANs can benefit significantly from scaling.
Building models that allow explicit, fine-grained control of the trade-off between sample variety and fidelity.
Discovering instabilities of large-scale GANs and characterizing them empirically.
an Inception Score (IS) of 166.3 with the previous best IS of 52.52;
Frechet Inception Distance (FID) of 9.6 with the previous best FID of 18.65.
The paper is under review for next ICLR 2019.
After BigGAN generators become available on TF Hub, AI researchers from all over the world are playing with BigGANs to generate dogs, watches, bikini images, Mona Lisa, seashores and many more.
Moving to larger datasets to mitigate GAN stability issues.
Replacing expensive manual media creation for advertising and e-commerce purposes.
A BigGAN demo implemented in TensorFlow is available to use on Google’s Colab tool.
Aaron Leong has a Github repository for BigGAN implemented in PyTorch .

Want Deeper Dives Into Specific AI Research Topics?

Due to popular demand, we’ve released several of these easy-to-read summaries and syntheses of major research papers for different subtopics within AI and machine learning.

Top 10 machine learning & AI research papers of 2018
Top 10 AI fairness, accountability, transparency, and ethics (FATE) papers of 2018
Top 14 natural language processing (NLP) research papers of 2018
Top 10 computer vision and image generation research papers of 2018
Top 10 conversational AI and dialog systems research papers of 2018
Top 10 deep reinforcement learning research papers of 2018

Update: 2019 Research Summaries Are Released

Top 10 AI & machine learning research papers from 2019
Top 11 NLP achievements & papers from 2019
Top 10 research papers in conversational AI from 2019
Top 10 computer vision research papers from 2019
Top 12 AI ethics research papers introduced in 2019
Top 10 reinforcement learning research papers from 2019

Enjoy this article? Sign up for more AI research updates.

We’ll let you know when we release more summary articles like this one.

Email Address *
Name * First Last
Natural Language Processing (NLP)
Chatbots & Conversational AI
Computer Vision
Ethics & Safety
Machine Learning
Deep Learning
Reinforcement Learning
Generative Models
Other (Please Describe Below)
What is your biggest challenge with AI research? *

Reader Interactions

About Mariya Yao

Mariya is the co-author of Applied AI: A Handbook For Business Leaders and former CTO at Metamaven. She "translates" arcane technical concepts into actionable business advice for executives and designs lovable products people actually want to use. Follow her on Twitter at @thinkmariya to raise your AI IQ.

March 13, 2024 at 4:32 pm

If you have a patio, deck or pool and are looking for some fun ways to resurface it, you may be wondering how to do stamped concrete over existing patio surfaces. https://www.google.com/maps/place/?cid=10866013157741552281

March 21, 2024 at 6:18 am

Yes! Finally someone writes about tote bags.

March 27, 2024 at 7:39 am

A coloured concrete driveway can be a great option if you want to add character to a plain concrete driveway. It is durable, weatherproof, and offers many different design options. https://search.google.com/local/reviews?placeid=ChIJLRrbgctL4okRbNmXXl3Lpkk

You must be logged in to post a comment.

About TOPBOTS

Expert Contributors
Terms of Service & Privacy Policy
Contact TOPBOTS

Computer Vision Group TUM School of Computation, Information and Technology Technical University of Munich

Technical university of munich.

--> -->

Boltzmannstrasse 3
85748 Garching [email protected]

this fall.

accepted to . Check our for more details.

accepted to . Check out our for more details.

We have accepted to . Check out our for more details.

We are currently actively working on the following research topics:

Inquiries for Bachelor and Master projects are always welcome.

We have also done research in many other fields. If you are interested in any of these, do not hesitate to contact us - we are always happy to get back to those topics and do exciting research there!

Boltzmannstrasse 3
85748 Garching [email protected]

this fall.

accepted to . Check our for more details.

accepted to . Check out our for more details.

We have accepted to . Check out our for more details.

Our University
Coronavirus
Publications
Departments
Awards and Honors
University Hospitals
Teaching and QMS
Working at TUM
Contact & Directions
Research Centers
Excellence Strategy
Research projects
Research Partners
Research promotion
Doctorate (Ph.D.)
Career openings
Entrepreneurship
Technology transfer
Industry Liaison Office
Lifelong learning
Degree programs
International Students
Application
Fees and Financial Aid
During your Studies
Completing your Studies
Student Life
Accommodation
Music and Arts
Alumni Services
Career Service
TUM for schools
International Locations
International Alliances
Language Center

AI - General
Computer Vision - General
Academic Paper(s)
Facial Recognition
Image enhancing
Object Detection/Tracking
Phone Technology
How Stuff Works
Controversies
Book Reviews

Be an Optimist Prime in the world of Computer Vision and AI

How to find a good thesis topic in computer vision.

“What are some good thesis topics in Computer Vision?”

This is a common question that people ask in forums – and it’s an important question to ask for two reasons:

There’s nothing worse than starting over in research because the path you decided to take turned out to be a dead end.
There’s also nothing worse than being stuck with a generally good topic but one that doesn’t interest you at all. A “good” thesis topic has to be one that interests you and will keep you involved and stimulated for as long as possible.

For these reasons, it’s best to do as much research as you can to avoid the above pitfalls or your days of research will slowly become torturous for you – and that would be a shame because computer vision can truly be a lot of fun 🙂

So, down to business.

The purpose of this post is to propose ways to find that one perfect topic that will keep you engaged for months (or years) to come – and something you’ll be proud to talk about amongst friends and family.

I’ll start the discussion off by saying that your search strategy for topics depends entirely on whether you’re preparing for a Master’s thesis or a PhD. The former can be more general, the latter is (nearly always) very fine-grained specific. Let’s start with undergraduate topics first.

Undergraduate Studies

I’ll propose here three steps you can take to assist in your search: looking at the applications of computer vision, examining the OpenCV library, and talking to potential supervisors.

Applications of Computer Vision

Computer Vision has so many uses in the world. Why not look through a comprehensive list of them and see if anything on that list draws you in? Here’s one such list I collected from the British Machine Vision Association :

agriculture
augmented reality
autonomous vehicles (big one nowadays!)
character recognition
industrial quality inspection
face recognition
gesture analysis
image restoration
medical image analysis
pollution monitoring
process control
remote sensing
robotics (e.g. navigation)
security and surveillance

Go through this list and work out if something stands out for you. Perhaps your family is involved in agriculture? Look up how computer vision is helping in this field! The Economist wrote a fascinating article entitled “ The Future of Agriculture ” in which they discuss, among other things, the use of drones to monitor crops, create contour maps of fields, etc. Perhaps Computer Vision can assist with some of these tasks? Look into this!

OpenCV is the best library out there for image and video processing (I’ll be writing a lot more about it on this blog). Other libraries do exist that do certain specific things a little better, e.g. Tracking.js , which performs things like tracking inside the browser, but generally speaking, there’s nothing better than OpenCV.

On the topic of searching for thesis topics, I recall once reading a suggestion of going through the functions that OpenCV has to offer and seeing if anything sticks out at you there. A brilliant idea. Work down the list of the OpenCV documentation . Perhaps face recognition interests you? There are so many interesting projects where this can be utilised!

Talk to potential supervisors

You can’t go past this suggestion. Every academic has ideas constantly buzzing around his head. Academics are immersed in their field of research and are always to talking to people in the industry to look for interesting projects that they could get funding for. Go and talk to the academics at your university that are involved in Computer Vision. I’m sure they’ll have at least one project proposal ready to go for you.

You should also run any ideas of yours past them that may have emerged from the two previous steps. Or at least mention things that stood out for you (e.g. agriculture). They may be able to come up with something themselves.

PhD Studies

Well, if you’ve reached this far in your studies then chances are you have a fairly good idea of how this all works now. I won’t patronise you too much, then. But I will mention three points that I wish someone had told me prior to starting my PhD adventure:

You should be building your research topic around a supervisor . They’ve been in the field for a long time and know where the niches and dead ends are. Use their experience! If there’s a supervisor who is constantly publishing in object tracking, then doing research with them in this area makes sense.
If your supervisor has a ready-made topic for you, CONSIDER TAKING IT . I can’t stress this enough. Usually the first year of your PhD involves you searching (often blindly) around various fields in Computer Vision and then just going deeper and deeper into one specific area to find a niche. If your supervisor has a topic on hand for you, this means that you are already one year ahead of the crowd. And that means one year saved of frustration because searching around in a vast realm of publications can be daunting – believe me, I’ve been there.
Avoid going into trending topics. For example, object recognition using Convolutional Neural Networks is a topic that currently everyone is going crazy about in the world of Computer Vision. This means that in your studies, you will be competing for publications with big players (e.g. Google) who have money, manpower, and computer power at their disposal. You don’t want to enter into this war unless you are confident that your supervisor knows what they’re doing and/or your university has the capabilities to play in this big league also.

Spending time looking for a thesis topic is time worth spending. It could save you from future pitfalls. With respect to undergraduate thesis topics looking at Computer Vision applications is one place to start. The OpenCV library is another. And talking to potential supervisors at your university is also a good idea.

With respect to PhD thesis topics, it’s important to take into consideration what the fields of expertise of your potential supervisors are and then searching for topics in these areas. If these supervisors have ready-made topics for you, it is worth considering them to save you a lot of time and stress in the first year or so of your studies. Finally, it’s usually good to avoid trending topics because of the people you will be competing against for publications.

But the bottom line is, devote time to finding a topic that truly interests you . It’ll be the difference between wanting to get out of bed to do more and more research in your field or dreading each time you have to walk into your Computer Science building in the morning.

To be informed when new content like this is posted, subscribe to the mailing list:

Arduino – Rotor Sweeping for Obstacle Detection Code

Security Film or Image Enhancing is Possible

How Deep Learning Works – The Very Basics

One reply to “how to find a good thesis topic in computer vision”.

Pingback: My Top 5 Posts So Far |

Research Topics

Biomedical Imaging

The current plethora of imaging technologies such as magnetic resonance imaging (MR), computed tomography (CT), position emission tomography (PET), optical coherence tomography (OCT), and ultrasound provide great insight into the different anatomical and functional processes of the human body.

Computer Vision

Computer vision is the science and technology of teaching a computer to interpret images and video as well as a typical human. Technically, computer vision encompasses the fields of image/video processing, pattern recognition, biological vision, artificial intelligence, augmented reality, mathematical modeling, statistics, probability, optimization, 2D sensors, and photography.

Image Segmentation/Classification

Extracting information from a digital image often depends on first identifying desired objects or breaking down the image into homogenous regions (a process called 'segmentation') and then assigning these objects to particular classes (a process called 'classification'). This is a fundamental part of computer vision, combining image processing and pattern recognition techniques.

Multiresolution Techniques

The VIP lab has a particularly extensive history with multiresolution methods, and a significant number of research students have explored this theme. Multiresolution methods are very broad, essentially meaning than an image or video is modeled, represented, or features extracted on more than one scale, somehow allowing both local and non-local phenomena.

Remote Sensing

Remote sensing, or the science of capturing data of the earth from airplanes or satellites, enables regular monitoring of land, ocean, and atmosphere expanses, representing data that cannot be captured using any other means. A vast amount of information is generated by remote sensing platforms and there is an obvious need to analyze the data accurately and efficiently.

Scientific Imaging

Scientific Imaging refers to working on two- or three-dimensional imagery taken for a scientific purpose, in most cases acquired either through a microscope or remotely-sensed images taken at a distance.

Stochastic Models

In many image processing, computer vision, and pattern recognition applications, there is often a large degree of uncertainty associated with factors such as the appearance of the underlying scene within the acquired data, the location and trajectory of the object of interest, the physical appearance (e.g., size, shape, color, etc.) of the objects being detected, etc.

Video Analysis

Video analysis is a field within computer vision that involves the automatic interpretation of digital video using computer algorithms. Although humans are readily able to interpret digital video, developing algorithms for the computer to perform the same task has been highly evasive and is now an active research field.

Evolutionary Deep Intelligence

Deep learning has shown considerable promise in recent years, producing tremendous results and significantly improving the accuracy of a variety of challenging problems when compared to other machine learning methods.

Discovery Radiomics

Radiomics, which involves the high-throughput extraction and analysis of a large amount of quantitative features from medical imaging data to characterize tumor phenotype in a quantitative manner, is ushering in a new era of imaging-driven quantitative personalized cancer decision support and management.

Sports Analytics

Sports Analytics is a growing field in computer vision that analyzes visual cues from images to provide statistical data on players, teams, and games. Want to know how a player's technique improves the quality of the team? Can a team, based on their defensive position, increase their chances to the finals? These are a few out of a plethora of questions that are answered in sports analytics.

Contact Waterloo
Maps & Directions
Accessibility

The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg, and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is co-ordinated within the Office of Indigenous Relations .

Kindson The Genius

Providing the best learning experience for professionals

10 Machine Learning Project (Thesis) Topics for 2020

Are you looking for some interesting project ideas for your thesis, project or dissertation? Then be sure that a machine learning topic would be a very good topic to write on. I have outlined 10 different topics. These topics are really good because you can easily obtain the dataset (i will provide the link to the dataset) and you can as well get some support from me. Let me know if you need any support in preparing your thesis.

You can leave a comment below in the comment area.

1. Machine Learning Model for Classification and Detection of Breast Cancer (Classification)

The data is provided by the Oncology department and details instances and related attributes which are nine in all.

You can obtain the dataset from here

2. Intelligent Internet Ads Generation (Classification)

This is one of the most interesting topics for me. The reason is because the revenue generated or expended by ads campaign depends not just on the volume of the ads, but also on the relevance of the ads. Therefore it is possible to increase revenue and reduce spending by developing a Machine Learning model that select relevants ads with a high level of accuracy. The dataset provides a collection of ads as well as the structure and geometry of the ads.

Get the ads dataset from here

3. Feature Extraction for National Census Data (Clustering)

This looks like big data stuff. But no! It’s simply dataset you can use for analysis. It is the actual data obtained from the US census in 1990. There are 68 attributes for each of the records and clustering would be performed to identify trends in the data.

You can obtain census the dataset from here

4. Movie Outcome Prediction (Classification)

This is quite a tasking project but its quite interesting. Before now, there exists models to predict the ratings of movies on a scale of 0 to 10 or 1 to 5. But this takes it a step further. You actually need to determine the outcome of the movie. The data set is a large multivariate dataset of movie director, cast, individual roles of the actor, remarks, studio and relevant documents.

You can get the movies dataset from here

5. Forest Fire Area Coverage Prediction (Regression)

This project have been classified as difficult but I don’t think so. The objective to predict the the area affected by forest fires. Dataset include relevant meteological information and other parameters taken from a region of Portugal.

You can get the fire dataset from here

6. Atmospheric Ozone Level Analysis and Detection (Clustering)

Two ground ozone datasets are provided for this. Data includes temperatures at various times of the day as well as wind speed. The data included in the dataset was collected in a span of 6 years from 1998 to 2004.

You can get the Ozone dataset from here

7. Crime Prediction in New York City (Regression)

If you have watched the movie, ‘Person of Interest’ directed by Jonathan Nolan, then you will appreciate the fact that there is a possibility of predicting violent criminal activities before they actually occur. Dataset would contain historical data on crime rate, types of crimes occurrence per region.

You can get the crime dataset from here

8. Sentiment Analysis on Amazon ECommerce User Reviews (Classification)

The dataset for this project is derived from user review comments from Amazon users. The model should be able to perform analysis on the training dataset and come up with a model that classifies the reviews based on sentiments. Granularity can be improved by generating predictions based on location and other factors.

You can get the reviews dataset from here

9. Home Eletrical Power Consumption Analysis (Regression)

Everyone uses electricity at home. Or rather, almost everyone! Would is not be great to have a system that helps to predict electricity consumption. Training dataset provided for this project includes feature set such as the size of the home, duration and more

You can get the dataset from here

10. Predictive Modelling of Individual Human Knowledge (Classification and Clustering)

Here the available dataset provide a collection of data about an individual on a subject matter. You are required to create a model that would try to quantify the amount of knowledge the individual have on the given subject. You can be creating by trying to also infer the performance of the user on certain exams.

I hope these 10 Machine Learning Project topic would be helpful to you.

Thanks for reading and do leave a comment below if you need some support

kindsonthegenius

Kindson Munonye is currently completing his doctoral program in Software Engineering in Budapest University of Technology and Economics

Machine learning 101 – equation for a line and regression line, simple linear regression in machine learning (a simple tutorial), pca tutorial 1 – introduction to pca and dimensionality reduction, 2 thoughts on “ 10 machine learning project (thesis) topics for 2020 ”.

Is there any suggestion related to educational data mining?

I’m working on this. You can subscribe to my channel so when I make the update, you can get notified https://www.youtube.com/channel/UCvHgEAcw6VpcOA3864pSr5A

Thesis Topics

This list includes topics for potential bachelor or master theses, guided research, projects, seminars, and other activities. Search with Ctrl+F for desired keywords, e.g. ‘machine learning’ or others.

PLEASE NOTE: If you are interested in any of these topics, click the respective supervisor link to send a message with a simple CV, grade sheet, and topic ideas (if any). We will answer shortly.

Of course, your own ideas are always welcome!

Spatial Explicit Machine Learning

Type of work:.

Guided Research
Earth Observation
Machine Learning
Remote Sensing
Spatial Awareness Modeling
Spatial Transferability

Description:

Machine learning models designed and trained to work on a specific regions are not necessarily transferable to other spatially different region. Include a spatially explicit component is mandatory to differentiates behaviors and predictions according to spatial locations. However, it is no clear what is the best way to use this spatial information or which kind of models work best for spatial transferability. In this topic, global remote sensing data will be used for supervised learning in different Earth observation applications.

Feel free to reach out if you have any question or ideas regarding the topic.

Image Super-Resolution in both ways

auto-encoder
deep learning
single image super-resolution

The goal of this project is to develop and evaluate a novel dual-decoder architecture for image super-resolution (SR) [1]. This architecture will utilize a single encoder to extract features from an input image, followed by two decoders: one trained to map the features to a low-resolution (LR) output, and the other to map the features to a high-resolution (HR) output. This approach aims to enhance the SR performance by leveraging the complementary learning objectives of both decoders. The goal of the work is to try different architectures and to analyze different loss formulations as well as the feature space learned by the encoder.

[1] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances

Applying TaylorShift to Transfomer-based Image Super-Resolution Models

vision transformer

The aim of this project is to integrate the TaylorShift [1] attention mechanism into the SwinIR model to enhance the efficiency and performance of image super-resolution (SR) [2]. By leveraging the linear complexity of TaylorShift, we intend to improve the processing speed and reduce the memory footprint of SwinIR without compromising its high accuracy in generating high-resolution images from low-resolution inputs. Image super-resolution is a crucial task in computer vision that aims to enhance the resolution of images, making them clearer and more detailed. SwinIR (Swin Transformer for Image Restoration) has shown state-of-the-art performance in various image restoration tasks, including super-resolution. However, the quadratic complexity of its attention mechanism can be a bottleneck, especially for high-resolution images. TaylorShift, a novel reformulation of the Taylor softmax function, addresses this issue by reducing the complexity of the attention mechanism from quadratic to linear. This enables efficient processing of long sequences and high-resolution images while maintaining the ability to capture intricate token-to-token interactions.

[1] TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax
[2] Hitchhiker’s Guide to Super-Resolution: Introduction and Recent Advances

Sherlock Holmes goes AI - Generative comics art of detective scenes and identikits

Bias in image generation models
Deep Learning Frameworks
Frontend visualization
Speech-To-Text, Text-to-Image Models
Transformers, Diffusion Models, Hugging Face

Sherlock Holmes is taking the statement of the witness. The witness is describing the appearance of the perpetrator and the forensic setting they still remember. Your task as the AI investigator will be to generate a comic sketch of the scene and phantom images of the accused person based on the spoken statement of the witness. For this you will use state-of-the-art transformers and visualize the output in an application. As AI investigator you will detect, qualify and quantify bias in the images which are produced by different generation models you have chosen.

This work is embedded in the DFKI KI4Pol lab together with the law enforcement agencies. The stories are fictional you will not work on true crime.

Requirements:

German level B1/2 or equivalent
Outstanding academic achievements
Motivational cover letter

Knowledge Graphs für das Immobilienmanagement

corporate memory
knowledge graph

Das Management von Immobilien ist komplex und umfasst verschiedenste Informationsquellen und -objekte zur Durchführung der Prozesse. Ein Corporate Memory kann hier unterstützen in der Analyse und Abbildung des Informationsraums um Wissensdienste zu ermöglichen. Aufgabe ist es, eine Ontologie für das Immobilienmanagement zu entwerfen und beispielhaft ein Szenario zu entwickeln. Für die Materialien und Anwendungspartner sind gute Deutschkenntnisse erforderlich.

Fault and Efficiency Prediction in High Performance Computing

Master Thesis
event data modelling
survival modelling
time series

High use of resources are thought to be an indirect cause of failures in large cluster systems, but little work has systematically investigated the role of high resource usage on system failures, largely due to the lack of a comprehensive resource monitoring tool which resolves resource use by job and node. This project studies log data of the DFKI Kaiserslautern high performance cluster to consider the predictability of adverse events (node failure, GPU freeze), energy usage and identify the most relevant data within. The second supervisor for this work is Joachim Folz.

Data is available via Prometheus -compatible system:

Node exporter
DCGM exporter
Slurm exporter
Linking Resource Usage Anomalies with System Failures from Cluster Log Data
Deep Survival Models

Feel free to reach out if the topic sounds interesting or if you have ideas related to this work. We can then brainstorm a specific research question together. Link to my personal website.

Construction & Application of Enterprise Knowledge Graphs in the E-Invoicing Domain

Guided Research Project
knowledge graphs
knowledge services
linked data
semantic web

In recent years knowledge graphs received a lot of attention as well in industry as in science. Knowledge graphs consist of entities and relationships between them and allow integrating new knowledge arbitrarily. Famous instances in industry are knowledge graphs by Microsoft, Google, Facebook or IBM. But beyond these ones, knowledge graphs are also adopted in more domain specific scenarios such as in e-Procurement, e-Invoicing and purchase-to-pay processes. The objective in theses and projects is to explore particular aspects of constructing and/or applying knowledge graphs in the domain of purchase-to-pay processes and e-Invoicing.

Anomaly detection in time-series

explainability

Working on deep neural networks for making the time-series anomaly detection process more robust. An important aspect of this process is explainability of the decision taken by a network.

Time Series Forecasting Using transformer Networks

time series forecasting
transformer networks

Transformer networks have emerged as competent architecture for modeling sequences. This research will primarily focus on using transformer networks for forecasting time series (multivariate/ univariate) and may also involve fusing knowledge into the machine learning architecture.

Topics on Machine Learning under Imperfect Supervision

This dissertation comprises several studies addressing supervised learning problems where the supervision is imperfect. Firstly, we investigate the margin conditions in active learning. Active learning is characterized by its special mechanism where the learner can sample freely over the feature space and exploit mostly the limited labeling budget by querying the most informative labels. Our primary focus is to discern critical conditions under which certain active learning algorithms can outperform the optimal passive learning minimax rate. Within a non-parametric multi-class classification framework,our results reveal that the uniqueness of Bayes labels across the feature space serves as the pivotal determinant for the superiority of active learning over passive learning. Secondly, we study the estimation of central mean subspace (CMS), and its application in transfer learning. We show that a fast parametric convergence rate is achievable via estimating the expected smoothed gradient outer product, for a general class of covariate distribution that admits Gaussian or heavier distributions. When the link function is a polynomial with a degree of at most r and the covariates follow the standard Gaussian, we show that the prefactor depends on the ambient dimension d as d^r. Furthermore, we show that under a transfer learning setting, an oracle rate of prediction error as if the CMS is known is achievable, when the source training data is abundant. Finally, we present an innovative application involving the utilization of weak (noisy) labels for addressing an Individual Tree Crown (ITC) segmentation challenge. Here, the objective is to delineate individual tree crowns within a 3D LiDAR scan of tropical forests, with only 2D noisy manual delineations of crowns on RGB images available as a source of weak supervision. We propose a refinement algorithm designed to enhance the performance of existing unsupervised learning methodologies for the ITC segmentation problem.

Computer science
Machine learning--Statistical methods
Transfer learning (Machine learning)
Active learning
Bayesian statistical decision theory
Crowns (Botany)
Gaussian processes

thumnail for Yuan_columbia_0054D_18528.pdf

More About This Work

DOI Copy DOI to clipboard

Current Members
Off-Campus Students
Robot Videos
Funded Projects
Publications by Year
Publications by Type
Robot Learning Lecture
Robot Learning IP
Humanoid Robotics Seminar
Research Oberseminar
New, Open Topics
Ongoing Theses
Completed Theses
External Theses
Advice for Thesis Students
Thesis Checklist and Template
Jobs and Open Positions
Current Openings
Information for Applicants
Apply Here!
TU Darmstadt Student Hiwi Jobs
Contact Information

Currently Available Theses Topics

We offer these current topics directly for Bachelor and Master students at TU Darmstadt who can feel free to DIRECTLY contact the thesis advisor if you are interested in one of these topics. Excellent external students from another university may be accepted but are required to first email Jan Peters before contacting any other lab member for a thesis topic. Note that we cannot provide funding for any of these theses projects.

We highly recommend that you do either our robotics and machine learning lectures ( Robot Learning , Statistical Machine Learning ) or our colleagues ( Grundlagen der Robotik , Probabilistic Graphical Models and/or Deep Learning). Even more important to us is that you take both Robot Learning: Integrated Project, Part 1 (Literature Review and Simulation Studies) and Part 2 (Evaluation and Submission to a Conference) before doing a thesis with us.

In addition, we are usually happy to devise new topics on request to suit the abilities of excellent students. Please DIRECTLY contact the thesis advisor if you are interested in one of these topics. When you contact the advisor, it would be nice if you could mention (1) WHY you are interested in the topic (dreams, parts of the problem, etc), and (2) WHAT makes you special for the projects (e.g., class work, project experience, special programming or math skills, prior work, etc.). Supplementary materials (CV, grades, etc) are highly appreciated. Of course, such materials are not mandatory but they help the advisor to see whether the topic is too easy, just about right or too hard for you.

Only contact *ONE* potential advisor at the same time! If you contact a second one without first concluding discussions with the first advisor (i.e., decide for or against the thesis with her or him), we may not consider you at all. Only if you are super excited for at most two topics send an email to both supervisors, so that the supervisors are aware of the additional interest.

FOR FB16+FB18 STUDENTS: Students from other depts at TU Darmstadt (e.g., ME, EE, IST), you need an additional formal supervisor who officially issues the topic. Please do not try to arrange your home dept advisor by yourself but let the supervising IAS member get in touch with that person instead. Multiple professors from other depts have complained that they were asked to co-supervise before getting contacted by our advising lab member.

NEW THESES START HERE

Data-Driven Bimanual Robotic Grasping

Scope: Bachelor/Master thesis Advisor: Vignesh Prasad and Alap Kshirsagar Added: 2024-04-25 Start: ASAP Topic: Topic:

Grasping is one of the most fundamental and challenging tasks in the robotic manipulation of objects. Most of the prior work on robotic grasping has focused on grasping with a single gripper and several large-scale datasets have been developed in recent years to tackle the problem of single-arm grasping in 3D by utilizing deep-learning techniques [1,2]. But many tasks in industrial and domestic environments require bimanual grasps. Bimanual grasps are required for manipulation of large, deformable or fragile objects. This project seeks to develop a data-driven technique for bimanual robotic grasp generation from visual input. We will utilize a large-scale dataset of simulated bimanual grasps [3] to train a bimanual grasp pose generation model. The method will be evaluated in simulation as well as on a real robot.

Requirements

Strong Python programming skills
Knowledge in Machine Learning / Supervised Learning
Experience with deep learning libraries is a plus

Interested students can apply by sending an e-mail to [email protected] and attaching the documents mentioned below:

Curriculum Vitae
Motivation letter explaining why you would like to work on this topic and why you are the perfect candidate

References [1] C. Eppner, A. Mousavian, and D. Fox, “ACRONYM: A Large-Scale Grasp Dataset Based on Simulation,” in Proceedings - IEEE International Conference on Robotics and Automation, 2021, vol. 2021-May, pp. 6222–6227, doi: 10.1109/ICRA48506.2021.9560844. [2] A. Mousavian, C. Eppner, and Di. Fox, “6-DOF GraspNet: Variational grasp generation for object manipulation,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, vol. 2019-Octob, pp. 2901–2910, doi: 10.1109/ICCV.2019.00299. [3] G. Zhai et al., “{DA2} Dataset: Toward Dexterity-Aware Dual-Arm Grasping,” IEEE Robot. Autom. Lett., vol. 7, no. 4, pp. 8941–8948, 2022.

Imitation Learning for High-Speed Robot Air Hockey

Scope: Master thesis Advisor: Puze Liu and Julen Urain De Jesus Start: ASAP Topic:

High-speed reactive motion is one of the fundamental capabilities of robots to achieve human-level behavior. Optimization-based methods suffer from real-time requirement when the problem is non-convex and contains constraints. Reinforcement learning requires extensive reward engineering to achieve the desired performance. Imitation learning, on the other hand, gathers human knowledge directly from data collection and enables robots to learn natural movements efficiently. In this paper, we explore how imitation learning can be performed in a complex robot Air Hockey Task. The robot needs to learn not only low-level skills, but also high-level tactics from human demonstrations.

Good Knowledge in Robotics

References * Chi, Cheng, et al. "Diffusion policy: Visuomotor policy learning via action diffusion." arXiv preprint arXiv:2303.04137 (2023). * Liu, Puze, et al. "Robot reinforcement learning on the constraint manifold." Conference on Robot Learning. PMLR (2022). * Pan, Yunpeng, et al. "Imitation learning for agile autonomous driving." The International Journal of Robotics Research 39.2-3 (2020). Interested students can apply by sending an e-mail to [email protected] and attaching the required documents mentioned above.

Walk your network: investigating neural network’s location in Q-learning methods.

Scope: Master thesis Advisor: Theo Vincent Start: Flexible Topic:

Q-learning methods are at the heart of Reinforcement Learning. They have been shown to outperform humans on some complex tasks such as playing video games [1]. In robotics, where the action space is in most cases continuous, actor-critic methods are relying on Q-learning methods to learn the critic [2]. Although Q-learning methods have been extensively studied in the past, little focus has been placed on the way the online neural network is exploring the space of Q functions. Most approaches focus on crafting a loss that would make the agent learn better policies [3]. Here, we offer a thesis that focuses on the position of the online Q neural network in the space of Q functions. The student will first investigate this idea on simple problems before comparing the performance to strong baselines such as DQN or REM [1, 4] on Atari games. Depending on the result, the student might as well get into MuJoCo and compare the results with SAC [2]. The student will be welcome to propose some ideas as well.

Highly motivated students can apply by sending an email to [email protected] . Please attach your CV, a grade sheet and clearly state why you are interested in this topic. Students who have followed the Reinforcement Learning or Robot Learning course will be prioritized.

Knowledge in Reinforcement Learning

References [1] Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." nature 518.7540 (2015): 529-533. [2] Haarnoja, Tuomas, et al. "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor." International conference on machine learning. PMLR, 2018. [3] Hessel, Matteo, et al. "Rainbow: Combining improvements in deep reinforcement learning." Proceedings of the AAAI conference on artificial intelligence. Vol. 32. No. 1. 2018. [4] Agarwal, R., Schuurmans, D. & Norouzi, M.. (2020). An Optimistic Perspective on Offline Reinforcement Learning International Conference on Machine Learning (ICML).

Co-optimizing Hand and Action for Robotic Grasping of Deformable objects

This project aims to advance deformable object manipulation by co-optimizing robot gripper morphology and control policies. The project will involve utilizing existing simulation environments for deformable object manipulation [2] and implementing a method to jointly optimize gripper morphology and grasp policies within the simulation.

Required Qualification:

Familiarity with deep learning libraries such as PyTorch or Tensorflow

Preferred Qualification:

Attendance of the lectures "Statistical Machine Learning", "Computational Engineering and Robotics" and "Robot Learning"

Application Requirements:

Interested students can apply by sending an e-mail to [email protected] and attaching the required documents mentioned above.

References: [1] Xu, Jie, et al. "An End-to-End Differentiable Framework for Contact-Aware Robot Design." Robotics: Science & Systems. 2021. [2] Huang, Isabella, et al. "DefGraspNets: Grasp Planning on 3D Fields with Graph Neural Nets." arXiv preprint arXiv:2303.16138 (2023).

Geometry-Aware Diffusion Models for Robotics

In this thesis, you will work on developing an imitation learning algorithm using diffusion models for robotic manipulation tasks, such as the ones in [2, 3, 4], but taking into account the geometry of the task space.

If this sounds interesting, please send an email to [email protected] and [email protected] , and possibly attach your CV, highlighting the relevant courses you took in robotics and machine learning.

What's in it for you:

You get to work on an exciting topic at the intersection of deep-learning and robotics
We will supervise you closely throughout your thesis
Depending on the results, we will aim for an international conference publication

Requirements:

Be motivated -- we will support you a lot, but we expect you to contribute a lot too
Robotics knowledge
Experience setting up deep learning pipelines -- from data collection, architecture design, training, and evaluation
PyTorch -- especially experience writing good parallelizable code (i.e., runs fast in the GPU)

References: [1] https://arxiv.org/abs/2112.10752 [2] https://arxiv.org/abs/2308.01557 [3] https://arxiv.org/abs/2209.03855 [4] https://arxiv.org/abs/2303.04137 [5] https://arxiv.org/abs/2205.09991

Learning Latent Representations for Embodied Agents

Interested students can apply by sending an E-Mail to [email protected] and attaching the required documents mentioned below.

Experience with TensorFlow/PyTorch
Familiarity with core Machine Learning topics
Experience programming/controlling robots (either simulated or real world)
Knowledgeable about different robot platforms (quadrupeds and bipedal robots)
Resume / CV
Cover letter explaining why this topic fits you well and why you are an ideal candidate

References: [1] Ho and Ermon. "Generative adversarial imitation learning" [2] Arenz, et al. "Efficient Gradient-Free Variational Inference using Policy Search"

Characterizing Fear-induced Adaptation of Balance by Inverse Reinforcement Learning

Interested students can apply by sending an E-Mail to [email protected] and attaching the required documents mentioned below.

Basic knowledge of reinforcement learning
Hand-on experience with reinforcement learning or inverse reinforcement learning
Cognitive science background

References: [1] Maki, et al. "Fear of Falling and Postural Performance in the Elderly" [2] Davis et al. "The relationship between fear of falling and human postural control" [3] Ho and Ermon. "Generative adversarial imitation learning"

Timing is Key: CPGs for regularizing Quadruped Gaits learned with DRL

To tackle this problem we want to utilize Central Pattern Generators (CPGs), which can generate timings for ground contacts for the four feet. The policy gets rewarded for complying with the contact patterns of the CPGs. This leads to a straightforward way of regularizing and steering the policy to a natural gait without posing too strong restrictions on it. We first want to manually find fitting CPG parameters for different gait velocities and later move to learning those parameters in an end-to-end fashion.

Highly motivated students can apply by sending an E-Mail to [email protected] and attaching the required documents mentioned below.

Minimum Qualification:

Good Python programming skills
Basic knowledge of the PyTorch library
Basic knowledge of Reinforcement Learning
Good knowledge of the PyTorch library
Basic knowledge of the MuJoCo simulator

References: [1] Cheng, Xuxin, et al. "Extreme Parkour with Legged Robots."

Damage-aware Reinforcement Learning for Deformable and Fragile Objects

Goal of this thesis will be the development and application of a model-based reinforcement learning method on real robots. Your tasks will include: 1. Setting up a simulation environment for deformable object manipulation 2. Utilizing existing models for stress and deformability prediction[1] 3. Implementing a reinforcement learning method to work in simulation and, if possible, on the real robot methods.

If you are interested in this thesis topic and believe you possess the necessary skills and qualifications, please submit your application, including a resume and a brief motivation letter explaining your interest and relevant experience. Please send your application to [email protected].

Required Qualification :

Enthusiasm for and experience in robotics, machine learning, and simulation
Strong programming skills in Python

Desired Qualification :

Attendance of the lectures "Statistical Machine Learning", "Computational Engineering and Robotics" and (optionally) "Robot Learning"

References: [1] Huang, I., Narang, Y., Bajcsy, R., Ramos, F., Hermans, T., & Fox, D. (2023). DefGraspNets: Grasp Planning on 3D Fields with Graph Neural Nets. arXiv preprint arXiv:2303.16138.

Imitation Learning meets Diffusion Models for Robotics

The objective of this thesis is to build upon prior research [2, 3] to establish a connection between Diffusion Models and Imitation Learning. We aim to explore how to exploit Diffusion Models and improve the performance of Imitation learning algorithms that interact with the world.

We welcome highly motivated students to apply for this opportunity by sending an email expressing their interest to Firas Al-Hafez ( [email protected] ) Julen Urain ( [email protected] ). Please attach your letter of motivation and CV, and clearly state why you are interested in this topic and why you are the ideal candidate for this position.

Required Qualification : 1. Strong Python programming skills 2. Basic Knowledge in Imitation Learning 3. Interest in Diffusion models, Reinforcement Learning

Desired Qualification : 1. Attendance of the lectures "Statistical Machine Learning", "Computational Engineering and Robotics" and/or "Reinforcement Learning: From Fundamentals to the Deep Approaches"

References: [1] Song, Yang, and Stefano Ermon. "Generative modeling by estimating gradients of the data distribution." Advances in neural information processing systems 32 (2019). [2] Ho, Jonathan, and Stefano Ermon. "Generative adversarial imitation learning." Advances in neural information processing systems 29 (2016). [3] Garg, D., Chakraborty, S., Cundy, C., Song, J., & Ermon, S. (2021). Iq-learn: Inverse soft-q learning for imitation. Advances in Neural Information Processing Systems, 34, 4028-4039. [4] Chen, R. T., & Lipman, Y. (2023). Riemannian flow matching on general geometries. arXiv preprint arXiv:2302.03660.

Be extremely motivated -- we will support you a lot, but we expect you to contribute a lot too

Scaling Behavior Cloning to Humanoid Locomotion

Scope: Bachelor / Master thesis Advisor: Joe Watson Added: 2023-10-07 Start: ASAP Topic: In a previous project [1], I found that behavior cloning (BC) was a surprisingly poor baseline for imitating humanoid locomotion. I suspect the issue may lie in the challenges of regularizing high-dimensional regression.

The goal of this project is to investigate BC for humanoid imitation, understand the scaling issues present, and evaluate possible solutions, e.g. regularization strategies from the regression literature.

The project will be building off Google Deepmind's Acme library [2], which has BC algorithms and humanoid demonstration datasets [3] already implemented, and will serve as the foundation of the project.

To apply, email [email protected] , ideally with a CV and transcript so I can assess your suitability.

Experience, interest and enthusiasm for the intersection of robot learning and machine learning
Experience with Acme and JAX would be a benefit, but not necessary

References: [1] https://arxiv.org/abs/2305.16498 [2] https://github.com/google-deepmind/acme [3] https://arxiv.org/abs/2106.00672

Robot Gaze for Communicating Collision Avoidance Intent in Shared Workspaces

Scope: Bachelor/Master thesis Advisor: Alap Kshirsagar , Dorothea Koert Added: 2023-09-27 Start: ASAP

Topic: In order to operate close to non-experts, future robots require both an intuitive form of instruction accessible to lay users and the ability to react appropriately to a human co-worker. Instruction by imitation learning with probabilistic movement primitives (ProMPs) [1] allows capturing tasks by learning robot trajectories from demonstrations including the motion variability. However, appropriate responses to human co-workers during the execution of the learned movements are crucial for fluent task execution, perceived safety, and subjective comfort. To facilitate such appropriate responsive behaviors in human-robot interaction, the robot needs to be able to react to its human workspace co-inhabitant online during the execution. Also, the robot needs to communicate its motion intent to the human through non-verbal gestures such as eye and head gazes [2][3]. In particular for humanoid robots, combining motions of arms with expressive head and gaze directions is a promising approach that has not yet been extensively studied in related work.

Goals of the thesis:

Develop a method to combine robot head/gaze motion with ProMPs for online collision avoidance
Implement the method on a Franka-Emika Panda Robot
Evaluate and compare the implemented behaviors in a study with human participants

Highly motivated students can apply by sending an email to [email protected]. Please attach your CV and transcript, and clearly state your prior experiences and why you are interested in this topic.

Strong Programming Skills in python
Prior experience with Robot Operating System (ROS) and user studies would be beneficial
Strong motivation for human-centered robotics including design and implementation of a user study

References : [1] Koert, Dorothea, et al. "Learning intention aware online adaptation of movement primitives." IEEE Robotics and Automation Letters 4.4 (2019): 3719-3726. [2] Admoni, Henny, and Brian Scassellati. "Social eye gaze in human-robot interaction: a review." Journal of Human-Robot Interaction 6.1 (2017): 25-63. [3] Lemasurier, Gregory, et al. "Methods for expressing robot intent for human–robot collaboration in shared workspaces." ACM Transactions on Human-Robot Interaction (THRI) 10.4 (2021): 1-27.

Tactile Sensing for the Real World

Topic: Tactile sensing is a crucial sensing modality that allows humans to perform dexterous manipulation[1]. In recent years, the development of artificial tactile sensors has made substantial progress, with current models relying on cameras inside the fingertips to extract information about the points of contact [2]. However, robotic tactile sensing is still a largely unsolved topic despite these developments. A central challenge of tactile sensing is the extraction of usable representations of sensor readings, especially since these generally contain an incomplete view of the environment.

Recent model-based reinforcement learning methods like Dreamer [3] leverage latent state-space models to reason about the environment from partial and noisy observations. However, more work has yet to be done to apply such methods to real-world manipulation tasks. Hence, this thesis will explore whether Dreamer can solve challenging real-world manipulation tasks by leveraging tactile information. Initial results suggest that tasks like peg-in-a-hole can indeed be solved with Dreamer in simulation (see figure above), but the applicability of this method in the real world has yet to be shown.

In this work, you will work with state-of-the-art hardware and compute resources on a hot research topic with the option of publishing your work at a scientific conference.

Highly motivated students can apply by sending an email to [email protected]. Please attach a transcript of records and clearly state your prior experiences and why you are interested in this topic.

Ideally experience with deep learning libraries like JAX or PyTorch
Experience with reinforcement learning is a plus
Experience with Linux

References [1] 2S Match Anest2, Roland Johansson Lab (2005), https://www.youtube.com/watch?v=HH6QD0MgqDQ [2] Gelsight Inc., Gelsight Mini, https://www.gelsight.com/gelsightmini/ [3] Hafner, D., Lillicrap, T., Ba, J., & Norouzi, M. (2019). Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603.

Large Vision-Language Neural Networks for Open-Vocabulary Robotic Manipulation

Robots are expected to soon leave their factory/laboratory enclosures and operate autonomously in everyday unstructured environments such as households. Semantic information is especially important when considering real-world robotic applications where the robot needs to re-arrange objects as per a set of language instructions or human inputs (as shown in the figure). Many sophisticated semantic segmentation networks exist [1]. However, a challenge when using such methods in the real world is that the semantic classes rarely align perfectly with the language input received by the robot. For instance, a human language instruction might request a ‘glass’ or ‘water’, but the semantic classes detected might be ‘cup’ or ‘drink’.

Nevertheless, with the rise of large language and vision-language models, we now have capable segmentation models that do not directly predict semantic classes but use learned associations between language queries and classes to give us ’open-vocabulary’ segmentation [2]. Some models are especially powerful since they can be used with arbitrary language queries.

In this thesis, we aim to build on advances in 3D vision-based robot manipulation and large open-vocabulary vision models [2] to build a full pick-and-place pipeline for real-world manipulation. We also aim to find synergies between scene reconstruction and semantic segmentation to determine if knowing the object semantics can aid the reconstruction of the objects and, in turn, aid manipulation.

Highly motivated students can apply by sending an e-mail expressing their interest to Snehal Jauhri (email: [email protected]) or Ali Younes (email: [email protected]), attaching your letter of motivation and possibly your CV.

Topic in detail : Thesis_Doc.pdf

Requirements: Enthusiasm, ambition, and a curious mind go a long way. There will be ample supervision provided to help the student understand basic as well as advanced concepts. However, prior knowledge of computer vision, robotics, and Python programming would be a plus.

References: [1] Y. Wu, A. Kirillov, F. Massa, W.-Y. Lo, and R. Girshick, “Detectron2”, https://github.com/facebookresearch/detectron2 , 2019. [2] F. Liang, B. Wu, X. Dai, K. Li, Y. Zhao, H. Zhang, P. Zhang, P. Vajda, and D. Marculescu, “Open-vocabulary semantic segmentation with mask-adapted clip,” in CVPR, 2023, pp. 7061–7070, https://github.com/facebookresearch/ov-seg

Dynamic Tiles for Deep Reinforcement Learning

Linear approximators in Reinforcement Learning are well-studied and come with an in-depth theoretical analysis. However, linear methods require defining a set of features of the state to be used by the linear approximation. Unfortunately, the feature construction process is a particularly problematic and challenging task. Deep Reinforcement learning methods have been introduced to mitigate the feature construction problem: these methods do not require handcrafted features, as features are extracted automatically by the network during learning, using gradient descent techniques.

In simple reinforcement learning tasks, however, it is possible to use tile coding as features: Tiles are simply a convenient discretization of the state space that allows us to easily control the generalization capabilities of the linear approximator. The objective of this thesis is to design a novel algorithm for automatic feature extraction that generates a set of features similar to tile coding, but that can arbitrarily partition the state space and deal with arbitrary complex state space, such as images. The idea is to combine the feature extraction problem directly with Linear Reinforcement Learning methods, defining an algorithm that is able both to have the theoretical guarantees and good convergence properties of these methods and the flexibility of Deep Learning approaches.

Curriculum Vitae (CV);
A motivation letter explaining the reason for applying for this thesis and academic/career objectives.

Minimum knowledge

Good Python programming skills;
Basic knowledge of Reinforcement Learning.

Preferred knowledge

Knowledge of the PyTorch library;
Knowledge of the Atari environments (ale-py library).
Knowledge of the MushroomRL library.

Accepted candidate will

Define a generalization of tile coding working with an arbitrary input set (including images);
Design a learning algorithm to adapt the tiles using data of interaction with the environment;
Combine feature learning with standard linear methods for Reinforcement Learning;
Verify the novel methodology in simple continuous state and discrete actions environments;
(Optionally) Extend the experimental analysis to the Atari environment setting.

Deep Learning Meets Teleoperation: Constructing Learnable and Stable Inductive Guidance for Shared Control

This work considers policies as learnable inductive guidance for shared control. In particular, we use the class of Riemannian motion policies [3] and consider them as differentiable optimization layers [4]. We analyze (i) if RMPs can be pre-trained by learning from demonstrations [5] or reinforcement learning [6] given a specific context; (ii) and subsequently employed seamlessly for human-guided teleoperation thanks to their physically consistent properties, such as stability [3]. We believe this step eliminates the laborious process of constructing complex policies and leads to improved and generalizable shared control architectures.

Highly motivated students can apply by sending an e-mail expressing your interest to [email protected] and [email protected] , attaching your letter of motivation and possibly your CV.

Experience with deep learning libraries (in particular Pytorch)
Knowledge in reinforcement learning and/or machine learning

References: [1] Niemeyer, Günter, et al. "Telerobotics." Springer handbook of robotics (2016); [2] Selvaggio, Mario, et al. "Autonomy in physical human-robot interaction: A brief survey." IEEE RAL (2021); [3] Cheng, Ching-An, et al. "RMP flow: A Computational Graph for Automatic Motion Policy Generation." Springer (2020); [4] Jaquier, Noémie, et al. "Learning to sequence and blend robot skills via differentiable optimization." IEEE RAL (2022); [5] Mukadam, Mustafa, et al. "Riemannian motion policy fusion through learnable lyapunov function reshaping." CoRL (2020); [6] Xie, Mandy, et al. "Neural geometric fabrics: Efficiently learning high-dimensional policies from demonstration." CoRL (2023).

Dynamic symphony: Seamless human-robot collaboration through hierarchical policy blending

This work focuses on arbitration between the user and assistive policy, i.e., shared autonomy. Various works allow the user to influence the dynamic behavior explicitly and, therefore, could not satisfy stability guarantees [3]. We pursue the idea of formulating arbitration as a trajectory-tracking problem that implicitly considers the user's desired behavior as an objective [4]. Therefore, we extend the work of Hansel et al. [5], who employed probabilistic inference for policy blending in robot motion control. The proposed method corresponds to a sampling-based online planner that superposes reactive policies given a predefined objective. This method enables the user to implicitly influence the behavior without injecting energy into the system, thus satisfying stability properties. We believe this step leads to an alternative view of shared autonomy with an improved and generalizable framework.

Highly motivated students can apply by sending an e-mail expressing your interest to [email protected] or [email protected] , attaching your letter of motivation and possibly your CV.

References: [1] Niemeyer, Günter, et al. "Telerobotics." Springer handbook of robotics (2016); [2] Selvaggio, Mario, et al. "Autonomy in physical human-robot interaction: A brief survey." IEEE RAL (2021); [3] Dragan, Anca D., and Siddhartha S. Srinivasa. "A policy-blending formalism for shared control." IJRR (2013); [4] Javdani, Shervin, et al. "Shared autonomy via hindsight optimization for teleoperation and teaming." IJRR (2018); [5] Hansel, Kay, et al. "Hierarchical Policy Blending as Inference for Reactive Robot Control." IEEE ICRA (2023).

Feeling the Heat: Igniting Matches via Tactile Sensing and Human Demonstrations

In this thesis, we want to investigate the effectiveness of vision-based tactile sensors for solving dynamic tasks (igniting matches). Since the whole task is difficult to simulate, we directly collect real-world data to learn policies from the human demonstrations [2,3]. We believe that this work is an important step towards more advanced tactile skills.

Highly motivated students can apply by sending an e-mail expressing your interest to [email protected] and [email protected] , attaching your letter of motivation and possibly your CV.

Good knowledge of Python
Prior experience with real robots and Linux is a plus

References: [1] https://www.youtube.com/watch?v=HH6QD0MgqDQ [2] Learning Compliant Manipulation through Kinesthetic and Tactile Human-Robot Interaction; Klas Kronander and Aude Billard. [3] https://www.youtube.com/watch?v=jAtNvfPrKH8

Inverse Reinforcement Learning for Neuromuscular Control of Humanoids

Within this thesis, the problems of learning from observations and efficient exploration in overactued systems should be addressed. Regarding the former, novel methods incorporating inverse dynamics models into the inverse reinforcement learning problem [1] should be adapted and applied. To address the problem of efficient exploration in overactuted systems, two approaches should be implemented and compared. The first approach uses a handcrafted action space, which disables and modulates actions in different phases of the gait based on biomechanics knowledge [2]. The second approach uses a stateful policy to incorporate an inductive bias into the policy [3]. The thesis will be supervised in conjunction with Guoping Zhao ( [email protected] ) from the locomotion lab.

Highly motivated students can apply by sending an e-mail expressing their interest to Firas Al-Hafez ( [email protected] ), attaching your letter of motivation and possibly your CV. Try to make clear why you would like to work on this topic, and why you would be the perfect candidate for the latter.

Required Qualification : 1. Strong Python programming skills 2. Knowledge in Reinforcement Learning 3. Interest in understanding human locomotion

Desired Qualification : 1. Hands-on experience on robotics-related RL projects 2. Prior experience with different simulators 3. Attendance of the lectures "Statistical Machine Learning", "Computational Engineering and Robotics" and/or "Reinforcement Learning: From Fundamentals to the Deep Approaches"

References: [1] Al-Hafez, F.; Tateo, D.; Arenz, O.; Zhao, G.; Peters, J. (2023). LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning, International Conference on Learning Representations (ICLR). [2] Ong CF; Geijtenbeek T.; Hicks JL; Delp SL (2019) Predicting gait adaptations due to ankle plantarflexor muscle weakness and contracture using physics-based musculoskeletal simulations. PLoS Computational Biology [3] Srouji, M.; Zhang, J:;Salakhutdinow, R. (2018) Structured Control Nets for Deep Reinforcement Learning, International Conference on Machine Learning (ICML)

Robotic Tactile Exploratory Procedures for Identifying Object Properties

Goals of the thesis

Literature review of robotic EPs for identifying object properties [2,3,4]
Develop and implement robotic EPs for a Digit tactile sensor
Compare performance of robotic EPs with human EPs

Desired Qualifications

Interested in working with real robotic systems
Python programming skills

Literature [1] Lederman and Klatzky, “Haptic perception: a tutorial” [2] Seminara et al., “Active Haptic Perception in Robots: A Review” [3] Chu et al., “Using robotic exploratory procedures to learn the meaning of haptic adjectives” [4] Kerzel et al., “Neuro-Robotic Haptic Object Classification by Active Exploration on a Novel Dataset”

Scaling learned, graph-based assembly policies

scaling our previous methods to incorporate mobile manipulators or the Kobo bi-manual manipulation platform. The increased workspace of both would allow for handling a wider range of objects
[2] has shown more powerful, yet, it includes running a MILP for every desired structure. Thus another idea could be to investigate approaches aiming to approximate this solution
adapting the methods to handle more irregular-shaped objects / investigate curriculum learning

Highly motivated students can apply by sending an e-mail expressing your interest to [email protected] , attaching your letter of motivation and possibly your CV.

Experience with deep learning libraries (in particular Pytorch) is a plus
Experience with reinforcement learning / having taken Robot Learning is also a plus

References: [1] Learn2Assemble with Structured Representations and Search for Robotic Architectural Construction; Niklas Funk et al. [2] Graph-based Reinforcement Learning meets Mixed Integer Programs: An application to 3D robot assembly discovery; Niklas Funk et al. [3] Structured agents for physical construction; Victor Bapst et al.

Long-Horizon Manipulation Tasks from Visual Imitation Learning (LHMT-VIL): Algorithm

The proposed architecture can be broken down into the following sub-tasks: 1. Multi-object 6D pose estimation from video: Identify the object 6D poses in each video frame to generate the object trajectories 2. Action segmentation from video: Classify the action being performed in each video frame 3. High-level task representation learning: Learn the sequence of robotic movement primitives with the associated object poses such that the robot completes the demonstrated task 4. Low-level movement primitives: Create a database of low-level robotic movement primitives which can be sequenced to solve the long-horizon task

Desired Qualification: 1. Strong Python programming skills 2. Prior experience in Computer Vision and/or Robotics is preferred

Long-Horizon Manipulation Tasks from Visual Imitation Learning (LHMT-VIL): Dataset

During the project, we will create a large-scale dataset of videos of humans demonstrating industrial assembly sequences. The dataset will contain information of the 6D poses of the objects, the hand and body poses of the human, the action sequences among numerous other features. The dataset will be open-sourced to encourage further research on VIL.

[1] F. Sener, et al. "Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities". CVPR 2022. [2] P. Sharma, et al. "Multiple Interactions Made Easy (MIME) : Large Scale Demonstrations Data for Imitation." CoRL, 2018.

Adaptive Human-Robot Interactions with Human Trust Maximization

Good knowledge of Python and/or C++;
Good knowledge in Robotics and Machine Learning;
Good knowledge of Deep Learning frameworks, e.g, PyTorch;

References: [1] Xu, Anqi, and Gregory Dudek. "Optimo: Online probabilistic trust inference model for asymmetric human-robot collaborations." ACM/IEEE HRI, IEEE, 2015; [2] Kwon, Minae, et al. "When humans aren’t optimal: Robots that collaborate with risk-aware humans." ACM/IEEE HRI, IEEE, 2020; [3] Chen, Min, et al. "Planning with trust for human-robot collaboration." ACM/IEEE HRI, IEEE, 2018; [4] Poole, Ben et al. “On variational bounds of mutual information”. ICML, PMLR, 2019.

Causal inference of human behavior dynamics for physical Human-Robot Interactions

Highly motivated students can apply by sending an e-mail expressing your interest to [email protected] , attaching your a letter of motivation and possibly your CV.

Good knowledge of Robotics;
Good knowledge of Deep Learning frameworks, e.g, PyTorch
Li, Q., Chalvatzaki, G., Peters, J., Wang, Y., Directed Acyclic Graph Neural Network for Human Motion Prediction, 2021 IEEE International Conference on Robotics and Automation (ICRA).
Löwe, S., Madras, D., Zemel, R. and Welling, M., 2020. Amortized causal discovery: Learning to infer causal graphs from time-series data. arXiv preprint arXiv:2006.10833.
Yang, W., Paxton, C., Mousavian, A., Chao, Y.W., Cakmak, M. and Fox, D., 2020. Reactive human-to-robot handovers of arbitrary objects. arXiv preprint arXiv:2011.08961.

Incorporating First and Second Order Mental Models for Human-Robot Cooperative Manipulation Under Partial Observability

Scope: Master Thesis Advisor: Dorothea Koert , Joni Pajarinen Added: 2021-06-08 Start: ASAP

The ability to model the beliefs and goals of a partner is an essential part of cooperative tasks. While humans develop theory of mind models for this aim already at a very early age [1] it is still an open question how to implement and make use of such models for cooperative robots [2,3,4]. In particular, in shared workspaces human robot collaboration could potentially profit from the use of such models e.g. if the robot can detect and react to planned human goals or a human's false beliefs during task execution. To make such robots a reality, the goal of this thesis is to investigate the use of first and second order mental models in a cooperative manipulation task under partial observability. Partially observable Markov decision processes (POMDPs) and interactive POMDPs (I-POMDPs) [5] define an optimal solution to the mental modeling task and may provide a solid theoretical basis for modelling. The thesis may also compare related approaches from the literature and setup an experimental design for evaluation with the bi-manual robot platform Kobo.

Highly motivated students can apply by sending an e-mail expressing your interest to [email protected] attaching your CV and transcripts.

References:

Wimmer, H., & Perner, J. Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception (1983)
Sandra Devin and Rachid Alami. An implemented theory of mind to improve human-robot shared plans execution (2016)
Neil Rabinowitz, Frank Perbet, Francis Song, Chiyuan Zhang, SM Ali Eslami,and Matthew Botvinick. Machine theory of mind (2018)
Connor Brooks and Daniel Szafir. Building second-order mental models for human-robot interaction. (2019)
Prashant Doshi, Xia Qu, Adam Goodie, and Diana Young. Modeling recursive reasoning by humans using empirically informed interactive pomdps. (2010)

Get the Reddit app

ml. Beginners please see learnmachinelearning

[P] Suggestion for Master's Thesis Project in Computer Vision

Hello, I'm currently a university student entering into their Master's in Computer Science. I am heavily interested in the field of Deep Learning and in particular, Computer Vision.

For my Bachelor's dissertation I worked on the classification of Alzheimer's Disease using MRI scans, and I heavily enjoyed the process. I find the field of computer vision interesting, not just due to the applications, but due to interest in the methods and algorithms behind it. However, the main issue that I had with my Bachelor's project is that ultimately it felt quite ameteurish and not as cutting edge as I would want - I want to work on something that is somewhat new, with the possibility of creating a truly great paper.

I've had many ideas of different routes to go down, however I'm not fully sure what would be the best route for me, and if there's areas in the field of computer vision I am neglecting. In particular I have been interested in applications of Visual Transformers as I've recently learnt about them, however I am interested in any project or ideas within the field of computer vision, regardless of if they employ Visual Transformers or not.

Some ideas I was considering:

- 3D Object Generation using Visual Transformers: Would use Shapenet dataets to generate 3d objects, employing the visual transformer architecture. This could be difficult (dataset size constraints), however, my supervisor has experience with her PHD students working on this

- Brain MRI Upscaling using Visual Transformers: Self explanatory. I was inspired for this project when working on my bachelor's thesis. My main concerns with this is that it could be very out of my depth and too much for me to do in one year.

- Bone fracture classification & segmentation (custom dataset): As part of my supervisor's research, she has commissioned a custom bone fracture dataset, which would have a segmentation and classification task. This could be implemented with a number of different algorithms. This is very interesting to me as working with such an exclusive dataset is very cool, however I'm not sure if this is the part of the field I'm most interested in - however it is likely a more doable project.

This list is limited by what I know, so if there's any fields that are on the cutting edge and could be interesting, please do leave a comment. Thanks!

IMAGES

Some research topics in machine vision-based inspection.
Top 5 Thesis Topics for Machine Learning [Customized Research Support]
Fundamentals of Machine Vision
Introduction to Machine Vision
(PDF) Research and Application of Machine Vision in Industry
(PDF) Literature Review of Machine Vision in Application Field

VIDEO

Introduction to Zebra Technologies Fixed Industrial Scanners and Machine Vision
AI vision for Metal surface inspection from OPT machine vision
10 Finance & 10 Marketing MBA RESEARCH THESIS TOPICS 2024
Machine Learning in Machine Vision
Machine Vision Application for Product Sorting
Thesis 2008 SCI_Arc Thorne

COMMENTS

The Future of AI Research: 20 Thesis Ideas for Undergraduate ...
Each thesis idea includes an introduction, which presents a brief overview of the topic and the research objectives. The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more.
Theses
A list of completed theses and new thesis topics from the Computer Vision Group. ... Manual phenotyping is highly time-consuming; therefore, many computer vision and machine learning based methods have been proposed in the past years to perform this task automatically based on images of the plants. However, the publicly available datasets (in ...
1000 Computer Science Thesis Topics and Ideas
Thesis topics could focus on the enhancement of machine perception through computer vision and sensor fusion, the development of more sophisticated AI-driven decision frameworks, or ethical considerations in the deployment of autonomous systems.
Undergraduate Research Topics
How to Contact Faculty for IW/Thesis Advising. Send the professor an e-mail. When you write a professor, be clear that you want a meeting regarding a senior thesis or one-on-one IW project, and briefly describe the topic or idea that you want to work on. ... Computer Vision, Machine Learning. Independent Research Topics: 3D Vision; Object ...
15 Computer Visions Projects You Can Do Right Now
If you're new or learning computer vision, these projects will help you learn a lot. 1. Edge & Contour Detection. If you're new to computer vision, this project is a great start. CV applications detect edges first and then collect other information. There are many edge detection algorithms, and the most popular is the Canny edge detector ...
10 Cutting Edge Research Papers In Computer Vision & Image ...
2. Adversarial Examples that Fool both Computer Vision and Time-Limited Humans, by Gamaleldin F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alex Kurakin, Ian Goodfellow, Jascha Sohl-Dickstein Original Abstract. Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus ...
Dissertations / Theses on the topic 'Machine vision'
The thesis discusses a system named Myriad, a distributed computing framework for Machine Vision applications. Myriad is composed components, such as image processing engines and equipment controllers, which behave as enhanced web servers and communicate using simple HTTP requests.
Computer Vision Group
Research Areas Research Areas Our research group is working on a range of topics in Computer Vision and Image Processing, many of which are using Artifical Intelligence. Computer Vision is about interpreting images. More specifically the goal is to infer properties of the observed world from an image or a collection of images. Our work combines a range of mathematical domains including ...
Deep learning, machine vision in agriculture in 2021
widely used in machine vision problems. Today, the use of deep machine learning is a priority in the problems of classification and tracking, which is confirmed by the results of competitions at Kaggle (www.kaggle.com) and Image.net. The most popular neural network used in classification tasks is the convolutional neural network (CNN).
Machine Vision Thesis Topics
Machine Vision Thesis Topics - Free download as PDF File (.pdf), Text File (.txt) or read online for free. The document discusses selecting thesis topics in machine vision and the assistance available. It notes that machine vision is a rapidly evolving field with many possibilities for research but narrowing options can be challenging. It then describes how the company HelpWriting.net ...
Finding a Good Thesis Topic in Computer Vision
With respect to undergraduate thesis topics looking at Computer Vision applications is one place to start. The OpenCV library is another. And talking to potential supervisors at your university is also a good idea. With respect to PhD thesis topics, it's important to take into consideration what the fields of expertise of your potential ...
Research Topics
Computer Vision. Computer vision is the science and technology of teaching a computer to interpret images and video as well as a typical human. Technically, computer vision encompasses the fields of image/video processing, pattern recognition, biological vision, artificial intelligence, augmented reality, mathematical modeling, statistics, probability, optimization, 2D sensors, and photography.
Thesis Topic Proposals 2022-2023
Below you can see the thesis topics for 2022-2023. We offer 3 different thesis formats: - Format 1 : Regular thesis (fully supervised by KU Leuven) - Format 2 : Thesis in cooperation with a company (supervised by KU Leuven and the company) - Format 3 : Thesis with a company project within a company (supervised by the company) NOTE: Additional ...
Thesis Topic Proposals 2023-2024
The deadline for submitting this form is 30th of October, 2023. (!!) Below you can see the thesis topics for 2023-2024. We offer 3 different thesis formats: - Format 1 : Regular thesis (fully supervised by KU Leuven) - Format 2 : Thesis in cooperation with a company (supervised by KU Leuven and the company) - Format 3 : Thesis with a company ...
10 Machine Learning Project (Thesis) Topics for 2020
2. Intelligent Internet Ads Generation (Classification) This is one of the most interesting topics for me. The reason is because the revenue generated or expended by ads campaign depends not just on the volume of the ads, but also on the relevance of the ads. Therefore it is possible to increase revenue and reduce spending by developing a ...
Thesis Topics
Thesis Topics. This list includes topics for potential bachelor or master theses, guided research, projects, seminars, and other activities. Search with Ctrl+F for desired keywords, e.g. 'machine learning' or others. PLEASE NOTE: If you are interested in any of these topics, click the respective supervisor link to send a message with a ...
What is a good topic for an undergraduate thesis in Machine ...
For example, perhaps take a walk through a park, take pictures of all of the plants of one species, and see if you can use machine learning that can figure out things like degree of branching, age, pest prevalence, etc., from images of the plant. Undergrad ML TA. I suggest you find a researcher at your university, preferably in biology ...
Machine Vision Thesis
Machine Vision Thesis - Free download as PDF File (.pdf), Text File (.txt) or read online for free. This document discusses the challenges of writing a thesis on machine vision and provides suggestions for getting help. It notes that machine vision is a complex, specialized topic requiring extensive research, data analysis, and expertise.
Computer Vision really cool ideas for a thesis? : r/computervision
Your thesis could be based on UI and computer vision as they really are changing the land scape and help an open source project in the process. We also want to add image homography and feature tracking to the next release (1.3). We have quick release cycles as well (about every 3 months).
Dissertations / Theses on the topic 'Machine vision ...
Thesis focus with in the field of Machine Vision that is used for optical online quality inspection of the cutting knifes in a wood chipper that is also the title of the thesis.The work is focused on measuring the quality of the cutting knifes that are moving with the speed of 45 m/s in a real time wood chipper.
Topics on Machine Learning under Imperfect Supervision
This dissertation comprises several studies addressing supervised learning problems where the supervision is imperfect. Firstly, we investigate the margin conditions in active learning. Active learning is characterized by its special mechanism where the learner can sample freely over the feature space and exploit mostly the limited labeling budget by querying the most informative labels.
Currently Available Theses Topics
In this thesis, we aim to build on advances in 3D vision-based robot manipulation and large open-vocabulary vision models [2] to build a full pick-and-place pipeline for real-world manipulation. We also aim to find synergies between scene reconstruction and semantic segmentation to determine if knowing the object semantics can aid the ...
Struggling to find a research topic in computer vision for masters' thesis
I am struggling to find a research topic for my masters thesis in Artificial Intelligence (computer vision topics). With a plethora of research already published and requirement of novelty, it's a real struggle finding a proper and practical research topic. Some topics I'v currently shortlisted are facial expression / emotion recognition ...
[P] Suggestion for Master's Thesis Project in Computer Vision
Project. Hello, I'm currently a university student entering into their Master's in Computer Science. I am heavily interested in the field of Deep Learning and in particular, Computer Vision. For my Bachelor's dissertation I worked on the classification of Alzheimer's Disease using MRI scans, and I heavily enjoyed the process.