Research Themes

My research is in the area of Human Centered Machine Learning, using Machine Learning, Computer Vision and Signal Processing methodologies to learn from multiple sources concepts that enable Intelligent Systems to understand, communicate and collaborate with humans. Currently it evolves around three themes:

  • Learning to recognise behaviour, emotions and cognitive states of people by analysing their images, video and neuro-physiological signals
  • Learning across modalities, and in particular at the intersections of language and vision, using large, pretrained language and audio-visual models
  • Learning from generative models and learning to control generation for privacy, interpretability and control purposes.
Card image cap

Multimodal Machine Learning (Vision and Language)

This line of work is concerned with learning across modalities, and in particular at the intersections of language and vision, utilising, fine-tuning and adapting large, pre-trained Language and Vision-Language models.

Key references:
  1. aimfair.png
    AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data
    Zengqun Zhao, Ziquan Liu, Yu Cao, Shaogang Gong, and Ioannis Patras
    In IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025
    generation-and-learning multimodal-ml
  2. EMSO.png
    Get Confused Cautiously: Textual Sequence Memorization Erasure with Selective Entropy Maximization
    Zhaohan Zhang, Ziquan Liu, and Ioannis Patras
    In International Conference on Computational Linguistics (COLING), 2025
    generation-and-learning multimodal-ml
  3. expclip.png
    Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
    Zengqun Zhao, Yu Cao, Shaogang Gong, and Ioannis Patras
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025
    affective-computing multimodal-ml
  4. Cemi-face.png
    CemiFace: Center-based Semi-hard Synthetic Face Generation for Face Recognition
    Zhonglin Sun, Siyang Song, Ioannis Patras, and Georgios Tzimiropoulos
    In Advances in Neural Information Processing Systems (NeurIPS), 2024
    multimodal-ml generation-and-learning
  5. clipcleaner.png
    CLIPCleaner: Cleaning Noisy Labels with CLIP
    Chen Feng, Georgios Tzimiropoulos, and Ioannis Patras
    In ACM International Conference on Multimedia (ACM MM), 2024
    learning-from-few-samples multimodal-ml
  6. simple-vqa-baseline.png
    A Simple Baseline for Knowledge-Based Visual Question Answering
    Alexandros Xenos, Themos Stafylakis, Ioannis Patras, and Georgios Tzimiropoulos
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
    multimodal-ml
  7. vl-bias.png
    Improving Fairness using Vision-Language Driven Image Augmentation
    Moreno D’Incà, Christos Tzelepis, Ioannis Patras, and Nicu Sebe
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024
    generation-and-learning multimodal-ml
  8. emo-clip.png
    EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition
    Niki Maria Foteinopoulou, and Ioannis Patras
    In IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2024
    affective-computing multimodal-ml
  9. pos.png
    Parts of Speech-Grounded Subspaces in Vision-Language Models
    James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, and Ioannis Patras
    In Advances in Neural Information Processing Systems (NeurIPS), 2023
    generation-and-learning multimodal-ml
  10. dferclip.png
    Prompting Visual-Language Models for Dynamic Facial Expression Recognition
    Zengqun Zhao, and Ioannis Patras
    In British Machine Vision Conference (BMVC), 2023
    affective-computing multimodal-ml
  11. contra-clip.png
    ContraCLIP: Interpretable GAN generation driven by pairs of contrasting sentences
    Christos Tzelepis, James Oldfield, Georgios Tzimiropoulos, and Ioannis Patras
    ArXiv, 2022
    generation-and-learning multimodal-ml
Card image cap

Affective Computing

This line of research is concerned with the recognition of behaviour, emotions and cognitive states of people by analysing their images, video and neuro-physiological signals. In a recent line of work this extends to the analysis of mental health illnesses, such as schizophrenia and depression.

Key references:
  1. expclip.png
    Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
    Zengqun Zhao, Yu Cao, Shaogang Gong, and Ioannis Patras
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025
    affective-computing multimodal-ml
  2. emo-clip.png
    EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition
    Niki Maria Foteinopoulou, and Ioannis Patras
    In IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2024
    affective-computing multimodal-ml
  3. dferclip.png
    Prompting Visual-Language Models for Dynamic Facial Expression Recognition
    Zengqun Zhao, and Ioannis Patras
    In British Machine Vision Conference (BMVC), 2023
    affective-computing multimodal-ml
  4. amigos.png
    AMIGOS: A Dataset for Affect, Personality and Mood Research on Individuals and Groups
    Juan Abdon Miranda Correa, Mojtaba Khomami Abadi, Nicu Sebe, and Ioannis Patras
    IEEE Transactions on Affective Computing, 2021
    affective-computing
  5. schinet.png
    SchiNet: Automatic Estimation of Symptoms of Schizophrenia from Facial Behaviour Analysis
    Mina Bishay, Petar Palasek, Stefan Priebe, and Ioannis Patras
    IEEE Transactions on Affective Computing, 2021
    affective-computing
  6. pairwise-ranking.png
    Pairwise Ranking Network for Affect Recognition
    Georgios Zoumpourlis, and Ioannis Patras
    In International Conference on Affective Computing and Intelligent Interaction (ACII), 2021
    affective-computing
  7. DEAP: A Database for Emotion Analysis Using Physiological Signals
    Sander Koelstra, Christian Mühl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras
    IEEE Trans. Affect. Comput., 2012
    affective-computing
  8. acmmm2022_foteinopoulou_methodOverview.png
    Learning from Label Relationships in Human Affect
    Niki Maria Foteinopoulou, and Ioannis Patras
    In ACM International Conference on Multimedia (ACM MM), 2022
    affective-computing
Card image cap

Generation and Learning

This line of research is concerned with learning from generative models and learning to control generation for privacy, interpretability and control purposes. This includes learning representations in the latent space of generative models so as to control local changes, control image generation with natural language and controlling generation so as to anonymise datatasets in order to use them for training machine learning models in a privacy preserving manner.

Key references:
  1. aimfair.png
    AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data
    Zengqun Zhao, Ziquan Liu, Yu Cao, Shaogang Gong, and Ioannis Patras
    In IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2025
    generation-and-learning multimodal-ml
  2. EMSO.png
    Get Confused Cautiously: Textual Sequence Memorization Erasure with Selective Entropy Maximization
    Zhaohan Zhang, Ziquan Liu, and Ioannis Patras
    In International Conference on Computational Linguistics (COLING), 2025
    generation-and-learning multimodal-ml
  3. Cemi-face.png
    CemiFace: Center-based Semi-hard Synthetic Face Generation for Face Recognition
    Zhonglin Sun, Siyang Song, Ioannis Patras, and Georgios Tzimiropoulos
    In Advances in Neural Information Processing Systems (NeurIPS), 2024
    multimodal-ml generation-and-learning
  4. mumoe-anim.gif
    Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
    James Oldfield, Markos Georgopoulos, Grigorios G. Chrysos, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Jiankang Deng, and Ioannis Patras
    In Advances in Neural Information Processing Systems (NeurIPS), 2024
    generation-and-learning
  5. LAFS.png
    LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition
    Zhonglin Sun, Chen Feng, Ioannis Patras, and Georgios Tzimiropoulos
    In IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024
    generation-and-learning
  6. facial_region_awareness.png
    Self-Supervised Facial Representation Learning with Facial Region Awareness
    Zheng Gao, and Ioannis Patras
    In IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2024
    generation-and-learning
  7. vl-bias.png
    Improving Fairness using Vision-Language Driven Image Augmentation
    Moreno D’Incà, Christos Tzelepis, Ioannis Patras, and Nicu Sebe
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024
    generation-and-learning multimodal-ml
  8. pos.png
    Parts of Speech-Grounded Subspaces in Vision-Language Models
    James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, and Ioannis Patras
    In Advances in Neural Information Processing Systems (NeurIPS), 2023
    generation-and-learning multimodal-ml
  9. hyperreenact.png
    HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces
    Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, and Georgios Tzimiropoulos
    International Conference on Computer Vision (ICCV), 2023
    generation-and-learning
  10. falco.png
    Attribute-preserving Face Dataset Anonymization via Latent Code Optimization
    Simone Barattin, Christos Tzelepis, Ioannis Patras, and Nicu Sebe
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
    generation-and-learning
  11. panda.jpg
    PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs
    James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis Nicolaou, and Ioannis Patras
    In International Conference on Learning Representations (ICLR), 2023
    generation-and-learning
  12. contra-clip.png
    ContraCLIP: Interpretable GAN generation driven by pairs of contrasting sentences
    Christos Tzelepis, James Oldfield, Georgios Tzimiropoulos, and Ioannis Patras
    ArXiv, 2022
    generation-and-learning multimodal-ml
Card image cap

Learning with few/no/noisy/uncertain/imprecise annotations

This line of research is concerned with learning in the absence of reliable annotations. This includes self-supervised representation learning, unsupervised learning with clustering objectives or learning with labels of different granularity than that of the downstream task.

Key references:
  1. clipcleaner.png
    CLIPCleaner: Cleaning Noisy Labels with CLIP
    Chen Feng, Georgios Tzimiropoulos, and Ioannis Patras
    In ACM International Conference on Multimedia (ACM MM), 2024
    learning-from-few-samples multimodal-ml
  2. excb.png
    Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing
    Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, and Ioannis Patras
    In European Conference on Computer Vision (ECCV), 2024
    learning-from-few-samples
  3. noise_box.png
    NoiseBox: Towards More Efficient and Effective Learning with Noisy Labels
    Chen Feng, Georgios Tzimiropoulos, and Ioannis Patras
    IEEE Transactions on Circuits and Systems for Video Technology, 2024
    learning-from-few-samples
  4. hypercolumn.png
    Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features
    Zheng Gao, Chen Feng, and Ioannis Patras
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024
    learning-from-few-samples
  5. simdetr.png
    SimDETR: Simplifying self-supervised pretraining for DETR
    Ioannis Maniadis Metaxas, Adrian Bulat, Ioannis Patras, Brais Martinez, and Georgios Tzimiropoulos
    arXiv, 2023
    learning-from-few-samples
  6. maskcon.png
    MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset
    Chen Feng, and Ioannis Patras
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
    learning-from-few-samples
  7. divclust.png
    DivClust: Controlling Diversity in Deep Clustering
    Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, and Ioannis Patras
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
    learning-from-few-samples
  8. ssr_method.png
    SSR: An Efficient and Robust Framework for Learning with Unknown Label Noise
    Chen Feng, Georgios Tzimiropoulos, and Ioannis Patras
    In British Machine Vision Conference (BMVC), 2022
    learning-from-few-samples
  9. ascl_framework.png
    Adaptive Soft Contrastive Learning
    Chen Feng, and Ioannis Patras
    In International Conference on Pattern Recognition (ICPR), 2022
    learning-from-few-samples
  10. Linear Maximum Margin Classifier for Learning from Uncertain Data
    Christos Tzelepis, Vasileios Mezaris, and Ioannis Patras
    IEEE Trans. Pattern Anal. Mach. Intell., 2018
    learning-from-few-samples
  11. Unsupervised convolutional neural networks for motion estimation
    Aria Ahmadi, and Ioannis Patras
    In 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016, 2016
    learning-from-few-samples
Card image cap

Video Understanding

This line of research is concerned with analysis of video for retrieval, summarisation and activity/action recognition.

Key references:
  1. sum-survey.png
    Video Summarization Using Deep Neural Networks: A Survey
    Evlampios E. Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, and Ioannis Patras
    Proceedings of the IEEE, 2021
    video-understanding
  2. ac-sum-gan.png
    AC-SUM-GAN: Connecting Actor-Critic and Generative Adversarial Networks for Unsupervised Video Summarization
    Evlampios E. Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, and Ioannis Patras
    IEEE Transactions on Circuits and Systems for Video Technology, 2021
    video-understanding
  3. Unsupervised Video Summarization via Attention-Driven Adversarial Learning
    Evlampios E. Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, and Ioannis Patras
    In International Conference on MultiMedia Modeling (MMM), 2020
    video-understanding
  4. FIVR: Fine-Grained Incident Video Retrieval
    Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, and Ioannis Kompatsiaris
    IEEE Trans. Multim., 2019
    video-understanding
  5. TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition
    Mina Bishay, Georgios Zoumpourlis, and Ioannis Patras
    In 30th British Machine Vision Conference 2019, BMVC 2019, Cardiff, UK, September 9-12, 2019, 2019
    video-understanding
  6. ViSiL: Fine-Grained Spatio-Temporal Video Similarity Learning
    Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, and Yiannis Kompatsiaris
    In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, 2019
    video-understanding