Recent publications
(Also see my Google Scholar and my Queen Mary University publications webpage)
2025
2024
-
NoiseBox: Towards More Efficient and Effective Learning with Noisy LabelsIEEE Transactions on Circuits and Systems for Video Technology, 2024
2023
2022
2021
-
Few-Shot Action Localization without Knowing BoundariesIn ICMR ’21: International Conference on Multimedia Retrieval, Taipei, Taiwan, August 21-24, 2021, 2021
-
MultiMedia Modeling - 27th International Conference, MMM 2021, Prague, Czech Republic, June 22-24, 2021, Proceedings, Part I2021
-
MultiMedia Modeling - 27th International Conference, MMM 2021, Prague, Czech Republic, June 22-24, 2021, Proceedings, Part II2021
-
S3: Supervised Self-supervised Learning under Label NoiseCoRR, 2021
Multimodal Machine Learning (Vision and Language)
2025
2024
2023
2022
2021
Affective Computing
2025
2024
2023
2022
2021
Generation and Learning
2025
2024
2023
2022
2021
Learning from few samples
2025
2024
-
NoiseBox: Towards More Efficient and Effective Learning with Noisy LabelsIEEE Transactions on Circuits and Systems for Video Technology, 2024
2023
2022
2021
-
S3: Supervised Self-supervised Learning under Label NoiseCoRR, 2021
Video understanding
2025
2024
2023
2022
2021
Older publications
2020
-
Cycle-Consistent Adversarial Networks and Fast Adaptive Bi-dimensional Empirical Mode Decomposition for Style TransferIn 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event / Milan, Italy, January 10-15, 2021, 2020
2019
-
Universal Foreground Segmentation Based on Deep Feature Fusion Network for Multi-Scene VideosIEEE Access, 2019
-
Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wildComput. Vis. Image Underst., 2019
-
Implicit and Explicit Concept Relations in Deep Neural Networks for Multi-Label Video/Image AnnotationIEEE Trans. Circuits Syst. Video Technol., 2019
-
FIVR: Fine-Grained Incident Video RetrievalIEEE Trans. Multim., 2019
-
TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action RecognitionIn 30th British Machine Vision Conference 2019, BMVC 2019, Cardiff, UK, September 9-12, 2019, 2019
-
ViSiL: Fine-Grained Spatio-Temporal Video Similarity LearningIn 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, 2019
-
Exploring Feature Representation and Training Strategies in Temporal Action LocalizationIn 2019 IEEE International Conference on Image Processing, ICIP 2019, Taipei, Taiwan, September 22-25, 2019, 2019
-
Multimodal Video Annotation for Retrieval and Discovery of Newsworthy Video in a News Verification ScenarioIn MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8-11, 2019, Proceedings, Part I, 2019
-
Detecting Tampered Videos with Multimedia Forensics and Deep LearningIn MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8-11, 2019, Proceedings, Part I, 2019
-
VERGE in VBS 2019In MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8-11, 2019, Proceedings, Part II, 2019
-
Video Fragmentation and Reverse Search on the WebIn Video Verification in the Fake News Era, 2019
-
Finding Near-Duplicate Videos in Large-Scale CollectionsIn Video Verification in the Fake News Era, 2019
-
Finding Semantically Related Videos in Closed CollectionsIn Video Verification in the Fake News Era, 2019
-
2018
-
Linear Maximum Margin Classifier for Learning from Uncertain DataIEEE Trans. Pattern Anal. Mach. Intell., 2018
-
Deep Mixture of MRFs for Human Pose EstimationIn Computer Vision - ACCV 2018 - 14th Asian Conference on Computer Vision, Perth, Australia, December 2-6, 2018, Revised Selected Papers, Part III, 2018
-
LikeNet: A Siamese Motion Estimation Network Trained in an Unsupervised WayIn British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, September 3-6, 2018, 2018
-
A Multi-Task Cascaded Network for Prediction of Affect, Personality, Mood and Social Context Using EEG SignalsIn 13th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2018, Xi’an, China, May 15-19, 2018, 2018
-
Visual and Audio Analysis of Movies Video for Emotion Detection @ preview=empty.png, Emotional Impact of Movies Task MediaEval 2018In Working Notes Proceedings of the MediaEval 2018 Workshop, Sophia Antipolis, France, 29-31 October 2018, 2018
-
VERGE in VBS 2018In MultiMedia Modeling - 24th International Conference, MMM 2018, Bangkok, Thailand, February 5-7, 2018, Proceedings, Part II, 2018
-
Multimedia Processing EssentialsIn Personal Multimedia Preservation - Remembering or Forgetting Images and Video, 2018
2017
-
Gaze movement-driven random forests for query clustering in automatic video annotationMultim. Tools Appl., 2017
-
Background modelling based on generative unetIn 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2017, Lecce, Italy, August 29 - September 1, 2017, 2017
-
Deep Refinement Convolutional Networks for Human Pose EstimationIn 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, May 30 - June 3, 2017, 2017
-
Generic to Specific Recognition Models for Membership Analysis in Group VideosIn 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, May 30 - June 3, 2017, 2017
-
Fusing Multilabel Deep Networks for Facial Action Unit DetectionIn 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, May 30 - June 3, 2017, 2017
-
Deep Globally Constrained MRFs for Human Pose EstimationIn IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, 2017
-
Near-Duplicate Video Retrieval with Deep Metric LearningIn 2017 IEEE International Conference on Computer Vision Workshops, ICCV Workshops 2017, Venice, Italy, October 22-29, 2017, 2017
-
SmileNet: Registration-Free Smiling Face Detection In The WildIn 2017 IEEE International Conference on Computer Vision Workshops, ICCV Workshops 2017, Venice, Italy, October 22-29, 2017, 2017
-
Concept Language Models and Event-based Concept Number Selection for Zero-example Event DetectionIn Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, ICMR 2017, Bucharest, Romania, June 6-9, 2017, 2017
-
Query and Keyframe Representations for Ad-hoc Video SearchIn Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, ICMR 2017, Bucharest, Romania, June 6-9, 2017, 2017
-
VideoAnalysis4ALL: An On-line Tool for the Automatic Fragmentation and Concept-based Annotation, and the Interactive Exploration of VideosIn Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, ICMR 2017, Bucharest, Romania, June 6-9, 2017, 2017
-
Comparison of Fine-Tuning and Extension Strategies for Deep Convolutional Neural NetworksIn MultiMedia Modeling - 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6, 2017, Proceedings, Part I, 2017
-
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersIn MultiMedia Modeling - 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6, 2017, Proceedings, Part I, 2017
-
VERGE in VBS 2017In MultiMedia Modeling - 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4-6, 2017, Proceedings, Part II, 2017
-
ITI-CERTH participation in TRECVID 2017In 2017 TREC Video Retrieval Evaluation, TRECVID 2017, Gaithersburg, MD, USA, November 13-15, 2017, 2017
2016
-
Special Issue on Individual and Group Activities in Video Event AnalysisComput. Vis. Image Underst., 2016
-
Action recognition using saliency learned from recorded human gazeImage Vis. Comput., 2016
-
Learning to detect video events from zero or very few video examplesImage Vis. Comput., 2016
-
Automatic Recognition of Emotions and Membership in Group VideosIn 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2016, Las Vegas, NV, USA, June 26 - July 1, 2016, 2016
-
Online multi-task learning for semantic concept detection in videoIn 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016, 2016
-
Unsupervised convolutional neural networks for motion estimationIn 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016, 2016
-
Video aesthetic quality assessment using kernel Support Vector Machine with isotropic Gaussian sample uncertainty (KSVM-IGSU)In 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016, 2016
-
Minimal filtered channel features for pedestrian detectionIn 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, December 4-8, 2016, 2016
-
Action Recognition Using Convolutional Restricted Boltzmann MachinesIn Proceedings of the 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction, MARMI@ICMR 2016, New York, preview=empty.png, New York, USA, June 6, 2016, 2016
-
Deep Multi-task Learning with Label Correlation Constraint for Video Concept DetectionIn Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15-19, 2016, 2016
-
Alone versus In-a-group: A Comparative Analysis of Facial Affect RecognitionIn Proceedings of the 2016 ACM Conference on Multimedia Conference, MM 2016, Amsterdam, The Netherlands, October 15-19, 2016, 2016
-
Video Event Detection Using Kernel Support Vector Machine with Isotropic Gaussian Sample Uncertainty (KSVM-iGSU)In MultiMedia Modeling - 22nd International Conference, MMM 2016, Miami, FL, USA, January 4-6, 2016, Proceedings, Part I, 2016
-
VERGE: A Multimodal Interactive Search Engine for Video Browsing and RetrievalIn MultiMedia Modeling - 22nd International Conference, MMM 2016, Miami, FL, USA, January 4-6, 2016, Proceedings, Part II, 2016
-
Ordering of Visual Descriptors in a Classifier Cascade Towards Improved Video Concept DetectionIn MultiMedia Modeling - 22nd International Conference, MMM 2016, Miami, FL, USA, January 4-6, 2016, Proceedings, Part I, 2016
-
ITI-CERTH participation to TRECVID 2016In 2016 TREC Video Retrieval Evaluation, TRECVID 2016, Gaithersburg, MD, USA, November 14-16, 2016, 2016
2015
-
Cascade of forests for face alignmentIET Comput. Vis., 2015
-
Random Subspace Supervised Descent Method for Regression Problems in Computer VisionIEEE Signal Process. Lett., 2015
-
DECAF: MEG-Based Multimodal Database for Decoding Affective Physiological ResponsesIEEE Trans. Affect. Comput., 2015
-
Privileged Information-Based Conditional Structured Output Regression Forest for Facial Point DetectionIEEE Trans. Circuits Syst. Video Technol., 2015
-
Local Features and a Two-Layer Stacking Architecture for Semantic Concept Detection in VideoIEEE Trans. Emerg. Top. Comput., 2015
-
Fine-Tuning Regression Forests Votes for Object Alignment in the WildIEEE Trans. Image Process., 2015
-
Robust Face Alignment Under Occlusion via Regional Predictive Power EstimationIEEE Trans. Image Process., 2015
-
Concept Detection in Multimedia Web Resources About Home Made ExplosivesIn 10th International Conference on Availability, Reliability and Security, ARES 2015, Toulouse, France, August 24-27, 2015, 2015
-
Identifying valence and arousal levels via connectivity between EEG channelsIn 2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015, Xi’an, China, September 21-24, 2015, 2015
-
Face Alignment Assisted by Head Pose EstimationIn Proceedings of the British Machine Vision Conference 2015, BMVC 2015, Swansea, UK, September 7-10, 2015, 2015
-
Mirror, mirror on the wall, tell me, is the error small?In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, 2015
-
Inference of personality traits and affect schedule by analysis of spontaneous reactions to affective videosIn 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2015, Ljubljana, Slovenia, May 4-8, 2015, 2015
-
Cascade of classifiers based on binary, non-binary and deep convolutional network descriptors for video concept detectionIn 2015 IEEE International Conference on Image Processing, ICIP 2015, Quebec City, QC, Canada, September 27-30, 2015, 2015
-
A flexible calibration method of multiple Kinects for 3D human reconstructionIn 2015 IEEE International Conference on Multimedia & Expo Workshops, ICME Workshops 2015, Turin, Italy, June 29 - July 3, 2015, 2015
-
VERGE: A Multimodal Interactive Video Search EngineIn MultiMedia Modeling - 21st International Conference, MMM 2015, Sydney, NSW, Australia, January 5-7, 2015, Proceedings, Part II, 2015
-
A Study on the Use of a Binary Local Descriptor and Color Extensions of Local Descriptors for Video Concept DetectionIn MultiMedia Modeling - 21st International Conference, MMM 2015, Sydney, NSW, Australia, January 5-7, 2015, Proceedings, Part I, 2015
-
ITI-CERTH participation to TRECVID 2015In 2015 TREC Video Retrieval Evaluation, TRECVID 2015, Gaithersburg, MD, USA, November 16-18, 2015, 2015
-
Face Pose AnalysisIn Encyclopedia of Biometrics, Second Edition, 2015
2014
-
Multimodal random forest based tensor regressionIET Comput. Vis., 2014
-
Face Sketch Landmarks Localization in the WildIEEE Signal Process. Lett., 2014
-
Structured Semi-supervised Forest for Facial Landmarks Localization with Face Mask ReasoningIn British Machine Vision Conference, BMVC 2014, Nottingham, UK, September 1-5, 2014, 2014
-
Non-invasive player experience estimation from body motion and game contextIn 2014 IEEE Conference on Computational Intelligence and Games, CIG 2014, Dortmund, Germany, August 26-29, 2014, 2014
-
Learning visual saliency using topographic independent component analysisIn 2014 IEEE International Conference on Image Processing, ICIP 2014, Paris, France, October 27-30, 2014, 2014
-
ITI-CERTH participation to TRECVID 2014In 2014 TREC Video Retrieval Evaluation, TRECVID 2014, Orlando, FL, USA, November 10-12, 2014, 2014
2013
-
Fusion of facial expressions and EEG for implicit affective taggingImage Vis. Comput., 2013
-
Coupled Gaussian Processes for Pose-Invariant Facial Expression RecognitionIEEE Trans. Pattern Anal. Mach. Intell., 2013
-
High order pLSA for indexing tagged imagesSignal Process., 2013
-
Supervised dictionary learning for action localizationIn 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2013, Shanghai, China, 22-26 April, 2013, 2013
-
Privileged information-based conditional regression forest for facial feature detectionIn 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2013, Shanghai, China, 22-26 April, 2013, 2013
-
Sieving Regression Forest Votes for Facial Feature Detection in the WildIn IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1-8, 2013, 2013
-
Semi-supervised visual recognition with constrained graph regularized non negative matrix factorizationIn IEEE International Conference on Image Processing, ICIP 2013, Melbourne, Australia, September 15-18, 2013, 2013
2012
-
Max-margin Non-negative Matrix FactorizationImage Vis. Comput., 2012
-
Leveraging social media for scalable object detectionPattern Recognit., 2012
-
Higher rank Support Tensor Machines for visual recognitionPattern Recognit., 2012
-
Tensor Learning for RegressionIEEE Trans. Image Process., 2012
-
Tree-Structured Feature Extraction Using Mutual InformationIEEE Trans. Neural Networks Learn. Syst., 2012
-
Exploiting Depth and Intensity Information for Head Pose Estimation with Random Forests and Tensor ModelsIn Computer Vision - ACCV 2012 Workshops, ACCV 2012 International Workshops, Daejeon, Korea, November 5-6, 2012, Revised Selected Papers, Part II, 2012
-
Exploring the Similarities of Neighboring Spatiotemporal Points for Action Pair MatchingIn Computer Vision - ACCV 2012 - 11th Asian Conference on Computer Vision, Daejeon, Korea, November 5-9, 2012, Revised Selected Papers, Part III, 2012
-
Face Parts Localization Using Structured-Output Regression ForestsIn Computer Vision - ACCV 2012, 11th Asian Conference on Computer Vision, Daejeon, Korea, November 5-9, 2012, Revised Selected Papers, Part II, 2012
-
Learning codebook weights for action detectionIn 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA, June 16-21, 2012, 2012
-
Support tensor action spottingIn 19th IEEE International Conference on Image Processing, ICIP 2012, Lake Buena Vista, Orlando, FL, USA, September 30 - October 3, 2012, 2012
-
A simple and effective extrinsic calibration method of a camera and a single line scanning lidarIn Proceedings of the 21st International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 11-15, 2012, 2012
-
Coupled 3D tracking and pose optimization of rigid objects using particle filterIn Proceedings of the 21st International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 11-15, 2012, 2012
-
Affective gaming: Beyond using sensorsIn 5th International Symposium on Communications, Control and Signal Processing, ISCCSP 2012, Roma, Italy, May 2-4, 2012, 2012
-
Higher Rank Support Tensor MachinesIn Advances in Visual Computing - 8th International Symposium, ISVC 2012, Rethymnon, Crete, Greece, July 16-18, 2012, Revised Selected Papers, Part II, 2012
-
Image Interpretation by Combining Ontologies and Bayesian NetworksIn Artificial Intelligence: Theories and Applications - 7th Hellenic Conference on AI, SETN 2012, Lamia, Greece, May 28-31, 2012. Proceedings, 2012
-
Exploiting gaze movements for automatic video annotationIn 13th International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS 2012, Dublin, Ireland, May 23-25, 2012, 2012