Research Article
An Improved Adaptive Angular Margin Loss Function for Deep Face Recognition
Issue:
Volume 11, Issue 1, June 2025
Pages:
1-10
Received:
10 February 2025
Accepted:
19 March 2025
Published:
17 April 2025
DOI:
10.11648/j.ajnna.20251101.11
Downloads:
Views:
Abstract: In recent years, there have been growing interests on deep learning based face recognition which currently produces state of the art standards in face detection, recognition and verification tasks. As is well known, loss function for extracting face feature plays a crucial role in deep face model. In this regards, margin-based loss functions which apply a fixed margin between the feature and the weight have attracted many interests. However, such margin-based losses have a somewhat limitation in enhancing the discriminative power and generalizability of the face model, since the intra-class and inter-class variations in the real face training sets are often imbalanced. In particular, the embedding feature whose angle between the feature and the weight is distributed around 90° or 180° on the hypersphere reflects the difficult embedding feature in the process of classes. These phenomena occur when one considers those class which contains few number of embedding data. In order to address this problem, in this paper we propose an improved adaptive angular margin loss that incorporates the adaptive and robust angular margin on the angular space between the feature and the corresponding weight instead of constant margin. Our new margin loss function is constructed by incorporating adaptive and more robust angular margin constraint on angular space between the embedding feature and the corresponding weight. The proposed loss function improves the feature discrimination by minimizing the intra-class variation and maximizing the inter-class variation simultaneously. We present some experimental result on LFW, CALFW, CPLFW, AgeDB and MegaFace benchmarks, which demonstrate the effectiveness of the proposed approach.
Abstract: In recent years, there have been growing interests on deep learning based face recognition which currently produces state of the art standards in face detection, recognition and verification tasks. As is well known, loss function for extracting face feature plays a crucial role in deep face model. In this regards, margin-based loss functions which ...
Show More
Research Article
AutoMalariaNet: A VGG16-Based Deep Learning Model for High-Performance Automated Malaria Parasite Detection in Blood Smear Images
Emmanuel Osaze Oshoiribhor
,
Adetokunbo MacGregor John-Otumu*
Issue:
Volume 11, Issue 1, June 2025
Pages:
11-27
Received:
20 March 2025
Accepted:
27 March 2025
Published:
17 April 2025
DOI:
10.11648/j.ajnna.20251101.12
Downloads:
Views:
Abstract: This research paper presents an automated malaria detection system using deep learning techniques to enhance diagnostic accuracy and efficiency, addressing the critical challenge of early and precise malaria diagnosis, especially in resource-constrained regions. Malaria remains a significant global health burden, particularly in tropical and subtropical regions where timely and accurate diagnosis is crucial for effective treatment and control. Traditional diagnostic methods, such as microscopic examination of blood smears, require skilled parasitologists and are often labor-intensive and time-consuming, making rapid detection difficult. To overcome these limitations, this study develops a deep learning-based malaria detection system integrating a Custom Convolutional Neural Network (CNN) and a pre-trained VGG16 model, trained on a publicly available malaria blood smear image dataset from Kaggle. Several data preprocessing techniques, including normalization and augmentation (rotation, flipping, scaling, and brightness adjustment), were applied to improve model generalization and robustness. The system is deployed through a web-based interface developed using Python, Flask, and HTML, allowing users to upload blood smear images and obtain real-time diagnostic results. Experimental evaluations demonstrate that the VGG16 model outperforms the Custom CNN, achieving an accuracy of 97%, precision of 96%, recall of 96.56%, and an F1-score of 97%, whereas the Custom CNN attained an accuracy of 87%, precision of 86%, recall of 85%, and an F1-score of 84.45%. These findings validate the effectiveness of deep learning in automating malaria detection and reducing reliance on manual microscopic examination, offering a scalable and accessible diagnostic tool for healthcare facilities with limited resources. Despite the success of the proposed system, further research is necessary to enhance model interpretability and trustworthiness. Future work should explore the integration of Vision Transformers (ViTs), Large Language Models (LLMs), and Ensemble Deep Learning techniques to improve malaria detection performance. Additionally, Explainable AI (XAI) methods, such as Grad-CAM, should be incorporated to provide visual explanations of model predictions, ensuring transparency and aiding medical professionals in understanding the decision-making process. By integrating these advancements, future systems can enhance both diagnostic accuracy and interpretability, making AI-driven malaria detection more reliable and widely applicable.
Abstract: This research paper presents an automated malaria detection system using deep learning techniques to enhance diagnostic accuracy and efficiency, addressing the critical challenge of early and precise malaria diagnosis, especially in resource-constrained regions. Malaria remains a significant global health burden, particularly in tropical and subtro...
Show More
Research Article
A Fast Acoustic Model Based on Multi-Scale Feature Fusion Module for Text-To-Speech Synthesis
Jin-Hyok Song,
Song-Chol Jong,
Thae-Myong Kim,
Guk-Chol Kim,
Hakho Hong*
Issue:
Volume 11, Issue 1, June 2025
Pages:
28-34
Received:
17 February 2025
Accepted:
26 March 2025
Published:
22 April 2025
DOI:
10.11648/j.ajnna.20251101.13
Downloads:
Views:
Abstract: In the end-to-end Text-to-speech synthesis, the ability of acoustic model has important effects on the quality of the speech generated. In the acoustic model, the encoder and decoder are critical components, and usually Transformer is used. The previous works have a lack of ability to model the essential features of speech signal, as they model fixed length of features. There is also limitation of slow inference speed of the acoustic model due to the characteristics of the transformer including the high-computational multi-head self-attention layer. This limits the application of TTS in the low performance of devices such as embedded devices or mobile phones. In this paper, we propose a novel acoustic model to model the different length of features and improve the speed of generating synthetic speech and naturalness as compared to the conventional Transformer structure. Through the experiment, we confirmed that the proposed method improves the naturalness of synthetic speech and operation speed in the low performance of devices.
Abstract: In the end-to-end Text-to-speech synthesis, the ability of acoustic model has important effects on the quality of the speech generated. In the acoustic model, the encoder and decoder are critical components, and usually Transformer is used. The previous works have a lack of ability to model the essential features of speech signal, as they model fix...
Show More