-
Research Article
Hybrid CNN-Transformer Model for Early Detection and Diagnosis of Breast Cancer: A Multi-Regional Dataset Study with Implications for African Healthcare Settings
Issue:
Volume 10, Issue 1, June 2026
Pages:
1-13
Received:
9 November 2025
Accepted:
3 December 2025
Published:
29 December 2025
Abstract: Breast cancer remains the most common cancer among African women, contributing to high mortality largely due to late-stage diagnosis. More than 70% of cases in many African countries are detected at advanced stages, when treatment options are limited and survival rates are significantly reduced. Barriers such as limited access to screening, a shortage of radiologists, and socio-economic constraints further delay early detection. Artificial intelligence (AI)-based diagnostic tools offer an opportunity to strengthen screening systems and improve timely diagnosis; however, the lack of publicly available African mammography datasets remains a major challenge. This study introduces MAMMOAI, an AI-driven framework designed to enhance early breast cancer detection with potential applicability to African contexts. Due to the scarcity of large-scale annotated African mammography datasets a critical barrier identified in current literature we adopted a pragmatic approach combining a multi-task deep learning model integrating convolutional neural networks (CNN) with Transformer layers to perform simultaneous risk assessment, cancer detection, staging, risk factor analysis, and differential diagnosis using mammograms. The model was trained using the King AbdulAziz University Breast Cancer Mammogram Dataset (KAU-BCMD) from Saudi Arabia, supplemented with preliminary data collected from Kenyan healthcare facilities. This multi-regional approach was necessitated by insufficient African data volume for robust deep learning model training. The framework employs a multi-task learning approach that integrates a ResNet50–Transformer backbone for spatial feature extraction, feeding five specialized branches for risk assessment, cancer detection, staging, risk factor analysis, and differential diagnosis. All images underwent standardized preprocessing, including resizing to 224×224 pixels, normalization, contrast enhancement, and extensive data augmentation. Weighted categorical cross-entropy losses supported joint optimization across tasks. Model interpretability was ensured using Grad-CAM–based heatmaps and uncertainty estimation, and predictions were compiled into automated, clinician-friendly HTML reports. The model achieved overall accuracy of 98% on the majority class (BI-RADS 1), with macro-averaged F1-scores of 0.61-0.62 across all branches, reflecting challenges in detecting minority classes. Critically, the model failed to identify any BI-RADS 5 (highly malignant) cases, misclassifying all as BI-RADS 4. Grad-CAM visualizations provided interpretable insights, supporting clinical decision-making. While these results demonstrate technical feasibility and the potential of hybrid architectures, they also underscore critical limitations: severe class imbalance, inadequate minority class performance, and unproven generalizability to diverse African populations. Future work must prioritize collection of larger, balanced, multi-center African datasets and external validation across diverse Sub-Saharan African healthcare settings before clinical deployment can be considered.
Abstract: Breast cancer remains the most common cancer among African women, contributing to high mortality largely due to late-stage diagnosis. More than 70% of cases in many African countries are detected at advanced stages, when treatment options are limited and survival rates are significantly reduced. Barriers such as limited access to screening, a short...
Show More
-
Research Article
The Cekirge σ-Method in AI: Analysis and Broad Applications
Huseyin Murat Cekirge*
Issue:
Volume 10, Issue 1, June 2026
Pages:
14-33
Received:
19 December 2025
Accepted:
4 January 2026
Published:
19 January 2026
Abstract: Modern machine learning relies predominantly on iterative optimization, assuming that learning must progress by following gradients across a probabilistic loss landscape shaped by Gaussian noise. Such methods depend on learning rates, random initialization, batching strategies, and repeated parameter updates, and their outcomes vary with numerical precision, hardware behavior, and floating-point drift. As dimensionality increases or data matrices become ill-conditioned, these iterative procedures often require extensive tuning and may converge inconsistently or fail altogether. This work presents the Cekirge Global Deterministic σ-Method, a non-iterative learning framework in which model parameters are obtained through a single closed-form computation rather than through iterative descent. Learning is formulated as a deterministic equilibrium problem governed by a σ-regularized equilibrium functional. A behavioral analogy-Cekirge’s Dog-illustrates the operational distinction: a trained working dog runs directly to its target without scanning or correcting its path step by step, whereas iterative algorithms resemble a dog that repeatedly stops, senses, and adjusts direction. Throughout this work, the term equilibrium functional is used instead of energy, emphasizing deterministic balance rather than physical or variational interpretations. The proposed framework yields a unique, reproducible solution independent of initialization or hardware, remains stable for ill-conditioned systems, and scales deterministically to large problems through σ-regularized partitioning. A minimal overdetermined example demonstrates that the inefficiency of gradient-based learning is structural rather than dimensional, arising from trajectory-based optimization, while the proposed σ-regularized formulation computes the same equilibrium directly in a single closed-form step.
Abstract: Modern machine learning relies predominantly on iterative optimization, assuming that learning must progress by following gradients across a probabilistic loss landscape shaped by Gaussian noise. Such methods depend on learning rates, random initialization, batching strategies, and repeated parameter updates, and their outcomes vary with numerical ...
Show More
-
Review Article
Thermal Management Optimization for Robotic Surgical Systems in Clinical Applications in Mongolia
Issue:
Volume 10, Issue 1, June 2026
Pages:
34-41
Received:
26 November 2025
Accepted:
31 December 2025
Published:
23 January 2026
Abstract: In the clinical application of robotic surgical systems, optimizing thermal management is crucial for improving surgical efficiency and ensuring system reliability. The rapid development of flexible robotics has introduced enhanced flexibility and adaptability in surgical procedures, opening new clinical possibilities while simultaneously imposing more stringent thermal management requirements. Furthermore, thermal management in implantable medical devices has become increasingly critical, demanding advanced optimization strategies to guarantee both safety and operational stability. This study conducted a systematic review and analysis of research indexed in major databases, including Web of Science (WoS), Scopus, and CNKI. The investigation focused on three key areas: (1) thermal management optimization in robotic surgical systems, (2) the design and clinical applications of flexible robotic technologies, and (3) thermal management strategies for implantable devices. By synthesizing findings from these domains, the study aimed to identify effective approaches to enhance thermal performance in surgical robotics. Thermal Management: Increasing heat dissipation surface area and optimizing thermal conduction pathways-such as using copper or aluminum materials under cryogenic conditions-significantly improved cooling efficiency. However, while iron/nickel-based alloys and ceramics demonstrated superior thermal stability in high-temperature environments, challenges related to corrosion resistance and long-term durability remained unresolved. Flexible Robotics: Magnetic actuation and smart material-based actuators enhanced reconfigurability, enabling more adaptable surgical interventions. Additionally, probability model-based online learning algorithms facilitated control optimization independent of specific robotic designs. Thermal Performance: Finite-state machine control combined with proportional-integral (PI) strategies effectively minimized temperature gradients to below 0.5°C. Liquid cooling systems proved highly efficient in battery thermal management, though integration complexities and control challenges persisted. The integration of multidisciplinary approaches-spanning materials science, thermodynamic modeling, and intelligent control-has significantly advanced thermal management in robotic surgical systems. Flexible robotics technologies offer safer and more precise surgical solutions, while thermal management strategies for implantable devices can be rigorously validated through computational simulations and experimental studies. Future research should explore novel nanomaterials, dynamic thermal management algorithms, and cross-disciplinary collaborations to further optimize the performance and reliability of robotic surgical systems. These advancements will be pivotal in meeting the growing demands of next-generation surgical robotics and implantable medical technologies.
Abstract: In the clinical application of robotic surgical systems, optimizing thermal management is crucial for improving surgical efficiency and ensuring system reliability. The rapid development of flexible robotics has introduced enhanced flexibility and adaptability in surgical procedures, opening new clinical possibilities while simultaneously imposing ...
Show More
-
Research Article
A Dual-Pathway AI Architecture for Tourism Logistics and Support in Low-Resource Environments
Aashish Dhakal*
,
Bibek Ropakheti
Issue:
Volume 10, Issue 1, June 2026
Pages:
42-47
Received:
19 December 2025
Accepted:
4 January 2026
Published:
23 January 2026
Abstract: The tourism industry in high-altitude regions, specifically the Himalayas, faces two critical and distinct challenges: ensuring operational safety amidst volatile weather and traffic conditions, and overcoming commercial inefficiency caused by a lack of 24/7 customer support. Traditional solutions have largely failed to address these issues simultaneously, often relying on fragmented manual processes or static chatbots that lack real-time capabilities. This paper presents a unified Artificial Intelligence (AI) platform designed to address these distinct problems using a novel ”Hybrid AI” architecture. A ”Two-Brain” system is proposed that integrates Retrieval Augmented Generation (RAG) for static, knowledge-intensive customer queries and Tool-Using Large Language Model (LLM) Agents for dynamic, real-time logistical support. By leveraging open source technologies, specifically Django for the backend framework and PostgreSQL with pgvector for high-dimensional vector storage, and implementing semantic caching, a cost-effective, maintainable solution is demonstrated for Small and Medium Enterprises (SMEs) in developing economies. The design mitigates hallucination risks through strict context faithfulness protocols and ensures data sovereignty via a self-hosted infrastructure. Performance metrics regarding average latency, token cost efficiency, and data freshness are analyzed, showing that the dual-pathway approach significantly optimizes resource usage compared to traditional methods. Specifically, the semantic caching mechanism reduces API costs by approximately 60 percent for repetitive queries, while the real-time agent ensures critical safety data is retrieved with a freshness of under 10 seconds. This study concludes that such a hybrid architecture provides a scalable, safe, and economically viable model for modernizing tourism operations in low-resource environments.
Abstract: The tourism industry in high-altitude regions, specifically the Himalayas, faces two critical and distinct challenges: ensuring operational safety amidst volatile weather and traffic conditions, and overcoming commercial inefficiency caused by a lack of 24/7 customer support. Traditional solutions have largely failed to address these issues simulta...
Show More
-
Research Article
Equilibrium-based Deterministic Learning in AI via
σ-Regularization
Huseyin Murat Cekirge*
Issue:
Volume 10, Issue 1, June 2026
Pages:
48-60
Received:
9 January 2026
Accepted:
19 January 2026
Published:
30 January 2026
Abstract: Gradient-based learning methods such as Gradient Descent (GD), Stochastic Gradient Descent (SGD), and Conjugate Gradient Descent (CGD) are widely used in supervised learning and inverse problems. However, when the underlying system is underdetermined, these iterative approaches do not converge to a unique solution; instead, their outcomes depend strongly on initialization, learning rates, numerical precision, and stopping criteria. This study presents a deterministic σ-regularized equilibrium framework, referred to as the Cekirge Method, in which model parameters are obtained through a single closed-form computation rather than iterative optimization. Using a controlled time-indexed dataset, the deterministic equilibrium solution is compared directly with GD, SGD, and CGD under identical experimental conditions. While gradient-based methods follow distinct optimization trajectories and require substantially longer runtimes, the σ-regularized formulation consistently yields a unique and numerically stable solution with minimal computational cost. The results demonstrate that the inability of gradient-based methods to reproduce the deterministic equilibrium in underdetermined systems is not an algorithmic shortcoming, but a structural consequence of trajectory-based optimization in a non-unique solution space. The analysis focuses on formulation-level properties rather than predictive accuracy, emphasizing equilibrium existence, numerical conditioning, parameter stability, and reproducibility. By prioritizing equilibrium recognition over iterative search, the proposed framework highlights deterministic algebraic learning as a complementary paradigm to conventional gradient-based methods, particularly for time-indexed systems where stability and repeatability are critical.
Abstract: Gradient-based learning methods such as Gradient Descent (GD), Stochastic Gradient Descent (SGD), and Conjugate Gradient Descent (CGD) are widely used in supervised learning and inverse problems. However, when the underlying system is underdetermined, these iterative approaches do not converge to a unique solution; instead, their outcomes depend st...
Show More
-
Research Article
Understanding Model Hallucinations: Causes, Mitigation Strategies, and Evaluation Metrics for Detection
Issue:
Volume 10, Issue 1, June 2026
Pages:
61-70
Received:
21 October 2025
Accepted:
3 November 2025
Published:
2 February 2026
DOI:
10.11648/j.ajai.20261001.16
Downloads:
Views:
Abstract: Foundation models (FMs) have the potential to revolutionize various fields, but their reliability is often compromised by hallucinations. This paper delves into the intricate nature of model hallucinations, exploring their root causes, mitigation strategies, and evaluation metrics. We provide a comprehensive overview of the challenges posed by hallucinations, including factual inaccuracies, logical inconsistencies, and the generation of fabricated content. To address these issues, we discuss a range of techniques, such as improving data quality, refining model architectures, and employing advanced prompting techniques. We also highlight the importance of developing robust evaluation metrics to detect and quantify hallucinations. By understanding the underlying mechanisms and implementing effective mitigation strategies, we can unlock the full potential of FMs and ensure their reliable and trustworthy operation. Foundation Models (FMs), such as large language models and multimodal transformers, have demonstrated transformative capabilities across a wide range of applications in artificial intelligence, including natural language processing, computer vision, and decision support systems. Despite their remarkable success, the reliability and trustworthiness of these models are frequently undermined by a phenomenon known as hallucination, the generation of outputs that are factually incorrect, logically inconsistent, or entirely fabricated. This study presents a comprehensive examination of model hallucinations, focusing on their underlying causes, mitigation approaches, and evaluation metrics for systematic detection. We begin by analyzing the root causes of hallucination, which span data-related factors such as bias, noise, and imbalance, as well as architectural and training issues like over-parameterization, poor generalization, and the lack of grounded reasoning. The paper categorizes hallucinations into factual, logical, and contextual types, illustrating how each arises in different stages of model inference and decision-making. We further discuss how prompt engineering, attention misalignment, and inadequate fine-tuning contribute to the persistence of erroneous model outputs. To mitigate these challenges, we explore a range of strategies, including improving data curation and preprocessing pipelines, integrating factual verification and retrieval-augmented mechanisms, and refining model architectures to enhance interpretability and context awareness. Techniques such as reinforcement learning with human feedback (RLHF), chain-of-thought prompting, and hybrid symbolic-neural approaches are highlighted for their potential in reducing hallucination rates while maintaining model fluency and adaptability. Furthermore, this work emphasizes the critical need for rigorous and standardized evaluation metrics capable of quantifying the severity, frequency, and impact of hallucinations. Metrics such as factual consistency scores, semantic similarity indices, and hallucination detection benchmarks are discussed as essential tools for assessing model reliability. Ultimately, this paper provides a structured understanding of model hallucinations as both a technical and ethical challenge in the deployment of Foundation Models. By elucidating their origins and presenting practical mitigation frameworks, we aim to advance the development of more transparent, accountable, and trustworthy AI systems. The insights presented herein contribute to ongoing efforts to ensure that Foundation Models not only achieve high performance but also uphold factual integrity and user trust across real-world applications.
Abstract: Foundation models (FMs) have the potential to revolutionize various fields, but their reliability is often compromised by hallucinations. This paper delves into the intricate nature of model hallucinations, exploring their root causes, mitigation strategies, and evaluation metrics. We provide a comprehensive overview of the challenges posed by hall...
Show More
-
Research Article
Deterministic σ-Regularized Equilibrium Inference Method for Artificial Intelligence
Huseyin Murat Cekirge*
Issue:
Volume 10, Issue 1, June 2026
Pages:
71-82
Received:
11 January 2026
Accepted:
21 January 2026
Published:
4 February 2026
DOI:
10.11648/j.ajai.20261001.17
Downloads:
Views:
Abstract: Convolutional encoders are widely used in modern artificial intelligence systems to transform structured inputs into compact representations that are subsequently processed by pooling, flattening, and training-based classification layers. Despite their empirical success, this pipeline implicitly assumes that learning is intrinsic to convolutional processing. In this work, we show that convolution itself is a deterministic linear measurement operation and does not inherently require training; learning becomes necessary only after architectural choices discard geometric structure and invertibility. By reformulating convolutional encoding as a known forward operator, inference is cast as an inverse problem governed by algebraic consistency rather than optimization trajectories. When spatial structure is preserved and pooling and flattening are avoided, the encoded representation admits a σ-regularized equilibrium solution obtained via the adjoint convolution operator. This formulation yields a unique closed-form reconstruction in a single computational step, eliminating gradient descent, backpropagation, learning rates, and iterative updates, and resulting in deterministic, reproducible inference independent of initialization or stochastic effects. From an AI perspective, the proposed framework clarifies the distinction between structure-preserving encoders, which admit equilibrium-based inference, and structure-discarding architectures, which require training-based approximation. The approach aligns convolutional encoding with classical inverse-problem methodologies, such as those used in tomography and radar, while remaining compatible with modern AI representations. Training is shown not to be a fundamental requirement of convolutional encoders, but rather a consequence of design choices that prioritize classification over structural recovery. As a result, the proposed framework offers a time- and energy-efficient alternative for inference in structured domains.
Abstract: Convolutional encoders are widely used in modern artificial intelligence systems to transform structured inputs into compact representations that are subsequently processed by pooling, flattening, and training-based classification layers. Despite their empirical success, this pipeline implicitly assumes that learning is intrinsic to convolutional p...
Show More
-
Research Article
Artificial Intelligence Without Iterative Learning:
The Cekirge Deterministic Equilibrium Framework
Huseyin Murat Cekirge*
Issue:
Volume 10, Issue 1, June 2026
Pages:
83-96
Received:
15 January 2026
Accepted:
23 January 2026
Published:
4 February 2026
DOI:
10.11648/j.ajai.20261001.18
Downloads:
Views:
Abstract: Modern language models predominantly rely on probabilistic attention mechanisms and iterative training procedures to resolve next-token prediction. In these approaches, query–key (Q–K) interactions are normalized via softmax to produce probability distributions, followed by stochastic sampling or expectation-based selection. While effective in large-scale settings, such formulations inherently depend on training trajectories, random initialization, and repeated parameter updates, leading to variability in outcomes and significant computational cost. This study presents a unified framework that contrasts probabilistic attention with a deterministic allocation methodology, referred to as the Cekirge method, under the same Q–K representation and identical vocabulary. Instead of interpreting Q–K interactions as probabilistic scores, the proposed approach treats them as deterministic constraints and computes model output through a single σ-regularized equilibrium solution of a linear allocation system. No training, softmax normalization, sampling, or initial guess is required. Using an explicit 8-token numerical example, the paper demonstrates that both methodologies operate on the same semantic information yet diverge fundamentally in how constraints are resolved: probabilistic optimization versus deterministic equilibrium recognition. The comparison highlights differences in reproducibility, energy consumption, and interpretability, showing that deterministic allocation yields a unique and stable solution while preserving semantic consistency. The results suggest that probabilistic attention and deterministic equilibrium allocation represent two mathematically coherent but structurally distinct resolutions of the same Q–K framework, opening a path toward energy-efficient, reproducible, and fully interpretable language inference without iterative training. In this work, the term probability distribution is reserved exclusively for softmax-normalized attention outputs, whereas the equilibrium vectors produced by the proposed method are unconstrained allocations with no probabilistic interpretation. No linguistic analysis is intended; the terminology is used strictly in a structural and mathematical sense.
Abstract: Modern language models predominantly rely on probabilistic attention mechanisms and iterative training procedures to resolve next-token prediction. In these approaches, query–key (Q–K) interactions are normalized via softmax to produce probability distributions, followed by stochastic sampling or expectation-based selection. While effective in larg...
Show More