Accepted Papers : 7th International Conference on Signal Processing and Machine Learning (SIGML 2026)

Accepted Papers

Vector Embeddings for Images Beyond Neural Networks: An Exhaustive Study on Compact Composite Descriptorss

Arpad Kiss, GreenEyes Artificial Intelligence Services, LLC, Lewes, Delaware, USA

ABSTRACT

This research report provides a comprehensive analysis of Compact Composite Descriptors (CCDs) as a highly ef icient alternative to deep learning embeddings for Content-Based Image Retrieval (CBIR) in resource-constrained environments. While Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) of er superior semantic performance, their computational overhead and storage requirements—often exceeding 8KB per image—limit their applicability in Edge AI and IoT scenarios. In contrast, engineered descriptors such as the Color and Edge Directivity Descriptor (CEDD), Fuzzy Color and Texture Histogram (FCTH), and Joint Composite Descriptor (JCD) utilize fuzzy inference systems to encode visual features into ultra-compact vectors ranging from 54 to 72 bytes. The study explores the algorithmic foundations of these descriptors, their implementation within the LIRE (Lucene Image Retrieval) framework, and benchmarks demonstrating their competitive retrieval accuracy against MPEG-7 standards. Finally, the report highlights the strategic utility of CCDs for privacy-preserving, low-bandwidth visual search on edge devices, proposing hybrid architectures that leverage the speed of fuzzy composites with the semantic power of neural re-ranking.

KEYWORDS

Computer Vision, Cloud Computing, Embedded Systems, Content-based Image Retrieval Systems

Cloud Based Decision Support systems for Analysing Student trends in Educational Institutions

Awatef Balobaid and R.Y. Aburasain, Jazan University, KSA

ABSTRACT

This research suggests a new technique to detect and categorize student performance that will assistschools in improving outcomes. A regression-based technique estimates student performance, and aclassification model classifies students by performance. It begins with a regression model that predictsstudent performance. It then utilizes gradient descent to refine the model over and over again to generatebetter predictions. The model is then cross-validated and retrained on the complete set of data to make itmore accurate and helpful in different circumstances. The system organizes students by predictedperformance using the regression model. To increase classification accuracy, further optimization isutilized to determine the appropriate option limit for splitting performance groups. We assess themethod's efficiency in terms of accuracy, response time, scalability, and resource utilization. The findingsdemonstrate the new procedure is superior to the old ones. This strategy is robust, versatile, and cost-effective for educational organizations since it can generate correct predictions 95% of the time, reactmore rapidly, utilize resources economically, and be employed on a big scale. It helps instructors knowhow their pupils are doing so they may intervene early and make better decisions to support them. Data-based analysis can enhance educational results by utilizing the system's power and ability to adapt to newdata

KEYWORDS

Adaptability, Classification, Data-driven, Educational institutions, Optimization, Performance prediction,Regression, Resource utilization, Scalability, Student outcomes

Bridging Misuse Case Modelling and MITRE ATT&CK: A Unified Framework for Threat-Informed Design

Jean-Marie Kabasele Tenday, University ND Kasayi(UKA), Belgium

ABSTRACT

Traditional threat modelling techniques often focus on theoretical or system-specific threats without grounding them in empirical adversarial behaviour. Conversely, frameworks such as MITRE ATT&CK provide rich, intelligence-based taxonomies of real-world attacker tactics, techniques, and procedures (TTPs), but are rarely integrated into early software design phases. This paper proposes a methodology for linking misuse cases—UML-based representations of malicious system interactions—with MITRE ATT&CK techniques, enabling traceability between system-level threats and empirically observed attacks. The proposed framework enhances the relevance, completeness, and operational value of misuse case–based threat modelling. A structured mapping template and example implementation demonstrate how software architects can enrich their security design processes using ATT&CK-informed misuse cases.

KEYWORDS

Misuse case, Mitre ATT&CK, Threat Analysis, Threat Modelling, Cybersecurity, Secure Design.

Vulnerability Analysis of Containerized Web Applications using SAST and DAST Tools

Burak Enes Beygog¹, Ahmet Burak Can ², ¹Aselsan Inc., Ankara, Türkiye, ² Hacettepe University, Ankara, Türkiye

ABSTRACT

While containerization has significantly simplified web application deployment, it has simultaneously introduced security blind spots that traditional testing methodologies often fail to address. This study examines the effectiveness of Static Application Security Testing (SAST), Dynamic Application Security Testing (DAST), and container scanning tools in identifying vulnerabilities within containerized environments through empirical testing of five open-source tools against three vulnerable applications (DVWA, Juice Shop, and VulnerableApp). Results demonstrate that reliance on any single tool presents substantial risk, with individual tools failing to detect up to 91% of existing vulnerabilities, while each tool category exhibited distinct limitations. Trivy uniquely identified critical infrastructure and supply chain risks, whereas DAST tools including Nikto and OWASP ZAP proved essential for detecting runtime misconfigurations. Notably, authenticated scanning emerged as particularly impactful, enhancing vulnerability detection rates by over 1,400%, thereby underscoring the necessity of implementing a Defense-in-Depth security strategy. Through strategic orchestration of Trivy for infrastructure assessment, authenticated DAST for runtime analysis, and SonarQube for static code analysis, security teams can substantially reduce their vulnerability miss rate to approximately 32%, achieving comprehensive coverage across code, infrastructure, and runtime configuration layers.

KEYWORDS

Container Security, DevSecOps, Static Application Security Testing (SAST), Dynamic Application Security Testing (DAST), Software Composition Analysis (SCA), Supply Chain Security

Evaluating Sentiment Models for Cybershield Abusive Language Detection System

Binisa Giri¹, Hashmath Fathima, Kelechi Nwachukwu ² and Kofi Nyarko, Department of Electrical and Computer Engineering, Morgan State University, Baltimore, USA

ABSTRACT

Cyber Shield is an automated graph augmented abusive language and interaction detection system designed to identify harmful content including toxic interaction, hate speech, and general negative sentiment that is prevalent on social media platforms. As part of integrating a robust sentiment component into the system, we evaluated four widely used sentiment analysis models: BERT, RoBERTa, VADER, and TextBlob based on their complementary strengths and methodological diversity. BERT and RoBERTa represent string transformers architectures capable of capturing contextual meaning in noisy social media texts. VADER provides a lexicon based model optimized for informal online communication, of ering a lightweight alternative to transformers. TextBlob is a traditional NLP baseline to benchmark improvements of ered by more contemporary models. Together, this combination allows for a comprehensive comparison across model families, ensuring evidence based model selection for the CyberShield project. These models were evaluated on a Kaggle dataset containing social media comments labeled with three sentiment classes (negative, positive, neutral) serving as the ground truth. Each model’s performance was measured using confusion matrices, accuracy, macro F1, weighted F1, and per class F1 scores. Our findings show that with an initial sample of 3000 texts, classical lexicon based models (i.e. VADER) and the traditional NLP baseline model (i.e., TextBlob), significantly outperformed transformer based models. TextBlob achieved the strongest performance results in this phase, underscoring the challenges of applying general pre-trained transformers to real world sentiment classification without domain specific fine tuning. However, after expanding the dataset to 18,318 samples per sentiment class and rerunning the evaluation with the updated RoBERTa sentiment model, the performance of trend shifted. The updated RoBERTa model demonstrated substantial improvement and outperformed the earlier transformer results.

KEYWORDS

Abusive Language Detection, Sentiment Analysis, Transformer Models, Lexicon-based models, Social Media Moderation, Performance Metrics

Split-brain Rag: Why Large Language Models are Not Enough for Scientific Question Answering

Jodi Moselle Alcantara¹ and Armielyn Obinguar², ¹Independent Researcher, Pampanga, Philippines, ² Independent Researcher, Makati, Philippines

ABSTRACT

Large Language Models (LLMs) show promise for information retrieval but face trade-offs between reasoning depth, latency, and cost in scientific question answering (QA). This work evaluates monolithic LLM deployments in Ricerca Paperchat, a Retrieval-Augmented Generation (RAG) system for academic inquiry. We analyze seven models, including Claude Sonnet 4.5, GPT-4o, Gemini Flash, and hosted variants of Qwen and Llama, across accuracy, hallucination resistance, formatting, long-context stability, consistency, and cost. Results show that no single model meets real-time scientific QA requirements: high-reasoning models are slow, while faster models often fail safety checks. We conclude that monolithic LLM architectures are insufficient and propose Split-Brain RAG, a complexity-aware routing approach that reduces latency and cost while maintaining scientific accuracy.

KEYWORDS

Large Language Models, Retrieval-Augmented generation

Semantic Topology Reasoning Architecture (STRA):From Parameter-Centric Models to StructureCentric Reasoning

Marcelo Emanuel Paradela Teixeira, Independent Researcher, France

ABSTRACT

Large language models fuse knowledge and reasoning into billions of inscrutable parameters, trading interpretability for performance. We propose Semantic Topology Reasoning Architecture (STRA), which cleanly separates: (1) knowledge as explicit, inspectable semantic topology; (2) reasoning as metaoperations by smaller models (1-7B parameters) trained on topology navigation; (3) language as output interface, not cognitive substrate. This separation enables transparency (visible reasoning paths), efficiency (targeted computation), correctability (edit knowledge without retraining), and genuine crossdomain reasoning through semantic similarity. STRA integrates five primitives: Activation Arrays (working memory), Causal Signatures (cross-domain analogy), Selection Pressure (reasoning stability), Transform Learning (procedural compression), and Semantic Abacus (skill acquisition). These form a complete architecture for transparent, evolvable reasoning that operates on concepts, not tokens.

KEYWORDS

Semantic reasoning, transparent AI, knowledge representation, activation dynamics, explainable AI

Privacy-By-Default: An Industry-Aware Framework for Automated Data Retention at Scale

Sandhya Vinjam , Principal Software Engineer, Texas, USA

ABSTRACT

Data privacy regulations such as GDPR, CCPA, and LGPD impose strict requirements on organizations to automatically delete personal identifiable information (PII) after specified retention periods. However, implementing compliant data retention at scale presents significant architectural and operational challenges, particularly for platforms processing millions of records daily across distributed microservices. This paper presents Privacy-by-Default, an industry-aware framework that automates data retention enforcement without requiring per-merchant configuration. Our framework processes 50,000 daily redaction requests acrosse 5 million user records spanning 12 microservices, achieving 99.7% deletion success rates with sub-3-hour latency. Through industry-specific retention policies and multi-service orchestration, we demonstrate how privacy compliance can be achieved by design rather than by configuration. Evaluation across pharmaceutical, healthcare, retail, and restaurant sectors shows our framework reduces compliance violations by 94%, eliminates manual intervention overhead, and provides audit-ready verification. We estimate our deployment has avoided approximately $4 million in potential regulatory fines while enabling market expansion into regulated jurisdictions.

KEYWORDS

Privacy engineering; GDPR compliance; automated data retention; privacy-by-design; PII redaction; distributed systems; microservices architecture

Economic Impact of Security Failures in Cloud Infrastructure

Sandhya Vinjam , Principal Software Engineer, Texas, USA

ABSTRACT

Security failures in cloud infrastructure result in significant economic losses that extend far beyond immediate breach costs. This paper presents a comprehensive analysis of the economic impact of security failures across cloud service providers, examining direct costs (incident response, system recovery, regulatory fines) and indirect costs (customer churn, reputational damage, market valuation impact). Through analysis of 127 publicly disclosed security incidents affecting cloud infrastructure providers between 2019-2024, we quantify the total economic impact at $47.3B, with individual incidents ranging from $2.1M to $4.2B. We develop a predictive model correlating security architecture decisions with economic risk, demonstrating that proactive security investments of $1M-5M can prevent potential losses of $50M-500M. Our findings show that the mean time to detect (MTTD) security incidents has the strongest correlation with total economic impact (r=0.82, p<0.001), suggesting that investment in detection capabilities provides the highest ROI for mitigating financial risk. We present evidence that organizations implementing comprehensive security frameworks achieve 73% lower total cost of incidents and 89% faster recovery times. This work provides quantitative evidence for prioritizing security investments in cloud infrastructure and establishes benchmarks for measuring the economic effectiveness of security programs.

KEYWORDS

Privacy engineering; Economics; Security; Mean time to detect; Cloud Infrastructure.

Real-time Smile Synchronization as a Mechanism for Emotional Contagion in Public Interactive Displays

He-lin Luo and Meng-fan Huang, Graduate Institute of Animation and Film Art, Tainan National University of the Arts, Tainan City, Taiwan

ABSTRACT

Emotional contagion refers to the psychological and behavioral phenomenon in which individuals unconsciously mimic the facial expressions, vocal patterns, postures, and movements of others during social interactions, resulting in corresponding changes in their own emotional states. With the rapid development of digital media and networked communication platforms, emotional transmission is no longer limited to face-to-face interaction, but increasingly mediated through multimodal digital signals such as symbolic icons, animated feedback, visual imagery, and auditory cues. This transformation has positioned emotional contagion as a critical research topic in the fields of Human-Computer Interaction (HCI) and Affective Computing. This study focuses on the transmission of positive emotions, specifically investigating the contagion effect of happiness through an interactive installation titled Quartic Smile. The system was designed to construct a real-time emotional feedback environment in which users can perceive and respond to the emotional expressions of others within a shared interactive space. By integrating real-time facial expression recognition, the system captures smiling behaviors as emotional triggers and translates them into visualized interactive responses, thereby facilitating emotional resonance and collective engagement among participants. To quantitatively evaluate the effectiveness of emotional contagion, two core metrics were defined in this study. The first is the contagion level, which is calculated based on the frequency of smiles and reflects the intensity and distribution of emotional transmission among users. The second is the contagion speed, measured by the cumulative duration of smiling behaviors between participants, representing the temporal dynamics and responsiveness of emotional propagation. The experimental results indicate that Quartic Smile effectively enhances positive emotional interaction and demonstrates the potential of real-time interactive systems to shape collective emotional atmospheres and social engagement patterns.

KEYWORDS

Smile Detection, Emotion Recognition, Real-Time System, Public Interactive Installation.

Evaluating Chunking Strategies for Retrieval-augmented Generation in Oil and Gas Enterprise Documents

Samuel Taiwo and Mohd Amaluddin Yusoff Digital and Innovation Department, Nigeria LNG Limited, Port-Harcourt, Nigeria

ABSTRACT

Retrieval-Augmented Generation (RAG) has emerged as a framework to address the constraints of Large Language Models (LLMs), yet its effectiveness fundamentally hinges on document chunking—an often-overlooked determinant of its quality. This paper presents an empirical study quantifying performance differences across four chunking strategies: fixed-size sliding window, recursive, breakpoint-based semantic, and structure-aware. We evaluated these methods using a proprietary corpus of oil and gas enterprise documents, including text-heavy manuals, table-heavy specifications, and piping and instrumentation diagrams (P&IDs). Our findings show that structure-aware chunking yields higher overall retrieval effectiveness, particularly in top-K metrics, and incurs significantly lower computational costs than semantic or baseline strategies. Crucially, all four methods demonstrated limited effectiveness on P&IDs, underscoring a core limitation of purely text-based RAG within visually and spatially encoded documents. We conclude that while explicit structure preservation is essential for specialised domains, future work must integrate multimodal models to overcome current limitations.

KEYWORDS

RAG, AI, Oil and Gas, Information Retrieval

Prolonging Anti-Deepfake Signatures Lifetime with Blockchain-Based Timestamps

Sohaib Saleem and Pericle Perazzo, University of Pisa, Italy

ABSTRACT

As AI-generated synthetic media, such as deepfake images, proliferate, verifying the authenticity of digital images has become a significant challenge. Traditional digital signature techniques become invalid if images are cropped; therefore, special croppable signatures have been proposed in the literature. However, both traditional and croppable signatures remain valid only as long as their associated public key certificate remains valid. This could be problematic for authenticated images, as they often circulate over the Internet for long periods of time, beyond the expiration of their public key certificates. Re-signing each image with a new key requires redistributing all affected images, and this may be impractical for large-scale systems. To address this issue, we propose an image authentication system with croppability and post-expiration validity features, using BLS (Boneh–Lynn–Shacham) short signatures, the Ethereum blockchain as a decentralized trusted timestamping service, and IPFS (InterPlanetary File System) as a decentralized storage solution. Additionally, we employ two methods: a baseline method, in which the web server hosting the images does not pay any transaction fees, and an optimized method, which produces very little traffic on the web browser. Experimental evaluations are conducted in Pakistan and Italy under real Wi-Fi and simulated 4G cellular connections using Linux traffic control (tc) to demonstrate the system’s performance. Results showed that, in the baseline method, the network traffic overhead and communication delay increase linearly with the image size. Meanwhile, the optimized method achieves constant-time performance for retrieval and verification.

KEYWORDS

Image authentication, BLS signatures, blockchain, decentralized timestamping, IPFS.

AI-based Classification of the Meat Freshness using Cantilever Sensor Data

Sebastian Hauschild, Jan-Philipp Schreiter and Horst Hellbruck, Luebeck University of Applied Sciences, Center of Excellence CoSA, Germany

ABSTRACT

A novel approach for determining the freshness of fish and meat involves the use of cantileversensors, which analyse the concentration of cadaverine on the surface. The cantilever sensor isexcited with a voltage sweep around its resonance frequency and the frequency shift due to depositson the sensor is measured. In this work, we present a draft of a distributed system and compareAI-based analysis of the stored cantilever sensor data with raw sweep data without preprocessing.We defined a meat quality index (mqi) range for the measurements, which depends on the frequency shiftbetween a reference and cadaverine measurement. We investigated, that the best practice to predictthe mqi value is to use classical machine learning models such as Random Forest, LightGBM,XGBoost where Random Forest performs best with anval. / test accuracy of up to 72.01 % / 71.67%, precisionof 72.37 % / 72.53%, recall of72.01 % / 71.67 % and F1-Score of72.06 % / 71.72 %.

KEYWORDS

Cantilever, Machine Learning, Database, Distributed Systems, Sustainability

AI-Driven Climate Adaptation Models for Predictive Crop Yield Optimization

Venkateshwara Reddy Mudiyala¹ and Sai Manvith Reddy Buchi Reddy², ¹Department of Computer Science, New England College, Henniker, New Hampshire ² Department of Computer Science, University of Bridgeport, Bridgeport, Connecticut

ABSTRACT

Agriculture is increasingly threatened by climate variability and change, necessitating innovative solutions for sustainable crop yield optimization. This paper presents an advanced AI-driven framework for climate adaptation that predicts crop yields by integrating novel machine learning methodologies and applied sciences. The proposed system leverages hybrid machine learning models, incorporating meteorological, soil, and satellite data to deliver precise and actionable insights for farmers and agricultural stakeholders. Our principled solution focuses on novel data fusion architecture, robust feature engineering, and adaptive modeling techniques, demonstrating significant advancements over conventional methods. An in-depth evaluation reveals the framework's ability to enhance decision-making and mitigate the adverse effects of climatic uncertainties.

Spatio-temporal Prediction of Crimes using Predictive Justice Algorithms

Fahil Abdulbasit A. Abdulkareem, Legal Administrative Department, Duhok Polytechnic University, Duhok, Iraqi Kurdistan Region

ABSTRACT

A statistical/machine learning technique known as Risk Terrain Modelling (RTM) is currently being used as a software solution to diagnose the socio-environmental conditions that lead to crime in a specific area's geography (the Study Area) by geospatially and temporally analysing crimes by linking them to hotspots and analysing them into big data (criminal data). As a result, new predicting patterns of risks in the geographical area under survey (the Study Area) appear in the future, all for the sake of a prompt and efficient response by the Predictive Police. Prioritising the use of precautionary resources is necessary in two ways: first, to prevent crime and lessen potential dangers in the event that one does occur; second, to determine what must be done as soon as possible as a preventive measure to control the crime with the least amount of harm to the police forces. Leslie W. Kennedy and Joel M. Caplan founded the Risk Terrain Modelling at Rutgers University, and it has been systematically investing in the field of criminal investigation for over ten years. Currently, the model is being tested in more than 45 nations worldwide.

KEYWORDS

Predictive Policing, Near-Repeat Phenomenon, Crime Prediction, Risk Terrain Modelling, Agent-Based Modelling.

Enhancing Financial Report Question-Answering: A Retrieval-Augmented Generation System with Reranking Analysis

Zhiyuan Cheng¹ Longying Lai², Yue Liu³, Kai Cheng⁴ and Xiaoxi Qi⁵ ¹School of Engineering, Stanford University, Stanford, CA, USA, ²Simon Business School, University of Rochester, Rochester, NY, USA, ³Accounting & Information Systems, Rutgers University, Newark, NJ, USA,⁴ Institute for Social and Economic Research and Policy, Columbia University, New York, NY, USA,⁵ Department of Economics, Northeastern University, Boston, MA, USA

ABSTRACT

Financial analysts face significant challenges extracting information from lengthy 10-K reports, which often exceed 100 pages. This paper presents a Retrieval-Augmented Generation (RAG) system designed to answer questions about S&P 500 financial reports and evaluates the impact of neural reranking on system performance. Our pipeline employs hybrid search combining full-text and semantic retrieval, followed by an optional reranking stage using a cross-encoder model. We conduct systematic evaluation using the FinDER benchmark dataset, comprising 1,500 queries across five experimental groups. Results demonstrate that reranking significantly improves answer quality, achieving 49.0 percent correctness compared to 33.5 percent without reranking (15.5 percentage point improvement). The error rate decreases from 35.3 percent to 22.5 percent (12.8percentage point reduction), and average scores improve from 4.95 to 6.02 (21.6 percent relative improvement). Our findings emphasize the critical role of reranking in financial RAG systems and demonstrate performance improvements over baseline methods through modern language models and refined retrieval strategies.

KEYWORDS

Retrieval-Augmented Generation, Financial Document Analysis, Question Answering, Neural Reranking, 10-K Reports.

Recursive Self-reference as a Structural Principle for Conscious Intelligence: A Dynamical Framework Linking Philosophy of Reality and Machine Learning

Rajiv Singh, Independent Researcher, India

ABSTRACT

Despite rapid progress in machine learning, a unified theoretical account explaining how intelligence develops from information processing into self-aware cognition remains unresolved [3], [4]. Current AI systems excel at prediction and optimization but lack persistent self-representation and meta-cognitive awareness. This paper proposes recursive self-reference as a structural principle underlying conscious intelligence. We present a theoretical framework in which intelligent systems integrate external inputs with internally generated self-models through recursive feedback across time. Drawing on philosophy of mind, cognitive science, and computational learning theory, recursive processing is interpreted as enabling identity continuity, adaptive coherence, and meta-level evaluation [6], [7]. Consciousness is modeled as a dynamically stable regime of recursive informational organization rather than an unexplained emergent phenomenon. The framework suggests architectural extensions supporting explainability, continual learning, autonomous reasoning, and neuro-symbolic integration, providing a formal bridge between philosophical theories of reality and computational models of advanced artificial intelligence.

KEYWORDS

Recursive Intelligence; Self-Reference; Artificial Consciousness; Machine Learning Theory; Meta-Cognition; Explainable AI; Neuro-Symbolic AI

An Integrated Augmented Reality Mobile Control System for ESP32-Based Robot Cars Using OpenCV Feature Matching and Real-Time Video Streaming

Xingtong Zou¹ and Jonathan Sahagun², ¹The Webb Schools, 1175 W Baseline, Claremont, CA 91711,² California State University, Los Angeles, 5151 State University Dr, Los Angeles, CA 90032

ABSTRACT

Remote-controlled robotic vehicles increasingly require augmented reality capabilities for enhanced operator situational awareness, yet affordable platforms combining real-time video streaming with AR overlay remain scarce [10]. This project presents an integrated system combining an ESP32-CAM robot car with a Unity iOS application featuring OpenCV-based marker detection. The ESP32 firmware provides MJPEG video streaming and PWM motor control through a WiFi access point architecture, while the mobile application implements ORB feature matching for rotation and scale-invariant marker detection with 3D object overlay. Key challenges addressed include video streaming latency optimization through multi-threaded decoding, computational efficiency through frame skipping and descriptor caching, and hardware resource conflicts through careful LEDC timer allocation. Experimental evaluation demonstrated mean video latency of 142-251ms depending on distance, and detection accuracy of 98.2% under optimal conditions. Comparison with ORB-SLAM, ArUco, and IoT streaming research highlights this system's unique combination of accessibility, flexibility, and integrated functionality [1]. The solution enables sophisticated AR robotics using components costing under $50.

KEYWORDS

Augmented Reality Robotics, Real-Time Video Streaming, Computer Vision Marker Detection, Embedded IoT Systems

Campus of Things: A Lora-based Iot Architecture for a Smart Campus

Bernardo Milheiro, Polytechnic Institute of Setúbal, Setúbal, Portugal

ABSTRACT

This paper presents the design, implementation, and empirical evaluation of a LoRa-based IoT network deployed at the Setúbal campus of the Polytechnic Institute of Setúbal. The network leverages LoRa longrange communication integrated with open-source cloud infrastructure to create a research and development platform for IoT technologies while transforming the campus into a smart environment. A custom fieldTester device was developed incorporating an RFM95 transceiver and integrated with a mobile application utilizing smartphone GPS for systematic coverage mapping. Extensive field testing across 33 locations demonstrated reliable coverage across all tested campus locations, with a maximum recorded communication distance of 17.5 km in one of the test sites. Path loss analysis revealed environmental coefficients ranging from 2.7 to 3.5 depending on terrain characteristics. The implementation validates the feasibility of cost-effective, scalable IoT infrastructure for educational institutions.

KEYWORDS

IoT, Smart Campus, LoRa, RSSI Mapping, Path Loss

Machine Learning–Enhanced Rocket Drift Prediction in a Layered Wind Unity Simulation

Yuxuan Hao¹ and Laurie Delinois², ¹Bellevue High School，10416 SE Wolverine Way, Bellevue, WA98004 ,² California State Polytechnic University, Pomona, CA 91768

ABSTRACT

Predicting a rocket’s maximum altitude and landing location is dif icult because wind changes with altitude and small timing errors can create large drift [1]. This project proposes a Unity based simulator that combines layeredwind data with rocket physics and a machine learning landing predictor [2]. RocketPhysics simulates thrust, massburn, aerodynamic drag from wind relative airspeed, and parachute deployment. WindDriftCalculator loads thewind layers, supplies wind to the flight model, and records the final drift after landing. RocketDriftAgent uses ML- Agents to output a two axis drift estimate from engine parameters and the wind profile, then scores it by landingerror. Design challenges included unit consistency, reliable logging, and stable training rewards. In 20test launches, 18 predictions were within 25 meters, which is 90 percent accuracy. The results show the approach is practical for launch planning, education, and reducing lost rockets in typical conditions.

KEYWORDS

Rocket Simulation, Wind Drift Modeling, ML Landing Prediction, Unity ML-Agents

Design and Implementation of Axon AI: A Smart Helmet System for Real-Time Concussion Detection in Football Players Using IoT Sensors and Mobile Integration

Jiahao Chang¹ Soroush Mirzaee², ¹St. Margaret’s Episcopal School, 31641 La Novia Ave, San Juan Capistrano, CA92675 ,² California State Polytechnic University, Pomona, CA 91768

ABSTRACT

Axon AI is a helmet system designed to help detect concussions in football players [1]. Every year, players canget long-term brain damage because they don’t realize they are hurt or they are not checked out quickly. Our systemuses small hardware parts like Particle Boron microcontrollers, gyroscopes, and pressure sensors, along withaFirebase database and a mobile app made with FlutterFlow, to gather and analyze impact data instantly [2]. Whenthere’s a dangerous hit, the system immediately alerts coaches and trainers, so they don’t have to rely solely onhowplayers feel or what they say they are hurt. We faced some challenges like making sure the helmet fits, fixingnetwork issues, and setting the right thresholds for alerts. We improved the design through testing and adjustments. Tests showed that the sensors were accurate and that the data was sent reliably under good network conditions. Overall, Axon AI looks promising as a tool to detect concussions quickly, helping coaches make faster decisions and keeping players safer from long-term injury.

KEYWORDS

Hardware, Artificial Intelligence, Flutter, Football Players

Mitigating Hallucinations in Small Language Models: A Comparative Analysis of Structured Reasoning Templates Across Tinyllama, Qwen2, and Qwen2.5

Dr.Mehedi Islam, Department of Architecture, Ministry of Housing and Public Works, Bangladesh

ABSTRACT

Evaluating hallucination mitigation in small language models (SLMs) presents a fundamental methodological challenge: fabrication behavior under temperature-based decoding is stochastic, meaning that single-generation evaluations yield unreliable estimates, and cross-model comparisons are valid only when generation conditions are strictly controlled. Yet many existing studies do not simultaneously satisfy these requirements. We present a controlled comparative evaluation of six reasoning templates across three SLMs—TinyLlama (1.1B), Qwen2 (1.5B), and Qwen2.5 (1.5B)—under matched stochastic regimes, totaling 1,620 inferences across 30 questions. Using multi-trial sampling for baseline estimation and standardized generation parameters across models, we find that structured reasoning templates significantly reduce confident fabrication rates (relative reductions: 41.5–71.2%, p < 0.01, Cramér's V: 0.29–0.49). However, reducing confident hallucination does not eliminate a subtler failure mode we term epistemic theater—responses that express uncertainty while still containing fabricated claims. A four-criterion detection system reveals theater rates of 66.7–100% among uncertain/refusal responses. We further identify a non-monotonic performance pattern (“Qwen2 anomaly”) in which an earlier model version outperforms a newer one under specific prompting strategies. These findings demonstrate that hallucination mitigation claims in SLMs require probabilistic evaluation under controlled decoding conditions, and that fabrication reduction and epistemic calibration are distinct challenges.

KEYWORDS

Small Language Models, Hallucination Mitigation, Epistemic Theater, Prompt Engineering, Structured Reasoning

Stock Scout: A Lightweight Reddit-Based Sentiment Analysis System for Real-Time Per-Ticker Market Insight

Eric Rodrick Hu¹ and Jonathan Thamrun², ¹Fairmont Preparatory Academy, 2200 W Sequoia Ave, Anaheim, CA 92801 ,² University of California, Irvine, Irvine, CA 92697

ABSTRACT

Markets move with sentiment, yet students lack a transparent, per-ticker tool to read social mood in real time. Our program, Stock Scout, ingests Reddit posts, scores them with VADER plus a small finance lexicon, aggregates daily values, and visualizes a 14-day trend in a Flutter app backed by FastAPI [2]. The three core systems are: (1) an NLP pipeline for Reddit retrieval, scoring, and daily aggregation; (2) authentication and session bootstrap with Firebase; and (3) a History Service that logs per-user searches and views in Firestore [1]. Design challenges—slang, sarcasm, rate limits, and latency—are addressed with filtering, finance-term boosts, ±0.10 thresholds, and resilient endpoints. We evaluated correctness on 20 labeled finance headlines; our classifier produced 18/20 accurate labels (90%), with errors clustered around recall and secondary-offering language. The result is a lightweight, interpretable signal that helps learners and retail investors quickly gauge sentiment and track changes without heavy infrastructure or black- box models.

KEYWORDS

Stock Sentiment, Reddit NLP, Market Trends, Financial Analytics

High-precision Capturing and AI Inspection System for Detecting Anomalies in Pemfc Meas

In Joo, Kwan-Hee Yoo¹ Kyeong Tae Son, Yong Seong Kim²,and Ga-Ae Ryu³, ¹Department of Computer Science, Chungbuk National University, Cheongju City, Republic of Korea ,² LAT Co.,Ltd, Old Street, Suwon City, Gyeonggi-do, Republic of Korea ,³ Hydrogen Digital Convergence Center, Korea Institute of Ceramic Engineering & Technology, Jinju City, Republic of Korea

ABSTRACT

Membrane Electrode Assembly (MEA) quality strongly affects performance and yield in polymer electrolyte membrane fuel cell (PEMFC) manufacturing. This paper presents an end-to-end, productionoriented inspection system that automates MEA feeding, dual-side high-resolution line-scan imaging, and AI-based surface defect decision making. The system accepts 3-/5-layer MEA sheets stacked with interleaving papers in a loader tray, performs separator removal and vacuum fixation, then scans the MEA with coaxial and side lighting to acquire uniform, distortion-free high resolution 8192× 11000 images. A hybrid AI model integrates EfficientNetV2-based classification with DETR-based object detection; a quadrant tiling strategy and bilateral-filter+CLAHE preprocessing enable stable inference on the high resolution images under GPU memory constraints. Evaluation of our system achieved 97.11% accuracy and 39.2 s/part tact time for dual-side inspection.

KEYWORDS

MEA, line-scan imaging, automated handling, vacuum pick-up, deep learning, EfficientNetV2, DETR, anomaly detection

Rg-rag: Rubric-grounded Retrieval-augmented Generation for Automated Short Answer Scorings

MingFang Zhu and Dr. Noorhan Abbas, School of Computer Science, University of Leeds, Leeds, UK

ABSTRACT

Automated scoring of short answers in high-stakes examinations must follow complex marking schemes that include logical rules such as dependencies, mutual exclusivity, and group score caps. Most existing approaches treat scoring rubrics as plain text and ignore these rules, which leads models to produce confident but logically inconsistent scores. This paper introduces RG-RAG, an evaluation framework centred on scoring rubrics that addresses this problem through three main contributions. First, we propose the Universal Rubric Schema(URS), which breaks down mark schemes into atomic items and encodes logical constraints in a machine-readable JSON format. Second, we develop and assess two verification methods—Chain-of-Thought (CoT) prompting and a Critic agent—to improve logical compliance. Third, we systematically evaluate four LLMs on 797 synthetic answers across four levels of logical complexity. Results demonstrate that URS alone enables top-performing models to achieve perfect logical compliance. CoT prompting improves logical adherence for some models but can hinder performance on complex tasks. The Critic mechanism consistently reduces residual logical errors and semantic noise, though with potential accuracy trade-offs. Crucially, experiments confirm that without URS, even advanced models exhibit high violation rates on composite tasks. We also identify fairness and transparency gaps requiring further investigation. Future work will extend the framework to additional disciplines, develop interpretable feedback generation, and establish human-in-the-loop deployment frameworks to ensure equitable and pedagogically meaningful assessment.

KEYWORDS

Automated Short Answer Scoring, Large Language Models, Logical Constraints, Universal Rubric Schema, Chain-of-Thought Prompting

An Intuitive Mobile Application to Document and Process Rocketry Launches Using AI Tools

Sarah Zhining Zhou¹ and Bobby Nguyen², ¹Basis Independent Silicon Valley, 1290 Parkmoor Ave, San Jose, CA 95126 ,² California State Polytechnic University, Pomona, 3801 W Temple Ave, Pomona, CA 91768

ABSTRACT

Model rocketry serves as a valuable gateway to STEM education, yet practitioners continue to struggle with inefficient manual data logging, paper-based checklists, and fragmented record-keeping that limit productivity on launch days. This paper presents an intuitive cross-platform mobile application built with Flutter and Firebase that addresses these inefficiencies through four key features: digital launch checklists with progress tracking, structured flight data logging forms, cloud-based record storage with real-time synchronization, and AI-powered flight analysis using OpenAI’s GPT-4o-mini model. The application was evaluated through two experiments: an AI accuracy assessment comparing predictions against OpenRocket baselines, which yielded a mean deviation of 12.3%, and a data logging efficiency study with ten participants that demonstrated a 54.6%-time reduction and 86% fewer errors compared to paper-based methods. Results confirm that integrating mobile technology with artificial intelligence significantly improves rocketry documentation efficiency, making the hobby more accessible and productive for students and enthusiasts.

KEYWORDS

Rocketry launches, Machine learning assistant, Data processing application, AI trajectory calculator

Optimizing High-Skilled Talent Retention in the Electric Vehicle Sector: A Data-Driven Supply Chain Approach

Xiaohan Zhang, Gotion Inc, United States of America

ABSTRACT

As the global transition to sustainable energy accelerates, the stability of the talent supply chain has become a critical bottleneck for the Electric Vehicle (EV) industry. Traditional supply chain models for the EV sector have predominantly focused on raw materials, manufacturing capacity, logistics, semiconductors, and battery components, while largely overlooking the human capital dimension. This research reconceptualizes high-skilled technical talent as a critical supply chain component and develops a predictive retention framework using a multi-variable analytical model. By examining four core variables—Compensation Structure, Skill Growth Visibility, Leadership Quality, and Project Criticality Exposure—alongside two auxiliary variables, the paper identifies key correlations between organizational factors and long-term workforce stability among specialized battery engineers, algorithm developers, and technical leads. The model further proposes a Human Capital Risk Heatmap for proactive talent risk management. The findings provide a strategic framework for EV enterprises and policymakers to maintain national competitiveness in the clean energy sector by treating workforce retention as a supply chain resilience variable.

KEYWORDS

Electric Vehicle, Talent Supply Chain, Human Capital, Retention Model, Workforce Resilience, Clean Energy

FROM MANUAL REWORK TO INTELLIGENT EXCEPTION HANDLING: DESIGNING AI-ASSISTED OPERATIONAL COORDINATION SYSTEMS

Maisie Mary Cu Berlin School of Economics and Law, Germany

ABSTRACT

This paper examines manual rework in the construction-settlement (XDTT) workflow of WinCommerce, the modern-trade grocery-retail arm of Masan Group, which operates more than 3,000 neighbourhood stores in Vietnam. Drawing on an internal review of 100 stores and 7,913 work-item records, the study finds that 57.1% of submitted line items are flagged as exceptions (phát sinh); 50.7% of all items already exist in the in-force Bill of Quantities (BOQ) but were missed during bid preparation, while only 6.4% are genuinely new. The 6.4% trigger a 60-day price-negotiation and legal-approval cycle that blocks payment for the other 93.6%; 40% of files are also returned for material defects, adding 24 days. Findings are aggregated via Excel pivot tables over a restructured BOQ (5 categories, 15 subcategories, 156 group headers, 2,291 leaf items), and the five candidate interventions are ranked using the Analytic Hierarchy Process (CR = 0.006): separating the volume-check step from the price-approval cycle dominates (composite weight 0.336). The paper outlines an AI-assisted coordination design, semantic deduplication, automated exception triage and digitised parallel approval, as a domain-grounded blueprint for soft-computing application to large retail expansion programmes.

KEYWORDS

AI-assisted operational coordination, exception handling, Bill of Quantities (BOQ), construction settlement, semantic deduplication, retail expansion, process automation

SIGML

7th International Conference on Signal Processing and Machine Learning (SIGML 2026)

March 14 ~ 15, 2026, Vienna, Austria

7th International Conference on Signal Processing and Machine Learning (SIGML 2026)

March 14 ~ 15, 2026, Vienna, Austria

7th International Conference on Signal Processing and Machine Learning (SIGML 2026)

March 14 ~ 15, 2026, Vienna, Austria

7th International Conference on Signal Processing and Machine Learning (SIGML 2026)

March 14 ~ 15, 2026, Vienna, Austria

7th International Conference on Signal Processing and Machine Learning (SIGML 2026)

March 14 ~ 15, 2026, Vienna, Austria

7th International Conference on Signal Processing and Machine Learning (SIGML 2026)

March 14 ~ 15, 2026, Vienna, Austria

Accepted Papers

Vector Embeddings for Images Beyond Neural Networks: An Exhaustive Study on Compact Composite Descriptorss

ABSTRACT

KEYWORDS

Cloud Based Decision Support systems for Analysing Student trends in Educational Institutions

ABSTRACT

KEYWORDS

Bridging Misuse Case Modelling and MITRE ATT&CK: A Unified Framework for Threat-Informed Design

ABSTRACT

KEYWORDS

Vulnerability Analysis of Containerized Web Applications using SAST and DAST Tools

ABSTRACT

KEYWORDS

Evaluating Sentiment Models for Cybershield Abusive Language Detection System

ABSTRACT

KEYWORDS

Split-brain Rag: Why Large Language Models are Not Enough for Scientific Question Answering

ABSTRACT

KEYWORDS

Semantic Topology Reasoning Architecture (STRA):From Parameter-Centric Models to StructureCentric Reasoning

ABSTRACT

KEYWORDS

Privacy-By-Default: An Industry-Aware Framework for Automated Data Retention at Scale

ABSTRACT

KEYWORDS

Economic Impact of Security Failures in Cloud Infrastructure

ABSTRACT

KEYWORDS

Real-time Smile Synchronization as a Mechanism for Emotional Contagion in Public Interactive Displays

ABSTRACT

KEYWORDS

Evaluating Chunking Strategies for Retrieval-augmented Generation in Oil and Gas Enterprise Documents

ABSTRACT

KEYWORDS

Prolonging Anti-Deepfake Signatures Lifetime with Blockchain-Based Timestamps

ABSTRACT

KEYWORDS

AI-based Classification of the Meat Freshness using Cantilever Sensor Data

ABSTRACT

KEYWORDS

AI-Driven Climate Adaptation Models for Predictive Crop Yield Optimization

ABSTRACT

Spatio-temporal Prediction of Crimes using Predictive Justice Algorithms

ABSTRACT

KEYWORDS

Enhancing Financial Report Question-Answering: A Retrieval-Augmented Generation System with Reranking Analysis

ABSTRACT

KEYWORDS

Recursive Self-reference as a Structural Principle for Conscious Intelligence: A Dynamical Framework Linking Philosophy of Reality and Machine Learning

ABSTRACT

KEYWORDS

An Integrated Augmented Reality Mobile Control System for ESP32-Based Robot Cars Using OpenCV Feature Matching and Real-Time Video Streaming

ABSTRACT

KEYWORDS

Campus of Things: A Lora-based Iot Architecture for a Smart Campus

ABSTRACT

KEYWORDS

Machine Learning–Enhanced Rocket Drift Prediction in a Layered Wind Unity Simulation

ABSTRACT

KEYWORDS

Design and Implementation of Axon AI: A Smart Helmet System for Real-Time Concussion Detection in Football Players Using IoT Sensors and Mobile Integration

ABSTRACT

KEYWORDS

Mitigating Hallucinations in Small Language Models: A Comparative Analysis of Structured Reasoning Templates Across Tinyllama, Qwen2, and Qwen2.5

ABSTRACT

KEYWORDS

Stock Scout: A Lightweight Reddit-Based Sentiment Analysis System for Real-Time Per-Ticker Market Insight

7^th International Conference on Signal Processing and Machine Learning (SIGML 2026)

7^th International Conference on Signal Processing and Machine Learning (SIGML 2026)

7^th International Conference on Signal Processing and Machine Learning (SIGML 2026)

7^th International Conference on Signal Processing and Machine Learning (SIGML 2026)

7^th International Conference on Signal Processing and Machine Learning (SIGML 2026)

7^th International Conference on Signal Processing and Machine Learning (SIGML 2026)