SOTA
MORIYAMA

Conference Papers

2025.12

Graph-Based Attention for Differentiable MaxSAT Solving Spotlight

Sota Moriyama, Katsumi Inoue

The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025).

The use of deep learning to solve fundamental AI problems such as Boolean Satisfiability (SAT) has been explored recently to develop robust and scalable reasoning systems. This work advances such neural-based reasoning approaches by developing a new Graph Neural Network (GNN) to differentiably solve (weighted) Maximum Satisfiability (MaxSAT). To this end, we propose SAT-based Graph Attention Networks (SGATs) as novel GNNs that are built on t-norm based attention and message passing mechanisms, and structurally designed to approximate greedy distributed local search. To demonstrate the effectiveness of our model, we develop a local search solver that uses SGATs to continuously solve any given MaxSAT problem. Experiments on (weighted) MaxSAT benchmark datasets demonstrate that SGATs significantly outperform existing neural-based architectures, and achieve state-of-the-art performance among continuous approaches, highlighting the strength of the proposed model.

2025.12

T-norm Selection for Object Detection in Autonomous Driving with Logical Constraints

Thomas Eiter, Nelson Higuera Ruiz, Katsumi Inoue, Sota Moriyama

The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025).

Integrating logical constraints into object detection models for autonomous driving (AD) is a promising way to enhance their compliance with rules and thereby increase the safety of the system. T-norms have been utilized to calculate the constrained loss, i.e., the violations of logical constraints as losses. While prior works have statically selected a few t-norms, we conduct an extensive experimental study to identify the most effective choices, as suboptimal t-norms can lead to undesired model behavior. To this end, we present MOD-ECL, a neurosymbolic framework that implements a wide range of t-norms and applies them in an adaptive manner. It includes an algorithm that selects well-performing t-norms during training and a scheduler that regulates the impact of the constrained loss. We evaluate its effectiveness on the ROAD-R and ROAD-Waymo-R datasets for object detection in AD, using attached common-sense constraints. Our results show that careful selection of parameters is crucial for effective constrained loss behavior. Moreover, our framework not only reduces constraint violations but also, in some cases, improves detection performance. Additionally, our methods offer fine-grained control over the trade-off between accuracy and constraint violation.

2023.11

GNN Based Extraction of Minimal Unsatisfiable Subsets

Sota Moriyama, Koji Watanabe, Katsumi Inoue

Inductive Logic Programming 2023. Lecture Notes in Computer Science, vol 14363. Springer, Cham.

In Boolean Satisfiability (SAT), Minimal Unsatisfiable Subsets (MUSes) are unsatisfiable subsets of constraints that serve as explanations for the unsatisfiability which, as a result, have been used in various applications. Although various systematic algorithms for the extraction of MUSes have been proposed, few heuristic methods have been studied, as the process of designing efficient heuristics requires extensive experience and expertise. In this research, we propose the first trainable heuristic based on Graph Neural Networks (GNNs). We design a new network structure along with loss functions and learning strategies specifically tuned to learn the process of MUS extraction, which we implement in a model called GNN-MUS. Furthermore, we introduce a new algorithm called NeuroMUSX that uses GNN-MUS as a heuristic and combines it with other systematic search methods to make the extraction process more efficient. We conduct experiments to compare our proposed method with existing methods on the MUS Track of the 2011 SAT Competition. From the results, NeuroMUSX is shown to achieve significantly better performance across a wide range of problem instances. In addition, training NeuroMUSX on specific instances of a class improves the algorithm’s performance against other problems in the same class, highlighting the advantages of the learnable approach. Overall, these results underscore the potential of using simple GNN architectures to drastically improve the procedures for extracting minimal subsets.

Conference Presentations

2024.11

A Constraint-Based Visual Dataset Generator

Thomas Eiter, Nelson Higuera, Katsumi Inoue, Sota Moriyama

The 7th Workshop on Trends and Applications of Answer Set Programming.

2023.12

MOD-CL: Multi-label Object Detection with Constrained Loss

Sota Moriyama, Koji Watanabe, Katsumi Inoue, Akihiro Takemura

ROAD-R Challenge for NeurIPS 2023.

2023.11

GNN Based Extraction of Minimal Unsatisfiable Subsets

Sota Moriyama, Koji Watanabe, Katsumi Inoue

Inductive Logic Programming 2023.

Domestic Conferences

2025.08

グラフアテンションに基づく微分可能なMaxSAT 解法

森山総太,井上克巳

人工知能学会研究会資料 人工知能基本問題研究会, 2025, 133回 (2025/08), p. 01-06.

The use of deep learning to solve fundamental AI problems such as Boolean Satisfiability (SAT) has been explored recently to develop robust and scalable reasoning systems. This work advances such neural-based reasoning approaches by developing a new Graph Neural Network (GNN) to differentiably solve (weighted) Maximum Satisfiability (MaxSAT). To this end, we propose SAT-based Graph Attention Networks (SGATs) as novel GNNs that are built on t-norm based attention and message passing mechanisms, and structurally designed to approximate greedy distributed local search. To demonstrate the effectiveness of our model, we develop a local search solver that uses SGATs to continuously solve any given MaxSAT problem. Experiments on (weighted) MaxSAT benchmark datasets demonstrate that SGATs significantly outperform existing neural-based architectures, and achieve state-of-the-art performance among continuous approaches, highlighting the strength of the proposed model.

2024.05

マルチラベル物体認識への制約知識の導入とROAD-Rへの適用

森山総太,渡邉晃司,井上克巳,竹村彰浩

人工知能学会全国大会論文集, 第38回 (2024/05), セッションID 2M1-OS-11a-03

自動運転において各物体が行っている動作を認識することはモデルの利便性を向上させることにつながるが, 細かい動作の組み合わせは非常に多く存在するため, 誤認識のリスクが高まってしまう. そこで, 本研究では各組み合わせが満たすべき性質を制約として書き起こし, モデルの学習時や推論時に制約に関する情報を活用することでモデルの性能や誤認知の頻度を低下させるフレームワークを提案する. 具体的には物体検知における最先端モデルである YOLOv8 をベースとしてマルチラベル認識が可能なように拡張した MODYOLO を開発し, ROAD-R Challenge for NeurIPS 2023 コンペティションへ適用した結果の効果について検討する. タスク 1 では物体検知モデルの推論結果を制御する機構としてコレクターモデルとブレンダーモデルと呼ばれる 2 つのモデルを新たに提案し, タスク 2 ではファジー論理を用いた制約項を損失に付加した上で MODYOLO の学習を行う. 以上を採用した結果, タスク 2 では優勝, タスク 1 では 3 位入賞の功績が得られており, 実データに対する本フレームワークの効果が示唆さている.

2023.08

グラフニューラルネットワークに基づく極小充足不能部分集合の抽出

森山総太,渡邉晃司,井上克巳

人工知能学会研究会資料 人工知能基本問題研究会, 2023, 125回 (2023/08), p. 03-08.

In Boolean Satisfiability (SAT), Minimal Unsatisfiable Subsets (MUSes) are unsatisfiable subsets of constraints that serve as explanations for the unsatisfiability which, as a result, have been used in various applications. Although various systematic algorithms for the extraction of MUSes have been proposed, few heuristic methods have been studied, as the process of designing efficient heuristics requires extensive experience and expertise. In this research, we propose a trainable heuristic based on Graph Neural Networks (GNNs). We design a new network structure along with loss functions and learning strategies specifically tuned to learn the process of MUS extraction, which we implement in a model called GNNMUS. Furthermore, we introduce a new algorithm called NeuroMUSX that uses GNN-MUS as a heuristic and combines it with other systematic search methods to make the extraction process more efficient. We conduct experiments to compare our proposed method with existing methods on the MUS Track of the 2011 SAT Competition. From the results, NeuroMUSX is shown to achieve significantly better performance across a wide range of problem instances. In addition, training NeuroMUSX on specific instances of a class improves the algorithm’s performance against other problems in the same class, highlighting the advantages of the learnable approach. Overall, these results underscore the potential of using simple GNN architectures to drastically improve the procedures for extracting minimal subsets.

2023.02

アテンション機構に基づく複数 CNN モデルの統合によるマルチソース転移学習

森山総太,中村和晃

情報処理学会第85回全国大会講演論文集, 2023, 1回 (2023/02), p. 115-116.

転移学習において「負の転移」を防ぐ手段の一つとして,異なるソースタスクで学習した複数のモデルを統合し利用するマルチソース転移学習がある.この際の統合は,一般にブースティング等の手法によりdecision-levelで行われるが,本研究では,主に画像認識を対象としたfeature-levelの統合法を提案し,認識精度の向上を目指す.提案手法では,アテンション機構の導入によりモデルごとの注目領域に多様性を持たせ,モデル同士の連携を高める.その際,アテンションマップの値域や方向(空間方向かチャンネル方向か)が統合後モデルの性能に与える影響を実験的に調査し,手法の更なる改善を図る.

Preprints

2024.01

MOD-CL: Multi-label Object Detection with Constrained Loss

Sota Moriyama, Koji Watanabe, Katsumi Inoue, Akihiro Takemura

ArXiv, abs/2403.07885

We introduce MOD-CL, a multi-label object detection framework that utilizes constrained loss in the training process to produce outputs that better satisfy the given requirements. In this paper, we use MODYOLO, a multi-label object detection model built upon the state-of-the-art object detection model YOLOv8, which has been published in recent years. In Task 1, we introduce the Corrector Model and Blender Model, two new models that follow after the object detection process, aiming to generate a more constrained output. For Task 2, constrained losses have been incorporated into the MODYOLO architecture using Product T-Norm. The results show that these implementations are instrumental to improving the scores for both Task 1 and Task 2.

Internship

2025.11

Mirror in the Model: Ad Banner Image Generation via Reflective Multi-LLM and Multi-modal Agents

Zhao Wang, Bowen Chen, Yotaro Shimose, Sota Moriyama, Heng Wang, Shingo Takamatsu

EMNLP 2025 Industry Track.

Recent generative models such as GPT‑4o have shown strong capabilities in producing high-quality images with accurate text rendering. However, commercial design tasks like advertising banners demand more than visual fidelity—they require structured layouts, precise typography, consistent branding and etc. In this paper, we introduce MIMO (Mirror In‑the‑Model), an agentic refinement framework for automatic ad banner generation. MIMO combines a hierarchical multimodal agent system (MIMO‑Core) with a coordination loop (MIMO‑Loop) that explores multiple stylistic directions and iteratively improves design quality. Requiring only a simple natural language based prompt and logo image as input, MIMO automatically detects and corrects multiple types of errors during generation. Experiments show that MIMO significantly outperforms existing diffusion and LLM-based baselines in real-world banner design scenarios.