MoKGR

Mixture of Length and Pruning Experts for Knowledge Graphs Reasoning

EMNLP 2025 Main Oral
Enjun Du1,2, Siyi Liu1, Yongqi Zhang1
1HKUST(GZ)    2Beijing Institute of Technology
logologo
← Back to Homepage Framework overview

Figure 1: Motivation — Two queries on the same KG require different reasoning depths: (JACK, followed, ?) resolves in 3 hops, while (JACK, watched, ?) needs deeper exploration. MoKGR adapts path length and pruning per query.

Abstract

Knowledge Graph (KG) reasoning critically depends on constructing informative reasoning paths. Existing GNNs adopt rigid, query-agnostic strategies. We propose MoKGR, a mixture-of-experts framework with two innovations: (1) a mixture of length experts that adaptively weights path lengths based on query complexity, and (2) a mixture of pruning experts that evaluates paths from complementary perspectives. MoKGR achieves state-of-the-art in both transductive and inductive settings.


Motivation

Motivation

Consider two queries on the same graph: (JACK, followed, ?) can be resolved within 3 hops, while (JACK, watched, ?) requires deeper exploration. Existing methods use fixed reasoning depth for all queries and uniform pruning criteria — both are suboptimal. MoKGR personalizes both the depth and the pruning strategy per query.


Method

MoKGR introduces two complementary Mixture-of-Experts modules:

  • Mixture of Length Experts — Multiple experts specialized for different reasoning depths. A learned gating network computes query-specific weights over path lengths: simple queries route to short paths, complex queries activate deeper experts. The final output is a soft weighted combination.
  • Mixture of Pruning Experts — At each GNN layer, multiple pruning experts evaluate candidate paths from complementary perspectives (structural, semantic, diversity). A learned aggregation selects the top-k most informative paths per query.
  • End-to-End Training — Both modules are trained jointly with the answer prediction objective, ensuring optimal synergy between depth selection and path pruning.

Experimental Results

Transductive Setting (6 Benchmarks)

Main Results

MoKGR achieves the best results across all 6 benchmarks (Family, UMLS, WN18RR, FB15k237, NELL-995, YAGO3-10), outperforming both non-GNN baselines and state-of-the-art GNN methods including NBFNet, RED-GNN, A*Net, and AdaProp.

Efficiency & Convergence

Efficiency

MoKGR converges significantly faster than competing methods while maintaining lower inference time, demonstrating that personalized path exploration is both more accurate and more efficient.

Expert Selection Analysis

Expert Analysis

The learned gating weights show interpretable patterns: the model prefers medium-length paths overall, but adapts dynamically — shorter paths for simple relational queries, deeper paths for complex multi-hop reasoning.

Ablation Study

Ablation

Removing the length experts causes the largest drop, confirming that adaptive depth is the most critical innovation. Removing pruning experts or using a single expert also degrades performance.

Presentation Video

Citation

@inproceedings{du2025mokgr, title = {Mixture of Length and Pruning Experts for Knowledge Graphs Reasoning}, author = {Du, Enjun and Liu, Siyi and Zhang, Yongqi}, booktitle = {Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2025}, note = {Oral Presentation} }