Blog | Enjun Du（杜恩俊）

GraphOracle: Efficient Fully-Inductive Knowledge Graph Reasoning via Relation-Dependency Graphs

Tue, 25 Mar 2025 00:00:00 +0000

Abstract

Knowledge graph reasoning in the fully-inductive setting — where both entities and relations at test time are unseen during training — remains an open challenge. We introduce GraphOracle, a novel framework that transforms each knowledge graph into a Relation-Dependency Graph (RDG) encoding directed precedence links between relations. A multi-head attention mechanism produces context-aware relation embeddings that guide inductive message passing over the original KG. Experiments on 60 benchmarks show up to 25% improvement in fully-inductive and 28% in cross-domain scenarios.

Motivation

Existing fully-inductive methods (INGRAM, ULTRA) construct undirected relation graphs that are dense and fail to capture directed compositional patterns — e.g., “born_in → located_in” is a directional dependency that undirected graphs cannot distinguish. Furthermore, they are limited to single-domain scenarios and cannot transfer across entirely different knowledge graphs.

Method

GraphOracle operates in three stages:

RDG Construction — Transform the KG into a directed Relation-Dependency Graph where edge (r_i, r_j) indicates that relation r_i precedes r_j in observed triple chains. This is significantly sparser than prior relation graphs while capturing compositional patterns.
Query-Dependent Multi-Head Attention — For each query relation, a multi-head attention GNN propagates messages over the RDG to produce context-aware relation embeddings. The same relation gets different representations depending on the query context.
Entity-Level Answer Prediction — The learned relation embeddings parameterize a second GNN on the original KG entity graph, performing inductive message passing from the query entity to score candidates.

Experimental Results

Evaluated across 60 benchmarks spanning transductive, entity-inductive, fully-inductive, and cross-domain settings.

MRR Comparison on 60 Datasets

GraphOracle (red) consistently matches or outperforms supervised SOTA (green) across all 60 datasets, with particularly strong gains on cross-domain and biomedical KGs.

Average Performance (4 Settings)

GraphOracle achieves +7.19% MRR improvement in transductive, +10.86% in entity-inductive, +13.28% in fully-inductive, and +26.82% in cross-domain settings over supervised SOTA.

Ablation Study

Removing the RDG structure or multi-head attention causes significant performance drops, confirming both components are essential. The directed precedence encoding is the single most important design choice.

GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments

Thu, 06 Mar 2025 00:00:00 +0000

Abstract

The era of foundation models has revolutionized AI research, yet Graph Foundation Models (GFMs) remain constrained by the scarcity of large-scale graph corpora. We introduce GraphMaster — the first multi-agent framework specifically designed for graph data synthesis in data-limited environments. GraphMaster orchestrates four specialized LLM agents (Manager, Perception, Enhancement, and Evaluation) that collaboratively optimize the synthesis process through iterative refinement, ensuring both semantic coherence and structural integrity.

Motivation

Graph Foundation Models face a critical data bottleneck. Existing synthesis methods fail for three reasons: (1) classical augmentation (GraphSmote, G-Mixup) only manipulates structure, producing semantically empty nodes; (2) LLMs cannot process entire graphs within context windows; (3) uncoordinated LLM generation introduces hallucinations that violate graph topology.

Method

GraphMaster decomposes graph synthesis into four specialized agents:

Manager Agent — Selects between semantic and topological enhancement modes based on environmental analysis, and coordinates the entire synthesis workflow.
Perception Agent — Overcomes context-window limitations via semantic-aware community detection, mode-adaptive seed selection, and hierarchical PPR-based diffusion sampling to extract representative subgraphs.
Enhancement Agent — Generates new nodes and edges conditioned on extracted knowledge, with dual-mode generation for semantic coherence and structural fidelity.
Evaluation Agent — Assesses quality through multi-dimensional scoring (semantic + structural), with adaptive threshold and temporal convergence detection for iterative refinement.

Experimental Results

Evaluated on 6 data-limited benchmarks with 4 GNN architectures (GCN, JKNET, GraphSage, GAT) on 8x A100 GPUs using QwQ-32B as the base LLM.

GraphMaster consistently outperforms all baselines across all datasets. The bottom row (blue) shows GraphMaster achieving the highest accuracy and F1 scores on every benchmark.

Graph Feature Preservation

The synthesized graphs maintain high fidelity: KS statistic 0.357 (p=0.059) for degree distribution, 0.835 clustering coefficient similarity, and 0.988 label homogeneity — indicating near-perfect structural preservation.

Ablation Study

Removing the Evaluation Agent causes the largest performance drop, confirming the critical role of iterative quality control. Each agent contributes uniquely to the final synthesis quality.

MoKGR: Mixture of Length and Pruning Experts for Knowledge Graphs Reasoning

Sun, 19 Jan 2025 00:00:00 +0000

Abstract

Knowledge Graph (KG) reasoning critically depends on constructing informative reasoning paths. Existing GNNs adopt rigid, query-agnostic strategies. We propose MoKGR, a mixture-of-experts framework with two innovations: (1) a mixture of length experts that adaptively weights path lengths based on query complexity, and (2) a mixture of pruning experts that evaluates paths from complementary perspectives. MoKGR achieves state-of-the-art in both transductive and inductive settings.

Motivation

Consider two queries on the same graph: (JACK, followed, ?) can be resolved within 3 hops, while (JACK, watched, ?) requires deeper exploration. Existing methods use fixed reasoning depth for all queries and uniform pruning criteria — both are suboptimal. MoKGR personalizes both the depth and the pruning strategy per query.

Method

MoKGR introduces two complementary Mixture-of-Experts modules:

Mixture of Length Experts — Multiple experts specialized for different reasoning depths. A learned gating network computes query-specific weights over path lengths: simple queries route to short paths, complex queries activate deeper experts. The final output is a soft weighted combination.
Mixture of Pruning Experts — At each GNN layer, multiple pruning experts evaluate candidate paths from complementary perspectives (structural, semantic, diversity). A learned aggregation selects the top-k most informative paths per query.
End-to-End Training — Both modules are trained jointly with the answer prediction objective, ensuring optimal synergy between depth selection and path pruning.

Experimental Results

Transductive Setting (6 Benchmarks)

MoKGR achieves the best results across all 6 benchmarks (Family, UMLS, WN18RR, FB15k237, NELL-995, YAGO3-10), outperforming both non-GNN baselines and state-of-the-art GNN methods including NBFNet, RED-GNN, A*Net, and AdaProp.

Efficiency & Convergence

MoKGR converges significantly faster than competing methods while maintaining lower inference time, demonstrating that personalized path exploration is both more accurate and more efficient.

Expert Selection Analysis

The learned gating weights show interpretable patterns: the model prefers medium-length paths overall, but adapts dynamically — shorter paths for simple relational queries, deeper paths for complex multi-hop reasoning.

Ablation Study

Removing the length experts causes the largest drop, confirming that adaptive depth is the most critical innovation. Removing pruning experts or using a single expert also degrades performance.