Publications
A novel graph neural network that leverages both semantic and structural information to predict which research publications will lead to clinical trials. Our model analyses a comprehensive dataset of 19 million publication nodes, using transformer-based title and abstract sentence embeddings within their citation network context. Our graph-based architecture, which employs attention mechanisms over local citation neighbourhoods, outperforms traditional convolutional approaches by effectively capturing knowledge flow patterns. Our metadata is carefully selected to eliminate potential biases from researcher-specific information, while maintaining predictive power through network structural features.
Emily Muller, Justin Boylan-Toomey, Jack Ekinsmyth, Arne Robben, María De La Paz Cardona, Antonia Langfelder
2025 | In Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025), Association for Computational Linguistics.

