| Title | Mining multimodal fatigue data using reasoning foundation models and formalized domain knowledge |
| Publication Type | Unpublished |
| Year of Publication | 2025 |
| Authors | Mohanty JPrakash, Thomas A, Pollock TM, Durmaz ARiza |
| Series Title | ChemRxiv |
| Abstract | The scarcity and expense of fatigue data limits optimal design of components and constrains companies to a few well qualified materials when safety-critical applications are concerned. This research investigates different strategies to improve extraction of structured information from unstructured scientific literature–-to date the largest corpus of fatigue information. Successful generative extraction is within reach considering latest foundation vision and reasoning language model (VLM/RLM) developments. In this work, a schema-based extraction is attempted for which an object-oriented fatigue data schema is designed. The schema provides labels, definitions and type-constraints for the target entities as contextual domain knowledge to the VLM/RLM model. The importance of nuanced target field definitions within the schema and constrained decoding is explored. Furthermore, the schema-based approach is gradually extended to form two agentic language model systems, one which utilizes a step-wise, human-inspired approach to first determine discriminative cues from fatigue S-N diagrams and one further applying dynamic knowledge augmentation. The latter dynamic workflow exploits the synergy of reasoning language models and ontologies by performing logical reasoning and web-search for dynamic knowledge augmentation and hallucination detection. On this rather complex fatigue data extraction task, requiring hierarchical pattern recognition and multimodal extraction, an overall F1-score of 0.82 is achieved, while fields contained in the narrative text modality are extracted with an F1-score of 0.92. The strengths and weaknesses of all models and methodologies are thoroughly discussed and extensions to our workflows are proposed. |
