This project aims to explore approaches for constructing, querying and visualizing a Knowledge Graph (KG) for industrial maintenance applications. Our initial work focuses on developing the upper-level ontology and the fault-diagnosis framework. The results of this first phase can be found in the following publications:
An extended “README” that accompanies this work is available here: (https://github.com/kai-vu/zorro/blob/main/READMEextention.md))
The project constitutes a proof-of-concept implementation in the domain of aircraft engine maintenance, using data from MaintNet (Akhbardeh et al.) maintenance records (logbooks / log sheets). Observing that this data primarily concerned Lycoming engines from the University of North Dakota Aviation Program, we constructed a KG with a logical, physical, and functional view of the Lycoming O-320 engine, as well as associated troubleshooting information.
This way, we can integrate extractions from the historical records with maintenance knowledge to gain insights into failure frequencies, causes, and patterns.
The KG structure can be seen as 3 layers: the schema, domain knowledge, and historical records, which are described below.
we developed an upper-level ontology, called ZORRO for fault diagnosis through: (i) systematic analysis of public and private fault diagnosis datasets, (ii) consultations with domain experts from two industrial partners, and (iii) a comprehensive review of maintenance ontologies, including ROMAIN, IDO, and IOF-Maint. ZORRO integrates both design and operation perspectives and captures components, functions, problems, causes, actions, and—critically—the dependency structure between functions, which is central for fault propagation reasoning. These choices allow ZORRO to serve as the semantic backbone for constructing our knowledge graph.
We extracted two tables from documents about the Lycoming O-320 engine:
Parts Catalog (from PDF)
Provides full list of parts, in a physical and logical hierarchy. The catalog is structured into sections which correspond to physical (sub)systems of the engine, which contain figures that describe assemblies. Each part description contains the logical type of the part, and has a unique part number.
Troubleshooting (from operator manual PDF)
A table describing frequent observable troubles, possible causes, and remedies.
Then, we enriched this data by treating GPT4 as a Proxy Expert, asking it for knowledge that would further integrate this information. The prompt first gives some background information on the engine (from the operator manual), and then asks for structured output about:
We extract information from MaintNet (Akhbardeh et al.) records of free text fields describing problems and actions. We are interested in mentioned parts, their location (engine and cylinder), and the failure type. To do this we use 3 approaches: regular expressions, NER, and GPT4.
We link extracted part names to parts from the part catalog.
One of the methods is called Filtered Contextual Bag-of-Words matching. For every unique extracted part name, we try to find the most likely matching candidates from the part catalog. First, we try to construct a set of candidates by (exactly) matching the (lemmatized) last word (such as “gasket”) of the mentioned part name (the head of the noun phrase) to a type from the parts catalog (this is the filter).
The text similarity function we use is the cosine similarity of TFIDF bag-of-words vectors.
NaghdiPour, A., Kruit, B., Chen,J., and Schlobach, S. (2024). Knowledge Representation and Engineering for Smart Diagnosis of Cyber Physical Systems. SOFLIM2KG-SemIIM@ISWC.