Document 3
Document 3
Document 3
Fig 2
In this paper, AI algorithms are trained to learn patterns in P&IDs using the DEXPI
standard, which can be used as a basis for intelligent and efficient process
development. To achieve this, DEXPI P&IDs are converted to graphs, hence are made
available for graph theory methods. Based on these graphs two different AI-
supported use cases for assisted P&ID synthesis are developed and explained in the
following section. The first use case uses sequential processing by recurrent
neural networks for the prediction of P&ID equipment. The second use case uses
pattern recognition in P&IDs with the aid of graph neural networks (GNN) for
consistency checks.
Tweets:
Fig. 2. Structure of the Python DEXPI-2-graph converter
Fig. 4. GUI of the DEXPI-2-graph converter
PHASuite: an automated HAZOP Analysis tool for chemical processes
Readers:
A. Gulli, S. Pal
Proceedings of the ICLR (2019), p. 2019
Article Metrics
R. Batres, M.L. Lu, Y. Naka
We use cookies to help provide and enhance our service and tailor content and ads.
By continuing you agree to the use of cookies.
As a last investigated variant, the graph attention approach (8) is applied, where
additionally for each
W.L. Hamilton
In addition, there is the possibility of adding a multilayer perceptron (MLP) over
the states to be aggregated, as shown in Eq. (6) for the sum-MLP (Xu et al., 2019).
The idea is that an MLP can transform each state
AI-based processing of P&IDs
Fig. 8. Workflow of a learning P&ID components using GNNs.
J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M. Sun
Node class Quantity
Download : Download high-res image (217KB)
Google Scholar
B. Weisfeiler, A. Leman
Google Scholar
embedding/feature of a node u
Neural Comput., 9 (1997), pp. 1735-1780, 10.1162/neco.1997.9.8.1735
into an one-hot vector, i.e., a vector consisting of zeros except a single entry
that is set to one (Gulli and Pal, 2017). The sum of the vectors allows for
determining exactly the used one-hot vectors, which were needed for its generation.
D.P. Kingma, J. Ba
Proteus XML, 2017
Algorithmische Graphentheorie, De Gruyter Studium
Thermal control of coke furnace by data-driven approach
Captures
Skip to main contentSkip to article
Oeing, 2022
k
Google Scholar
Data augmentation for machine learning of chemical process flowsheets
layer of a GNN / iteration step in message passing
Valves 1034
Apress, Berkeley (2018), 10.1007/978-1-4842-3516-4
View PDFView articleView in ScopusGoogle Scholar
Google Scholar
Terms and conditions
Download : Download full-size image
Question answering system for chemistry—A semantic agent extension
S. Hochreiter, J. Schmidhuber
2023, arXiv
Batres et al., 1997
An agent-based environment for operational design
open access
Workflow - GNN node classification
Proceedings of the NIPS (2017), p. 2017
In the following, several P&IDs in the standardized DEXPI format are used as
training data, which were exported using the program PlantEngineer from the
software vendor X-Visual Technologies GmbH and converted to graphs in GraphML
format (GraphML Project Group, 2017) according to chapter 2.1. In total, 35 P&ID
graphs from third parties (laboratory and industrial plants) with 1641 nodes and
1410 edges are used. The data set contains 92 different equipment classes (valves,
pumps, vessels, instrumentation, etc.) based on the DEXPI specifications (Theißen
and Wiedau, 2021) and has three different classes of edges (pipes, signal lines,
process connection lines). The ratio of nodes/edges shows that, as expected for
P&IDs, these are very linear graphs with rather low connectivity structures. At a
closer look there are usually many single nodes along a pipeline (e.g. valves,
vessels, pumps, heat exchangers, measuring points, etc.) which results in a kind of
dead ends. Additionally, some P&IDs show inconsistencies in their drawn structures,
which in some cases lead to isolated nodes or several, smaller graphs. However,
these inconsistencies were deliberately included in the data set, as the data is
intended to represent the current state of machine-readable P&IDs in the process
industry to obtain representative results. The influence of the inconsistencies on
the results is examined in more detail in chapter 4.
It appears that the classes have an uneven distribution, which is unavoidable since
this data set is a representative cross-section of all components in a process
plant. In the context of this work, it is important to investigate to what extent
the unequal distribution of training data will affect the results of the
classification.
bias
Fig 7
Google Scholar
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, ACM, New York, NY, USA (2016), pp. 855-864,
10.1145/2939672.2939754
AI-based processing of P&IDs
Introduction
Digital Chemical Engineering, Volume 2, 2022, Article 100010
View PDF
number of edges
Results - GNN node classification
Remote access
2022, arXiv
International Electrotechnical Commission 2016
input
Share
Chem. Ing. Tech., 93 (2021), pp. 2105-2115, 10.1002/cite.202100203
Deep Learning with Keras: Implementing Deep Learning Models and Neural Networks
With the Power of Python
Graph representation learning
Google Scholar
View in ScopusGoogle Scholar
arbitrary node u of a graph
View PDF
Keywords
Add to Mendeley
(5)
Download : Download full-size image
Google Scholar
Oeing, J., 2022. DEXPI2graph converter application [WWW Document]. URL
https://github.com/TUDoAD/DEXPI2graphML (accessed 5.18.22).
Author links open overlay panelJonas Oeing a, Wolfgang Welscher b, Niclas Krink b,
Lars Jansen a, Fabian Henke a, Norbert Kockmann a
output
List of symbols
International Organization for Standardization, 2013. ISO 15926-2 - industrial
automation systems and integration – integration of life-cycle data for process
plants including oil and gas production facilities – part 2: data model. Beuth
Verlag, Geneva.
Google Scholar
How powerful are graph neural networks?
Vessels 75
View PDFView articleView in ScopusGoogle Scholar
E
Original Article
Zaheer et al., 2017
M. Wiedau, G. Tolksdorf, J. Oeing, N. Kockmann
Google Scholar
Abbreviations
Show 4 more figures
Node classification using graph neural networks
Tandoh Henry, Yi Cao
StellarGraph, 2020. StellarGraph machine learning library - documentation [WWW
Document]. URL https://stellargraph.readthedocs.io/en/stable/README.html (accessed
3.1.22).
Google Scholar
The results for all models show deviations among each other. The accuracy for the
training for all models is between 75.4% (sum-MLP) and 87.3% (sum) while the
accuracy for the test data varies between 74.5% (sum-MLP) and 81.5% (sum). It is
striking that the gap between the test and training accuracies for the sum
aggregation as well as the attention aggregation is larger than for the remaining
aggregation functions, at about 6 percentage points. Additionally, the results show
that simpler aggregation algorithms such as sum and arithmetic mean achieve higher
training accuracies than the more complex aggregations using attention, set pooling
or sum-MLP. It is hypothesized that this is due to the fact that all neighborhood
information is equally important in predicting the component class. For this
reason, learning the individual P&ID components works particularly well when the
neighborhood information is aggregated with the same weight, i.e., equally
important. This is especially true for the sum or mean.
The first step is sampling, during which the graph with its networked structure of
nodes and edges is sequentially transformed into linear input data. These input
data consist of a list of contiguous nodes, which contain the interconnected graph
in linear representations. In the sampling process, all possible turns based on the
number of output edges are made at branches to obtain a reliable representation of
all node interconnections via random walks (Grover and Leskovec, 2016). The
sampling is performed with the function randomBiasedWalk, which is part of the
Python library StellarGraph (package: stellargraph.data.BiasedRandomWalk / version:
v1.0.0rc1) (StellarGraph, 2020). The random biased walk requires four input
parameters. The number of walks defines how many walks are generated from each node
in the graph. The walk length specifies how many nodes are considered per walk.
Important special features of the biased random walk are the return hyperparameter
p and the in-out hyperparameter q, which guide the walk. Thus, 1/p defines the
probability of reversing the sampling direction during the random walk, while 1/q
describes the probability of discovering new nodes in the graph. In this way, the
depth of the search can specifically be controlled (Grover and Leskovec, 2016).
Since the generated samples should represent a clean and linear section of the
plant topology, the parameters must be chosen in a way that the random walk jumps
back as rarely as possible and continuously explores new paths. In this respect,
previous investigations have shown that convincing results can be achieved with
values of p = 1000 and q = 1. Smaller values of p, lead to an undesired probability
of sampling against the flow direction. The sequential samples represent the actual
training data for AI modeling and have a previously defined length l. They are
divided in such a way that the first l-1 entries represent the input sequence x,
while the entries at position l are the corresponding output y. The dataset used in
this work is composed of a total of 4923 sequences, each consisting of six nodes.
For validation, 20 % of the data set are randomly retained as a test set. The
remaining 80% are used to train the RNN.
Preprocessing – DEXPI-2-graph
Google Scholar
In the following, the different RNN models are used and trained with the in chapter
2.2 generated P&ID graphs according to the presented workflow. The implementation
is done in Python using the keras library (Chollet, 2020). The "Adam" optimizer
(Kingma and Ba, 2014) is used for all trainings and the calculation of the loss is
performed by the "categorical cross entropy" (Murphy, 2012). The prediction
accuracy is used as an evaluation metric and is defined as follows.
Proceedings of the NIPS (2017), p. 17
Results – node prediction
X. Hu, P. Balasubramaniam
Long short-term memory
Fig. 7. Example of the neighborhood aggregation of a GNN using a P&ID.
Recommended articles
Recurrent Neural Networks
Datasets
AIartificial intelligenceBRNNbidirectional recurrent neural networkGNNgraph neural
networkGRUgated recurrent unitLSTMlong short-term memoryMLPmultilayer
perceptronP&IDpiping and instrumentation diagramPCEprocess control
equipmentRNNrecurrent neural network
Hu, 2008
Google Scholar
GraphML Project Group 2017
Google Scholar
N.K. Manaswi
InTech (2008), 10.5772/68
Outline
The recursive GNN is used based on the GraphSAGE algorithm presented in the
previous chapter 4. The number of layers is k = 3. For the activation function, the
ReLU function (Manaswi, 2018) and a subsequent normalization are applied. To
achieve the most efficient prediction accuracy, different state-of-the-art
aggregation functions are used and compared against each other. Basic variants in
this respect are the calculation of a sum (4) or an arithmetic mean (5) (Grabisch
et al., 2009).
Fig. 3. P&ID topology representing GraphML structure used for further training.
S. Fillinger, H. Bonart, W. Welscher, E. Esche, J.-U. Repke
MIT Press, Cambridge (2012)
Download : Download full-size image
Google Scholar
Packt Publishing, Birmingham (2017)
Springer, Singapore, Singapore (2021), 10.1007/978-981-16-2233-5
Citation Indexes:
Learning from flowsheets: A generative transformer model for autocompletion of
flowsheets
(7)
https://doi.org/10.1016/j.dche.2022.100038
Pumps 69
Sequential node prediction using recurrent neural networks
The design and engineering of piping and instrumentation diagrams (P&ID) is a very
time-consuming and labor-intensive process. Although P&IDs show common patterns
that could be reused during development, the drawing is usually created manually
and built up from scratch for each process. The aim of this paper is to recognize
these patterns with the help of artificial intelligence (AI) and to make them
available for the development and the drawing process of P&IDs. In order to achieve
this, P&ID data is made accessible for AI applications through the DEXPI format,
which is a machine-readable, manufacturer-independent exchange standard for P&IDs.
It is demonstrated how deep learning models trained with DEXPI P&ID data can
support the engineering as well as drawing of P&IDs and therefore decrease labor
time and costs. This is achieved by assisted prediction of equipment in P&IDs based
on recurrent neural networks as well as consistency checks based on graph neural
networks.
StellarGraph 2020
Under a Creative Commons license
Hochreiter and Schmidhuber, 1997
Cited By (4)
IEEE Trans. Knowl. Data Eng., 33 (2020), pp. 1807-1818, 10.1109/TKDE.2019.2951398
Keywords
The computational graph shown in Fig. 7 visualizes how it is possible to obtain
information of a component based on its neighborhood through the neural networks
AGGREGATE(k=1) and AGGREGATE(k=2). If the neighborhood components of the vessel are
known, it is possible to predict the class of the vessel or to analyze in a
consistency check how probable it is that a vessel is located at exactly this
position in the graph.
Energy and policy considerations for deep learning in NLP
Introduction
International Organization for Standardization (2012)
Fig. 9. Results of the node classification in a P&ID graph via recursive GNN
grouped by the applied aggregation functions.
Google Scholar
International Organization for Standardization 2012a
Google Scholar
Chollet, F., 2020. Keras API - documentation vers. 2.4.0 [WWW Document]. URL
https://keras.io (accessed 2.20.22).
Download : Download full-size image
Another option is to form the message via a set pooling approach (7), which is
similar to the aggregation of a Graph Isomorphism Network (GIN) (Xu et al., 2019).
According to Zaheer et al. (2017), this uses an
Conclusion & outlook
v
M. Khosla, V. Setty, A. Anand
Show more
Check valves 60
ArXiv ID 1412.6980
Process control equipment (PCE) 547
Google Scholar
Towards a systematic data harmonization to enable AI application in the process
industry
Piping equipment 41
Download : Download high-res image (630KB)
Tables (1)
Xiaochi Zhou, …, Markus Kraft
Improving interoperability of engineering tools - data exchange in plant design
Download : Download high-res image (872KB)
The accuracy is calculated by dividing the sum of true positive (TP) and true
negative (TN) by the sum of true positive (TP), true negative (TN), false positive
(FP) and false negative (FN). The computations were done on an Intel® Xeon® W-2155
(3.31 GHz) CPU in combination with 128 GB RAM. The results are shown below in Fig.
6. The training accuracy and validation accuracy are shown. In addition, the
accuracy5 indicates the correctness, with which the real output of the validation
dataset is predicted, when the five most probable outputs are returned. This score
is of particular interest to investigate whether the trained models are suitable
for a suggestion system that can be used, for example, in a drop-down menu to speed
up the drawing process of P&IDs. Furthermore, both the calculated loss and the
training and validation accuracy over 60 epochs as well as the training time needed
to calculate the 60 epochs are given.
Elsevier logo with wordmark
Process. Saf. Environ. Prot., 83 (2005), pp. 509-532, 10.1205/psep.04055
u
International Organization for Standardization 2012b
(3)
2023, Computers and Chemical Engineering
When parsing the DEXPI files, it becomes apparent that the level of detail in the
P&ID description varies greatly depending on the user. Thus, different numbers of
attributes of the XML files are filled in. At the same time, the use of the
attributes leaves some room for interpretation, such that synonymous information
was mapped to different attributes. This requires a certain degree of robustness,
which has been considered in the DEXPI-2-graph implementation. Therefore, several
attributes (e.g. design temperature, pressure, material, …) are deliberately
searched until the desired information for the respective node is found.
Mathematically, the message passing of an GNN (Eq. (2)) can be described as follows
(Hamilton, 2020):
y
Proteus XML, 2017. Proteus schema for P&ID exchange [WWW Document]. URL
https://github.com/ProteusXML/proteusxml (accessed 5.18.22).
Proceedings of the ACL (2019), p. 19
Fig 1
Khosla et al., 2020
Deep Learning with Applications Using Python
Safety valves 93
Google Scholar
Modified smith predictor for slug control with large valve stroke time in unstable
systems
C. Zhao, M. Bhushan, V. Venkatasubramanian
Schuster and Paliwal, 1997
Fig. 6. Results of the training of following P&ID equipment with different RNN
models
Digital Chemical Engineering, Volume 3, 2022, Article 100032
Fig. 1 shows the two use cases identified in this paper for AI-assisted P&ID
synthesis and their respective modeling approaches. In the first use case, a node
prediction generates suggestions about subsequent components based on a recurrent
neural network (RNN). These suggestions can support the user and decrease the time
of the drawing process. The second approach uses graph neural networks (GNN), which
are neural networks especially developed for the modeling of graphs. They can learn
the topologies of process plants, which are stored in the form of a graph, and
enable a consistency check during drawing by comparing the models with drawn P&IDs.
The use of a neural network offers the advantage that patterns and rules for
generating P&IDs can be learned from existing plant topologies. Therefore, no
explicit heuristics need to be stored. Furthermore, the models can be adapted to
the user's requirements and preferences by re-training them with data from the
user. GNNs can be used to perform various classifications and predictions, which
then can be used for consistency checking. The node classification allows for
predicting information, such as the equipment classes of individual nodes to check
whether components within a P&ID are present at meaningful positions. In contrast,
edge classification focuses on the prediction of edge information. In relation to
the P&ID, these are for example the connection types (piping or signal lines) or
the information, whether a pipeline is insulated or heated. In parallel, a link
prediction can be used, which provides information about the probability of a
possible link between two components. This enables the validation of connections
within a P&ID as well as the suggestion of connections during the drawing process.
In the following, the preprocessing for converting the P&IDs into graphs as well as
the two modeling concepts is introduced in more detail. The results for a node
prediction via RNNs as well as a node classification based on a GNN is presented.
1
Abbreviations
About ScienceDirect
References
Cited by (4)
LEARNING FROM FLOWSHEETS: A GENERATIVE TRANSFORMER MODEL FOR AUTOCOMPLETION OF
FLOWSHEETS
The reduction of a graph to canonical form and the algebra which appears therein
(8)
Artificial intelligenceProcess synthesisP&IDProcess engineeringDetail engineering
Theißen, M., Wiedau, M., 2021. DEXPI - P&ID Specification [WWW Document]. Version
1.3. URL https://dexpi.org/specifications/ (accessed 3.1.22).
Hamilton et al., 2017
As mentioned before, the information from the P&ID is interpreted in the form of a
graph. This makes it possible to store the relationships between components and the
topology in an unambiguous and machine-interpretable way. However, to learn the
graph structure as a whole and to solve tasks such as node classification, edge
classification or link predictions, machine learning methods of graph analysis are
required that can deal with non-Euclidean data structures such as graphs. The
modeling of graph structures is particularly interesting in the field of P&ID
engineering. By learning connections (e.g. piping, signal lines, …) or components
(e.g. valves, equipment, …) based on their neighborhood with the help of AI, it
will be possible in the future to perform consistency checks in P&IDs and detect
errors in P&IDs. This could reduce the amount of time for drawing P&IDs, which will
shorten the time for developing a plants documentation. To achieve this goal, Graph
Neural Networks can be used for modeling Graph Neural Networks (GNN) can be used
for modelling, which have become increasingly important in recent years (Zhou et
al., 2020). A GNN is based on a message passing algorithm that aggregates arbitrary
information from the neighborhood of a node, which will convolve the graph
(Hamilton, 2020). In general, the message passing of a GNN is analogous to the
Weisfeiler-Lehman algorithm to test the isomorphism of two graphs (Weisfeiler and
Leman, 1968), which was introduced in 1968 and in which information is aggregated
from the neighborhood of each node.
Chollet, 2020
Search ScienceDirect
Heat exchangers 86
Synth. Lect. Artif. Intell. Mach. Learn., 14 (2020), pp. 1-159,
10.2200/S01045ED1V01Y202009AIM046
message of an aggregated neighborhood of a node in a graph
Grabisch et al., 2009
Show abstract
Xu et al., 2019
Chem. Ing. Tech., 91 (2019), pp. 240-255, 10.1002/cite.201800112
Show 3 more articles
AI Open, 1 (2020), pp. 57-81, 10.1016/j.aiopen.2021.01.001
hu
M. Schuster, K.K. Paliwal
CrossRefGoogle Scholar
(9)
stands for the embedding of a node u at iteration step k. UPDATE and AGGREGATE are
arbitrary, differentiable functions, where the aggregation of the neighborhood N(u)
of node u represents the actual "message" m. The parameter k defines the number of
iterations, at which the message passing proceeds, thus represents the number of
hidden layers of the GNN. Since the aggregation of the neighborhood information
must be independent of the order, it is important that the AGGREGATION is a
permutation-invariant function. Based on the embedding for each iteration step k, a
final embedding for each node u can subsequently be determined using a final layer
(Hamilton, 2020).
plumX logoView details
Fig. 4. GUI of the DEXPI-2-graph converter.
M. Grabisch, J.-L. Marichal, R. Mesiar, E. Pap
Download : Download high-res image (310KB)
weight by which the features of nodes influence each other
Download : Download high-res image (542KB)
Murphy, 2012
b
Yoshinari Hashimoto, Hiroto Kase
Google Scholar
This research work was supported within the KEEN project (grant number: 01MK20014S)
and has been funded by the Federal Ministry of Economic Affairs and Climate Action
(BMWK).
TOWARDS AUTOMATIC GENERATION OF PIPING AND INSTRUMENTATION DIAGRAMS (P&IDS)
WITH ARTIFICIAL INTELLIGENCE
Fig 8
4
Since the structure of the graphs in GNNs is learned by aggregating the neighbors,
it is important that the graphs have no missing links. As mentioned before, this is
not always the case in the data base from chapter 2.2 due to inconsistencies in
real P&ID drawings and their DEXPI exports. For this reason, a new dataset of
laboratory plants as well as industrial distillation plants is used. The dataset
contains 13 P&ID graphs and have a sufficiently high density of cross-links such
that they represent the original P&IDs well. These 13 P&ID graphs with 2020 nodes
and 2283 edges are used to train the GNNs in the following. Within the 13 P&IDs,
there are a total of 47 different equipment classes. In this context, a feasibility
analysis must be performed to investigate the potential of classifying P&ID
equipment by GNNs and to identify further challenges. Since different components in
a P&ID can fulfill the same process engineering function and since the use of
meaningful functionalities at the existing positions is to be examined for a
consistency check, it is recommended to divide the components within the P&ID into
meaningful classes. This has the additional advantage that even single components
can be used as training data influencing the classification. Hence, the 47
different components in the training data set are sorted into 9 superior classes,
which are shown in Table 1 below. The number of components per class is also shown
in relation to the training data.
Aggregation Functions (Encyclopedia of Mathematics and its Applications)
Abstract
Fillinger et al., 2017
4
Google Scholar
Download : Download full-size image
Privacy policy
International Organization for Standardization, 2012b. ISO 10209, technical product
documentation – vocabulary – terms relating to technical drawings, product
definition and related documentation. Geneva.
Show full outline
Fig. 3. P&ID topology representing GraphML structure used for further training
View PDFView articleView in ScopusGoogle Scholar
Contact and support
Elsevier logo
Google Scholar
The need for data-driven modeling and optimization is growing in the process
industry due to increasing digitalization of processes and tools. Previous work
already shows an acceleration of process development by agent-based environments,
which can be understood as the first steps of intelligent process development
(Batres et al., 1997). Further work shows the potential for using intelligent
process information models by describing chemical plants and processes in terms of
a machine-readable format (e.g. colored petri nets), which enables accessibility
for deterministic algorithms (Zhao et al., 2005). This paper aims to develop
applications that accelerate the development of P&IDs using artificial intelligence
(AI). This requires an accelerated development and application of standardized and
machine-readable file exchange formats to ensure a sufficiently large and highly
available database for the application of AI in P&ID development. In the field of
P&IDs, the DEXPI (Data Exchange in Process Industry) standard is becoming
increasingly popular., as it enables the uniform description of P&IDs and ensures
the vendor-independent exchange of information (Theißen and Wiedau, 2021). At the
same time, DEXPI provides the possibility to be used as a platform for digital
plant data in process industry (Wiedau et al., 2019), which can significantly
reduce the development time of chemical and biotechnological production plants.
Additionally, interoperability increases due to the continuous integration of DEXPI
into existing engineering software (Fillinger et al., 2017).
View in ScopusGoogle Scholar
Fig 10
m
Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y., 2014. On the properties of
neural machine translation: encoder-decoder approaches. arXiv Prepr.
arXiv1409.1259.