Document 2
Document 2
Document 2
Deep Learning with Keras: Implementing Deep Learning Models and Neural Networks
With the Power of Python
Digital Chemical Engineering, Volume 3, 2022, Article 100032
Privacy policy
Share
layer of a GNN / iteration step in message passing
(5)
Deep Learning and Practice with MindSpore, Cognitive Intelligence and Robotics
Synth. Lect. Artif. Intell. Mach. Learn., 14 (2020), pp. 1-159,
10.2200/S01045ED1V01Y202009AIM046
DEXPI is a machine-readable P&ID exchange format under development by the DEXPI
Initiative. The initiative consists of owner operators, engineering, procurement &
construction companies, software vendors and research institutions. The latest data
model and the associated DEXPI specification 1.3 (Theißen and Wiedau, 2021) were
published in 2021. Within the specification, different international standards for
the description of engineering relevant data for P&IDs are combined (e.g. ISO 15926
(International Organization for Standardization, 2013), ISO 10628 (International
Organization for Standardization, 2012a), IEC 62424 (International Electrotechnical
Commission, 2016), ISO 10209 (International Organization for Standardization,
2012b). In particular, these include plant breakout structures, instrumentation,
properties of equipment and components, and piping topology. The DEXPI information
model is already offered by some manufacturers and is exchangeable via a Proteus
XML schema (Proteus XML, 2017). At the same time, DEXPI provides the possibility to
be used as a platform for digital plant data in process industry (Wiedau et al.,
2019), which can significantly reduce the development time of chemical and
biotechnological production plants. Additionally, interoperability increases due to
the continuous integration of DEXPI into existing engineering software (Fillinger
et al., 2017). The uniform and machine-readable format as well as the increasing
acceptance of the DEXPI format in the process industry improve the potential for
the application in the field of data science and allow the application of
artificial intelligence (Wiedau et al., 2021).
Node class Quantity
View in ScopusGoogle Scholar
This research work was supported within the KEEN project (grant number: 01MK20014S)
and has been funded by the Federal Ministry of Economic Affairs and Climate Action
(BMWK).
(1)
Digital Chemical Engineering, Volume 2, 2022, Article 100010
Tandoh Henry, Yi Cao
Fig. 4. GUI of the DEXPI-2-graph converter.
Google Scholar
Fig. 2. Structure of the Python DEXPI-2-graph converter
Gulli and Pal, 2017
Pumps 69
N
Energy and policy considerations for deep learning in NLP
Cite
View in ScopusGoogle Scholar
Chen, 2021
For better application, the DEXPI-2-graph converter is equipped with a graphical
user interface (GUI), which is shown in Fig. 4. The figure also shows a
visualization of an extracted P&ID graph. The path folder containing the DEXPI
P&IDs to be converted is selected and the conversion is started via buttons. A
console window shows the progress, errors and the generated GraphML files. In
addition, a plot window is used to directly check the generated P&ID graphs. The
converter including GUI is published as an open-source application and is available
on Github as a Python application at https://github.com/TUDoAD/DEXPI2graphML
(Oeing, 2022).
Proceedings of the ICLR (2014), p. 2015
AI-based processing of P&IDs
Deep sets
4
Neural Comput., 9 (1997), pp. 1735-1780, 10.1162/neco.1997.9.8.1735
Citations
Fig. 5. Workflow of an RNN-based model for predicting subsequent equipment in
P&IDs.
Fig 2
Wiedau et al., 2021
Algorithmische Graphentheorie, De Gruyter Studium
Download : Download high-res image (310KB)
Google Scholar
Valves 1034
Google Scholar
Chem. Ing. Tech., 91 (2019), pp. 240-255, 10.1002/cite.201800112
Workflow - GNN node classification
LEARNING FROM FLOWSHEETS: A GENERATIVE TRANSFORMER MODEL FOR AUTOCOMPLETION OF
FLOWSHEETS
The basis is an XML-parser (python package: xml.etree.ElementTree / version 3.3) to
read out all relevant information from the DEXPI-file into Python. In the first
step, the parser searches for all Equipment and PipingComponents and creates a
separate node in a graph for each component. At the same time an equipment list is
created, which documents the respective EquipmentClass and ComponentClass of the
individual nodes. In a second step, the PipingNetworkSystem is searched for
connections. These are implemented as directed edges between the already created
nodes of the graph. Similarly, the InstrumentationFunction of the DEXPI P&ID is
scanned and PCE components are added as additional nodes and associated signal
lines are added as edges. This allows for considering control loops and signal
streams in the graph information model. Subsequently, the graph and the associated
equipment list are loaded into a data store and saved for further use. The
structure of the P&ID representing graph is shown in Fig. 3.
Download : Download high-res image (499KB)
Cited by (4)
References
Batres et al., 1997
Khosla et al., 2020
Proceedings of the NIPS (2017), p. 2017
4
Download : Download full-size image
International Organization for Standardization, 2012b. ISO 10209, technical product
documentation – vocabulary – terms relating to technical drawings, product
definition and related documentation. Geneva.
AI Open, 1 (2020), pp. 57-81, 10.1016/j.aiopen.2021.01.001
When parsing the DEXPI files, it becomes apparent that the level of detail in the
P&ID description varies greatly depending on the user. Thus, different numbers of
attributes of the XML files are filled in. At the same time, the use of the
attributes leaves some room for interpretation, such that synonymous information
was mapped to different attributes. This requires a certain degree of robustness,
which has been considered in the DEXPI-2-graph implementation. Therefore, several
attributes (e.g. design temperature, pressure, material, …) are deliberately
searched until the desired information for the respective node is found.
2023, Computers and Chemical Engineering
and CONCAT concatenates the individually calculated aggregations of each node.
(7)
Google Scholar
Proceedings of the ICLR (2014), p. 2015
Fig. 1. Use cases of artificial intelligence to accelerate and improve the
synthesis of…
2022, arXiv
The authors declare that they have no known competing financial interests or
personal relationships that could have appeared to influence the work reported in
this paper.
Introduction
ISO 10628-2 - Diagrams for the Chemical and Petrochemical Industry – Part 2:
Graphical Symbols
Fig 5
Original Article
The results for all models show deviations among each other. The accuracy for the
training for all models is between 75.4% (sum-MLP) and 87.3% (sum) while the
accuracy for the test data varies between 74.5% (sum-MLP) and 81.5% (sum). It is
striking that the gap between the test and training accuracies for the sum
aggregation as well as the attention aggregation is larger than for the remaining
aggregation functions, at about 6 percentage points. Additionally, the results show
that simpler aggregation algorithms such as sum and arithmetic mean achieve higher
training accuracies than the more complex aggregations using attention, set pooling
or sum-MLP. It is hypothesized that this is due to the fact that all neighborhood
information is equally important in predicting the component class. For this
reason, learning the individual P&ID components works particularly well when the
neighborhood information is aggregated with the same weight, i.e., equally
important. This is especially true for the sum or mean.
A. Gulli, S. Pal
Download : Download full-size image
Google Scholar
y
Grabisch et al., 2009
A. Grover, J. Leskovec
W.L. Hamilton, R. Ying, J. Leskovec
International Organization for Standardization
Show more
View PDF
Tweets:
Contact and support
2022, arXiv
number of nodes
View PDF
ENPRO data integration: extending DEXPI towards the asset lifecycle
plumX logoView details
Show abstract
Fig. 7. Example of the neighborhood aggregation of a GNN using a P&ID.
Google Scholar
View PDF
Fig. 6. Results of the training of following P&ID equipment with different RNN
models
Declaration of Competing Interest
As a last investigated variant, the graph attention approach (8) is applied, where
additionally for each
Kingma and Ba, 2014
Google Scholar
B. Weisfeiler, A. Leman
To check how well the classification can be performed for the different classes,
the confusion matrix of the model with sum aggregation using the test data set is
also considered, see Fig. 10. The columns in the matrix describe the predicted
classes, while the rows represent the real classes. Consequently, the main diagonal
displays the number of correctly classified components (TP).
Fig. 9. Results of the node classification in a P&ID graph via recursive GNN
grouped by the applied aggregation functions.
Google Scholar
Hamilton et al., 2017
Process. Saf. Environ. Prot., 83 (2005), pp. 509-532, 10.1205/psep.04055
Fig 10
embedding/feature of a node u
k
AIartificial intelligenceBRNNbidirectional recurrent neural networkGNNgraph neural
networkGRUgated recurrent unitLSTMlong short-term memoryMLPmultilayer
perceptronP&IDpiping and instrumentation diagramPCEprocess control
equipmentRNNrecurrent neural network
S. Fillinger, H. Bonart, W. Welscher, E. Esche, J.-U. Repke
(3)
The need for data-driven modeling and optimization is growing in the process
industry due to increasing digitalization of processes and tools. Previous work
already shows an acceleration of process development by agent-based environments,
which can be understood as the first steps of intelligent process development
(Batres et al., 1997). Further work shows the potential for using intelligent
process information models by describing chemical plants and processes in terms of
a machine-readable format (e.g. colored petri nets), which enables accessibility
for deterministic algorithms (Zhao et al., 2005). This paper aims to develop
applications that accelerate the development of P&IDs using artificial intelligence
(AI). This requires an accelerated development and application of standardized and
machine-readable file exchange formats to ensure a sufficiently large and highly
available database for the application of AI in P&ID development. In the field of
P&IDs, the DEXPI (Data Exchange in Process Industry) standard is becoming
increasingly popular., as it enables the uniform description of P&IDs and ensures
the vendor-independent exchange of information (Theißen and Wiedau, 2021). At the
same time, DEXPI provides the possibility to be used as a platform for digital
plant data in process industry (Wiedau et al., 2019), which can significantly
reduce the development time of chemical and biotechnological production plants.
Additionally, interoperability increases due to the continuous integration of DEXPI
into existing engineering software (Fillinger et al., 2017).
Separation units 15
Fig 7
Conclusion & outlook
Long short-term memory
into an one-hot vector, i.e., a vector consisting of zeros except a single entry
that is set to one (Gulli and Pal, 2017). The sum of the vectors allows for
determining exactly the used one-hot vectors, which were needed for its generation.
Using artificial intelligence to support the drawing of piping and instrumentation
diagrams using DEXPI standard
References
Fig. 2. Structure of the Python DEXPI-2-graph converter.
Sequential node prediction using recurrent neural networks
CrossRefGoogle Scholar
How powerful are graph neural networks?
Google Scholar
Apress, Berkeley (2018), 10.1007/978-1-4842-3516-4
K. Xu, W. Hu, J. Leskovec, S. Jegelka
International Electrotechnical Commission, 2016. IEC 62424, Representation of
process control engineering – requests in P&I diagrams and data exchange between
P&ID tools and PCE-CAE tools. International Electrotechnical Commission, Geneva.
International Electrotechnical Commission 2016
StellarGraph 2020
Download : Download full-size image
Safety valves 93
Artificial intelligenceProcess synthesisP&IDProcess engineeringDetail engineering
Fig. 8. Workflow of a learning P&ID components using GNNs.
a weight factor
2023, arXiv
arbitrary node u of a graph
Google Scholar
The recursive GNN is used based on the GraphSAGE algorithm presented in the
previous chapter 4. The number of layers is k = 3. For the activation function, the
ReLU function (Manaswi, 2018) and a subsequent normalization are applied. To
achieve the most efficient prediction accuracy, different state-of-the-art
aggregation functions are used and compared against each other. Basic variants in
this respect are the calculation of a sum (4) or an arithmetic mean (5) (Grabisch
et al., 2009).
hu
Wiedau et al., 2019
Since the structure of the graphs in GNNs is learned by aggregating the neighbors,
it is important that the graphs have no missing links. As mentioned before, this is
not always the case in the data base from chapter 2.2 due to inconsistencies in
real P&ID drawings and their DEXPI exports. For this reason, a new dataset of
laboratory plants as well as industrial distillation plants is used. The dataset
contains 13 P&ID graphs and have a sufficiently high density of cross-links such
that they represent the original P&IDs well. These 13 P&ID graphs with 2020 nodes
and 2283 edges are used to train the GNNs in the following. Within the 13 P&IDs,
there are a total of 47 different equipment classes. In this context, a feasibility
analysis must be performed to investigate the potential of classifying P&ID
equipment by GNNs and to identify further challenges. Since different components in
a P&ID can fulfill the same process engineering function and since the use of
meaningful functionalities at the existing positions is to be examined for a
consistency check, it is recommended to divide the components within the P&ID into
meaningful classes. This has the additional advantage that even single components
can be used as training data influencing the classification. Hence, the 47
different components in the training data set are sorted into 9 superior classes,
which are shown in Table 1 below. The number of components per class is also shown
in relation to the training data.
Keywords
Google Scholar
X. Hu, P. Balasubramaniam
open access
Introduction
In the following, several P&IDs in the standardized DEXPI format are used as
training data, which were exported using the program PlantEngineer from the
software vendor X-Visual Technologies GmbH and converted to graphs in GraphML
format (GraphML Project Group, 2017) according to chapter 2.1. In total, 35 P&ID
graphs from third parties (laboratory and industrial plants) with 1641 nodes and
1410 edges are used. The data set contains 92 different equipment classes (valves,
pumps, vessels, instrumentation, etc.) based on the DEXPI specifications (Theißen
and Wiedau, 2021) and has three different classes of edges (pipes, signal lines,
process connection lines). The ratio of nodes/edges shows that, as expected for
P&IDs, these are very linear graphs with rather low connectivity structures. At a
closer look there are usually many single nodes along a pipeline (e.g. valves,
vessels, pumps, heat exchangers, measuring points, etc.) which results in a kind of
dead ends. Additionally, some P&IDs show inconsistencies in their drawn structures,
which in some cases lead to isolated nodes or several, smaller graphs. However,
these inconsistencies were deliberately included in the data set, as the data is
intended to represent the current state of machine-readable P&IDs in the process
industry to obtain representative results. The influence of the inconsistencies on
the results is examined in more detail in chapter 4.
Fig. 3. P&ID topology representing GraphML structure used for further training.
Results - GNN node classification
Abbreviations
Shopping cart
M. Grabisch, J.-L. Marichal, R. Mesiar, E. Pap
Fillinger et al., 2017
E
Outline
Zhao et al., 2005
The accuracy is calculated by dividing the sum of true positive (TP) and true
negative (TN) by the sum of true positive (TP), true negative (TN), false positive
(FP) and false negative (FN). The computations were done on an Intel® Xeon® W-2155
(3.31 GHz) CPU in combination with 128 GB RAM. The results are shown below in Fig.
6. The training accuracy and validation accuracy are shown. In addition, the
accuracy5 indicates the correctness, with which the real output of the validation
dataset is predicted, when the five most probable outputs are returned. This score
is of particular interest to investigate whether the trained models are suitable
for a suggestion system that can be used, for example, in a drop-down menu to speed
up the drawing process of P&IDs. Furthermore, both the calculated loss and the
training and validation accuracy over 60 epochs as well as the training time needed
to calculate the 60 epochs are given.
Remote access
N.K. Manaswi
Proteus XML, 2017. Proteus schema for P&ID exchange [WWW Document]. URL
https://github.com/ProteusXML/proteusxml (accessed 5.18.22).
C. Zhao, M. Bhushan, V. Venkatasubramanian
Xiaochi Zhou, …, Markus Kraft
Aggregation Functions (Encyclopedia of Mathematics and its Applications)
D.P. Kingma, J. Ba
Show 4 more figures
The main diagonal shows that most components of each class are correctly
classified. This way, all separation units are correctly classified. Process
control equipment (PCE, 96%) and piping components (93%) are also almost completely
correctly assigned. With over 87% prediction accuracy, valves and check valves are
also classified sufficiently well, although it is noticeable that there is
confusion between valves and piping equipment and valves and safety valves.
However, with less than 10%, this is still within tolerable limits. The
classification of the remaining components is much more difficult for the GNN.
Thus, 39% of safety valves are identified as piping equipment and 16% as standard
valves. This is not surprising since both classes are usually found in similar
positions of a P&ID. Classification of pumps (67%), vessels (55%) and heat
exchangers (75%) is only reasonably satisfactory. It is noticeable that all three
classes are mainly classified as valves or PCEs. Classes which are particularly
strongly represented in the data set according to Table 1. The GNN models should
therefore be further optimized in the future. At this point, it is conceivable to
integrate the comparatively underrepresented classes more strongly into the
training by introducing weighting factors. Furthermore, it would be conceivable to
use a larger k, which would aggregate more information. However, this results in a
larger computational effort.
In the following, the different RNN models are used and trained with the in chapter
2.2 generated P&ID graphs according to the presented workflow. The implementation
is done in Python using the keras library (Chollet, 2020). The "Adam" optimizer
(Kingma and Ba, 2014) is used for all trainings and the calculation of the loss is
performed by the "categorical cross entropy" (Murphy, 2012). The prediction
accuracy is used as an evaluation metric and is defined as follows.
b
Xu et al., 2019
Author links open overlay panelJonas Oeing a, Wolfgang Welscher b, Niclas Krink b,
Lars Jansen a, Fabian Henke a, Norbert Kockmann a
Digital Chemical Engineering
Data augmentation for machine learning of chemical process flowsheets
Google Scholar
is defined (Bahdanau et al., 2014; Hamilton, 2020). In the context of this work, a
softmax approach (9) is used to calculate the weight factor
Piping equipment 41
InTech (2008), 10.5772/68
(6)
AI-based processing of P&IDs
Packt Publishing, Birmingham (2017)
Download : Download full-size image
Yoshinari Hashimoto, Hiroto Kase
Google Scholar
Improving interoperability of engineering tools - data exchange in plant design
Declaration of Competing Interest
Proceedings of the ACL (2019), p. 19
Zaheer et al., 2017
1
Google Scholar
We use cookies to help provide and enhance our service and tailor content and ads.
By continuing you agree to the use of cookies.
Hu, 2008
Elsevier logo with wordmark
Heat exchangers 86
Tables (1)
v
Digital Chemical Engineering, Volume 3, 2022, Article 100028
Adam: a method for stochastic optimization
Google Scholar
Node classification using graph neural networks
Table 1
Table 1. Higher level node classes and their quantity in the training dataset.
Journals & Books
The workflow of the node classification is shown in Fig. 8. First, all nodes of all
P&ID graphs in the used dataset are divided into a training dataset (80%) and a
test dataset (20%) using a mask. The neural network is then provided with
information about the topology of the graph, as well as attributes of the nodes and
edges, e.g. equipment class, connection type, etc… From this information, the
network generates an embedding for each node and the predicted node class. This is
compared with the real node class and the error is reduced via backpropagation.
After the training is finished, the trained network can be used for node
classification of unseen data (nodes).
GraphML Project Group 2017
RELX group home page
Google Scholar
https://doi.org/10.1016/j.dche.2022.100038
A comparative study for unsupervised network representation learning
Datasets
output