03.08.2020 | Features and technology
PERSEUS is a platform developed by Aingura IIoT for dumping, cleaning, filtering and analyzing offline data. Its main objective is to take advantage of the power of our HPC (high performance computing) system to pre-process and process data from different acquisition domains, helping us to design and develop appropriate algorithms that will be deployed at our Edge computing node.
It is also a tool that provide services for exploratory or diagnosis analysis for our customers. In a completely unattended process, the system starts by taking all available data sources and storing them in Apache Parquet format.
Then, it launches a preliminary sanity check with basic statistics that provides data quality information, defining alerts as soon as data degradation is detected. For example, erratic sampling times, static variables, anomalous dispersions, among others. These alerts can be specified by the user to suit the nature of the data.
Figure 1. Data Storage For algorithm design and development, PERSEUS can extract user-specified subsets of data according to the analysis needs in terms of time frame and variables. To create subsets, PERSEUS sensor fusion features are able to join data variables coming from different domains and even different sampling times. To solve this, a wide variety of imputation techniques guarantee the statistical quality of the data. Additionally, these features also help to repair datasets with acquisition issues, such as missing data or errors by out-of-range sensors.
Figure 2. Data Filtering Regarding offline services, PERSEUS is able to apply different artificial intelligence algorithms depending on the needs. These algorithms could be standard such as K-means, GMM or agglomerative, MDS, among others or developed by Aingura IIoT such as GDPC and GDPC+.
Figure 3. Offline exploratory algorithms results. The benefits of running in HPC is the great capacity of totally unattended offline analysis, allowing the generation of actionable insights, either from the point of view of visualization, in data files such as CSV, PRV, etc., or in automated reports.
For more information, please contact us at
 J. Diaz-Rozo, C. Bielza, y P. Larrañaga, «Clustering of data streams with dynamic Gaussian mixture models: An IoT application in industrial processes», IEEE Internet of Things Journal, vol. 5, n.o 5, pp. 3533-3547, 2018, doi: 10.1109/JIOT.2018.2840129  J. Diaz-Rozo, C. Bielza, y P. Larrañaga, «Machine-tool condition monitoring with Gaussian mixture models-based dynamic probabilistic clustering», Engineering Applications of Artificial Intelligence, vol. 89, p. 103434, mar. 2020, doi: 10.1016/j.engappai.2019.103434.