Acoustic anomaly detection of oil and gas processing equipment using spectrogram-based machine learning

Spectrogram-based machine learning with acoustic sensors effectively detects anomalies in oil and gas processing equipment, enhancing maintenance efficiency and reducing downtime by automating fault detection.

WO2026128284A2PCT designated stage Publication Date: 2026-06-18SCHLUMBERGER TECH CORP +3

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SCHLUMBERGER TECH CORP
Filing Date
2025-12-04
Publication Date
2026-06-18

Smart Images

  • Figure US2025058038_18062026_PF_FP_ABST
    Figure US2025058038_18062026_PF_FP_ABST
Patent Text Reader

Abstract

A method for detecting acoustic anomalies in oil and gas processing equipment includes receiving first audio data that is generated by the oil and gas processing equipment. The method also includes segmenting the first audio data into a plurality of time frames. The method also includes extracting a spectrum from the time frames. The method also includes generating a spectrogram based upon the spectrum. The method also includes training a machine learning (ML) audio model based upon the spectrogram. The method also includes receiving second audio data that is generated by the oil and gas processing equipment or different oil and gas processing equipment. The method also includes determining an output using the trained ML audio model based upon the second audio data.
Need to check novelty before this filing date? Find Prior Art

Description

ACOUSTIC ANOMALY DETECTION OF OIL AND GAS PROCESSING EQUIPMENT USING SPECTROGRAM-BASED MACHINE LEARNINGCross-Reference to Related Applications

[0001] This application claims priority to U.S. Provisional Patent Application No. 63 / 730,070, filed on December 10, 2024, which is incorporated by reference.Background

[0002] Oil and gas equipment involves regular maintenance and tuning to both minimize unforeseen breakdowns or failures during operation and to extend the overall lifespan of the equipment. The equipment may be appropriately maintained and inspected to assure sustainable, efficient, and safe production. High accuracy in fault detection is helpful, as failure to detect faults in the process equipment may result in costly consequences such as higher repair costs, production downtime, and nonproductive time, health, and safety implications for workers. For various reasons, it is desirable to limit a human physical presence for intervention in industrial process plants. Inspection and maintenance operations should be converted to condition-based predictive actions to reduce the overall in-operation failure rate, and autonomous robotics may be used to handle as many tasks as possible.

[0003] Process equipment in the oil and gas industry may produce a wide range of sounds. A functional piece of equipment creates a standard sound, while malfunctioning equipment creates an anomalous sound. Anomalous sound detection aims to identify whether the sound emitted from equipment is normal or anomalous. An acoustic sensor may be fully integrated in the payload of a robot, which may be used to collect audio data for anomaly detection of the process equipment in an autonomous manner. The emerging technologies of artificial intelligence (AI) and machine learning (ML) have opened new opportunities for automatically detecting process equipment malfunctions.Summary

[0004] A method for detecting acoustic anomalies in oil and gas processing equipment is disclosed. The method includes receiving first audio data that is generated by the oil and gas processing equipment. The method also includes segmenting the first audio data into a pluralityof time frames. The method also includes extracting a spectrum from the time frames. The method also includes generating a spectrogram based upon the spectrum. The method also includes training a machine learning (ML) audio model based upon the spectrogram. The method also includes receiving second audio data that is generated by the oil and gas processing equipment or different oil and gas processing equipment. The method also includes determining an output using the trained ML audio model based upon the second audio data.

[0005] A computing system is also disclosed. The computing system includes one or more processors and a memory system. The memory system includes one or more non-transitory computer-readable media storing instructions that, when executed by at least one of the one or more processors, cause the computing system to perform operations. The operations include receiving first audio data. The first audio data is measured by an acoustic sensor or microphone. The first audio data is generated by the oil and gas processing equipment. The operations also include denoising the first audio data. The operations also include segmenting the first audio data into a plurality of time frames. The operations also include extracting a spectrum from the time frames. The operations also include generating a spectrogram based upon the spectrum. The operations also include training a machine learning (ML) audio model based upon the spectrogram. The operations also include receiving second audio data. The second audio data is generated by the oil and gas processing equipment or different oil and gas processing equipment. The operations also include determining an output using the trained ML audio model based upon the second audio data.

[0006] A non-transitory computer-readable medium is also disclosed. The medium includes instructions that, when executed by one or more processors of a computing system, cause the computing system to perform operations. The operations include receiving first audio data. The first audio data is measured by an acoustic sensor or microphone. The first audio data is generated by the oil and gas processing equipment. The operations also include denoising the first audio data. The operations also include segmenting the first audio data into a plurality of first time frames. The operations also include extracting a first spectrum from the first time frames. The operations also include generating a first spectrogram based upon the first spectrum. The operations also include training a machine learning (ML) audio model based upon the first spectrogram. The operations also include receiving second audio data. The second audio data is generated by the oil and gas processing equipment or different oil and gas processing equipment.The operations also include denoising the second audio data. The operations also include segmenting the second audio data into a plurality of second time frames. The operations also include extracting a second spectrum from the second time frames. The operations also include generating a second spectrogram based upon the second spectrum. The operations also include determining an output using the trained ML audio model based upon the second spectrogram.

[0007] It will be appreciated that this summary is intended merely to introduce some aspects of the present methods, systems, and media, which are more fully described and / or claimed below. Accordingly, this summary is not intended to be limiting.Brief Description of the Drawings

[0008] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the present teachings and together with the description, serve to explain the principles of the present teachings. In the figures:

[0009] Figure 1 illustrates an example of a system that includes various management components to manage various aspects of a geologic environment, according to an embodiment.

[0010] Figure 2 illustrates a workflow for anomalous sound detection, and how the input audio signal is pre-processed prior to becoming the input for our model, according to an embodiment.

[0011] Figure 3 illustrates an autoencoder with a spectrogram for anomaly detection, according to an embodiment.

[0012] Figure 4A illustrates an aerial view of a portion of a production facility, where audio data may be recorded for training and validation of a model, according to an embodiment.

[0013] Figure 4B illustrates an inspection robot collecting audio data by staying close to the two centrifugal hot oil pumps of interest, according to an embodiment.

[0014] Figure 5A illustrates audio signals hot oil pump A (normal), Figure 5B illustrates audio signals for hot oil pump B (abnormal), Figure 5C illustrates a spectrogram for hot oil pump A (normal), and Figure 5D illustrates a spectrogram for hot oil pump B (abnormal), according to an embodiment.

[0015] Figure 6 illustrates a histogram of reconstruction errors of audio clips evaluated on the test dataset of hot oil pumps, according to an embodiment.

[0016] Figure 7 illustrates the reconstruction errors of audio clips as a function of samples evaluated on test dataset, according to an embodiment.

[0017] Figure 8 illustrates a confusion matrix for anomalous sound detection with the autoencoder evaluated on the test dataset of hot oil pumps, according to an embodiment.

[0018] Figure 9 illustrates a flowchart of a method for detecting acoustic anomalies in oil and gas processing equipment using spectrogram-based machine learning, according to an embodiment.

[0019] Figure 10 illustrates a schematic view of a computing system for performing at least a portion of the method(s) described herein, according to an embodiment.Detailed Description

[0020] Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings and figures. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be apparent to one of ordinary skill in the art that the present disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.

[0021] It will also be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first object or step could be termed a second object or step, and, similarly, a second object or step could be termed a first object or step, without departing from the scope of the present disclosure. The first object or step, and the second object or step, are both, objects or steps, respectively, but they are not to be considered the same object or step.

[0022] The terminology used in the description herein is for the purpose of describing particular embodiments and is not intended to be limiting. As used in this description and the appended claims, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and / or” as used herein refers to and encompasses any possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes,” “including,” “comprises” and / or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and / or components, but do not preclude the presence oraddition of one or more other features, integers, steps, operations, elements, components, and / or groups thereof. Further, as used herein, the term “if’ may be construed to mean “when” or “upon” or “in response to determining” or “in response to detecting,” depending on the context.

[0023] Attention is now directed to processing procedures, methods, techniques, and workflows that are in accordance with some embodiments. Some operations in the processing procedures, methods, techniques, and workflows disclosed herein may be combined and / or the order of some operations may be changed.System Overview

[0024] Figure 1 illustrates an example of a system 100 that includes various management components 110 to manage various aspects of a geologic environment 150 (e.g., an environment that includes a sedimentary basin, a reservoir 151, one or more faults 153-1, one or more geobodies 153-2, etc.). For example, the management components 110 may allow for direct or indirect management of sensing, drilling, injecting, extracting, etc., with respect to the geologic environment 150. In turn, further information about the geologic environment 150 may become available as feedback 160 (e.g., optionally as input to one or more of the management components 110).

[0025] In the example of Figure 1, the management components 110 include a seismic data component 112, an additional information component 114 (e.g., well / logging data), a processing component 116, a simulation component 120, an attribute component 130, an analysis / visualization component 142 and a workflow component 144. In operation, seismic data and other information provided per the components 112 and 114 may be input to the simulation component 120.

[0026] In an example embodiment, the simulation component 120 may rely on entities 122. Entities 122 may include earth entities or geological objects such as wells, surfaces, bodies, reservoirs, etc. In the system 100, the entities 122 may include virtual representations of actual physical entities that are reconstructed for purposes of simulation. The entities 122 may include entities based on data acquired via sensing, observation, etc. (e.g., the seismic data 112 and other information 114). An entity may be characterized by one or more properties (e.g., a geometrical pillar grid entity of an earth model may be characterized by a porosity property). Such properties may represent one or more measurements (e.g., acquired data), calculations, etc.

[0027] In an example embodiment, the simulation component 120 may operate in conjunction with a software framework such as an object-based framework. In such a framework, entities may include entities based on pre-defined classes to facilitate modeling and simulation. A commercially available example of an object-based framework is the MICROSOFT®. NET® framework (Redmond, Washington), which provides a set of extensible object classes. In the. NET® framework, an object class encapsulates a module of reusable code and associated data structures. Object classes may be used to instantiate object instances for use in by a program, script, etc. For example, borehole classes may define objects for representing boreholes based on well data.

[0028] In the example of Figure 1, the simulation component 120 may process information to conform to one or more attributes specified by the attribute component 130, which may include a library of attributes. Such processing may occur prior to input to the simulation component 120 (e.g., consider the processing component 116). As an example, the simulation component 120 may perform operations on input information based on one or more attributes specified by the attribute component 130. In an example embodiment, the simulation component 120 may construct one or more models of the geologic environment 150, which may be relied on to simulate behavior of the geologic environment 150 (e.g., responsive to one or more acts, whether natural or artificial). In the example of Figure 1, the analysis / visualization component 142 may allow for interaction with a model or model-based results (e.g., simulation results, etc.). As an example, output from the simulation component 120 may be input to one or more other workflows, as indicated by a workflow component 144.

[0029] As an example, the simulation component 120 may include one or more features of a simulator such as the ECLIPSE™ reservoir simulator (SLB, Houston Texas), the INTERSECT™ reservoir simulator (SLB, Houston Texas), etc. As an example, a simulation component, a simulator, etc. may include features to implement one or more meshless techniques (e.g., to solve one or more equations, etc.). As an example, a reservoir or reservoirs may be simulated with respect to one or more enhanced recovery techniques (e.g., consider a thermal process such as SAGD, etc ).

[0030] As an example, the simulation component 120 may include one or more features of a simulator such as SYMMETRY™ software (SLB, Houston, Texas). More particularly, SYMMETRY™ may process workflows in a single integrated environment with accuratethermodynamic fluid representation and consistent modeling across multiple disciplines including process, production, and HSE. The simulator integrates steady-state and transient (e.g., dynamic) analyses that may be tailored for each domain. This approach enables users to optimize processes in upstream, midstream, and downstream sectors while maximizing profits and minimizing capital expenditures. It may also help reduce emissions, energy consumption, and waste.

[0031] As an example, the simulation component 120 may include one or more features of a simulator such as PIPESIM™ (SLB, Houston, Texas). More particularly, PIPESIM™ is steadystate multiphase flow simulator that incorporates the three areas of flow modeling: multiphase flow, heat transfer and fluid behavior.

[0032] As an example, the simulation component 120 may include one or more features of a simulator such as OLGA™ (SLB, Houston, Texas). More particularly, OLGA™ is a dynamic multiphase flow simulator that models transient flow (e.g., time-dependent behaviors) to maximize production potential. Transient modeling is a component for feasibility studies and field development design. Dynamic simulation is useful in deep water and is used in both offshore and onshore developments to investigate transient behavior in pipelines and wellbores. Transient simulation with the OLGA™ simulator provides an added dimension to steady-state analysis by predicting system dynamics, such as time-varying changes in flow rates, fluid compositions, temperature, solids deposition, and operational changes.

[0033] In an example embodiment, the management components 110 may include features of a commercially available framework such as the PETREL® seismic to simulation software framework (SLB, Houston, Texas). The PETREL® framework provides components that allow for optimization of exploration and development operations. The PETREL® framework includes seismic to simulation software components that may output information for use in increasing reservoir performance, for example, by improving asset team productivity. Through use of such a framework, various professionals (e.g., geophysicists, geologists, and reservoir engineers) may develop collaborative workflows and integrate operations to streamline processes. Such a framework may be considered an application and may be considered a data-driven application (e.g., where data is input for purposes of modeling, simulating, etc.).

[0034] In an example embodiment, various aspects of the management components 110 may include add-ons or plug-ins that operate according to specifications of a framework environment. For example, a commercially available framework environment marketed as the OCEAN®framework environment (SLB, Houston, Texas) allows for integration of add-ons (or plug-ins) into a PETREL® framework workflow. The OCEAN® framework environment leverages. NET® tools (Microsoft Corporation, Redmond, Washington) and offers stable, user-friendly interfaces for efficient development. In an example embodiment, various components may be implemented as add-ons (or plug-ins) that conform to and operate according to specifications of a framework environment (e.g., according to application programming interface (API) specifications, etc.).

[0035] Figure 1 also shows an example of a framework 170 that includes a model simulation layer 180 along with a framework services layer 190, a framework core layer 195 and a modules layer 175. The framework 170 may include the commercially available OCEAN" framework where the model simulation layer 180 is the commercially available PETREL® model-centric software package that hosts OCEAN® framework applications. In an example embodiment, the PETREL® software may be considered a data-driven application. The PETREL® software may include a framework for model building and visualization.

[0036] As an example, a framework may include features for implementing one or more mesh generation techniques. For example, a framework may include an input component for receipt of information from interpretation of seismic data, one or more attributes based at least in part on seismic data, log data, image data, etc. Such a framework may include a mesh generation component that processes input information, optionally in conjunction with other information, to generate a mesh.

[0037] In the example of Figure 1, the model simulation layer 180 may provide domain objects 182, act as a data source 184, provide for rendering 186 and provide for various user interfaces 188. Rendering 186 may provide a graphical environment in which applications may display their data while the user interfaces 188 may provide a common look and feel for application user interface components.

[0038] As an example, the domain objects 182 may include entity objects, property objects and optionally other objects. Entity objects may be used to geometrically represent wells, surfaces, bodies, reservoirs, etc., while property objects may be used to provide property values as well as data versions and display parameters. For example, an entity object may represent a well where a property object provides log information as well as version information and display information (e.g., to display the well as part of a model).

[0039] In the example of Figure 1, data may be stored in one or more data sources (or data stores, generally physical data storage devices), which may be at the same or different physical sites and accessible via one or more networks. The model simulation layer 180 may be configured to model projects. As such, a particular project may be stored where stored project information may include inputs, models, results and cases. Thus, upon completion of a modeling session, a user may store a project. At a later time, the project may be accessed and restored using the model simulation layer 180, which may recreate instances of the relevant domain objects.

[0040] In the example of Figure 1, the geologic environment 150 may include layers (e.g., stratification) that include a reservoir 151 and one or more other features such as the fault 153-1, the geobody 153-2, etc. As an example, the geologic environment 150 may be outfitted with any of a variety of sensors, detectors, actuators, etc. For example, equipment 152 may include communication circuitry to receive and to transmit information with respect to one or more networks 155. Such information may include information associated with downhole equipment 154, which may be equipment to acquire information, to assist with resource recovery, etc. Other equipment 156 may be located remote from a well site and include sensing, detecting, emitting or other circuitry. Such equipment may include storage and communication circuitry to store and to communicate data, instructions, etc. As an example, one or more satellites may be provided for purposes of communications, data acquisition, etc. For example, Figure 1 shows a satellite in communication with the network 155 that may be configured for communications, noting that the satellite may additionally or instead include circuitry for imagery (e.g., spatial, spectral, temporal, radiometric, etc.).

[0041] Figure 1 also shows the geologic environment 150 as optionally including equipment 157 and 158 associated with a well that includes a substantially horizontal portion that may intersect with one or more fractures 159. For example, consider a well in a shale formation that may include natural fractures, artificial fractures (e.g., hydraulic fractures) or a combination of natural and artificial fractures. As an example, a well may be drilled for a reservoir that is laterally extensive. In such an example, lateral variations in properties, stresses, etc. may exist where an assessment of such variations may assist with planning, operations, etc. to develop a laterally extensive reservoir (e.g., via fracturing, injecting, extracting, etc.). As an example, the equipment 157 and / or 158 may include components, a system, systems, etc. for fracturing, seismic sensing, analysis of seismic data, assessment of one or more fractures, etc.

[0042] As mentioned, the system 100 may be used to perform one or more workflows. A workflow may be a process that includes a number of worksteps. A workstep may operate on data, for example, to create new data, to update existing data, etc. As an example, a may operate on one or more inputs and create one or more results, for example, based on one or more algorithms. As an example, a system may include a workflow editor for creation, editing, executing, etc. of a workflow. In such an example, the workflow editor may provide for selection of one or more predefined worksteps, one or more customized worksteps, etc. As an example, a workflow may be a workflow implementable in the PETREL® software, for example, that operates on seismic data, seismic attribute(s), etc. As an example, a workflow may be a process implementable in the OCEAN® framework. As an example, a workflow may include one or more worksteps that access a module such as a plug-in (e.g., external executable code, etc.).Acoustic Anomaly Detection of Oil and Gas Processing Equipment Using Spectrogram-Based Machine Learning

[0043] The present disclosure provides acoustic anomaly detection of oil and gas processing equipment using spectrogram-based machine learning. While hot oil pumps are used as an example to show the effectiveness of the method, the method may have broader applications and may be readily extended and applied for acoustic anomaly detection of other oil and gas processing equipment, such as cooling fans, motors, membrane separators, valves, compressors, heat exchange, pipeline gas leakage that could generate abnormal noise, and so on.

[0044] The method has shown to be effective for oil and gas processing equipment (e.g., pumps, cooling fans, motors, membrane separators, valves, compressors, heat exchange, and pipeline gas leakage that could generate abnormal noise, and so on) under various operational conditions, including different pressure, temperature, flow rate, different gas composition and oil viscosity, and different rotational speed of relevant equipment.

[0045] Figure 2 illustrates a workflow for anomalous sound detection, and how the input audio signal is pre-processed prior to becoming the input for our model, according to an embodiment. The oil and gas processing equipment (e.g., pumps, cooling fans, motors, membrane separators, valves, compressors, heat exchange, and pipeline gas leakage) 200 may generate abnormal noise, which may be collected by an inspection robot that carries an acoustic sensor or microphone 210.

[0046] Based upon the signal-to-noise ratio in the vicinity of the equipment under inspection, the raw audio signal may optionally be subjected to denoising. The first step is to apply a sliding window procedure to segment the raw audio signal 220 into suitable time frames of fixed size. From individual time frames, a short time Fourier transform (STFT) module extracts a spectrum that is represented by a matrix, the dimension of which is defined by the number of frequency bins and the number of time ticks. Each cell contains the amount of energy related to a specific frequency at a certain tick within the spectrogram. Afterwards, a further transformation of the spectrogram may be conducted, which exploits a bank of triangular filters to produce a spectrogram 230.

[0047] The spectrogram 230 may be a log-Mel spectrogram, a CQT spectrogram, a Gammatonegram, or a Mel-frequency cepstral coefficients (MFCC) spectrogram. These spectrograms 230 may be used to train machine learning audio algorithms, which are then processed input to feed and train machine learning models 240. The machine models 240 may be based on unsupervised deep neural networks such as an autoencoder, or supervised machine learning classifiers such as a convolutional neural network (CNN), support vector machine (SVM), and so on. Eventually, the method may use the output from the audio anomaly detector to distinguish between normal and abnormal conditions for the equipment 200 overtime, and display the results and insights on a software platform, as shown in Figure 2. The audio signal processing, spectrogram generation and analysis, and / or machine learning training / inference may be performed on a cloud for improved scalability.

[0048] Autoencoders may be based on unsupervised machine learning that applies the backpropagation technique and sets the target values equal to the inputs. An autoencoder has the same number of neurons in the input and the output layers. This kind of architecture learns to generate the “identity” transformation between inputs and outputs. Simply put, autoencoders may be used to learn the compressed representation of raw data.

[0049] Figure 3 illustrates an autoencoder with a spectrogram 230 for anomaly detection, according to an embodiment. More particularly, Figure 3 illustrates the use of an autoencoder with a spectrogram 230 as an input for audio-based anomaly detection. An autoencoder includes two parts: the encoder 310 and the decoder 320. The encoder 310 of the network compresses the input into a latent-space representation. It encodes the input data as a compressed representation in a reduced dimension. The decoder 320 aims to reconstruct the input data from the encodedrepresentation. It tries to generate an output that is as close as possible to the original input. The idea is that autoencoders are trained to minimize reconstruction errors, which makes them efficient in learning the distribution of the input data.

[0050] In the context of anomaly detection, autoencoders may be useful. They may be trained on normal data to learn the representation of the normal state. During inference, if an input deviates from this learned representation by more than a predetermined threshold, the autoencoder is expected to reconstruct it poorly. This poor reconstruction is a signal of an anomaly. The autoencoder should be trained exclusively on normal data. The training process involves adjusting the weights to minimize the reconstruction error. During inference, the new data may be input into the autoencoder. If the data is normal, the autoencoder may successfully reconstruct it with minimal error. However, if the data is anomalous, the reconstruction error becomes higher. A threshold may be set for the residual error. If the error surpasses this threshold, the data point is flagged as an anomaly, as illustrated in Figure 3. The threshold may depend on specific equipment, gas composition, oil viscosity, operational condition such as rotational speed, pressure, temperature, and flow rate.Case Study: Motor Fault in Hot Oil Pumps

[0051] Hot oil pumps are components in the process train and many industrial applications, playing a role in transferring hot liquids from one location to another. With the right hot oil pump, businesses may improve the efficiency of their operations and reduce downtime caused by equipment failure. Hot oil pumps work by using mechanical energy to transfer hot liquids from one location to another. They include several components, including a pump casing, an impeller, a shaft, a bearing, and seals. The basic principle of hot oil pump operation is relatively straightforward. The impeller rotates within the pump casing, generating fluid flow and pressure. The fluid is then forced through the discharge port and into the piping system, where it is delivered to its destination.

[0052] Figure 4A illustrates an aerial view of a portion of a production facility, where audio data may be recorded for training and validation of a model, according to an embodiment. This is a facility in operation that may separate acid gas such as carbon dioxide from natural gas. The hot oil pumps are in the top left corner. Noise levels may often reach tens of decibels during normal operation, and the personnel are instructed to wear earplugs when working in the plant. Figure4B illustrates an inspection robot collecting audio data by staying close to the two centrifugal hot oil pumps of interest, according to an embodiment. The robotic carries an acoustic sensor that may record high-frequency audio data with a sampling rate of 48 kHz. There are two hot oil pumps: Pump A is under normal operational condition, whereas Pump B operates abnormally with a faulty motor bearing. Eventually, 773 time-series audio clips with duration of 10 seconds each are collected on this configuration, encompassing the full range of operational conditions for the hot oil pump system. Among the samples, 516 normal audio clips are used for model training, and 257 audio clips are used for model testing. In the test dataset, there are 175 and 82 normal and abnormal audio clips, respectively.

[0053] Figure 5A illustrates audio signals hot oil pump A (normal), Figure 5B illustrates audio signals for hot oil pump B (abnormal), Figure 5C illustrates a spectrogram for hot oil pump A (normal), and Figure 5D illustrates a spectrogram for hot oil pump B (abnormal), according to an embodiment. In this example, the audio data was recorded for about 20 minutes with Pump A powered on, whereas Pump B was powered on for about 14 minutes and afterwards shut down with sudden decrease of amplitude during data recording. Apart from the larger amplitude of the abnormal signal and some patterns that are more irregular, it appears difficult to distinguish between these two signals in Figures 5A and 5B.

[0054] A Fourier transform is a mathematical operator that decomposes a function of time (or a signal) into its underlying frequencies. The Fourier transform is a function of frequency, and its amplitude represents how much of a given frequency is present in the original signal. However, a sound signal may be highly non-stationary (i.e., their statistics change over time) in general. For a given time period, the frequency decomposition may be different from another time period. Consequently, it may be rather meaningless to compute a single Fourier transform over the entire signal. This is the scenario where the short-time Fourier transform (STFT) may help.

[0055] The STFT may be obtained by computing the Fourier transform for successive frames in a signal. The method slices the signal in successive time frames, computes a STFT for each time frame, and extracts the amplitude of each frequency as a function of time. Most sounds that humans may hear are concentrated in a very small range (e.g., both in frequency and amplitude range). The method thus takes a log scale for both the frequency and the amplitude. The amplitude may be determined by converting the color axis to decibels, which is the equivalent of applying a log scale to the sound amplitudes. As an example, the spectrograms for Pump A and Pump B thatare obtained using s STFT window length of 2,048, a hop length of 512, a sampling rate of 48kHz, and 128 mel-filterbanks are shown in Figures 5C and 5D. These images have interesting features that may be readily uncovered. Specifically, stronger noises at higher frequencies may be seen on Pump B with the faulty motor bearing, which are not observed on Pump A with healthy motor bearing. These are the types of features that a neural network may try to uncover and structure in the next step. These log-Mel-scale images are then segmented into overlapping frames as part of feature extraction process.

[0056] An autoencoder model has been implemented with TensorFlow®, which is an open source deep learning framework, and has been widely used in large-scale applications and distributed computing. An autoencoder architecture with 3 hidden layers may be selected with rectified linear unit (i.e., Relu) as the activation function with hyperparameter tuning. The autoencoder may be trained with a learning rate of le-3, and batch size of 512 (e.g., on a Nvidia RTX 6000 Ada GPU). After training the model for 100 epochs, the model loss becomes small. Afterwards, the weights of the neural network may be frozen, such that the trained model may be used for inference.

[0057] Figure 6 illustrates a histogram of reconstruction errors of audio clips evaluated on the test dataset of hot oil pumps, according to an embodiment. More particularly, the histogram of the reconstruction errors of the deep autoencoder may be evaluated with the test dataset. In Figure 6, the distributions of reconstruction errors for normal and abnormal (e g., faulty motor bearing) audio clips differ without overlap between the two groups, which shows the effectiveness of using the developed autoencoder model to detect anomalous sounds from the hot oil pumps.

[0058] Figure 7 illustrates the reconstruction errors of audio clips as a function of samples evaluated on test dataset, according to an embodiment. Figure 8 illustrates a confusion matrix for anomalous sound detection with the autoencoder evaluated on the test dataset of hot oil pumps, according to an embodiment. It appears reasonable to select 25 as the threshold of the reconstruction error to flag malfunction of the motor bearing for the hot oil pumps in this example. Based on this threshold (25), the obtained confusion matrix may provide visual feedback on the model’s classification ability to predict true positives and true negatives along the diagonals, as shown in Figure 8. It may be found that the developed model may correctly detect the 32% abnormal audio clips with faulty motor bearing and 68% normal audio clips in the test dataset.Exemplary Method

[0059] Figure 9 illustrates a flowchart of a method 900 for detecting acoustic anomalies in oil and gas processing equipment using spectrogram-based machine learning, according to an embodiment. An illustrative order of the method 900 is provided below; however, one or more portions of the method 900 may be performed in a different order, simultaneously, repeated, or omitted. At least a portion of the method 900 may be performed with a computing system (described below).

[0060] The method 900 may include receiving first audio data 220, as at 905. The first audio data 220 may be measured by an acoustic sensor or microphone 210. The first audio data 220 may be generated by the oil and gas processing equipment 200. The oil and gas processing equipment 200 may include a pump, a cooling fan, a motor, a membrane separator, a valve, a compressor, a heat exchanger, a pipeline with leakage, or a combination thereof. In an embodiment, the first audio data 220 may optionally be de-noised.

[0061] The method 900 may also include segmenting the first audio data 220 into a plurality of first time frames of a predetermined size, as at 910. The first audio data 220 may be segmented using a sliding window technique.

[0062] The method 900 may also include extracting a first spectrum from the first time frames, as at 915. The first spectrum may be extracted using a transform. The transform may be a short time Fourier transform (STFT). The spectrum may be represented by a matrix. Dimensions of the matrix may be defined by a number of frequency bins and a number of time ticks. Each cell in the matrix may include an amount of energy related to a specific frequency at a predetermined one of the time ticks.

[0063] The method 900 may also include generating a first spectrogram 230 based upon the first spectrum, as at 920. The first spectrogram 230 may be generated using a plurality of triangular filters. The first spectrogram may be a log-Mel spectrogram, a CQT spectrogram, a Gammatonegram, or a Mel-frequency cepstral coefficient (MFCC) spectrogram.

[0064] The method 900 may also include training a machine learning (ML) audio model 240 based upon the first spectrogram, as at 925. The ML audio model 240 may include an unsupervised deep neural network or a supervised classifier. The unsupervised deep neural network may include an autoencoder or a variational autoencoder. The supervised classifier includes a convolutional neural network (CNN), a support vector machine (SVM), or both. For the autoencoder or thevariational autoencoder, the training may include determining an anomaly threshold based upon a type of the oil and gas processing equipment, a failure mode of the oil and gas processing equipment, a gas composition, an oil viscosity, a rotational speed of the oil and gas processing equipment, a pressure in the oil and gas processing equipment, a temperature in the oil and gas processing equipment, a flow rate through the oil and gas processing equipment, or a combination thereof.

[0065] The method 900 may also include receiving second audio data, as at 930. The second audio data may be generated by the oil and gas processing equipment or other / different oil and gas processing equipment. In an embodiment, the second audio data may optionally be de-noised.

[0066] The method 900 may also include segmenting the second audio data into a plurality of second time frames of a predetermined size, as at 935. The second audio data may be segmented using a sliding window technique.

[0067] The method 900 may also include extracting a second spectrum from the second time frames, as at 940. The spectrum may be extracted using a transform. The transform may be a short time Fourier transform (STFT). The second spectrum may be represented by a matrix. Dimensions of the matrix may be defined by a number of frequency bins and a number of time ticks. Each cell in the matrix may include an amount of energy related to a specific frequency at a predetermined one of the time ticks.

[0068] The method 900 may also include generating a second spectrogram based upon the second spectrum, as at 945. The second spectrogram may be generated using a plurality of triangular filters. The spectrogram may be a log-Mel spectrogram, a CQT spectrogram, a Gammatonegram, or a Mel-frequency cepstral coefficient (MFCC) spectrogram.

[0069] The method 900 may also include determining an output 250, 350 using the trained ML audio model 240, as at 950. The output may be based upon the second audio data. More particularly, the output may be based upon the second time frames, the second spectrum, the second spectrogram, or a combination thereof. The output 250, 350 may indicate whether the oil and gas processing equipment 200 is operating normally or abnormally. For example, the output may indicate whether there is leakage, contamination, or clogging for the pump. In another example, the output may indicate whether the motor is subjected to bearing malfunction, lubrication or debris issue. In another example, the output may indicate whether the industrial cooling fan experiences unbalanced or clogging issues. In another example, the output mayindicate whether the valve is subjected to contamination, malfunction or jamming. In another example, the output may indicate whether the reciprocating compressor is emitting a knocking noise. In another example, the output may indicate whether there is gas leak.

[0070] The method 900 may also include displaying the output 250, 350, as at 955. Examples of this are shown in Figures 5C, 5D, and 7.

[0071] The method 900 may also include performing an action in response to the output 250, 350, as at 960. The action may be or include generating and / or transmitting a signal (e.g., using a computing system) that recommends, instructs, or causes a physical action to occur. The action may also or instead include performing the physical action on / to the oil and gas processing equipment 200 or the other / different oil and gas processing equipment. The physical action may include repairing or replacing the oil and gas processing equipment or the other / different oil and gas processing equipment. For example, the physical action may be or include modifying the gas composition, the oil viscosity, the rotational speed, the pressure, the temperature, the flow rate, or a combination thereof. In another example, the physical action may be or include fixing a leak.Example

[0072] Let the discrete-time audio signal be:x[n] 6 R,n = 0, 1,..., N — 1where n is discrete-time sample index, N is the total number of samples, and the values of x[n] belong to real numbers.

[0073] The signal may be divided into overlapping frames using a window function w[n] as:xm[n] =x n+ mH]* w[n],n = 0,..., L — l,m = 0,..., M — 1 where H is the hop size, m is the frame index of the signal, L is the frame length, M is the total number of frames, and w[n] is the window function (e.g., Hamming window).

[0074] A short-time Fourier transform for each frame may be computed as:Z.-1X[m, k] = xm[nl •e~j27rkn / L, k = 0,..., K — 1n=0where X[m, k] represents the complex STFT coefficient at frame m, frequency bin k, and K is the number of frequency bins.

[0075] The power spectrogram may be computed as:S[m, k] = |X[m, fc]|2where Sfm, k] denotes the power at the frame m, frequency bin k.

[0076] Using the mel spectrogram as an example, a mel filterbank projection may be applied to convert the power spectrogram to mel scale as:K—lHi[k]*S[m, k], i = 1,k=0where Mel[m, i] is the mel-scaled energy at frame m, filter i, Hi [fc] is the weight of mel filter i at frequency bin k, and I is the number of mel filters (also called mel bands).

[0077] The logarithm may be taken to compress dynamic range as follows:Mel[m, t] = log(MeZ [m, i] + e)where Mel[m, t] is the log-scaled mel energy, and e represents a small constant for avoiding numerical instability.

[0078] The feature vector may be constructed as:zm=[MeZ[m, 1]... Mel[m, / ]] 6 RTWhere zmis the feature vector for frame m, and / is the number of mel bands.

[0079] A model may be used to predict the anomaly score or class. The autoencoder may be taken as an example:h-m ff)(.zm)zm ~ 9<p(. -m)where h is the latent representation of framem, is the reconstructed feature vector for zm, feis the encoder function, andis the decoder function.

[0080] For autoencoder-based anomaly detection, the anomaly score function may be computed by tracking the reconstruction error as follows:-(.zm) ~ Ilzm ~zml l2where a(zm) is the anomaly score for frame m. Note that a(zm) > T indicates anomaly, in which T is the decision threshold.Exemplary Computing System

[0081] In some embodiments, the methods of the present disclosure may be executed by a computing system. Figure 10 illustrates an example of such a computing system 1000, in accordance with some embodiments. The computing system 1000 may include a computer orcomputer system 1001 A, which may be an individual computer system 1001 A or an arrangement of distributed computer systems. The computer system 1001A includes one or more analysis modules 1002 that are configured to perform various tasks according to some embodiments, such as one or more methods disclosed herein. To perform these various tasks, the analysis module 1002 executes independently, or in coordination with, one or more processors 1004, which is (or are) connected to one or more storage media 1006. The processor(s) 1004 is (or are) also connected to a network interface 1007 to allow the computer system 1001 A to communicate over a data network 1009 with one or more additional computer systems and / or computing systems, such as 1001B, 1001C, and / or 1001D (note that computer systems 1001B, 1001C and / or 1001D may or may not share the same architecture as computer system 1001 A, and may be located in different physical locations, e.g., computer systems 1001A and 1001B may be located in a processing facility, while in communication with one or more computer systems such as 1001C and / or 1001D that are located in one or more data centers, and / or located in varying countries on different continents).

[0082] A processor may include a microprocessor, microcontroller, processor module or subsystem, programmable integrated circuit, programmable gate array, or another control or computing device.

[0083] The storage media 1006 may be implemented as one or more computer-readable or machine-readable storage media. Note that while in the example embodiment of Figure 10 storage media 1006 is depicted as within computer system 1001 A, in some embodiments, storage media 1006 may be distributed within and / or across multiple internal and / or external enclosures of computing system 1001 A and / or additional computing systems. Storage media 1006 may include one or more different forms of memory including semiconductor memory devices such as dynamic or static random access memories (DRAMs or SRAMs), erasable and programmable read-only memories (EPROMs), electrically erasable and programmable read-only memories (EEPROMs) and flash memories, magnetic disks such as fixed, floppy and removable disks, other magnetic media including tape, optical media such as compact disks (CDs) or digital video disks (DVDs), BLURAY® disks, or other types of optical storage, or other types of storage devices. Note that the instructions discussed above may be provided on one computer-readable or machine-readable storage medium, or may be provided on multiple computer-readable or machine-readable storage media distributed in a large system having possibly plural nodes. Such computer-readable ormachine-readable storage medium or media is (are) considered to be part of an article (or article of manufacture). An article or article of manufacture may refer to any manufactured single component or multiple components. The storage medium or media may be located either in the machine running the machine-readable instructions, or located at a remote site from which machine-readable instructions may be downloaded over a network for execution.

[0084] In some embodiments, computing system 1000 contains one or more method execution module(s) 1008. In the example of computing system 1000, computer system 1001 A includes the method execution module 1008. In some embodiments, a single method execution module may be used to perform some aspects of one or more embodiments of the methods disclosed herein. In other embodiments, a plurality of method execution modules may be used to perform some aspects of methods herein.

[0085] It should be appreciated that computing system 1000 is merely one example of a computing system, and that computing system 1000 may have more or fewer components than shown, may combine additional components not depicted in the example embodiment of Figure 10, and / or computing system 1000 may have a different configuration or arrangement of the components depicted in Figure 10. The various components shown in Figure 10 may be implemented in hardware, software, or a combination of both hardware and software, including one or more signal processing and / or application specific integrated circuits.

[0086] Further, the steps in the processing methods described herein may be implemented by running one or more functional modules in information processing apparatus such as general purpose processors or application specific chips, such as ASICs, FPGAs, PLDs, or other appropriate devices. These modules, combinations of these modules, and / or their combination with general hardware are included within the scope of the present disclosure.

[0087] Computational interpretations, models, and / or other interpretation aids may be refined in an iterative fashion; this concept is applicable to the methods discussed herein. This may include use of feedback loops executed on an algorithmic basis, such as at a computing device (e.g., computing system 1000, Figure 10), and / or through manual control by a user who may make determinations regarding whether a given step, action, template, model, or set of curves has become sufficiently accurate for the evaluation of the subsurface three-dimensional geologic formation under consideration.

[0088] The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or limiting to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. Moreover, the order in which the elements of the methods described herein are illustrated and described may be re-arranged, and / or two or more elements may occur simultaneously. The embodiments were chosen and described in order to best explain the principles of the disclosure and its practical applications, to thereby enable others skilled in the art to best utilize the disclosed embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

CLAIMSWhat is claimed is:

1. A method for detecting acoustic anomalies in oil and gas processing equipment, the method comprising:receiving first audio data that is generated by the oil and gas processing equipment; segmenting the first audio data into a plurality of time frames;extracting a spectrum from the time frames;generating a spectrogram based upon the spectrum;training a machine learning (ML) audio model based upon the spectrogram; receiving second audio data that is generated by the oil and gas processing equipment or different oil and gas processing equipment; anddetermining an output using the trained ML audio model based upon the second audio data.

2. The method of claim 1, wherein the oil and gas processing equipment comprises a pump, a cooling fan, a motor, a membrane separator, a valve, a compressor, a heat exchanger, a pipeline with leakage, or a combination thereof.

3. The method of claim 1, wherein the first audio data is segmented using a sliding window technique.

4. The method of claim 1, wherein the spectrum is extracted using a transform, and wherein the transform comprises a short time Fourier transform (STFT).

5. The method of claim 1, wherein the spectrum is represented by a matrix, and wherein dimensions of the matrix are defined by a number of frequency bins and a number of time ticks.

6. The method of claim 5, wherein each cell in the matrix comprises an amount of energy related to a specific frequency at a predetermined one of the time ticks.

7. The method of claim 1, wherein the spectrogram is generated using a plurality of triangular filters, and wherein the spectrogram comprises a log-Mel spectrogram, a CQT spectrogram, a Gammatonegram, or a Mel-frequency cepstral coefficient (MFCC) spectrogram.

8. The method of claim 1, wherein the output indicates whether the oil and gas processing equipment is operating normally or abnormally.

9. The method of claim 1, further comprising displaying the output as a histogram of reconstruction errors or as a confusion matrix.

10. The method of claim 1, further comprising performing a physical action to the oil and gas processing equipment or the different oil and gas processing equipment in response to the output.

11. A computing system, comprising:one or more processors; anda memory system comprising one or more non-transitory computer-readable media storing instructions that, when executed by at least one of the one or more processors, cause the computing system to perform operations, the operations comprising:receiving first audio data, wherein the first audio data is measured by an acoustic sensor or microphone, and wherein the first audio data is generated by the oil and gas processing equipment;denoising the first audio data;segmenting the first audio data into a plurality of time frames;extracting a spectrum from the time frames;generating a spectrogram based upon the spectrum;training a machine learning (ML) audio model based upon the spectrogram; receiving second audio data, wherein the second audio data is generated by the oil and gas processing equipment or different oil and gas processing equipment; and determining an output using the trained ML audio model based upon the second audio data.

12. The computing system of claim 11, wherein the ML audio model comprises a supervised classifier or an unsupervised deep neural network.

13. The computing system of claim 12, wherein the ML audio model comprises the supervised classifier, and wherein the supervised classifier comprises a convolutional neural network (CNN), a support vector machine (SVM), or both.

14. The computing system of claim 12, wherein the ML audio model comprises the unsupervised deep neural network, and wherein the unsupervised deep neural network comprises an autoencoder or a variational autoencoder.

15. The computing system of claim 14, wherein, for the autoencoder or the variational autoencoder, the training comprises determining an anomaly threshold based upon a type of the oil and gas processing equipment, a failure mode of the oil and gas processing equipment, a gas composition in the oil and gas processing equipment, an oil viscosity in the oil and gas processing equipment, a rotational speed of the oil and gas processing equipment, a pressure in the oil and gas processing equipment, a temperature in the oil and gas processing equipment, a flow rate through the oil and gas processing equipment, or a combination thereof.

16. A non-transitory computer-readable medium storing instructions that, when executed by one or more processors of a computing system, cause the computing system to perform operations, the operations comprising:receiving first audio data, wherein the first audio data is measured by an acoustic sensor or microphone, and wherein the first audio data is generated by the oil and gas processing equipment;denoising the first audio data;segmenting the first audio data into a plurality of first time frames;extracting a first spectrum from the first time frames;generating a first spectrogram based upon the first spectrum;training a machine learning (ML) audio model based upon the first spectrogram; receiving second audio data, wherein the second audio data is generated by the oil and gas processing equipment or different oil and gas processing equipment;denoising the second audio data;segmenting the second audio data into a plurality of second time frames;extracting a second spectrum from the second time frames;generating a second spectrogram based upon the second spectrum; anddetermining an output using the trained ML audio model based upon the second spectrogram.

17. The non-transitory computer-readable medium of claim 16, wherein the first spectrum is extracted using a transform, wherein the transform comprises a short time Fourier transform (STFT), wherein the first spectrum is represented by a matrix, wherein dimensions of the matrix are defined by a number of frequency bins and a number of time ticks, and wherein each cell in the matrix comprises an amount of energy related to a specific frequency at a predetermined one of the time ticks.

18. The non-transitory computer-readable medium of claim 17, wherein the first spectrogram is generated using a plurality of triangular filters, and wherein the first spectrogram comprises a log-Mel spectrogram, a CQT spectrogram, a Gammatonegram, or a Mel-frequency cepstral coefficient (MFCC) spectrogram.

19. The non-transitory computer-readable medium of claim 18, wherein the ML audio model comprises an unsupervised deep neural network or a supervised classifier, wherein the unsupervised deep neural network comprises an autoencoder or a variational autoencoder, wherein the supervised classifier comprises a convolutional neural network (CNN), a support vector machine (SVM), or both, and wherein, for the autoencoder or the variational autoencoder, the training comprises determining an anomaly threshold based upon a type of the oil and gas processing equipment, a failure mode of the oil and gas processing equipment, a gas composition in the oil and gas processing equipment, an oil viscosity in the oil and gas processing equipment, a rotational speed of the oil and gas processing equipment, a pressure in the oil and gas processing equipment, a temperature in the oil and gas processing equipment, a flow rate through the oil and gas processing equipment, or a combination thereof.

20. The non-transitory computer-readable medium of claim 19, wherein the operations further comprise:displaying the output as a histogram of reconstruction errors or as a confusion matrix; and performing an action in response to the output, wherein the action comprises generating and / or transmitting a signal that recommends, instructs, or causes physical action to occur to the oil and gas processing equipment or the different oil and gas processing equipment, and wherein the physical action comprises modifying the gas composition, the oil viscosity, the rotational speed, the pressure, the temperature, the flow rate, or a combination thereof.