Rock facies identification plays a key role in the exploration and development of O&G reservoirs. In geology terms, a facies is a body of rock with specified characteristics which can be any observable attribute such as their appearance, composition, or depositional environment. It is the sum of total characteristics of a rock including its chemical, physical, and biological features that distinguishes it from the adjacent rock.

Classification of rock facies is important for reservoir modelling as different rocks have different permeability and fluid saturation for a given porosity. Knowing the rock type distribution throughout the reservoir remains the source of uncertainty in reservoir modelling.

Geologists traditionally identified rock facies by looking at core samples. There is a high cost to it. Rock facies classification is also done by indirect measurements such as data from wireline logs. Experts use proprietary and expensive software. Such traditional analytical methods are tedious, expensive, and error-prone when human interpretation is involved.


Conventionally, facies are manually identified by geologists based on the observation of core samples. Core samples cannot always be obtained due to associated costs. Alternatively, facies classification is also done from indirect measurements such as from wireline logs. Facies identification by human interpretation is tedious and time-consuming. There are always errors.

Techniques such as machine learning can now be used to draw patterns from known data, learn from it, and reliably predict unknowns. In this case, data from the wireline log of a single well was used. Some rocks had the facies classes assigned. The algorithm was trained to understand patterns between wire log measurements, called input features, with the facies class being the target output feature. Once trained, the algorithm was used to assign facies classes to rocks that had not been classified thereby predicting the rock type distribution throughout the reservoir.

The advantages are:

  • Higher accuracy on an increasing amount of data
  • The AI identified patterns and relationships which might be invisible to the human eye
  • Valuable information could be extracted in just a few clicks

The following steps illustrate how the user ingests, cleans, and transforms data to train a machine-learning algorithm and then deploys a model to production.

Step 1 - Create a project

  • The user starts by logging on to our AI platform
  • Creates a project.
  • Names it facies_model.
  • The project can be shared with team members to collaborate.

Step 2 - Ingest data, clean and transform for machine learning.

Data can come from several sources. It can be a file, a database, can be from your cloud, a machine controller, etc. Our AI platform allows ingestion from several data sources such as CSV files, GCS, SQL, etc and in various formats such as text, numeric, image, audio, video, etc.

The user uploads “welldata.csv” from their computer.

Once uploaded, the user can view the raw data and can also view the statistics of the data by clicking on the Stat icon. These are auto-computed by the system.

The user starts defining a dataset.

The database already has a column named “classification” that contains the facies classification. The user selects Target (Output) as this column by the name “classification”. 

The user selects Features (Input) as other input variables. Proceeds to the next step.

The data may have missing values. It may be required to treat the missing values with something.

The user checks for missing values in the Missing Value Handler feature.

The system has understood your data already and has prescribed transformation steps. 

The user follows the system’s recommendation. 

The data has to be transformed into something that algorithms can understand and calculate relationships.

The user clicks on Feature Pre-processor.

The system has understood the data type for each column and has recommended transformations.

The user follows the system's recommendation. Proceeds to the next step.

The user gives this transformed dataset a name. Proceeds to the next step.

The transformed data has to be split into a relevant training and test sample. The training dataset trains the machine learning algorithm(s) and the test dataset will test what the algorithms have learnt from the training sample. The platform allows the user to split it in any proportion. The system recommends splitting it in 80% training and 20% test dataset. 

The user clicks on Generate Dataset

The user can always view the automated statistics and charts of the dataset or the configuration of it.

This completes the data cleaning and transformation step. The dataset is ready for machine learning.

Step 2 - Build models

There are many algorithms available for modelling, some that solve classification problems, some to solve regression problems, and some to solve deep learning problems. The mlOS is shipped with many algorithms and allows you to add, or create your own algorithms.

The user selects “Add Base Model” to select the dataset created in the previous section.

The user concurrently selects a machine learning algorithm. 

The system provides default parameters. The user can experiment with different parameters.

The model that the user had selected, in this case, a KNeighbors Classifier, is instantly created. 

The user can see the performance metrics. All documentation has been automatically created.

The user can create as many models by clicking and selecting other algorithms. Let’s create one more version of this model, now using a RandomForest classifier.

The system has understood the data and has recommended optimum parameters for the chosen algorithm. They can also be tuned here.

The user goes with the system optimized recommendations.

Version 2 of the model, now using the RandomForest classifier is instantly created. 

  • The user can compare the performance metrics of these two models.
  • The model accuracy statistics are instantly displayed. The actual vs. predicted results can be seen.
  • The user compares the models to pick one that offers the best results
  • Once satisfied, the user selects a model to be deployed.

Step 3 - Check for biases, validate and approve model for deployment.

It’s a good practice to always have someone review your work. Making AI is no different. It has to be checked for biases, interpretability and a lot of stuff that makes AI safe and ethical for use.

The user puts comments for deployment, attaches any extra information and submits the request to review to a peer, usually called the model risk manager.

The model risk manager has access to the documentation that was automatically generated, all the statistics and performance metrics, from what the raw data looks like to the point of time that the model was created. 

Once satisfied, the model reviewer accepts the model to be deployed in a production environment and model creator will deploy the model in the production

Step 4 - Deploy the model as an API for real-time prediction.

The model is now approved to be deployed. It’s time to deploy it so that new and legacy applications can start using it as a prediction service.

After the model deployment, it also creates an opportunity to edit code in its own advanced code editor.

Once the model is deployed, it allows testing the sample data and API response.

It can now be used to score the data for a similar geologic environment, suppose, you have similar well-log data. This example had used features like GR, SHLV, DT, VP, VS, RHOP, NPHI, DPHI, POIS to predict facies classification. 

If you want to classify facies in another data file for the same region and environment, just upload it and the deployed model will automatically score them.