The first step in mining projects is the exploration of new mining sites. This involves evaluating a potential mining area in terms of its mineral composition and properties, as well as its mining method and profitability.
For this purpose, geologists evaluate various sources, such as mineral maps, geological reports from authorities and research institutes, property information and local word of mouth. The resulting assessment is of high economic importance because, due to the high cost of drilling, good prioritization of drill sites can significantly increase the efficiency of a mining project.
However, well analysis requires well-trained geologists with many years of experience and is a very time-consuming process. Automated source analysis can assist geologists and surveyors, resulting in higher profitability by increasing the probability of hitting suitable, economically viable exploration sites.
The automatic analysis of geological maps and reports, can be achieved through the development of Machine Learning (ML) alogrithms, using Natural Language Processing (NLP) methods for the analysis of textual data and Computer Vision (CV) methods for the analysis of image data.
The main challenge in NLP projects is usually the choice of a suitable OCR tool that can satisfactorily process the different input sources, e.g., handwritten versus machine-generated reports or tabular versus information summarized as continuous text.
Since the input data, both text and image, come from different sources, they may have different formats and terminologies that can make it difficult to develop a unified model. Therefore, the first step is to develop a concept of how to convert the data into a format that can be used for the model.
The first step is to choose the OCR tool, such as Tesseract, Google Vision API, ABBYY FineReader or Amazon Textract, which provide different good results for different types of input. Modern NLP techniques that detect relationships between words and their respective contexts are Naive Bayes classifiers, TF-IDF or LSTM algorithms. Moreover, for simpler information extractions, rule-based approaches should also suffice.
The geological maps can be analyzed by implementing CV algorithms, especially Convolutional Neural Networks (CNN), which are able to detect and classify objects in image data. As a result, relevant information can be automatically identified in the geological maps.
To capture and model any dependencies, the different input data - the text data from the geological reports and the image data from the maps - should be integrated into a common data set. For example, if information from both reports and maps is available for a given area and also matches, the results will be correspondingly more meaningful. The information extracted in each case can then be combined in a database and made available to the user via a Graphical User Interface (GUI).