Digital Commons @ Michigan Tech Digital Commons @ Michigan Tech Rapid visual presentation to support geospatial big data Rapid visual presentation to support geospatial big data processing processing

: Given the limited number of human GIS/image analysts at any organization, use of their time and organizational resources is important, especially in light of Big Data application scenarios when organizations may be overwhelmed with vast amounts of geospatial data. The current manuscript is devoted to the description of experimental research outlining the concept of Human-Computer Symbiosis where computers perform tasks, such as classification on a large image dataset, and, in sequence, humans perform analysis with Brain-Computer Interfaces (BCIs) to classify those images that machine learning had difficulty with. The addition of the BCI analysis is to utilize the brain’s ability to better answer questions like: “Is the object in this image the object being sought?” In order to determine feasibility of such a system, a supervised multi-layer convolutional neural network (CNN) was trained to detect the difference between ‘ships’ and ‘no ships’ from satellite imagery data. A prediction layer was then added to the trained model to output the probability that a given image was within each of those two classifications. If the probabilities were within one standard deviation of the mean of a gaussian distribution centered at 0.5, they would be stored in a separate dataset for Rapid Serial Visual Presentations (RSVP), implemented with PsyhoPy, to a human analyst using a low cost EMOTIV “Insight” EEG BCI headset. During the RSVP phase, hundreds of images per minute can be sequentially demonstrated. At such a pace, human analysts are not capable of making any conscious decisions about what is in each image; however, the subliminal “aha-moment” still can be detected by the headset. The discovery of these moments are parsed out by exposition of Event Related Potentials (ERPs), specifically the P300 ERPs. If a P300 ERP is generated for detection of a ship, then the relevant image would be moved to its rightful designation dataset; otherwise, if the image classification is still unclear, it is set aside for another RSVP iteration where the time afforded to the analyst for observation of each image is increased each time. If classification is still uncertain after a respectable amount of RSVP iterations, the images in question would be located within the grid matrix of its larger image scene. The adjacent images to those of interest on the grid would then be added to the presentation to give an analyst more contextual information via the expanded field of view. If classification is still uncertain, one final expansion of the field of view is afforded. Lastly, if somehow the classification of the image is indeterminable, the image is stored in an archive dataset.


Geospatial Big Data and Applications scenarios
Nowadays the rapid development of information technology has led to the tremendous growth of data from various geospatial sensors which can be defined as the era of big data (Li et al., 2016). Typical application scenarios of geospatial big data include but are not limited to: land use mapping (Joshi et al., 2016 ), change detection (Wang et al., 2016) , natural and manmade disasters monitoring (Cernove et al., 2016). Geospatial Big Data are composed of terrestrial geosensors (Reis, 2005), (Nittel et al., 2005), social media data (Esmaili et al., 2013), terrestrial and airborne LIDAR (Debie et al., 2020), aerial imagery from manned and unmanned (UAS) platforms, satellite Earth Observation Systems imagery with various spatial (Li et al., 2008), temporal (Stellmes et al, 2013), and spectral resolutions (Asadzadeh et al.,2016). Most current geospatial toolsets can be termed as a "human-in-the-loop" in spite of increased amounts of operations that are automated by computer algorithms; therefore, research in optimization of the geospatial analysts' workflow can be important for overall productivity increase of geospatial systems and toolsets.

Why human-computer symbiosis?
Most of the automated detection and identification algorithms for objects/phenomena recognition in big data geospatial domains compares radiometric and geometric parameters of the objects/phenomena against parameters obtained as a result of supervised or unsupervised training classification and match them against some set of predefined decision rules. Unfortunately, the practical usefulness of automated detection and identification based on" press-a-button" derived results are limited (Feferman et al., 2004), because geospatial imaging data are burdened with residual errors and artifacts which have to be manually inspected, cleaned, and corrected. These tasks complicate large projects that require real-time processing of immense amounts of geospatial big data information and require a human analysts' involvement in manual post-processing and visual inspection of the automatically derived objects detection and identification results. This prompted us to consider developing a human-in-the-loop semi-automated technology to enable the most efficient processing of visual geospatial data. As humans, we perceive and process vast amounts of information visually at extremely high speed; therefore, it seems reasonable to combine this human ability with the speed of computers to build a Human-Computer Symbiosis (HCS) platform for processing geospatial data. This platform can be based on registering the cognitive activity of analysts by means of brain electroencephalogram (EEG) and visual attention of an analyst by real-time eyetracking. While the human brain performs searches and analysis of visual data, the analyst's eyes intuitively scan the scene. Such eye movements are driven by and indirectly represent results of internal processes of visual searching and matching, performed by the whole human visual system. By tracking and analysing EEG and eye movements data (Eenwyk et al., 2008) we can arrange a "smart" loop, with the computer performing the rest of the tasks, where computation and data storage predominate. Table 1 shows the relative strengths of brains and computers for geospatial image analysis. As can be seen, the combination of both would lead to a more complete analysis where one would supplement where the other is lacking.

Stage Agent
General observed scenes matching brain Tuned area matching brain computer Disparity evaluation brain computer Finding spot correspondence brain Object recognition brain Measuring (un)matched objects brain computer Measurement registration computer Statistics computer Analysis brain computer Table 1. Strengths of humans and computers in geospatial image analysis (Gienko and Levin, 2007) 1.3 State-of-the-art: integrating automated and interactive geospatial data processing.
An interesting experiment on integrating computer vision and brain-computer interfaces is described in (Pohlmeyer et al., 2011) and led to the development of the concept corticallycoupled computer vision (C3Vision) which refers to a particular class of brain-computer interface (BCI) meant to combine the complementary strengths of computer vision and human vision to provide robust image search and retrieval in high throughput tasks (Gerson et al., 2006), (Parra et al., 2008), (Sajda et al., 2010). The C3 Vision is aimed to assist users in searching large imagery databases. Specifically, Caltech-101 (Fei-Fei et al., 2004) testing imagery dataset was deployed. Samples of images were randomly selected and presented in what's known as a rapid visual serial visual presentation (RSVP). The computer vision method deployed for image annotation was selected as a transductive annotation graph (TAG) described in (Wang et al., 2008) and (Wang et al., 2009). That is a semi-supervised learning technique which uses a small subset of labelled images. ActiveTwo Biosemi EEG with 64 electrodes was used for the recording and determination of human-interest rate to the RSVP data. C3Vision system enabled interaction between the human and computer vision by means of cortical interface. Its architecture was useful in reorganizing imagery in large diverse data sets. However, the challenge of practical use of the cortical interface-based system in geoinformatics is associated with: 1. Research on how RSVP will be efficient for complex (aerial, satellite) imagery labelling 2. How easy-to-use EEG devices that do not require human analyst scalp soaking can be deployed with significantly reduced number of EEG electrodes.

Experimental Scenarios
To implement cortically-coupled empowered geospatial big data application scenarios we performed research experiments to demonstrate a feasibility of the iterative ERP solution. The major stages of that solution are: 1. Train a supervised machine learning model to classify images to predict the probability that an image is within the bounds of a classification 2. Move images that have been misclassified or those whose classification is uncertain within the first standard deviation of a gaussian distribution centred at 0.5 to a separate dataset and present them to an analyst via RSVP where t0=0; dt=5Hz; tmax=0.5; tmin=0.25 te=t0 Figure 1. Geospatial Imaging -Event Related Potential Engine Functional Workflow 3. Move images to proper dataset when ERP analysis detects P300 indicators. 4. Iterate over remaining images with te = te + dt 5. Mark and remove image with positive P300 indicators from presentation dataset 6. Repeat iteration until te = tmax 7. Widen field of view to add contextual information and present them to the analyst. Repeat field of view expansion once more if necessary. 8. Archive any unclassifiable images.
As a result, we intend to check the feasibility of the Geospatial Imaging ERP engine (GI-ERP) as it is depicted on Figure 1.

Experimental Data
The experiment was conducted using a Kaggle dataset composed of labelled satellite data (Planet Team, 2017). The dataset included 1000 images of ships and 3000 images without ships. Figure 2 shows a subset of the training data with labels.

Figure 2. Sample set of image training images and labels
As it can be seen from Figure 2, there are errors in the labelled data as gathered directly from Kaggle. For our purposes, the data was too large to manually relabel the data in the time given, see the future work section for a potential solution that takes advantage of the application of BCI. As for the results, the labeling errors exacerbate type I and type II errors; however, the foundation of the model fits within supervised learning norms.

Creating the Model and Results Obtained
The model was created using well developed convolutional neural network training methods. The activation function used during training was an unmodified ReLU that sets all negative values to zero and keeps any positive values which performs well on image data (Glorot et al., 2010) (Nair and Hinton, 2010). The model also used "Adam" optimization for its computational efficiency and low requirements for tuning (Kingma et al., 2014). For the prediction layer a modified ReLu activation function was added to scale the positive values to fit within the range of 0 to 1. The resulting values were attributed with the confidence the model had with its prediction. An example of the resulting predictions and confidence can be seen in Figure 3. The same logic that is used to create the red text for the "misclassified" data is what is to be used to create a separate dataset to be presented in RSVP for a fully connected system. In order to determine if an image should then be forwarded to an analyst for RSVP, the confidence value would need to fall within the first standard deviation of the normal distribution shown in Figure 4.  where 0 ≦ x ≦ 1) and a normal distribution centered at 0.5

Rapid Serial Visual Presentation and ERP Analysis
The cost-efficient EMOTIV insight (Emotiv) EEG device for GI ERP was used for recordings. Custom PsyhoPy script was developed to perform rapid serial visual presentations of the imagery to the subjects involved in the experiment. Figure 5 below shows the PsychoPy user interface of the main RSVP loop. The resulting Python script that executed the The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2020, 2020 XXIV ISPRS Congress (2020 edition) presentation was automatically generated from this interface. The analyst was presented with an initial image, "Target Acquisition" to solidify what they were to search for. They were then presented with a message preparing them for the RSVP. The images were then looped through during "Test Trial" and when all of the images were presented, the exit message was displayed prompting them to seek the researchers to end the task.

Probabilistic mathematical approach for the estimation of GI ERP extracted objects
Any information extracted from geospatial big data may have inaccuracies associated with the nature of this data. This inaccuracy may follow from the errors in machine learning algorithms and human interpretation due to complexity of sensors models and variety of conditions. In case of imaging information such inaccuracy may follow from the known errors in image features segmentation (which is known as "ill-posed problem" (Marroquin et.al., 1987) and matching those features against geospatial features presented in a pre-existing database of the area. For the mathematical abstraction let's assume that we will compare probability of the object of interest found event or image feature towards a simplistic binary model of "truth" and "false". In this case of binary simplification, the GI ERP operates vectors ⃗ and �⃗ representing a current geospatial object (GO) and database vector (DBV), respectively. When �⃗ represents a full data stream, and ⃗ represents only a particular view/state of a particular object, a number of ⃗ permutations with N-number of symbols (vector components), and M-number of "ones," is where m!=m(m-1)(m-2).
It could be shown for equal probabilities for "1" and "0" (equal to 0.5) that: The False Negative Rate (FNR) can be calculated from the binomial distribution approximated by Gaussian (normal) distribution in the following asymptotic form (Mesgeneu et al., 1976): , which is illustrated in Figure 6.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2020, 2020 XXIV ISPRS Congress (2020 edition) Figure 6. Illustration of the FNR for Gaussian distribution.
The FNR determines with what probability we can guess the right distribution, with variance: and mean value: In GI ERP context, we assume to use well-known automated target recognition (ATR) so called Figures of Merit (FOMs) that are based on Signal Detection Theory (SDT) (Macmillan,2002); where conditional probabilities are applies (see, Table of Figure  6); p(SIS), p(SIN), p(NIS), p(NIN), translated as probability of: hit, False-Alarm-Rate (FAR), miss, and correct-rejection, respectively. In this formation, the conditional probability p(AIB) means; if event (B) is present, then event (A) is detected. For example, p(NIS) is probability of signal/target-detection assuming that only noise-is-present. Thus, p(NIS) also means the probability of False-alarm, or False-Alarm-Rate (FAR). We should observe that p(d/I) is similar to p(SIS). Yet p(SIS) is closer to our ERP scenario, when, on the basis of single, or few images, we need to evaluate the probability of target recognition (hit). Based on this modelling, we here assume to establish Classes of Equivalence (COEs), as shown in Figure 6. In more general than in (Thom,1969) case, in addition to probability of hit, miss, FAR, and correct rejection, we here also may compute cross-table probabilities (CTP). During GI ERP development we investigate and evaluate the COE-concept for specific classes of equivalence related to emergency situations response application scenarios, for example such as: Ship (A), No Ship (B).  Table 3. Conditional probabilities, according to the CI ERP SDT model, generalized into two classes of equivalence geospatial objects ships and no ships In addition to a traditional simplified signal/noise model (Macmillan, 2002), where the conditional probabilities mean; hit, miss, FAR, and Correct rejection, we will estimate also Cross-Talk-Probability (CTP).

EEG EXPERIMENT RESULTS
Imagery datasets as described in section 2.5 were demonstrated to one subject (due to covid-19 quarantine we were not able to involve more human subjects). We recorded experiments on both Emotiv Epoch (14 electrodes) and Insight (6 electrodes) EEG headsets. Grand average ERP to Aha on all sensors are depicted on Figure 7 for both devices.
A B Figure 7. Grand average signals for Emotiv Epoc(A) and Insight(B) recording ERP experiment.
It is visible from Figure 7 that electrodes O1 and O2 in Epoc device and Pz on Insight are responsive to the signal (Ahamoment). It follows well with a neuroscience theory that subliminal aha-moment can be detected in the hypothalamus area of the brain.
The graphical representation for the result of the experimental objects ERP recordings for the ships/no ships trial are given on Figure 8. We defined bin number 0 for the no-ship images and bin number 1 for the ship images. Based on the result from ERP processing, we captured more activities on the left lower back of the brain during 300ms after the presentation of ships during RSVP experiment rather than other parts of the brain. EEG data was also analysed based on ERP "Aha" -minus "Noaha" differencing waves (Mai et al., 2004). P300 (van Dinteren et al., 2014) was elicited for both data subsets with objects (Aha) and without objects (non-Aha) images. Results of computing P300 differences and potentials are listed below in Tables 4 and 5.   Experimental results prove our concept feasibility.

CONCLUSIONS AND FUTURE RESEARCH
Preliminary experimental results indicate a feasibility of the human computer symbiosis within the big geospatial data domain. Our future research will be devoted to increasing accuracy and minimizing processing and analysis time for obtaining results, including further software and algorithm optimizations.
For improving the labelled dataset more quickly than traditional methods, a form of Interactively Procedural Image Serial Presentation can be used where interaction with the presentation via button presses cause the next image to be shown. This methodology can be used to show an analyst each image/label pair. When an analyst sees the image in Figure 8, for instance, they can press a button to inform the system that the image was mislabelled which can then be programmatically set to correct the label by moving the training image to the proper directory. This method would be faster than an analyst opening up every image in a poorly labelled dataset and manually moving the image files to the proper classification directory.
If there is uncertainty from the analyst during the RSVP, as automatically detected by EEG, the image that caused the uncertainty will be stored in a separate dataset. This new dataset will then be used to locate the image within a larger context, such as the scene in Figure 10; however, the entire scene will not be loaded as that is computationally inefficient. Instead, the images in question will be used to locate their position within the larger scene via their coordinates. Once the position is determined, the field of view is increased around the original image by only gathering the eight adjacent images around it, Figure 11 B. If certainty still doesn't increase beyond a chosen threshold, the field of view will be increased once more Figure 11 C. Finally, if the analyst still cannot be certain of what is in the original image, it will be stored in a dataset with other undeterminable images for archiving. To further enhance outcomes, we are planning to integrate eye tracking technology for the analysis of zones on imagery which tracks visual attention of the human analyst and approve machine learning technology by addition of rules in form of decision trees. Specifically, anticipated workflow ERP Extract is depicted in Figure 12.

ACKNOWLEDGEMENTS
Authors would like to express gratitude to the Civil Engineering Department at Michigan Technological University and personally to Jefferey Hollingsworth for the financial support of this research.