HIGH THROUGHPUT SYSTEM FOR PLANT HEIGHT AND HYPERSPECTRAL MEASUREMENT

Hyperspectral and three-dimensional measurement can obtain the intrinsic physicochemical properties and external geometrical characteristics of objects, respectively. Currently, a variety of sensors are integrated into a system to collect spectral and morphological information in agriculture. However, previous experiments were usually performed with several commercial devices on a single platform. Inadequate registration and synchronization among instruments often resulted in mismatch between spectral and 3D information of the same target. And narrow field of view (FOV) extends the working hours in farms. Therefore, we propose a high throughput prototype that combines stereo vision and grating dispersion to simultaneously acquire hyperspectral and 3D information. * Corresponding author


INTRODUCTION
The constantly increasing global population presents a tremendous challenge for agricultural production (Mulla, 2013). Improving crop varieties and developing precision agriculture have become key steps to increasing yield (Furbank et al., 2011;Busemeyer et al., 2013) inseparably linked to the ability to assess the phenotype of plants (White et al., 2012). Currently, the measurements of thousands of plants are laborious and time consuming. Thus, there is an urgent need to develop high throughput systems that allow plot-level measurements within seconds. In addition, high throughput phenotyping in greenhouse has the possibility to relieve the bottleneck in gene discovery and crop improvement (Pandey et al., 2017). Among various phenotypes, plant height and spectral reflectance are two essential traits, which play an important role in breeding, growth monitoring and yield estimation.
In the past decades, numerous optical sensors have been developed to obtain spectral and 3D information. These sensors can be classified into passive and active types. Active sensors are typically equipped with energy source to obtain spectral or depth information by projecting the signal onto objects and measuring the responses.
For spectral measurement, active sensors (Tec5 AgroSpec, Trimble GreenSeeker, etc.) and passive sensors (ASD FieldSpec3, Specim ImSpector, Cubert UHD185, etc.) are often used for spectra acquisition. Light intensity across hundreds or thousands of distinct spectral bands can be quantified (Pandey et al., 2017). Prism, grating, Fabry-Perot, and Fourier transform spectrometers are typical conventional spectrometers. The grating spectroscopic system can obtain the spectra from X-ray to microwaves, and when blazed gratings are used, it can enhance the energy of the spectral region of interest and is therefore widely used. For depth measurement, active sensors based on time-of-flight (TOF) or laser triangulation and passive sensors based on stereo vision or structure from motion (SFM) are common ways to acquire depth information. A lot of sensor technologies such as depth camera (Azzari et al., 2013), lidar (Zhang et al., 2012), structured light approaches (Bellasio et al., 2012), ultrasonic transducer (Gil et al., 2007), stereo camera system (Biskup et al., 2007) are used to obtain 3D structures of plants. Point-based sensors (lidar, ultrasonic transducer) employ a narrow FOV that usually results in the loss of the highest point of crops (Jiang et al., 2016). Depth cameras such as RGB-cameras offer a lowcost way to acquire 3D information (Bellasio et al., 2012), but due to the poor performance on sunny days, a shaded environment is required (Jiang et al., 2016). Close-up laser triangulation can provide 3D data of high precision, but a measuring arm or an auxiliary motion mechanism is needed. Simultaneously, stereo vision or SFM can obtain the dense point cloud through image processing with lower cost, but the algorithm is complex and the accuracy is limited. Thus, the accuracy, time efficiency, application field and cost should be considered when choosing a 3D sensor.
Most integrated systems combine the above-mentioned techniques for depth and spectra measurement. For integrated system design, there are three main types: point-, line-and image-based styles. Point-based systems can obtain a 3D point and a spectral curve each time, and acquire the full data by whiskbroom. Zhao et al. (2017) designed an integrated system for auto-registered hyperspectral and 3D measurement by using the principle of point laser triangulation and prism dispersion. Line-based systems usually combine the line laser and the slitbased spectrometer, thus the entire data can be obtained through pushbroom. Behmann et al. (2015) developed an integrated system to generate 3D plant model with hyperspectral texture by combining several push-broom cameras and laser scanners. Image-based systems can extract 3D information directly from spectral images through SFM (Aasen et al., 2015), and these systems need camera calibration only without registration.
In this study, we mainly aim to develop an integrated prototype that combines stereo vision based on triangulation for depth information acquisition and grating dispersion for spectral data acquisition. Given that the system obtains data frame by frame, it can be applied for the simultaneous acquisition of the high throughput phenotyping of plants. Figure 1a illustrates the structure of the concave grating spectrometer. The incident light is imaged on the primary imaging plane by the fore lens, on which the slit lies as a field diaphragm. Then light coming out of the slit is dispersed by grating and focused on the detector. In contrast to plate grating, concave grating combines the functions of light dispersion and focusing, thereby ensuring that the spectrometer is compact and portable (Zhou et al., 2016). Moreover, the flat-filed design and aberration correction enable the planar detector to capture hyperspectral images. As shown in Figure 1b, a slit is imaged on the sensor with the spectral information horizontally dispersed.

Principle of Binocular Stereo Vision
Binocular stereo vision can infer depth information with two cameras based on triangulation. Figure illustrates the geometry of binocular stereo vision system. The object point P w (x c ,y c ,z c ) is projected on two image planes at position P L (X L ,Y L ,1), P R (X R ,Y R ,1) through optical centre. That is, two half-lines defined by lens centers and projected points in two images intersect at one point in space. Their relationship can be described by the following equations: Where R and T are the rotation matrix and translation vector between the left and right cameras, respectively, A L and A R are the intrinsic parameters of two cameras, s l and s r are the nonzero scale factors. When the parameters of the two cameras are known, which means that the spatial equations of the two halflines are provided, the object point position under the left camera coordinate can be obtained. Figure 2 shows the common structure of binocular stereo vision system, which can be rectified into a standard model (Fusiello et al., 2000). In this case, the two cameras are parallel. Therefore, the homologous points P L and P R are constrained on the same horizontal line of rectified images (Li et al., 2011). The coordinate of the object point is given by the following equations: Where B is the baseline, f is the focal length, and d=x l -x r is the disparity. Stereo matching is important to stereo vision, which uncovers pixel-wise correspondences between left and right images and subsequently generates the optimal map of disparities d(x,y) for all pixels (x,y) in the left image (Lati et al., 2011). In this study, Semi Global Block Matching provided by OpenCV is adopted for the generation of disparity maps. Figure 3 shows the structure of the integrated system. The left portion depicts the 3D structure measurement scheme, which consists of two cameras. The right portion illustrates the spectral detection component, in which the lens of the spectrometer is placed in the middle of the stereo cameras. In addition, an optical fiber, one end of which is arranged in a square and the other end is arranged in a line, is used to transfer the scene from two dimension to one dimension. The optic axes of the three lenses intersect at a distance of 1.2 m, thereby ensuring that the images captured by the two parts are centrally overlapping. Furthermore, one part of the reflected light from the target is captured by the stereo camera, from which a 3D point cloud is generated. Meanwhile, the other section is transmitted by a fiber bundle then dispersed by concave grating on the detector, from which hyperspectra are obtained.  Figure 4 shows a picture of the prototype. The upper dashed box illustrates the 3D measurement component, which includes two Basler dart daA2500-14uc cameras. The horizontal and vertical FOV of stereo cameras were 28°. The lower dashed box illustrates the hyperspectral detection element, in which a CMOS camera (HK-A5100-GM, Microview, Beijing) and grating were used. Furthermore, the numerical aperture (NA) of the fiber was 0.24, approximately 27.7° FOV, which ensured that the FOV was approximate to that of stereo cameras, and the fiber diameter was 125 μm. The software ran on a 3.2 GHz Core i5 PC without graphics processing unit (GPU) acceleration. Data acquisition of point cloud was performed at five frames per second. Moreover, stereo camera and spectral detectors captured the scene at the same time. An enlarged picture of the concave grating is shown on the left side of the figure.

Spectral Calibration
Spectral calibration focuses on determining the relationship between pixel position (N) and wavelength (λ). The relationship can be described in the following equation: 2 N=aλ +bλ+c (3) A monochromator (SP2500, Princeton, USA) equipped with a tungsten-halogen lamp acted as the standard light source. The mechanical range was 0 -1400 nm with 0.2 nm accuracy and 0.05 nm repeatability. During the process, the drive step size was set to 5 nm, and two items were recorded at each step (Cho, et al., 1995): first, the pixel position (N) that corresponds to the peak of each band and current wavelength (λ); second, the full width at half maximum. Figure 5 shows the fitting result. The spectral range of the prototype is 450-790 nm, and the typical spectral resolution is 3.1 nm@600 nm. Figure 5. Relationship between pixel position N and wavelength λ

Stereo Camera Calibration
In this step, we obtained the parameters of the cameras, particularly intrinsic and extrinsic parameters, and distortion coefficients, by using the method described by Zhang (2000). Figure 6 shows the calibration images of two cameras. During the process, the calibration board was placed at various positions with diverse orientations. Finally, we calculated the re-projection errors to evaluate the accuracies of the calibration.
The RMS values of re-projection error were 0.116 and 0.139 pixels for the left and right cameras, respectively. Intrinsic parameters are shown in table 1, and extrinsic parameters are illustrated in equation (4).

Physical meaning
Parameter Camera Values

EXPERIMENTAL TESTS
In order to evaluate the depth accuracy, the standard plate and column were used for the evaluation of depth accuracy. The root mean squared errors (RMSE) at 1200 mm were 0.82 and 1.05 mm for the plate and column, respectively. We used 3 times the RMSE to describe the accuracy, which was ±3.15 mm at 1.2 m.
An Epipremnum aureum was used as experimental sample to show the feasibility of the integrated system. This experiment was carried out under laboratory condition with a tungsten lamp. First, the spectral data of a standard diffusing reflector was acquired as the reference spectrum. Then, the average spectra of the plant were respectively obtained by the prototype and ASD in the same position. Finally, the reflectance was calculated from the ratio between the spectra of the plant and that of the reference. Figure 7 illustrates the results of experiment. Figure  7b shows the disparity map, from which depth information can be generated. For height measurement, several points on the plant were selected as measuring targets. As shown in Fiure 7c, prototype measurements are strongly correlated with manual measurements (adjusted R2=0.998). For spectral measurement, the spectral data obtained by prototype are almost overlapped to the data measured by ASD. Since the prototype and the ASD had similar FOV and spectral resolution in the 450-790 nm range, the two measured spectra were almost overlapped. Furthermore, the root mean squared error (RMSE) was 1.34% in the range.

CONCLUSION
In this study, we proposed a high throughput prototype that can simultaneously obtain plant height and hyperspectral information. The spectral range is 450-790 nm with the resolution 3.1 nm@600 nm, and the depth accuracy is ±3.15 mm at 1.2 m. The hyperspectral and depth measurement are performed with grating dispersion principle and binocular stereo vision respectively. However, the prototype has several problems to be solved. Since the quantum efficiency and width of the spectral detector are limited, spectra in a narrow range can be obtained. Additionally, the 3D point cloud is recovered from only two perspectives, some structures of plant are lost due to partial occlusion.
Combining different types of information can offer multiple traits and open up new possibilities in crop monitoring. Therefore, developing a combined system in terms of hardware and software is a novel trend, ensuring that data from each sensor of the same target are matched at the area or plant scale and even at point scale. With the development of sensing technologies, systems that can offer information in a timely manner, cover large areas, have sufficient spectral resolutions, carry multiple data, and have reasonable costs are urgently needed in agriculture. Hence, the integrated system proposed in this study is promising.