AUTOMATIC TEXTURE MAPPING METHOD FOR 3D MODELS TO CIRCUMVENT OCCLUSION THROUGH 3D SPATIAL THROUGH-VIEW RELATIONSHIPS BETWEEN IMAGES, MODELS, AND POINT CLOUDS

Photorealistic 3D models play an important role in various applications related to smart cities. Texture mapping plays a crucial role as the last step of 3D reconstruction. The texture of the model directly reflects the visualization and realism of the model. In complex city scene s, objects often have varying degrees of occlusion from one another. Traditional methods to circumvent occlusion by image or model dat a are not very effective. In this paper, we propose an automatic texture mapping method for 3D models to circumvent occlusion through 3D spatial through-view relationships between images, models, and point clouds. First of all, calculate the density and voxel resolution of the point clouds, the point clouds here can be either dense matching point clouds or laser point clouds. Voxelization is based on point clo uds, voxels generated by voxelization can reflect the spatial location of the point clouds. Secondly, to determine whether there is an occlu sion of voxels between the image and the model vertices based on the spatial through-view relationship between the image, the model, an d the voxels, and filter out the unobstructed images. Finally, by optimizing the data items of the Markov random field, the shading functi on is added to the original data items for evaluating the shading condition, based on which the most suitable image is selected for texture mapping of a triangular surface of the model. Loop computation to complete the unobstructed texture mapping of the entire model. In thi s paper, the public test datasets released by the International Society for Photogrammetry and Remote Sensing (ISPRS) and several ur ban area datasets collected by ourselves are incorporated to test the performance of the proposed method. Both qualitative and quantit ative cross-sectional comparisons with existing texture mapping algorithms are conducted and presented. Experimental results indicat e that our method can effectively circumvent occlusion for automated texture mapping, and generate 3D models with fewer obstructed te xtures, which largely improves the visual quality and realism of the models.


INTRODUCTION
A highly realistic city 3D model is the key foundation for the c onstruction of a digital twin city (Li et al., 2022), especially the building 3D model plays an important role in many application s such as urban space simulation and urban refinement manage ment (Haala and Kada, 2010;Wu et al., 2018). Compared with t he traditional two-dimensional data, the realistic 3D model has incomparable advantages in expressing the actual geometry and spatial location of objects, etc. As the development of the city advances day by day, the overall spatial layout of the city is con stantly changing from the original two-dimensional layout to th e three-dimensional space, which also leads to the existence of different degrees of spatial occlusion between various objects i n the city, in the complex urban scene, due to the air or ground image In complex urban scenes, due to the natural limitation of non-contact remote acquisition methods or the objective influe nce of the complex environment around the building, the textur e pictures of the 3D model of the building are often obscured b y the surrounding trees or other objects, which cannot accuratel y portray the real surface texture information of the building an d lead to texture mapping errors. As the last part of 3D reconstruction, high-definition texture ma pping, as an important means to improve the quality of building 3D models, directly affects the visualization effect and realism of building 3D models. Considering that point clouds are widel y used in 3D reconstruction, they are also easy to obtain. Both t he 3D laser scanning point clouds based on the laser scanner (N iemeyer et al., 2014) and the dense matching point clouds (Rhee and Kim, 2016) based on the image generation are more matu * Corresponding author re in their acquisition and production methods. The point cloud s record the 3D coordinate information of all points on the surfa ce of the object in a unified coordinate system, 1 which expresse s the spatial geometry of the object more completely, and the te xture mapping process can be optimized by the point clouds.
To address the occlusion problem of texture mapping of buildin g 3D models, in this paper, a method to determine the occlusion relationship between building models and texture candidate im ages using 3D point clouds is proposed. First, the resolution-ad aptive voxel occlusion generation method is proposed to voxeli zed the point cloud by calculating the point cloud voxel resoluti on with the number of point clouds and the enclosing box infor mation; then, the occlusion judgment and image filtering metho d based on the spatial through-view relationship is proposed to filter the images based on the three-dimensional spatial through -view relationship among the image projection center, model a nd voxel; finally, the Markov random field is optimized based o n the occlusion function Finally, the data items of the Markov r andom field are optimized based on the occlusion function, and the most suitable image is selected for each facepiece of the m odel for texture mapping. This results in a more realistic 3D mo del. The rest of this paper is organized as follows. Section 2 briefly reviews the existing texture mapping and its methods for circu mventing occlusion. The proposed method and its key steps are described in detail in Section 3. Section 4 conducts experiment s using different platforms or multi-platform fused images, lase r point clouds, and image dense matching point clouds, and per forms qualitative and quantitative cross-sectional comparisons with existing texture mapping algorithms. Conclusions and disc ussions are drawn in Section 5.

RELATED WORK
With the progress of modern science and technology, photogra mmetry and computer vision have been developed rapidly, a nd the application of real-world 3D models has become more a nd more widespread. Image-based 3D reconstruction can be un derstood as four algorithmic steps: Structure from Motion, Mul ti-View Stereo, surface structure network reconstruction (Snave ly et al., 2006), and texture mapping. numerous scholars have a lso designed different 3D reconstruction algorithms according t o the applicability of each step algorithm, such as VisualSfM (W u and Ieee, 2013), Bundler (Snavely, 2010), MVE (Fuhrmann et al., 2015), etc. Texture mapping, as the most available step of 3D reconstructi on, is one of the most widely used key techniques in computer vision and photogrammetry. The idea of texture mapping was f irst introduced by Catmull back in 1974 (Catmull, 1974). After t hat, Sinha designed an interactive system to texture large 3D m odels with flat surfaces (Sinha, 2008), and Tan et al. also propos ed an interactive approach and focused on texture mapping of b uilding surfaces (Tan et al., 2008). By applying pairwise Marko v random fields to texture mapping a view can be selected for e ach model slice and its data items are used to determine the qua lity of the texture view (Lempitsky et al., 2007), so by optimizin g the data items there are different image filtering effects (Gal et al., 2010), either using normal vector angles or using projected gradient information (Waechter et al., 2014). Early methods of texture mapping to circumvent occlusion relie d on user interaction to mark the occluded objects (Sinha, 2008). As the research on automated methods progressed, there were t wo main methods to circumvent occlusion, one based on image restoration and filtering, and one based on image and model 3 D relationships for judgment. Some of the image-based method s are iterative screening by color information in the image (Gra mmatikopoulos et al., 2012), some are detecting duplicate regio ns for substitution (Zhou et al., 2016), and some are based on de ep learning methods for masking detection (Ronneberger et al., 2015)and later repairing the masked regions (Yu et al., 2019;He et al., 2021). The substitution or repair methods do not essentia lly solve the masking problem, and the effect is different from t he actual one. Judgment methods based on image and model 3 D relationships often map images of geometrically missing obj ects onto the surrounding model because the geometry of the m odel reconstruction is not complete enough. To verify the effectiveness of the segmentation method propose d in this paper, an RF classifier is used to distinguish the classe s of the clusters utilizing five kinds of features.

METHODOLOGY
The method in this paper is divided into three main steps. The f irst step is the Resolution-adaptive point clouds voxel generatio n method. The second step is the Occlusion judgment and imag e filtering method based on the spatial through-view relationshi p. The final step is the Texture preference method for mixing te xture quality and shading relationships. The specific process is shown in Figure 1.

3.1.1
Point clouds preprocessing. Both laser point clouds and image-dense matching point clouds often cause data redundancy in the data acquisition process, so the point clouds data need to be pre-processed. For point clouds with large coverage, it is necessary to segment them to obtain the point clouds within the experimental range, and then it is necessary to align the point clouds data with the 3D model that needs texture mapping to the same coordinate system to judge the 3.2 spatial through-view relationships. For dense point clouds data, noise reduction and downsampling are required. Noise reduction can better represent the object geometry by eliminating discrete points, and downsampling can effectively improve the efficiency of the algorithm by reducing the number of point clouds without changing the geometry. 3.1.2 Point clouds voxelization. In this paper, we design a r esolution-adaptive voxel occlusion generation method by solvi ng the enclosing box of the pre-processed point clouds data, co mbining it with the point clouds density, calculating the adaptiv e voxel resolution, and spatially dividing the point clouds data based on it, traversing all voxels to see whether they contain po int clouds data, extracting all valid voxels into a set, which can effectively express the morphology of the point clouds data. Ca lculate the Bounding Box corresponding to the original point cl ouds data based on the 3D spatial coordinate data of each point The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France (3) (N = 0,1,2 … … , n)； This method can effectively calculate the adaptive voxel resolution, and the spatial partitioning of the point clouds data ca n be performed according to the adaptive resolution. The number of voxels in each axis can be calculated based on the adaptive voxel resolution and the extreme points in the three axes,X size 、Y size 、Z size are the number of voxels in X, Y, and Z axes respectively.
Get the number of voxels in X, Y, and Z directions to solve for the number of voxels in the whole point clouds BBX size .
BBX size = X size * Y size * Z size (5) The voxel resolution divides the enclosing box between the X, Y, and Z axes to get all voxels of the point clouds, and stores al l voxels into the same set. By traversing all voxels to see if they contain point clouds data, all voxels containing point clouds ar e filtered out, and the voxel set is the result of point clouds vox el talk, which expresses the geometry of the original large num ber of point clouds by fewer voxels.

Occlusion judgment and image filtering method based on the spatial through-view relationship
The 3D reconstruction technique of image can obtain the infor mation of internal and external orientation elements of each im age in the process of SFM solution, mainly including the focal l ength, projection center coordinates, rotation matrix parameters and other related information of the image, which can be used to determine the position pose of the image in the spatial right a ngle coordinate system. By parsing the data generated in SFM, the projection center I O (X o , Y o , Z o ) corresponding to each imag e I is extracted. The mesh model records the 3D spatial coordin ate information of all points in the model, and the index inform ation of every three neighboring points can constitute a triangul ar surface slice, based on which the position data of points and surfaces in the mesh model can be extracted by The three vertic e s c o r r e s p o n d i n g t o F a r e F 1 (X F1 , Y F1 , Z F1 )、F 2 ( X F2 , Y F2 , Z F2 )、F 3 (X F3 , Y F3 , Z F3 ). Generating three line segme nts L OF1 in three-dimensional space by connecting three vertice s F 1 (X F1 , Y F1 , Z F1 )、F 2 (X F2 , Y F2 , Z F2 )、F 3 (X F3 , Y F3 , Z F3 ) of a single face sheet to the projection center I O corresponding to each image I.
If there is an intersection between the line segment and the vo xel surface piece, it is recorded as a collision and the position information of the intersection in the three-dimensional space is calculated as S 1 (X s1 , Y s1 , Z s1 ).
When all voxels are traversed, the total number of collisions is recorded as S times, and the intersection points are S n n=0,1,2 ......n. The overall process of spatial through-view relationship determination is shown in Figure 2.
After iterating through all the point clouds voxels, a threshold is set to determine the spatial relationship between the center of the image projection, the vertices of the model surface, and the point clouds voxels. According to the application in practice, the threshold value is set to 2 (because the point clouds and the model have been aligned to the same coordinate system, the voxel generated based on the point clouds contains all the point clouds inside the voxel, so the 3D model is also contained inside the voxel, when by connecting the model vertex and the image center into a line segment, the line segment will definitely pass through the voxel where the model vertex is located and produce the intersection, the collision count is 1, but when (When there is an occlusion between the model triangle vertices and the image center, the voxel generated by the point clouds of the occlusion will also collide with the line segment, which leads to a collision count greater than or equal to 2, so the threshold value is set to 2), when the collision count is less than the threshold value, it means that the image projection center and the model vertices maintain a through-view relationship, no occlusion; when the collision count is greater than the threshold value, it means that the image projection center and the When the number of collisions is greater than the threshold, it means that the image projection center and the model vertex cannot maintain the through-view relationship, then there is an occlusion between them, and there is an occlusion.
3.3 Texture preference method for mixing texture quality and shading relationships 3.3.1 Occlusion function. From the perspective of texture mapping in 3D reconstruction, occlusion is manifested as the existence of another model between the two when a model is observed from the direction of the field of view, resulting in a wrong texture mapping of the former model to the model observed in the back, thus producing a wrong texture mapping. A depth conflict check is performed between these three, image, model, and point clouds voxel, based on the spatial location, and a judgment of occlusion is made based on the results of spatial passages to determine whether there is occlusion between the image and this triangular surface sheet, but this is only a qualitative description, and a quantitative evaluation of each image is needed. Here, let the quality value of each image be and the quality value of each model triangular surface sheet be . Based on the calculation result of the spatial pass-view relationship, if the number of collisions S < 2, then let.
V Ii = V Fj (10) The equation expresses that when the number of collisions between the image projection center and the connection line break of each vertex of the model triangular facet sheet and the point clouds voxel is less than 2, the mass value of the image is equal to the mass value of the triangular facet sheet, which is mathematically expressed as a one-to-one correspondence mapping relationship between the two. In this paper, we evaluate the occlusion status of each image relative to the model surface slice by setting a new function. Considering that the actual occlusion status exists only in two cases, with and without occlusion, and any degree and proportion of occlusion can be regarded as occlusion, we set the occlusion function as E Cover , and its specific formula is expressed as follows.
E Cover = [V Ii = V Fj ] (11) The [·] represents the Iverson bracket, whose operation process is expressed as 1 if the condition inside the bracket is satisfied and 0 if it is not, which can effectively convert the Boolean value to an integer value. The expression can be interpreted as E Cover is 0 if there is no occlusion between image and surface , and E Cover is 1 if there is an occlusion between image and surface . The value of E Cover quantitatively evaluates the occlusion relationship from a mathematical point of view.

3.3.2
Markov Random Field Optimization. The Markov random field represents the set of all objects that are only related to the properties of neighboring objects and not to the properties of objects in other regions. Based on the Markov random field concept, some scholars applied it to the work of texture mapping, where Lempitsky and Ivanov used the Markov random field to select a view for each triangular facet, the data term to determine the quality of the texture view, and the smoothing term to model the severity of the joints between the texture facets for smoothing, whose expressions are shown as follows.
This method uses the Sobel operator to project the triangular surface slice onto the image and calculates the gradient magnitude ‖ ( ( ))‖ 2 by summing all pixels of the gradient magnitude image within the projection ( , ) of . In this paper, we combine the previous occlusion status of each image and the occlusion function E Cover used for evaluation to judge the quality of the texture view of the data item is further optimized by adding the occlusion function E Cover to the original one, and the optimized expression of the data item is: E Q (M) = E Data + E Cover (14) By further optimizing the data items of the Markov random field and adding the occlusion function E Cover , the original data item E Data is filtered by the information of normal vector angle and projection gradient, and E Cover is filtered by the occlusion condition of the image, and the most suitable and optimal image is filtered for texture mapping of this triangle face slice. The other triangular slices are iterated using the same method so that each triangular slice has a texture image corresponding to it, and the neighboring slices using the same image can calculate the texture coordinate value of their vertices in the image so that their corresponding texture slices are also merged, which effectively reduces the time for texture seam processing afterward, and then the texture is leveled and color graded to complete the whole process of texture mapping. The whole process of texture mapping is completed, and a highly realistic 3D model is output. , from which 20images of the ground platform acquired by Sony Nex-7 and 21 images of the air platform acquired by the multi-rotor aircraft DIJ S800 with Sony Nex-7 were selected, a total of 41 images, and the point clouds were dense matching point clouds generated from these 41 images, and the 3D model generated by the existing 3D reconstruction algorithm. The specific experimental data are shown in Table 1.

Results for the Jiaxing dataset
In Experiment 1, the experiment is first conducted by using six images captured by DJI Genie 4 RTK UAV. Based on the six images, their camera pose and orientation element information inside and outside the image are recovered by SFM, and then the dense point clouds of the scene are output by dense reconstruction of MVS, which has 95850883 dense point clouds. The occluded voxels are generated using the pre-processed dense matching point clouds, and the most suitable image is selected to complete texture mapping based on the spatial flux and mapping relationships among the image, model, and voxels. The comparison of the results of the method in this paper with the open-source algorithm mvs-texturing is shown in Fig. 3. The red box area indicates the part of the open-source algorithm mvs-texturing texture mapping where there is still occlusion, a nd the textures such as tree branches are mapped to the tile mod el of the gazebo, and the green box part indicates the result afte r texture mapping at the same location using the method of this paper. By comparing the visualization of the 3D models after te xture mapping of the two methods, the method of this paper is better than the mvs-texturing method. The two-dimensional im ages were obtained by a camera at the same camera angle as th e models generated by the two methods, and the image pixels w ere 466*219, two ranges were selected for region one and regio n two in Figure 3, and there was a certain degree of occluded te xture in the regional range, and the pixels of region one were 1 30*70 and the pixels of region two were 75*43. The details are shown in Table 2 Table 2 Occlusion status statistics after texture mapping using dense point clouds in Experiment 1 According to the counted number of occluded pixels, the number of mvs-texturing method is 3864 and 893 in the two regions after mapping, while the number is 893 and 56 after mapping by the method of this paper, the number of occluded pixels is greatly reduced, and the percentage of occlusion is also reduced from 42.46% and 27.69% to 5.07% and 1.74%, so the effect of circumventing occlusion is greatly improved. The effect of occlusion avoidance has been greatly improved. The above experiments are conducted using image-based dense matching point clouds, followed by experiments using laser point clouds captured by a GeoSlam handheld laser scanner. The experiments are conducted using 32 images captured by DJI Phantom 4 RTK and their corresponding image information, laser point clouds data, and the model to be texture mapped. Figure 4 Comparison of texture mapping results using laser point clouds in Experiment 1 The mvs-texturing texture mapping method still cannot circum vent the occlusion, and the texture image of the tree branch is st ill incorrectly mapped on the model texture, while in the experi ments based on the method of this paper, the spatial through-vi ew relationship judgment and depth conflict checking can effec tively circumvent the occlusion by voxelizing the laser point cl ouds data, and the two-dimensional images are obtained by a ca mera at the same camera angle for the models generated by the two methods, and the acquired image pixels are 486*314, and a ll the occluded pixels in the images are counted, and the details are shown in Table 3.

mvstexturing our method
Number of occluded pixels 12764 169 Total percentage of occluded pixels 8.36% 0.11% Table 3 Statistics of occlusion condition after texture mapping using laser point clouds in Experiment 1 According to the number of occluded pixels in the whole imag e, the number of the mvs-texturing method is 12,764 after map ping, and the number is 169 after mapping by the method of thi s paper, and the number of occluded pixels is greatly reduced; i n the comparison of the percentage of occluded pixels, the perc entage of mvs-texturing method is 8.36%, and the percentage o f this paper is In the comparison of the percentage of occluded pixels, the percentage of mvs-texturing method is 8.36%, and t he percentage of this method is 0.11%.

Results for the Shenzhen University dataset
In Experiment 2, 14 images of the Science and Technology Bui lding of Shenzhen University on the ground platform were acq uired using a Canon EOS80D DSLR camera, while laser point clouds data within the survey area were acquired using a GeoSl am handheld laser scanner. By inputting the images and their re lated information, the laser point clouds data and the model are to be texture mapped for the related experiments.
(a) Results of mvs-texturing (b) Results of our method Figure 5 Comparison of the results of the mvs-texturing and our method in Experiment 2 The mvs-texturing method in which a large number of tree foli age textures are incorrectly mapped to the side of the building a t the rear, while using the method in this paper effectively circu mvents the tree foliage occlusion in texture mapping, by acquir ing a two-dimensional image by a camera at the same camera a ngle for the models generated by both methods, the acquired i mage pixels are 445*446, and all the occluded pixels in the ima ge are counted, as shown in Table 4.

mvstexturing our method
Number of occluded pixels 82469 112 Total percentage of occluded pixels 41.55% 0.06% Table 4 Statistics of occlusion condition after texture mapping using laser point clouds in Experiment 2 According to the number of occluded pixels in the whole imag e, the number of the mvs-texturing method is 82,469 after map ping, and the number is 112 after mapping by the method of thi s paper, and the number of occluded pixels is greatly reduced; i n the comparison of the percentage of occluded pixels, the perc entage of mvs-texturing method is 41.55%, and the percentage of this paper is In the comparison of the percentage of occluded pixels, the percentage of mvs-texturing method is 41.55%, and the percentage of this paper is 0.06%, the canopy texture on th e building faç ade is well eliminated in the screening process.

Results for the Dortmund dataset
In Experiment 3, 20 images of the ground platform and 21 images of the air platform from the public dataset-Dortmund Experiment dataset provided by ISPRS were used, as well as the dense matching point clouds generated based on these 41 images, and the results of the experiment are shown in Figure  6. The mvs-texturing method retains a large number of erroneous tree textures on the mapped building surface, while the method in this paper effectively removes them by acquiring twodimensional images from the models generated by both methods at the same camera angle, and the acquired image pixels are 490*554, and all the occluded pixels in the images are counted, as shown in Table 5. According to the number of occluded pixels in the whole imag e, the number of mvs-texturing method is 81464 after mapping, and the number of mvs-texturing method is 511 after mapping by this paper, the number of occluded pixels is greatly reduced, but some wrong textures still remain in the upper right of the b uilding. In the comparison of the percentage of occluded pixels, the percentage of the mvs-texturing method is 30.01%, and the percentage of this paper is 0.19%, so the effect of circumventin g occlusion has been greatly improved.
(a) Results of mvs-texturing (b) Results of our method Figure 6 Comparison of the results of the mvs-texturing and our method in Experiment 3

mvstexturing our method
Number of occluded pixels 81464 511 Total percentage of occluded pixels 30.01% 0.19% Table 5 Occlusion status statistics after texture mapping using dense matching point clouds in Experiment 3 4.5 Experimental analysis

4.5.1
Validity of the method Based on the visualization of the experiments in this paper shown in Figs. 3 to 6, the method in this paper achieves an aut omated texture mapping to circumvent occlusion, which is a gr eat improvement in the effect, whether it is the images collecte d independently or the public dataset provided by ISPRS -Dort mund experimental dataset, or the smallest treetop branches or t he trunk canopy of the whole tree, which can be effectively rem oved, thus making the model more realistic and better visualize d.

4.5.2
Effect of image platform sources According to the analysis of the image data used in the three set s of experiments, experiment 1 used 32 images collected by the UAV on the air platform, experiment 2 used 14 images collect ed by the Canon EOS80D DSLR camera on the ground platfor m, and experiment 3 selected 20 images from the ground platfo rm and 21 images from the air-ground platform in the public da taset -Dortmund Experiment dataset provided by ISPRS. In Ex periment 3, 20 ground platform images and 21 air platform ima ges were selected from the public dataset provided by ISPRS -Dortmund Experimental Dataset for the fusion of air-ground pl atforms. The results of multiple comparison experiments show that the images acquired by the air platform alone, the images a cquired by the ground platform, or the images fused by the airground platform can be used to complete the automatic occlusi on avoidance texture mapping of buildings based on the metho d in this paper, and the method designed in this paper has no li mitation on the source of the images, which can achieve good e ffect of occlusion avoidance.

4.5.3
Effect of different point clouds sources According to the analysis of the experiments based on the point clouds data used in the three groups of experiments, there are two types of point clouds used in the experiments of this paper: laser point clouds and dense matching point clouds. There are two groups of experiments in which laser point clouds are used. In experiment 2, the percentage of occluded texture decreased from 41.55% to 0.06% after using the laser point clouds; in experiment 1, the percentage of occluded texture decreased from 8.36% to 0.11% after using the laser point clouds, and the effect of avoiding occlusion was significant, and the percentage of occluded texture decreased significantly. There are two sets of experiments using dense matching point clouds. Experiment 3 decreased the occluded texture ratio from the original 30.01% to 0.19% after the experiment using the public dataset provided by ISPRS -Dortmund experimental dataset, and Experiment 1 conducted the related test experiment after generating the dense matching point clouds using the autonomously acquired image data, the two statistical regions in the experiment were reduced from the original 42.46% and 27.69% to 5.07% and 1.74%, and although the percentage of occlusion decreases, the decrease is smaller compared to the two experiments conducted with laser point clouds and the experiments using dense matching point clouds in Dortmund. The visualization effect contains the presence of some of the more obvious occluded textures. By comparing the experimental data, since only 6 images were used for dense reconstruction in Experiment 1, the generated dense point clouds did not represent the object geometry well, and the dense matching point clouds used in Experiment 1 were compared with the laser point clouds morphology as shown in Figure 7. When more images are involved in the dense reconstruction, such as in Experiment 3, where 41 images of the air-ground platform are fused to generate the dense matching point clouds, the experimental results are significant in avoiding occlusion.
a Dense matching point clouds b Laser Point clouds Figure 7 Comparison of dense point clouds and laser point clouds morphology in experiment 1

CONCLUSION
In this paper, in order to address the occlusion problem existing in the texture mapping of building 3D models, a method for ju dging the occlusion relationship between building models and t exture candidate images using 3D point clouds is proposed. Fir st, the resolution-adaptive voxel occlusion generation method is proposed, which voxelized the voxel-based on the adaptive voxel resolution of the point clouds calculation; then, the occlu sion judgment and image filtering method based on the spatia l through-view relationship is proposed, which screens the ima ges based on the spatial through-view relationship among the i mage projection center, model vertices, and voxels; finally, the data items of the Markov random field based on the occlusion f unction are Finally, the data items of the Markov random field are optimized based on the occlusion function, and the most sui table image is selected for each facepiece of the model for textu re mapping. The main innovation of this method is to overcome the natural deficiency of current texture detection and repair of occluders directly from 2D images and to shift to a 3D spatial perspective. Experiments are conducted using three sets of data for quantitative and qualitative analysis to evaluate the propose d method, and the results show that different types of point clou ds data or images from different platform sources can achieve a good effect of occlusion avoidance, which can effectively supp ort cross-platform multi-source image data and point clouds dat a. The related technologies can be widely used in smart cities, d igital twins, and other fields to help the construction of realistic 3D models.