Single images 3D reconstruction by a binary classifier

Resen Sallama Adhab

2024 , VOLUME 24, NUMBER 5 ( september-october )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2024-24-5-843-848

Single images 3D reconstruction by a binary classifier

R. A. Sallama

Read the full article

Article in English

For citation:

Resen S.A. Single images 3D reconstruction by a binary classifier. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 5, pp. 843–848. doi: 10.17586/2226-1494-2024-24-5-843-848

Abstract

Intelligent systems demand interaction with a variety of complex environments. For example, a robot might need to interact with complicated geometric structures in an environment. Accurate geometric reasoning is required to define the objects navigating the scene properly. 3D reconstruction is a complex problem that requires massive amounts of images. The paper proposes producing intelligent systems for 3D reconstruction from single 2D images. Propose a learnable reconstruction context that uses features to realize the synthesis. Proposed methods produce encoding feature lable input to classification, pulling out that information to make better decisions. Binary Classifier Neural Network (BCNN) classifies whether a point is inside or outside the object. The reconstruction system models an object 3D structure and learns feature filter parameters. The geometry and the corresponding features are implicitly updated based on the loss function. The training doesn’t require compressed supervision to visualize the task of reconstructed shapes and texture transfer. A point-set network flow results in BCNN having a comparable low memory footprint and is not restricted to specific classes for which templates are available. Accuracy measurements show that the model can extend the occupancy encoder by the generative model, which doesn’t request an image condition but can be trained unconditionally. The time required to train the model will have more neurons and weight parameters overfitting.

Keywords: intelligent systems, 3D reconstruction, features filter, convolution neural networks, Binary Classifier Neural Network (BCNN)

References

Häming K., Peters G.The structure from-motion reconstruction pipeline - A survey with focus on short image sequences. Kybernetika, 2010, vol. 46, no. 5, pp. 926–937.
Molenaar M., Eisemann E. Editing compressed high-resolution voxel scenes with attributes. Computer Graphics Forum,2023,vol. 42, no. 2, pp. 235–243. https://doi.org/10.1111/cgf.14757
Oechsle M., Peng S., Geiger A. UNISURF: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 5569–5579. https://doi.org/10.1109/iccv48922.2021.00554
Petersen F., Goldluecke B., Borgelt C., Deussen O. GenDR: A generalized differentiable renderer. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3992–4001. https://doi.org/10.1109/cvpr52688.2022.00397
Gromniak M., Magg S., Wermter S. Neural field conditioning strategies for 2D semantic segmentation. Lecture Notes in Computer Science, 2023, vol. 14255, pp. 520–532. https://doi.org/10.1007/978-3-031-44210-0_42
Zhao Z., Liu W., Chen X., Zeng X., Wang R., Cheng P., Fu B., Chen T., Yu G., Gao S. Michelangelo: Conditional 3D shape generation based on shape-image-text aligned latent representation. Advances in Neural Information Processing Systems,2023.
Su G.-M.Joint forward and backward neural network optimization in image processing. Patent US20230084705A1, 2023.
Greff K., Kaufman R.L., Kabra R., Watters N., Burgess C., Zoran D., Matthey L., Botvinick M., Lerchner A. Multi-object representation learning with iterative variational inference. Proc. of the 36^th International Conference on Machine Learning, 2019, pp. 4317–4343.
Zanuttigh P., Minto L.Deep learning for 3D shape classification from multiple depth maps. Proc. of the IEEE International Conference on Image Processing (ICIP), 2017, pp. 3615–3619. https://doi.org/10.1109/icip.2017.8296956
Cheng F., Xiao J., Tillo T., Zhao Y. Global motion information based depth map sequence coding. Lecture Notes in Computer Science, 2015, vol. 9314, pp. 721–729. https://doi.org/10.1007/978-3-319-24075-6_69
Yuan Z., Zhu Y., Li Y., Liu H., Yuan C.Make encoder great again in 3D GAN inversion through geometry and occlusion-aware encoding. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 2437–2447. https://doi.org/10.1109/iccv51070.2023.00231
Liu F., Huang T., Zhang Q., Yao H., Zhang C., Wan F., Ye Q., Zhou Y. BEAM: Beta distribution ray denoising for multi-view 3D object detection. arXiv, 2024, arXiv:2402.03634v1. https://doi.org/10.48550/arXiv.2402.03634
Wang X., Gupta A. Generative image modeling using style and structure adversarial networks. Lecture Notes in Computer Science, 2016, vol. 9908, pp. 318–335. https://doi.org/10.1007/978-3-319-46493-0_20
Shu C., Deng J., Yu F., Liu Y.3DPPE: 3D point positional encoding for transformer-based multi-camera 3D object detection. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 3557–3566. https://doi.org/10.1109/iccv51070.2023.00331
Naselaris T., Olman D., Stansbury K., Ugurbil J., Gallant J.L. A voxel-wise encoding model for early visual areas decodes mental images of remembered scenes. NeuroImage, 2015, vol. 105, pp. 215–228. https://doi.org/10.1016/j.neuroimage.2014.10.018
Du Y.P., Chu R., Tregellas J.R. Enhancing the detection of BOLD signal in fMRI by reducing the partial volume effect. Computational and Mathematical Methods in Medicine, 2014. https://doi.org/10.1155/2014/973972
Wen X., Zhou J., Liu Y.-S., Su H., Dong Z., Han Z. 3D shape reconstruction from 2D images with disentangled attribute flow. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3793–3803. https://doi.org/10.1109/cvpr52688.2022.00378
Oechsle M., Mescheder L., Niemeyer M., Strauss T., Geiger A. Texture fields: Learning texture representations in function space. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 4530–4539. https://doi.org/10.1109/iccv.2019.00463
Giannis K., Thon C., Yang G., Kwade A., Schilde C.Predicting 3D particles shapes based on 2D images by using convolutional neural network. Powder Technology,2024, vol. 432, pp. 119122. https://doi.org/10.1016/j.powtec.2023.119122

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License