doi: 10.17586/2226-1494-2024-24-5-843-848


Single images 3D reconstruction by a binary classifier

R. A. Sallama


Read the full article  ';
Article in English

For citation:
Resen S.A. Single images 3D reconstruction by a binary classifier. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 5, pp. 843–848. doi: 10.17586/2226-1494-2024-24-5-843-848


Abstract
Intelligent systems demand interaction with a variety of complex environments. For example, a robot might need to interact with complicated geometric structures in an environment. Accurate geometric reasoning is required to define the objects navigating the scene properly. 3D reconstruction is a complex problem that requires massive amounts of images. The paper proposes producing intelligent systems for 3D reconstruction from single 2D images. Propose a learnable reconstruction context that uses features to realize the synthesis. Proposed methods produce encoding feature lable input to classification, pulling out that information to make better decisions. Binary Classifier Neural Network (BCNN) classifies whether a point is inside or outside the object. The reconstruction system models an object 3D structure and learns feature filter parameters. The geometry and the corresponding features are implicitly updated based on the loss function. The training doesn’t require compressed supervision to visualize the task of reconstructed shapes and texture transfer. A point-set network flow results in BCNN having a comparable low memory footprint and is not restricted to specific classes for which templates are available. Accuracy measurements show that the model can extend the occupancy encoder by the generative model, which doesn’t request an image condition but can be trained unconditionally. The time required to train the model will have more neurons and weight parameters overfitting. 

Keywords: intelligent systems, 3D reconstruction, features filter, convolution neural networks, Binary Classifier Neural Network (BCNN)

References
  1. Häming K., Peters G.The structure from-motion reconstruction pipeline - A survey with focus on short image sequences. Kybernetika, 2010, vol. 46, no. 5, pp. 926–937.
  2. Molenaar M., Eisemann E. Editing compressed high-resolution voxel scenes with attributes. Computer Graphics Forum,2023,vol. 42, no. 2, pp. 235–243. https://doi.org/10.1111/cgf.14757
  3. Oechsle M., Peng S., Geiger A. UNISURF: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 5569–5579. https://doi.org/10.1109/iccv48922.2021.00554
  4. Petersen F., Goldluecke B., Borgelt C., Deussen O. GenDR: A generalized differentiable renderer. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3992–4001. https://doi.org/10.1109/cvpr52688.2022.00397
  5. Gromniak M., Magg S., Wermter S. Neural field conditioning strategies for 2D semantic segmentation. Lecture Notes in Computer Science, 2023, vol. 14255, pp. 520–532. https://doi.org/10.1007/978-3-031-44210-0_42
  6. Zhao Z., Liu W., Chen X., Zeng X., Wang R., Cheng P., Fu B., Chen T., Yu G., Gao S. Michelangelo: Conditional 3D shape generation based on shape-image-text aligned latent representation. Advances in Neural Information Processing Systems,2023.
  7. Su G.-M.Joint forward and backward neural network optimization in image processing. Patent US20230084705A1, 2023.
  8. Greff K., Kaufman R.L., Kabra R., Watters N., Burgess C., Zoran D., Matthey L., Botvinick M., Lerchner A. Multi-object representation learning with iterative variational inference. Proc. of the 36th International Conference on Machine Learning, 2019, pp. 4317–4343.
  9. Zanuttigh P., Minto L.Deep learning for 3D shape classification from multiple depth maps. Proc. of the IEEE International Conference on Image Processing (ICIP), 2017, pp. 3615–3619. https://doi.org/10.1109/icip.2017.8296956
  10. Cheng F., Xiao J., Tillo T., Zhao Y. Global motion information based depth map sequence coding. Lecture Notes in Computer Science, 2015, vol. 9314, pp. 721–729. https://doi.org/10.1007/978-3-319-24075-6_69
  11. Yuan Z., Zhu Y., Li Y., Liu H., Yuan C.Make encoder great again in 3D GAN inversion through geometry and occlusion-aware encoding. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 2437–2447. https://doi.org/10.1109/iccv51070.2023.00231
  12. Liu F., Huang T., Zhang Q., Yao H., Zhang C., Wan F., Ye Q., Zhou Y. BEAM: Beta distribution ray denoising for multi-view 3D object detection. arXiv, 2024, arXiv:2402.03634v1. https://doi.org/10.48550/arXiv.2402.03634
  13. Wang X., Gupta A. Generative image modeling using style and structure adversarial networks. Lecture Notes in Computer Science, 2016, vol. 9908, pp. 318–335. https://doi.org/10.1007/978-3-319-46493-0_20
  14. Shu C., Deng J., Yu F., Liu Y.3DPPE: 3D point positional encoding for transformer-based multi-camera 3D object detection. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 3557–3566. https://doi.org/10.1109/iccv51070.2023.00331
  15. Naselaris T., Olman D., Stansbury K., Ugurbil J., Gallant J.L. A voxel-wise encoding model for early visual areas decodes mental images of remembered scenes. NeuroImage, 2015, vol. 105, pp. 215–228. https://doi.org/10.1016/j.neuroimage.2014.10.018
  16. Du Y.P., Chu R., Tregellas J.R. Enhancing the detection of BOLD signal in fMRI by reducing the partial volume effect. Computational and Mathematical Methods in Medicine, 2014. https://doi.org/10.1155/2014/973972
  17. Wen X., Zhou J., Liu Y.-S., Su H., Dong Z., Han Z. 3D shape reconstruction from 2D images with disentangled attribute flow. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 3793–3803. https://doi.org/10.1109/cvpr52688.2022.00378
  18. Oechsle M., Mescheder L., Niemeyer M., Strauss T., Geiger A. Texture fields: Learning texture representations in function space. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 4530–4539. https://doi.org/10.1109/iccv.2019.00463
  19. Giannis K., Thon C., Yang G., Kwade A., Schilde C.Predicting 3D particles shapes based on 2D images by using convolutional neural network. Powder Technology,2024, vol. 432, pp. 119122. https://doi.org/10.1016/j.powtec.2023.119122


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика