The problem of position and orientation estimation for an active vision sensor that moves with respect to the full six degrees of freedom is considered. The proposed approach is based on point features extracted from RGB-D data. This work focuses on efficient point feature extraction algorithms and on methods for the management of a set of features in a single RGB-D data frame. While the fast, RGB-D-based visual odometry system described in this paper builds upon our previous results as to the general architecture, the important novel elements introduced here are aimed at improving the precision and robustness of the motion estimate computed from the matching point features of two RGB-D frames. Moreover, we demonstrate that the visual odometry system can serve as the front-end for a pose-based simultaneous localization and mapping solution. The proposed solutions are tested on publicly available data sets to ensure that the results are scientifically verifiable. The experimental results demonstrate gains due to the improved feature extraction and management mechanisms, whereas the performance of the whole navigation system compares favorably to results known from the literature.
Bachrach, A., Prentice, S., He, R., Henry, P., Huang, A.S., Krainin, M., Maturana, D., Fox, D. and Roy, N. (2012). Estimation, planning, and mapping for autonomous flight using an RGB-D camera in GPS-denied environments, International Journal of Robotics Research31(11): 1320–1343.
Bączyk, R. and Kasiński, A. (2010). Visual simultaneous localisation and map-building supported by structured landmarks, International Journal of Applied Mathematics and Computer Science20(2): 281–293, DOI: 10.2478/v10006-010-0021-7.
Bailey, T. and Durrant-Whyte, H. (2006). Simultaneous localization and mapping: Part II, IEEE Robotics & Automation Magazine13(3): 108–117.
Baker, S. and Matthews, I. (2004). Lucas–Kanade 20 years on: A unifying framework, International Journal of Computer Vision56(3): 221–255.
Bay, H., Ess, A., Tuytelaars, T. and Van Gool, L. (2008). Speeded-up robust features (SURF), Computer Vision and Image Understanding110(3): 346–359.
Belter, D., Nowicki, M. and Skrzypczyński, P. (2015). On the performance of pose-based RGB-D visual navigation systems, in D. Cremers et al. (Eds.), Computer Vision, ACCV 2014, Part II, Lecture Notes in Computer Science, Vol. 9004, Springer, Zurich, pp. 1–17.
Bouguet, J.Y. (2000). Pyramidal implementation of the Lucas–Kanade feature tracker, description of the algorithm, Technical report, Intel Corp., Microprocessor Research Labs., Pittsburgh, PA.
Choi, S., Kim, T. and Yu, W. (2009). Performance evaluation of RANSAC family, British Machine Vision Conference, London, UK.
Cummins, M. and Newman, P. (2010). Accelerating FAB-MAP with concentration inequalities, IEEE Transactions on Robotics26(6): 1042–1050.
Davison, A.J., Reid, I.D., Molton, N.D. and Stasse, O. (2007). MonoSLAM: Real-time single camera SLAM, IEEE Transactions on Pattern Analysis and Machine Intelligence29(6): 1052–1067.
Durrant-Whyte, H. and Bailey, T. (2006). Simultaneous localization and mapping: Part I, IEEE Robotics & Automation Magazine13(2): 99–110.
Eggert, D.W., Lorusso, A. and Fisher, R.B. (1997). Estimating 3-D rigid body transformations: A comparison of four major algorithms, Machine Vision and Applications9(5–6): 272–290.
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D. and Burgard, W. (2012). An evaluation of the RGB-D SLAM system, IEEE International Conference on Robotics and Automation, St. Paul MN, USA, pp. 1691–1696.
Endres, F., Hess, J., Sturm, J., Cremers, D. and Burgard, W. (2014). 3-D mapping with an RGB-D camera, IEEE Transactions on Robotics30(1): 177–187.
Engel, J., Sturm, J. and Cremers, D. (2012). Camera-based navigation of a low-cost quadrocopter, IEEE/RSJ International Conference on Intelligent Robots & Systems, Vilamoura, Portugal, pp. 2815–2821.
Ester, M., Kriegel, H.-P., Sander, J. and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise, International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, pp. 226–231.
Hansard, M., Lee, S., Choi, O. and Horaud, R. (2012). Time-of-Flight Cameras: Principles, Methods and Applications, Springer, Berlin.
Hartley, R.I. and Zisserman, A. (2004). Multiple View Geometry in Computer Vision, 2nd Edn., Cambridge University Press, Cambridge.
Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A. and Fitzgibbon, A. (2011). KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera, ACM Symposium on User Interface Software and Technology, New York, NY, pp. 559–568.
Kerl, C., Sturm, J. and Cremers, D. (2013). Robust odometry estimation for RGB-D cameras, IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, pp. 3748–3754.
Khoskelham, K. and Elberink, S.O. (2012). Accuracy and resolution of Kinect depth data for indoor mapping applications, Sensors12(2): 1437–1454.
Kraft, M., Nowicki, M., Schmidt, A. and Skrzypczyński, P. (2014). Efficient RGB-D data processing for point-feature-based self-localization, in C. Zieliński and K. Tchoń (Eds.), Postępy robotyki, PW, Warsaw, pp. 245–256, (in Polish).
Kuemmerle, R., Grisetti, G., Strasdat, H., Konolige, K. and Burgard, W. (2011). g2o: A general framework for graph optimization, IEEE International Conference on Robotics and Automation, Shanghai, China, pp. 3607–3613.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision60(2): 91–110.
Mertens, L., Penne, R. and Ribbens, B. (2013). Time of flight cameras (3D vision), in J. Buytaert (Ed.), Recent Advances in Topography, Engineering Tools, Techniques and Tables, Nova Science, Hauppauge, NY, pp. 353–417.
Nascimento, E., Oliveira, G., Campos, M.F.M., Vieira, A. and Schwartz, W. (2012). BRAND: A robust appearance and depth descriptor for RGB-D images, IEEE/RSJ International Conference on Intelligent Robots & Systems, Vilamoura, Portugal, pp. 1720–1726.
Nowicki, M. and Skrzypczyński, P. (2013a). Combining photometric and depth data for lightweight and robust visual odometry, European Conference on Mobile Robots (ECMR), Barcelona, Spain, pp. 125–130.
Nowicki, M. and Skrzypczyński, P. (2013b). Experimental verification of a walking robot self-localization system with the Kinect sensor, Journal of Automation, Mobile Robotics & Intelligent Systems7(4): 42–51.
Nüchter, A., Lingemann, K., Hertzberg, J. and Surmann, H. (2007). 6D SLAM—3D mapping outdoor environments, Journal of Field Robotics24(8–9): 699–722.
Penne, R., Mertens, L. and Ribbens, B. (2013). Planar segmentation by time-of-flight cameras, in J. Blanc-Talon et al. (Eds.), Advanced Concepts for Intelligent Vision Systems, Lecture Notes in Computer Science, Vol. 8192, Springer, Berlin, pp. 286–297.
Penne, R., Raposo, C., Mertens, L., Ribbens, B. and Araujo, H. (2015). Investigating new calibration methods without feature detection for ToF cameras, Image and Vision Computing43: 50–62.
Raguram, R., Chum, O., Pollefeys, M., Matas, J. and Frahm, J. (2013). USAC: A universal framework for random sample consenus, IEEE Transactions on Pattern Analysis and Machine Intelligence35(8): 2022–2038.
Rosten, E. and Drummond, T. (2006). Machine learning for high-speed corner detection, 9th European Conference on Computer Vision (ECCV’06), Graz, Austria, pp. 430–443.
Rublee, E., Rabaud, V., Konolige, K. and Bradski, G. (2011). ORB: an efficient alternative to SIFT or SURF, IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, pp. 2564–2571.
Rusu, R., Blodow, N., Marton, Z. and Beetz, M. (2008). Aligning point cloud views using persistent feature histograms, IEEE/RSJ International Conference on Intelligent Robots & Systems, Nice, France, pp. 3384–3391.
Scaramuzza, D. and Fraundorfer, F. (2011). Visual odometry, Part I: The first 30 years and fundamentals, IEEE Robotics & Automation Magazine18(4): 80–92.
Schmidt, A., Fularz, M., Kraft, M., Kasiński, A. and Nowicki, M. (2013a). An indoor RGB-D dataset for the evaluation of robot navigation algorithms, in J. Blanc-Talon et al. (Eds.), Advanced Concepts for Intelligent Vision Systems, Lecture Notes in Computer Science, Vol. 8192, Springer, Berlin, pp. 321–329.
Schmidt, A., Kraft, M., Fularz, M. and Domagala, Z. (2013b). The comparison of point feature detectors and descriptors in the context of robot navigation, Journal of Automation, Mobile Robotics & Intelligent Systems7(1): 11–20.
Segal, A., Haehnel, D. and Thrun, S. (2009). Generalized-ICP, Robotics: Science and Systems, Seattle, WA, USA.
Shi, J. and Tomasi, C. (1994). Good features to track, IEEE Conference on Computer Vision and Pattern Recognition (CVPR’94), Seattle, WA, USA, pp. 593–600.
Skrzypczyński, P. (2009). Simultaneous localization and mapping: A feature-based probabilistic approach, International Journal of Applied Mathematics and Computer Science19(4): 575–588, DOI: 10.2478/v10006-009-0045-z.
Steder, B., Rusu, R.B., Konolige, K. and Burgard, W. (2011). Point feature extraction on 3D range scans taking into account object boundaries, IEEE International Conference on Robotics and Automation, Shanghai, China, pp. 2601–2608.
Steinbrücker, F., Sturm, J. and Cremers, D. (2011). Real-time visual odometry from dense RGB-D images, Workshop on Live Dense Reconstruction with Moving Cameras/International Conference on Computer Visision, Barcelona, Spain, pp. 719–722.
Stewénius, H., Engels, C. and Nistér, D. (2006). Recent developments on direct relative orientation, ISPRS Journal of Photogrammetry and Remote Sensing60(4): 284–294.
Stoyanov, T., Louloudi, A., Andreasson, H. and Lilienthal, A. (2011). Comparative evaluation of range sensor accuracy in indoor environments, 5th European Conference on Mobile Robots, Örebro, Sweden, pp. 19–24.
Strasdat, H. (2012). Local Accuracy and Global Consistency for Efficient Visual SLAM, Ph.D. thesis, Imperial College, London.
Sturm, J., Engelhard, M., Endres, F., Burgard, W. and Cremers, D. (2012). A benchmark for the evaluation of RGB-D SLAM systems, IEEE/RSJ International Conference on Intelligent Robots & Systems, Vilamoura, Portugal, pp. 573–580.
Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence13(4): 376–380.
Whelan, T., McDonald, J., Kaess, M., Fallon, M., Johannsson, H. and Leonard, J. (2012). Kintinuous: Spatially extended KinectFusion, Robotics: Science and Systems, Sydney, Australia.
Whelan, T., Johannsson, H., Kaess, M., Leonard, J. and McDonald, J. (2013). Robust real-time visual odometry for dense RGB-D mapping, IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, pp. 5724–5731.