开发过程中发现,项目在射线求交的计算耗时太大,不使用射线求交和使用射线求交的运行时间相差了3倍左右。

之前为了方便开发,求交时使用的是一个 $1 * 6$ 的张量分多次计算求交,示例代码如下所示:

    const auto start = (double)clock();

    open3d::t::geometry::RaycastingScene scene;
    const auto& tensor = open3d::t::geometry::TriangleMesh::FromLegacy(part, open3d::core::Float32, open3d::core::Int64);
    scene.AddTriangles(tensor);

    const auto numRays = (int)part.vertices_.size() * 10;
    auto rays = open3d::core::Tensor::Zeros({1, 6}, open3d::core::Float32);

    Eigen::Vector3d position(0.05,0.5, -0.25);

    for (size_t idx = 0; idx < numRays; idx++) {
        const auto& vertex = part.vertices_[idx % part.vertices_.size()];
        Eigen::Vector3d rayDir = (vertex - position).normalized();

        rays.SetItem({open3d::core::TensorKey::Index(0), open3d::core::TensorKey::Slice(0, 6, 1)},
                     open3d::core::Tensor::Init<float>({float(position.x()), float(position.y()), float(position.z()),
                                                        float(rayDir.x()), float(rayDir.y()), float(rayDir.z())}));
        // 循环进行多次求交
        auto castResults = scene.CastRays(rays);
    }
    cout << ((double)clock() - start) / CLOCKS_PER_SEC << endl;

而Open3D的RayCast实际上会在底层利用多线程来处理张量,理论上速度更快,因此使用下述代码进行测试:

    const auto start = (double)clock();
    open3d::t::geometry::RaycastingScene scene;
    const auto& tensor = open3d::t::geometry::TriangleMesh::FromLegacy(part, open3d::core::Float32, open3d::core::Int64);
    scene.AddTriangles(tensor);

    const auto numRays = (int)part.vertices_.size() * 10;
    auto rays = open3d::core::Tensor::Zeros({numRays, 6}, open3d::core::Float32);

    Eigen::Vector3d position(0.05,0.5, -0.25);

    for (size_t idx = 0; idx < numRays; idx++) {
        const auto& vertex = part.vertices_[idx % part.vertices_.size()];
        Eigen::Vector3d rayDir = (vertex - position).normalized();

        rays.SetItem({open3d::core::TensorKey::Index((int)idx), open3d::core::TensorKey::Slice(0, 6, 1)},
                     open3d::core::Tensor::Init<float>({float(position.x()), float(position.y()), float(position.z()),
                                                        float(rayDir.x()), float(rayDir.y()), float(rayDir.z())}));
    }
    auto castResults = scene.CastRays(rays);

    cout << ((double)clock() - start) / CLOCKS_PER_SEC << endl;

最终的测试结果(光线数量约为7万)如下图所示:

可见,两种方式在性能上确实存在一定差异。

不过需要注意,张量在初始化时需要耗费一定时间,因此在光线数量较少的情况下,两者差距并不大。

Update: 进一步测试发现,张量在实际场景中速度会更快,应该是由于测试数据过于理想导致的,实际场景中存在远距离或无交点的极端情况。

留下评论