开发过程中发现,项目在射线求交的计算耗时太大,不使用射线求交和使用射线求交的运行时间相差了3倍左右。
之前为了方便开发,求交时使用的是一个 $1 * 6$ 的张量分多次计算求交,示例代码如下所示:
const auto start = (double)clock();
open3d::t::geometry::RaycastingScene scene;
const auto& tensor = open3d::t::geometry::TriangleMesh::FromLegacy(part, open3d::core::Float32, open3d::core::Int64);
scene.AddTriangles(tensor);
const auto numRays = (int)part.vertices_.size() * 10;
auto rays = open3d::core::Tensor::Zeros({1, 6}, open3d::core::Float32);
Eigen::Vector3d position(0.05,0.5, -0.25);
for (size_t idx = 0; idx < numRays; idx++) {
const auto& vertex = part.vertices_[idx % part.vertices_.size()];
Eigen::Vector3d rayDir = (vertex - position).normalized();
rays.SetItem({open3d::core::TensorKey::Index(0), open3d::core::TensorKey::Slice(0, 6, 1)},
open3d::core::Tensor::Init<float>({float(position.x()), float(position.y()), float(position.z()),
float(rayDir.x()), float(rayDir.y()), float(rayDir.z())}));
// 循环进行多次求交
auto castResults = scene.CastRays(rays);
}
cout << ((double)clock() - start) / CLOCKS_PER_SEC << endl;
而Open3D的RayCast实际上会在底层利用多线程来处理张量,理论上速度更快,因此使用下述代码进行测试:
const auto start = (double)clock();
open3d::t::geometry::RaycastingScene scene;
const auto& tensor = open3d::t::geometry::TriangleMesh::FromLegacy(part, open3d::core::Float32, open3d::core::Int64);
scene.AddTriangles(tensor);
const auto numRays = (int)part.vertices_.size() * 10;
auto rays = open3d::core::Tensor::Zeros({numRays, 6}, open3d::core::Float32);
Eigen::Vector3d position(0.05,0.5, -0.25);
for (size_t idx = 0; idx < numRays; idx++) {
const auto& vertex = part.vertices_[idx % part.vertices_.size()];
Eigen::Vector3d rayDir = (vertex - position).normalized();
rays.SetItem({open3d::core::TensorKey::Index((int)idx), open3d::core::TensorKey::Slice(0, 6, 1)},
open3d::core::Tensor::Init<float>({float(position.x()), float(position.y()), float(position.z()),
float(rayDir.x()), float(rayDir.y()), float(rayDir.z())}));
}
auto castResults = scene.CastRays(rays);
cout << ((double)clock() - start) / CLOCKS_PER_SEC << endl;
最终的测试结果(光线数量约为7万)如下图所示:
可见,两种方式在性能上确实存在一定差异。
不过需要注意,张量在初始化时需要耗费一定时间,因此在光线数量较少的情况下,两者差距并不大。
Update: 进一步测试发现,张量在实际场景中速度会更快,应该是由于测试数据过于理想导致的,实际场景中存在远距离或无交点的极端情况。
留下评论