Revoscan Flayer (Frame Player)

Just as a comment, it appears that you are projecting the point clouds into a box in 3D space. But since the camera has a FOV angle and perspective distortion, you’d probably need to project it into a cone in 3D instead.

In my math literature, those projection matrices were usually called P or Q, so my guess would be that either PI.bin or Q.bin is a float32 matrix which encodes the perspective distortion of the camera. (They both look like matrix encodings to me, I’m just not sure which one is the correct matrix). Similarly, the .inf files appear to be float32 matrices encoding the rotation and translation of a given frame. That means the proper way to generate a point cloud out of the RevoScan cache is probably:

  1. read the depth map like you did
  2. apply the perspective distortion matrix
  3. apply the affine transformation matrix

If my guess is correct, then that should also make sure that the point clouds from multiple frames end up being aligned in 3D, because step #3 applies the position estimated by the IMU / marker tracker inside the 3D scanner.

2 Likes

I now figured it out :smiley: With the depth scale from property.rvproj you can use the 3x4 matrix in Q.bin to transform the depth maps into 3D. And then using the 4x4 matrix from the frame’s *.inf file will perfectly align multiple keyframes: revoscan_5_parse_cache.py · GitHub

2 Likes

Uau, awsome findings Master :smiley:

how did you figure out it was a float32 matrix?
here i was thinking it was somehow encripted as i was expecting a plaintext matrix distortion like the ones you get from calibrating a camera in openCV.

So on the inf file have you figured out what it means? Beacause if so, that also means wer can finally cook up a way to do a matrix realign of the model to make it orthonal to the views :smiley:

I’ve worked a lot with the numpy python package in the past, so it just looked familier. And, indeed, np.fromfile('Q.bin', count=4*4, dtype=np.float32).reshape((4,4,)) could directly read the binary data and produced a plausible matrix.

Not entirely, but bytes 16 to 80 are the 4x4 matrix that RevoScan 5 uses to align the point cloud into its reference coordinate system. So yes, you should be able to patch that to fix alignment. Here’s my python for reading and applying it:

# load the transformation matrix for this frame
T = np.fromfile('frame_000_0001.inf'), offset=0x10, count=4*4, dtype=np.float64).reshape((4, 4,))
print('T', T)
# convert point cloud to [x,y,z,1] and multiply and reduce to [x/w,y/w,z/w]
data = np.concatenate([data,np.ones_like(data[:,0:1])],-1)
data = np.matmul(T.reshape((1,4,4,)), data.reshape((-1,4,1,)))
data = data[:,0:3,0] / data[:,3:4,0]
1 Like

Hello, interesting topic. Like you, I am also playing with the files a bit. I am quite impressed with these scanners, but I feel like the software is currently holding back the true potential of the hardware a bit. Probably some potential is sacrificed to make it more user-friendly and better suited for the average user, which is unfortunate for some of us.

Main thing I would like to improve is the compounding error problem, as this limits the scanner in some cases. I feel like some post-processing, an extra alignment step after completing the scan for example, could offer great improvements to the end result. I would gladly run a “second alignment”-algorithm on my laptop for a day, if that yields better scan results…

As far as I can tell, there is no way to export scan frames without some kind of post-processing being done. So I decided to have a look at the files. I wasn’t certain at what point the distortion correction was applied (this could have been done on the scanner, before the initial alignment) and I wasn’t quite sure where to start with that, so I decided to skip the param folder for now and tackle the .dph and .inf side first. Perhaps that would even be good enough to realign the scans and adjust the .inf files.

I am far from done with this, but I think I can add some info, especially for the .inf-files. In every sentence that follows you can add “appears to be” or “I believe”, since most of it still needs to be verified on a larger scale. :wink:

The .dph-files contain the depth view. This is like a pyramid, with its tip offset from the center. Lowest value accepted by Revo Scan for the dph is 2, highest is 6000. Looking at the raw_preview.ply-files, a simple linear interpolation is enough to recreate point clouds from a dph-file. Point clouds that I have created in this way appear to be correct, when I put them into cloud compare for example, but I have not done a very good check on this, just a quick visual inspection and I couldn’t see any obvious flaws.

After that I shifted my focus to the .inf-files, since I suspected these to contain the key to the 6-DOF of the individual frames. I have not fully figured out the .inf file, but here is what I have currently got:

I parse these files into a hex-view. A typical file looks something like this:

00 00 00 00 01 02 00 02 00 00 00 00 00 00 00 00
3D F2 4F DD 69 88 EF 3F 42 96 45 6F B7 55 B4 BF
12 7D A2 53 16 46 C3 BF C0 2B A4 43 1C 11 78 C0
67 4E E0 66 A6 00 A7 3F 38 05 21 94 BD 2E EF 3F
EB BA FD A2 48 2A CC BF B0 8B 57 11 2C 9C B3 40
80 0B C6 5C C8 04 C5 3F DE 37 29 F5 5A E3 CA 3F
9A AF 3C 8A 71 D7 EE 3F D9 EC AE 24 5B D3 99 C0
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 F0 3F
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 F0 3F 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 F0 3F
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 F0 3F 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 F0 3F

The first row contains some information about the frame number, but it looks like it can be ignored for what I want. The next six rows are the interesting bit, and I am focusing on those now. The rows after that are empty/unused.

The interesting six rows are, as far as I can tell, copied directly (without changes) into the global_register_pose.pose by Revo Scan. There appears to be a lot of random nonsense in these six rows, kind of obscuring what is going on. After looking at a lot of files, some patterns started to become clear. On the 8th and 16th rows there are a lot of 3F, BF, 3E, BE, C0 and 40. Those are like a multiplication factor. These give us a direction (positive/negative) and order of magnitude (small/large). The bytes directly before them are the values that should be multiplied by the factors. How many significant bytes there are, I don’t know. I am fairly certain that it is not the full 7 and possibly only 2.

The values & factors contain information about scaling, skewing and translating. Skew in two directions and you have rotated a mesh. Add scale and translate and you are the master of all 6 DOF. Below is where the position of the translations, skewing and scaling are found.

\ 1-8 | 9-16
1 Scaling X-direction | Skew X-dir., Z-dependent.
2 Skew X-dir., Y-dep. | Translation X-dir.
3 Skew Z-dir., X-dep. | Scale Z-dir.
4 Skew Z-dir., Y-dep. | Translation Z-dir.
5 Skew Y-dir., X-dep. | Skew Y-dir, Z-dep.
6 Scale Y-dir. | Translation Y-dir.

For the factors; 3F & BF are coupled, they do the same thing, but in the opposite direction. The same goes for 3E & BE and 40 & C0. BE & 3E I have only seen used for skewing. C0 & 40 only for translations. You could add 00 to the list of factors, but I have only seen those in the first frame, since that is placed in origin, so all values are zero there.

For the values; 00 = smallest/zero, FF = largest.

This is about as far as I have gotten right now. I am currently looking at ways to verify that this information is indeed correct. Next steps for me will be to create point clouds based on .dph and .inf files.

3 Likes

in the helpers folder of the revoscan frame player in the github project you find a player that decodes and builds pointclouds as 3d video frames, with @fxtentacle intel on the shape deformation being a 3x4 matrix andhiw to read it one can create undistorted and properly scaled pointclouds that we can align.

if we figure out how to extract and write each frame position we can create a preprocessor tool to try to realign and write the new pos to the frames

My guess would be that rows 8 to 16 are stored in 64 bit little endian floating point, like a double in C/C++. Then, “00 00 00 00 00 00 F0 3F” would mean exactly 1.0 and the data stored would be:

0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0,
0.0, 0.0, 0.0, 0.0,
1.0, 0.0, 0.0, 0.0,
0.0, 1.0

But since that data is the same for all .inf files of me depth-only scan, I’m guessing that it is only relevant for color scans.

I’ve tried precisely that by using ICP to re-align each newly loaded depth camera frame against the currently loaded point cloud. But the result is a lot worse than not doing any correction. The reason is that ICP will typically try to match as closely as possible, but when you 3D scan you only get a partial model in each frame, meaning that your partial scan will never cover the entire point cloud, and that causes stock ICP to fail.

CloudCompare can somewhat limit this effect with their “Final overlap” parameter: ICP - CloudCompareWiki
but my completely unscientific feeling is that one might be better of finding shape features (similar to SIFT in 2D) and using those for a global bundle alignment (like what COLMAP does).

But where ICP of the keyframes might really help is for noise reduction. Here’s the result of aligning multiple depth frames of a static scene:

One can see that in some ares of the image, the points align onto a line. That’s what should happen if nothing has moved between frames. But in other areas, the camera had some noise in the depth estimates, so the points are spread out over an area around the line.

In this example, the transformation matrix created by the RevoScan 5 feature alignment had an error of about 30% of the width of one depth camera pixel, so the ICP really helps here :slight_smile: My guess would be that they didn’t add it to the software because it took 0.5s per frame to do the ICP, so it will probably NOT work in real-time on current-gen hardware.

1 Like