Osiyo. Dohiju? Hey, welcome back.
I’ve narrowed down my big picture steps for getting up and running. I have already done a lot of programming work for filters and putting it all together. I have something working that I’m pretty happy with. Now, is the time to bring in a more structured approach to the process. What parts are necessary and what priority are they?
Here’s the list in order. I’ll go through them in more depth at the end.
- Display
- OpenCV
- 3D Visualization
- Gestures
- TTS/STT
Display – Both full screen and split stereo screen. Some HMDs utilize a full screen where others, like HoloKit, Vufine AR, and North Star variants, use a screen split for the 3D view. This needs to be configurable so the user can switch on startup. This also includes what to use to serve up the display. Is it a standard web browser, Firefox Reality, something else?
OpenCV – Visual Odometry (VSLAM, PTAM, DTAM, whatever works best). Everything in view needs to be mapped in a way. The number of points need not be extensive; only enough that I can place virtual objects in the real world and interact with them by walking around them, moving them, etc. OpenCV powers almost everything visual for SERINDA.
3D Visualization – With the last item comes additional HUD GUIs. Elements for user interaction and informational views. This could be native OpenGL or WebGL maybe Three.js. I am avoiding using Unity as much as possible.
Gestures – This is two-fold. I want to improve my OpenCV gestures eventually. To start with, however, I want to improve the LeapMotion interaction.
TTS/STT – This is essential for hands-free work. The ability for the user to communicate with the system and the system to communicate back.
For now, getting the data from the headset either by my 6DOF or 9DOF breakout board or Visual Inertial Odometry (VIO) and then hand tracking by camera or LeapMotion seem to be the best place to start. I have a 3d scene already. Now, I need to move objects in it and move within it. Fortunately, VR has done a lot of this so I’m sure there are libraries that will handle what I’m looking for.
After those are done then manipulating the display will be the higher purpose.
I got a bit distracted so my thoughts aren’t complete here. I’ll revisit this once I’ve had a chance to sleep.
Until next time. Dodadagohvi.