SERINDA Graceland Update (video display)

ᎣᏏᏲ. ᏙᎯᏧ? Hey, welcome back!

From my last post here, I had this list of items to work on:

  • Get the voice commands working.
  • Wire the OAK-D as an option in the camera pool – and properties so the OAK-D could be the primary camera if a developer wanted – I do, so of course this is an option – and this will auto-detect the cameras still – but in the properties I’ll be able to define if I want OAK-D as the primary
  • Set up object detection via the commands – e.g. recognize a lamp and not a QR code – and have an object display in 3D space – this is one of the examples modified to work with my base
  • set up hand recognition to determine my hand location in 3D space – another example modified to work with my base
  • work on the hand recognition so that it can determine that my hands are interacting with the 3D object – lots of intersection work
  • use my hand to move the 3d object in 3d space

I haven’t done a lot of work on the voice commands. At this point, I can still type commands, and they work; the dynamic menu with commands works fine, as well. I’m pushing that out for a bit. I’m considering using Rust to capture voice commands and send those to the Flask server. Mainly because I’d like to offload any heavy processing work to the system to keep Flask and the front end from lagging too much; after all, they’ll be processing and showing video. Also, to this point, I’ve considered moving the OpenCV portion of the video processing to Rust. That’s for another time. Right now, I have goals to accomplish, and I don’t need to distract myself.

I have not added the OAK-D to the camera pool. I have created a new object for it using depthai. It works fine but is a little slow. I need to track that down. However, it does process all of the filters as expected. I mean, I can’t complain about that. I need to wire the pool so it’ll detect if an OAK-D exists and then add that to the pool. I also need to add a property or two to set the OAK-D as a primary or secondary camera. That way the user has a choice and can have four cameras (or whatever) and one of those is the OAK-D that they’re using for primary views.

I did fix the SNIPS-NLU startup so it works correctly.

I also run the server in a virtual environment. I spent a lot of time curating the libraries so everything worked together in Python 3.11. I hadn’t updated the libraries in like 4+ years so there was a little growing pain.

I’m currently working with a marker tag (Aruco, AprilTag, etc) to test my 3D scene. With this, I’m going to add the hand-tracking so I can move an object in the 3D space. I have a few elements I need to work on, mainly getting the IMU data to correctly work with camera data (e.g. BabylonJS, or whatever) and I need to translate the hand-tracking data to match the 3D space I’m in.

If I could get the display for the OAK-D sped up (not the OAK-D fault – PEBKAC) and I could get the IMU data working with a camera and the hand-tracking data inside the 3D environment I think I’d be very happy. Of course, the next step after that is mounting the camera to the helmet and viewing this all in MR.

Until next time. Dodadagohvi. ᏙᏓᏓᎪᎲᎢ.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.