Osiyo. Dohiju? Hey, welcome back.
Since finding the Triton project I’ve been working on a side project with it. The idea is to take the SERINDA work I’ve done and integrate some of it with Triton. For example, the webspeech work, the OpenCV work, dynamic layouts, etc. In order to do that, I can either take the Triton work and merge it into my own; or I can take my work and merge it into Triton.
I’ve decided to start with the working Triton code. I know my code inside and out so starting with something working is the better plan for me since I’m not familiar with the Triton codebase. My OpenCV code for AR and MR is based on the camera capture and then display of elements like grids, tracking squares, text, etc. What I’m going to do is take my previous work with SERINDA from 3+ years ago where it would recognize faces and instead of writing the same image with the tracking square I’ll write a new transparent png with the tracking square. The idea is that if I take the video feed and overlay the tracking data then I should be able to take that tracking data and put it on something transparent to use as a HUD with Triton. I hope, anyway.
Here are the steps that I think need to be done:
- create a transparent image the same size as the frame
- detect motion (or whatever we want in the overlay)
- take those coordinates and apply them to the transparent image
- render the transparent image as a HUD element over the Three.js scene
If my hypothesis is correct then I’ll get the parts of the OpenCV work I’ve done married with the Three.js portion.
To test my hypothesis I’ve decided that since the SERINDA code has been abstracted so I could write quick plugins I need code that’s new and whole. I chose this project – minus the audio portion. You can find the github code link in the article. Here is another project that is similar and I use the code from this project in my video. Using ideas from this SO post – I create a transparent image the same size as the frame.
transparent_img = np.zeros((frame.shape[0], frame.shape[1], 4), dtype=np.uint8)
Then I take the motion detect x, y, w, h coords and apply them to the transparent image.
(x, y, w, h) = cv2.boundingRect(contour) cv2.rectangle(transparent_img, (x, y), (x + w, y + h), (0, 255, 0, 255), 2)
Instead of returning the frame, which has the complete video, I just return the transparent image.
That happened to be it. The change was really to the BGRA information (0, 255, 0, 255) – where the final 255 is the alpha. Without that value it doesn’t work the way it should. That is not to say that the tracking data won’t be applied to the image, it will. What it means is that the image will be black with the tracking data over it and that’s not what we wanted.
Now, to the next part. Eventually, I’m going to put OpenCV.js into the project, I think. For now, though, I can do what I want with Flask. This is the rapid prototype portion and I want to test my hypothesis.
The steps for this part are:
- Create a new basic Flask setup
- Add a video_feed call like the current SERINDA has, but this will just return the OpenCV video capture with the motion detection code and not contain all of the other filters
- Display that feed as orthographic 2d HUD element of Three.js
I hope this part works. There are more details I need to consider like the video cameras lining up to what the display actually looks like so detecting motion will be mostly calibrated and not like having a “lazy eye” so-to-speak. Then I think the final portion of this would be to use WebRTC and OpenCV.js and incorporate this directly into the Triton code. I’ll only need one camera mounted to the front of the headset that will be stereoscopic across both eyes. Eventually, it’d be nice to have depth view, but we’ll see.
I went back to the original Flask startup that I had. I had found two that were similar so I’m going to post the four links here, code here, here, and code here. The one I stayed with is the last one. I have already coded the portions that are in it, but I wanted to start over so I could include the code I just wrote for the transparent image. I got this part to work just the way I thought it would.
So, now to marry the two. It may not be pretty but the display in my Threejs scene should show the Threejs portion and the OpenCV transparent overlay. I’ll do this by including the img src in a div tag that sets it on top of the Threejs scene. I’d really like to put this in the orthographic projection, but let’s get this first part to work.
Alright, with that complete. I now have a HUD that’s showing OpenCV information with Threejs. I need to get that display to work with the Threejs projection. I’m not sure how to do that. I’ll need to do more research on that. I’m happy with the results, so far. My next two steps will be integrating fully with the Threejs stereoscopic view and then making it work with OpenCVjs. Another option is to maybe make a “viewscreen” that is set in the distance of the Threejs display but has a VideoTexture and is in the Ortho2d realm. There are options – idk which one is the better option.
If you’d like to watch the video about making a transparent display you can watch it below.
Until next time. Dodadagohvi.