Osiyo. Dohiju? Hey, welcome back.
One element of SERINDA that I know better than all of the others, including OpenCV and OpenGL, is Natural Language Understanding (NLU). I think since 2009, but may be earlier than that, I have used CMU Sphinx. I know before that I had my own NLU parser written in Java. Sphinx is amazing if you’ve never used it. I used it in conjunction with Sikuli to perform tasks on my Mac. So I could work at my desk and interact with iTunes and a few other items. Nothing amazing by today’s standards. Back then, though it was awesome to be able to say “play next song” and everything would happen the way it was supposed to.
One of the issues with SERINDA, that I’ve mentioned before, is that I tried to shoehorn my experience into something that wasn’t as efficient as I would’ve liked. For example, the first version was an all Java version. Then there was a Java version running on a Raspberry Pi (with Jasper, and several other programs along the way). I would attempt to take one aspect of my code and make it work with another framework. That is until I realized I’d written all of these filters for OpenCV with Python. Why should I rewrite all of them? Why not find a way to use Python as a server – and then came Flask.
I had several requirements. I didn’t want to go out to a service like Google, Witt.ai, etc. They’re good, but I want to handle everything on the SBC and not have to rely on wifi. This restriction caused a lot of issues over the years, but one requirement I’m very happy I stuck with.
With Flask came the need to use a Python only framework for NLU. I looked at Rasa. Not free to use. I looked at SNIPs and even have some code that ran with it. I create utterances with intents that when I say a phrase I get back what the intention was. SNIPs was the closest thing to Java Grammar Format, or JSGF, that I found. I like SNIPs and if I hadn’t found the solution I’m going with I would’ve kept it. The issues I have are the yml format. I convert that to JSON so it can read it. Because the documentation isn’t what I’d like making entities and understanding them isn’t natural for me. It’s not bad I’m just used to JSGF from all the years working with Sphinx.
I found Rhasspy-NLU. https://github.com/rhasspy/rhasspy-nlu I get a template langauge like JSGF that I’m used to and I get the Python-y goodness I wanted for this project.
Here’s how it works. First install Rhasspy-NLU with
pip3 install rhasspy-nlu
Create your config file. You can use whatever extension you want. Config files tend to be .ini for your IDE there are benefits that it can parse them.
[intentName]
(command1 | command2 | command3) item [(optional1 | optional2)] item2 (one:1 | two:2)
Here’s an example of how that looks.
[showGrid]
(show | display | draw) grid [(on | for)] camera (one:1 | two:2)
Parentheticals () are alternative words. Brackets are optional words. For example: “show grid on camera one” will return the intent “showGrid”. As will “display grid for camera two” or “draw grid camera one”.
Set up and use is very easy. Here’s an example from the docs:
from pathlib import Path
import rhasspynlu
# Load and parse
intents = rhasspynlu.parse_ini(Path("sentences.ini"))
graph = rhasspynlu.intents_to_graph(intents)
rhasspynlu.recognize("show grid on camera one", graph)
I’ve checked in some code to SERINDA. You can look at the config file here and the parser here.
If you have any questions just ask.
Other links to check out
https://rhasspy.readthedocs.io/en/latest/
https://pypi.org/project/rhasspy-nlu/0.1/
Until next time. Dodadagohvi.