Over the last two weeks I worked on improving the database and a voice recognition app.
Our database made the code unreadable and changing it would mean changing the entire code quite a bit. This is why I decided to have a second look at it and making some changes.
I looked at SQLAlchemy , however it seemed too complex for our needs. However, I liked couple of ideas that were mentioned there and I tried to incorporate it into my code. For example the fact that it returns an object, instead of what we had before which was the raw tuple. In addition, the function take in objects not list of particular parameters.
These were the main two changes I made to the code. However, that meant that I also had to import the relevant classes into the DatabaseTables.py. This unfortunately resulted in circular imports. When trying to fix that I saw that importing the entire file, rather than a particular class helps in some cases.
PocketSphinx works on many platforms and is in general a very good sound recognition software. There are even python bindings for it, which makes it ideal for our project. However, it is very hard to use and install. I would need to make my own grammar for it and train it. I would also need another package to record the voice, since that is not included in PocketSphinx.
Because of that I looked at my alternative – the Android API. That is not perfect, since it only works on Android and it needs internet connection. It is however much simpler to use, which I welcomed with pleasure. I did not manage to get continuous voice recognition working on it, so every time you want to send a command you need to press a button, which is not ideal, but has to suffice for now.
The next problem I encountered is how to manage the commands. For one the voice recognition is not perfect, so sometimes the words recognised are not exact. The other problem is that people usually operate by names, whereas our API uses ids (which makes more sense). Each room has a name, but now we have to ensure that the names are unique, otherwise my system won’t be able to differentiate between them. The same goes for items in the room.
So a sample command would be:
Open Front Door in Corridor
Assuming Corridor is id 1 and Front Door has id 2 the RESTful command sent would be:
To transfer between the name and id I needed to get the
and get the ids from there.
To tackle my second problem, instead of taking the first result, I am taking couple of them (the API if asked gives couple of “best guesses”) and I run them all with the list of names, starting from the best of the best. That way it has more chance of succeeding.