Home Assistant Update - Building my own Echos

It has been four years since I adopted Home Assistant, and since then, the devices on my network has expanded dramatically. At time of writing, Home Assistant recognizes, tracks, and controls 338 devices and 1623 entities (properties of a device) in my home. Although much of my programming was simple if/then commands to control the home, eventually the burden of tracking and coordinating so many devices was too much for my Raspberry Pi to handle, and automations and scripts in my house began to suffer noticeable latency. Combined with my desire to embark on an ambitious new project, I decided it was time to upgrade to a more performant and permanent solution - a dedicated x86 PC.

The full scope of automations in my house would be enough to write an entire book on, and I plan to create a series of YouTube videos showcasing some of my favorites, but some highlights include (and may be topic of future write ups):

1. Lights that turn on with motion, and change brightness depending on ambient lighting conditions

2. Shades that move to block out glare from the sun, while keeping as much sunlight as possible

3. Security systems that announce guests and unlock doors for them at parties

4. Wall-mounted dashboards in every room, plus dashboards on mobile phones that change depending on which room a user is in

5. Building controls that change based on who in the house is awake.

However, the project that prompted this big upgrade and re-write was something more ambitious than any of these: building my own fully-self hosted, fully offline Amazon Echo (or "Alexa"). To build this service would require a complex chain of inter-linked open source software that can all be run locally. I had Wi-Fi connected ESP32 microphones scattered around the house, all streaming audio to the HomeAssistant PC, where OpenWakeWord is listening for the command "hey Jarvis" to begin processing (similar to how Amazon Echo's listen for the word "Alexa" before activiating). Once the wake word is detected, audio is piped into Whisper, an open source speech to text program that turns that audio into text. That text is then fed to Ollama on one of my dedicated AI computers, running either Llama 3 or Phi3. The response text is then sent back to the HomeAssistant server, where it is converted to an audio file by Piper, an open source TTS neural network, and then streamed to the relevant connected speaker. The dataflow process is complex, but straightforward, and thankfully HomeAssistant has a orchestration tool, called Wyoming, which can help facilitate.

By default, the Wyoming Protocol sends voice responses back to the same device that heard the wake word and recorded the prompt. This is non-ideal, because the built in speaker is very quiet, but fortunately this guide shows how you can redirect the audio to a different speaker. In my case, I program the lights to flash when a wake word is detected (so that is is glaringly obvious that the system is now listening) and then speak responses on the nearest smart speaker (usually one of my Google Home Minis, on mute)

This project has been a fantastic improvement over using Google Home or Alexa, especially since I rarely use voice commands to control the home (the house should already know what to do), but instead use voice primarily for question answering, and Llama3 is much more capable than whatever Google or Amazon uses.

I also used this complete re-write as an opportunity to change how tasks in the house are run. In the past, small errors from sensors or commands that got lost in transmission would accumulate and cause the house to run a muck. Rather than a simple trigger -> action structure which I had used before, I changed to a structure that would continuously self-check that items were in their proper state. A self-check routine would be run every 5 minutes (usually resulting in no action) and whenever a certain trigger was activated (such as a motion sensor detecting motion). This new structure has led to a substantial decrease in unexpected behavior, and I have even noticed it self-correcting itself. For example, all but one shade is set in the proper position, and then 5 minutes later the final shade is re-sent the command and moves to the correct position.

As almost every imaginable device in my house has been integrated into my smart home, my focus now has been to make systems that are more reliable and more intelligent. As AI becomes more accessible, my house will become simultaneously more functional and more private.

Frederick Dopfel

Home Assistant Update - Building my own Echos

Recent Posts

Comments