Analysis, in some occasions, complicates more rather than simplifying things. Luckily there is an other side to this where you see results.. in stead of simple confusion. If I sit alone, given that I'm comfortable with my surroundings, a lot of things just flip through my thoughts. Thinking about many things time and again with out a constructive output makes me sad.. so I decide to use my blog to consolidate, accumulate and build ideas..
One of the things that comes to mind is.. smart surroundings. I'm fascinated about building a central command for my home/room. Let me talk further about the same..
My objective is to build a smart home. My design is fairly simple. A processing unit, An interface between me and the processing unit and possibly different interfaces between the processing unit and devices like Television, Refrigerator, A cooking device (possibly Micro wave), HVAC (Heating Ventilation and Air Conditioning).. This list is big but possible and has several ways to automate your life :)
Let me talk about the human-command interface. The easiest possible thing is to use a simple terminal with keyboard, mouse and a monitor. But I wanted use speech recognition. My thoughts were to use temporal and spacial (frequency space) Fourier decomposition to identify/characterise sound. And then using pattern matching. I know simple deterministic approaches to do this. But most probably they should not yield good results. The other way for pattern recognition to use ANN (Artificial neural networks).
I examined couple of speech recognition programs that are available. The first one I tried is Microsoft Speech. It tries to write document from dictation. It is a crap of a software. The success rate (word identification) is far less. I can understand the problem here as it has to correlate voice signal to a vast set of outputs (Whole English language) and it is designed to work with all the voices. After this I read couple of articles for the information about the voice commands. (as opposed to whole speech to text conversion).
It tells about one nice thing. You can improve the command recognition efficiency, using a special syntax of word combinations, to around 95%. Gives a hope here as my command pool would not be as vast as whole English language. One more advantage with my case is I only care about my voice to be recognised.
Next, command unit: processing can be done with a typical Linux box with needed hardware interface. I need to learn a lot of stuff here but it is possible.
Next one is command processor and device interface. I will explain a typical device, TV. This is very simple case as every TV comes with a IR reception (Remote). You can attach a IR source to a Linux box and design a software remote to generate signals same as TV remote controller. It should have been done already. I need to search for the resources. And for all typical on/off electrical devices you can use actuators along with a controller.
So.. as whole, it can be done. And it is done commercially also (should have been!!). Anyways.. it is not the end result that fascinates me but the process made it possible. So from here on I will keep on updating my findings about individual topics related to this.. It should take years.. if at all I do it but I will enjoy building on my thoughts.... :)