This
first article in this series gives a brief overview of the Kinect
sensor, the different terms used like natural user interface and
machine learning, along with the things required to make these work.[
Copy write from Microsoft.]
How
wonderful life would be if you could control:
- A television without a remote control
- A computer without a keyboard, mouse or touch screen
- Games without any controller in your hand
Well,
all this is possible with the Kinect. Though initially invented for
gaming, people have begun using it for different purposes (more on
this later).
The
Kinect is a motion-sensing device developed by Microsoft for the Xbox
360 video game console. The main idea was to be able to use a gaming
console without any kind of controller. The Kinect is packed with an
array of sensors and specialised devices to preprocess the
information received. Communication between the Kinect and the game
console, or Linux, is through a single USB cable.
Its
main features include:
- Gesture recognition: It can recognise gestures like hand movements, based on inputs from an RGB camera and depth sensor.
- Speech recognition: It can recognise spoken words and convert them into text, although accuracy strictly depends on the dictionary used. Input is from a microphone array.
The
main components are the RGB camera, depth sensor and microphone
array. The depth sensor combines an IR laser projector with a
monochrome CMOS sensor to get 3D video data. Besides these, there is
a motor to tilt the sensor array up and down for the best view of the
scene, and an accelerometer to sense position.
Although
Kinect has been developed by Microsoft, within a week a FOSS
enthusiast developed an open source driver for it. Microsoft denies
that the Kinect can be “hacked”. After watching the video
developed by the FOSS enthusiast, Microsoft’s Alex Kipman, speaking
formally on NPR’s Science Friday, said: “The first thing to talk
about is, Kinect was not actually hacked. Hacking would mean that
someone got to our algorithms that sit inside of the Xbox and was
able to actually use them, which hasn’t happened. Or, it means that
you put a device between the sensor and the Xbox for means of
cheating, which also has not happened. That’s what we call hacking,
and that’s what we have put a ton of work and effort to make sure
doesn’t actually occur. What has happened is that someone wrote an
open source driver for PCs that essentially opens the USB connection,
which we didn’t protect by design, and reads the inputs from the
sensor. The sensor, again, as I said earlier, has eyes and ears, and
that’s a whole bunch of noise that someone needs to take and turn
into signals.”
Well,
when we open source users use the work hack, we mean something
completely different. But, anyway, that’s another story. To
get the best out of the Kinect, “natural user interface” and
“machine learning” are the two terms that must be clear in your
mind. Natural user interface (NUI) refers to the close interaction
between the user and the computer. It includes controlling the
computer by gestures, or the computer recognizing the user’s
voice/face. Microsoft Surface, multi-touch and Kinect are a few
examples of NUI.
Machine
learning, “
a
branch of artificial intelligence, is a scientific discipline
concerned with the design and development of algorithms that allow
computers to evolve behavior based on empirical data, such as from
sensor data.” (Wikipedia).
Open
platforms supporting the Kinect: The OpenKinect community
was founded by the developer of the Kinect open source driver.
Another organisation, Open NI, was founded by Prime Sense,
Willow Garage, Side Kick and ASUS. Open Kinect publishes its code
under Apache 2.0 or GPL 2 licenses, while Open NI publishes its work
under different licenses. There are mainly two open platforms or
libraries, namely libfreenect and OpenNI. These have been developed
for almost the same purpose, and both support various languages like
Python, C++, C#, JavaScript, Java JNI, Java JNA and Action Script.
Some
amazing things/projects that can be done with Kinect:
- ROS (Robot Operating System) Kinect is an open source project focused on the integration of the Kinect sensor with ROS. ROS uses both drivers — OpenNI and libfreenect.
- Kinect-controlled computers: Based on user gestures and/or speech recognition, Kinect can control the computer — although no one has done it for Linux yet.
- Scanning of 3D objects: Kinect enables robots to map 3D objects, resulting in detailed and precise models of people.
- Medical applications: Gesture-based control of surgical tools.
- Education: Writing and calculator applications have already been developed using it.
- And there are many more…
In
subsequent articles, I plan to cover the installation for libfreenect
and some interesting Kinect applications (of course, with source
code).
0 comments:
Post a Comment