Salons-ii
Project card
- Project Name: People 3D tracking using volumetric primitives
- Authors: Sara Marugán Alonso (smarugan [at] gsyc [point] es) and Jose María Cañas Plaza (jmplaza [at] gsyc [point] es)
- Academic Years: 2009-2010
- Degree: Grad
- Jde Version: jderobot-4.3.0
- SVN Repository: source code (restricted)
- Tags: computer vision, fall detection, 3D tracking
- Technology: c, jderobot
- State: Finished
- Source License: not defined
- Abstract:
The aim of this project is to collaborate with Eldercare project.
Project evolution
Sift matching schema
This is a schema that uses sift features to match image pixels [11/08]
Surf matching schema
This is a schema that uses SURF features to match image pixels [7/05/09]
Motion segmentation schema
This video shows regions generated from a motion segmentator schema [12/08]
Tracker2D schema
I have developed a schema that implements an evolutive algorithm for tracking people. It uses different features extracted from an image secuence as support for the algorithm. At the moment, this features consist of motion segmentation and automatic color learning.
The algorithm uses a population of races and each race tracks one person. Motion detection generate new races. A group of rectangles make up a race, this rectangles changes their position and size every iteration according to feature analysis. The videos below show representative rectangles for each race.
Tracking demos
Tracker2D for 3 people [24/03/09]
This video shows the best color and motion based tracking at the moment [13/05/09]
This video shows the best color and motion based tracking at the moment [9/06/09]
Automatic color learning [24/03/09]
In this videos we can see the color filter that is result of color learning. Pixels that do not pass the filter are painted in grey. The color model used is HSV.
Improving algorithm with new kind of image features [11/04/09]
Color and motion information permits to track people quite well. Nevertheless in order to make the application robust, I have been proving the algorithm with other kind of image features. Specifically I use SIFT and SURF for pixel matching and Lucas-Kanade point tracking. These techniques require a high computational resources, so the algorithm works too slow. To fix it I have divided the fitness calculation step in two parts. First part consist of calculate the race representative rectangle using color and motion information and second part calculates new feature fitness using the representative rectangle. As a result, the algorithm only have to match pixels from representative rectangle area of each race, where normally the person will be. This change permits the application to execute an acceptable number of iterations per second.
Sift features
This is the same video but tracker2D is using SIFT features. Points well matched are drawn in green color.
Surf features
This is the same video but tracker2D is using SURF features [7/05/09]
Lucas-Kanade features
This is the same video but tracker2D is using Lucas-Kanade point tracking [7/05/09]
Kalman filter over evolutive algorithm [2/06/09]
New fitness function based on four features [20/07/09][16/09/09]
I have changed fitness function from 2 steps to 1 step. It uses color, motion, optical flow (Lucas Kanade) and Kalman filter position estimation to calculate fitness value for race individuals.
For each race, now optical flow is analyzed over an influence window. This is drawn in blue color in the following video.
In this video race individuals are drawn in red, blue or qreen color. It depends on its fitness value. Green color represents high fitness and red color low fitness [16/09/09]
Fitness function analysis [14/10/09]
I have been checking fitness function. Fitness function is composed by four kind of feature measurement: color, motion, Lucas Kanade matching and Kalman position estimation. For each individual (Ri) of a race, that is represented as a rectangle, I obtain a partial fitness value for each kind of feature. Total fitness (Hi) is a lineal combination of these partial fitness.
Color fitness
Hc_i= K / (Ri.width * Ri.height), K= number of pixels that pass color filter inside the rectangle(Ri)
Motion fitness
Hmo_i= K / (Ri.width * Ri.height), K= number of pixels that pass motion filter inside the rectangle(Ri)
Matching fitness
Hma_i= MIN(density*10 + K / Ntotal_feature_points, 1.0) density= K / (Ri.width * Ri.height) K= number of feature points inside the rectangle Ntotal_feature_points= total number of features points for individual(Ri) race * 1.0 is a saturation umbral
Position fitness
Hpos_i= MAX(1.0 -(D / Ri.height), 0.0) D= distance between Ri center and Kalman filter position estimation * Ri.height is an umbral
First analysis experiment was in position terms. For a race in a particular instant, I took its representative rectangle and moved it around its neighborhood in order to see how fitness values change. I analyzed partial fitness results and the total one. The following image shows the results.
Second experiment was in size terms. As the same as first analysis, I took a representative rectangle for a race and explored all the sizes (width and height) possibilities in its neighborhood. The image shows the best option in green and the 3D graphic shows fitness values for the whole exploration.
Third experiment consist of exploring the four dimensions at the same time. I have created a little program to do that. This program permits to analyze several frame and race state samples that have been saved during a normal execution.
For each sample, the program uses the race stored state and explore different position and size possibilities, as a local search. The purpose is to select the best fitness rectangle throw the local search in order to make a comparison.
This program also permits to select real position and size for the person in each sample.
I stored 23 samples of an execution time for one person and analyzed it with the program. I obtained a comparison between evolutive algorithm and local search. First graphic shows estimation errors respect to real data of evolutive algorithm and second graphic is the same for local search.
Results shows that position estimation and width estimation error are 15 pixels in mean. In case of height estimation error, in mean is 60 pixels. Also shows that local search isn't good enough to include it in the evolutive algorithm because its computational cost doesn't compensate it.
Kalman filter
This is a kalman filter implementation for tracking of one person. Green cross represents real state and blue cross represents predicted state [19/05/09]
This is a first kalman filter implementation for tracking of several people [20/05/09]
There are different kalman filter. Motion ROIs from motion segmentator schema are given to each kalman estimator depending on their position.
This is second kalman filter implementation for tracking of several people [2/06/09]
I have improved motion ROI selection to feedback kalman filters. Now I also compare ROIs size. In addition, if a kalman filter does not have feedback, it validates its prediction in motion terms and takes it as feedback.
Kalman filter over optical flow [23/06/09]
This 2D tracker implementation takes Lucas Kanade optical flow calculation as a measurement for Kalman filter. Red points are Lucas Kanade measurement and green points are Kalman filter estimation.
Feature points are calculated by Good Features to Track algorithm (Opencv implementation).
First experiment: chess table
Second experiment: non motion masked video
Third experiment: motion masked video
Tracker3D schema over Tracker2D schema [20/10/09]
I have implemented a 3D tracker schema that uses information from two calibrated cameras. Each camera image is analyzed by a tracker2D schema and these results are examined by tracker3D schema.
The following image shows the Tracker3D's global design diagram.
It takes the region of interest (ROI) from camera A and backprojects it in order to get several samples across the backproject line. After that, it projects all samples into image B and compares them to camera B ROI. It permits to select the most similar rectangle to camera B ROI and get the 3D position estimation. See the process in the image below.
This video shows tracker3D working for one person.
This video shows tracker3D working for two people [27/10/09]
Fitness analysis [10/11/09]
The purpose of this fitness analysis is to know how good are the location solutions of tracker3D.
I have got 100 solution samples from one video and I have analyzed a 2x2 meters neighborhood with 5 centimeters of step. The result in mean has been 45 milimeters of error in X coordinate and 25 in Y coordinate. We assume that people are in the floor.
Here we have one of this samples. In the images from two cameras, we can see drawn in blue the original solution from tracker3D and drawn in red the solution of local search. The graphic shows a temperature map of the sample analysis.
Also, drawing the solution in the two images permit us to appreciate if the solution is good in reality.
TrackerEvolutive3D schema [23/11/09]
I have implemented a schema based on a multimodal evolutive algorithm with prisms as individuals. These prisms have five degrees of liberty: 3D position (center), width and height.
The purpose of this alternative implementation is to compare both 3D tracking methods and find out which is the best. I will analyze it in terms of performance, race dynamics and precision (position, size and color).
By the way, these videos shows how this schema works.
Using four cameras [10/12/09]
This video shows the system working with four cameras for two people.
Experiments [10/12/09][15/01/10]
I have done several experiments with TrackerEvolutive3D schema in order to validate it and adjust its parameters.
Color learning
The system can learn all kind of colors.
In this images we can see the race color redefinition. Colors change depending on the room zone because of the lighting.
Number of cameras influence
This experiment measures if 3D position error estimation is better using four cameras than using less cameras.
Results shows that error estimation using four or three cameras is much better than using only two.
Fitness function
In order to calculate prism fitness first step is projecting the prism into four images. Second step is obtaining their minimum rectangle shaped enclosing of the projections and third step is calculating color and motion density inside the rectangles. With both kinds of density the 3D location works fine but people size estimation could still be improved. The way to improve it consisted of putting on in 30% the original instantaneous prism of 3D tracking and calculate the difference density between fatter prism enclosing and original prism.
This images shows the improvement about people size estimation. Image (a) is before improvement and (b) after improvement.
Another experiment with fitness function consisted of making a 3D position local search over instantaneous representative prism of 3D tracking for one person. Local search results are painted in green and original prism in blue.
Results for size local search are shown in these images.
These experiments prove that fitness function is discriminatory and results of evolutive algorithm are very similar to local search algorithm results.
Fall detection
This video shows the system fall detection aplication.
Temporary visual obstruction
In this experiment the system only uses two cameras.
Final demo [12/01/10]
3D tracking comparison [20/12/09]
I have compared both algorithms in terms of performance, race dynamics and error estimation.
Performance
Performance have been measured in iterations per second for one, two and three people.
Results shows that tracker3D is quicker than trackerEvolutive3D.
Race dynamics
Race dynamics consist of race appearing and disappearing when someone enters and leaves the covered room. Also, the quality of tracking has been compared. In general, both algorithms work fine in race dynamics terms (we can see it in the videos). But we have observed that tracker3D sometimes makes mistakes in prism abduction. The following image shows an example of tracking fail.
Estimation error
I have measured 3D position error estimation in several points of the room.
In mean tracker3D makes a 10 centimeters error and trackerEvolutive3D makes a 5 centimeters error, so trackerEvolutive3D is better than tracker3D in 3D position estimation.
Jderobot collaboration grant
Improving manual
Configuration examples
Improving FAQ
Install Gazebo-0.8 in Ubuntu-8.04
Improving libraries, schemas, core
I have added little improvements to progeo library, some schemas and jderobot core.
Opencv driver
I have developed a driver that provides images. It offers static, video file and camera images. This driver uses Opencv library.
You can find source code here.
Simulated3D driver
This driver provides virtual images. It reads camera and world configuration files and uses Progeo library to render images.
You can find source code here
Gazebo08 driver
This driver offers sensors and motors from Gazebo-0.8 simulator. It uses libgazebo C++ API and integrate it in Jderobot platform with C API.
Source code is not available yet.
X10 driver and schemas
Improving x10 driver
I have extended x10 driver. Now not only can execute switch on/switch off commands over devices. This driver permits to read state from devices and monitor events.
Improving x10_controller schema
I have extended this schema as a consequence of x10 driver extension. Now you can ask for device states or see events.
New schema: domoticdemo
I have developed the first schema in Jderobot that uses x10 devices for security application. Trough x10 driver the schema obtains events, like motion sensor detecting presence. When this event occurs, the application starts to grab video from camera.
Source code is not available yet.
Firewire22 driver
I have reimplemented firewire driver in order to use 22 version of libdc1394 library, that has changed its API.
Source code is not available yet.