Julio.guillen-tfm
- Project Name: Vision-based AUV
- Authors: Julio Guillén García (julio [dot] guillen [at] gmail [dot] com)
- Academic Year: 2011-2012
- Degree: Master
- Jde Version: jde-5.0
- SVN Repository: Repository
- Tags: opencv, jde, gazebo, slam, auv
- Technology: JDErobot 5.0, Ubuntu 10.04, C/C++, OpenCV, OpenGL, GTK+, Ice, Gazebo, Octave
- State: Developing
- Document License:
This wiki aims to be a place to describe periodically all the steps followed in this project to accomplish the final goals. There will be discussions about implementation and design details within every stage of the project. I will therefore include multimedia content to better illustrate those explanations.
- Documentation:
Master Thesis: Presentation: Computer vision applied to Autonomous Underwater Vehicles PDF
- Press releases:
Infodefensa.com: 2011-11-14 2012-01-20 2012-02-24 2012-04-10 Muy Interesante:Feb2012 Amazings: 2012-04-03 Revista General de Marina: 2012-03 Cover 2012-03 Article Madrid i+d: 2012-03-26 agencia SINC: 2012-03-21 FayerWayer: 2012-03-28 actuaupm: 2012-01-10 Instituto de la Ingenieria de España: 2012-01-18 2012-04-20 ateneadigtal: 2012-01-20 laverdad.es: 2012-01-28 UNED.es: 2012-03-01 2012-03-20 Catedra ISDEFE: 2012-02-08 2012-04-02 El Economista: 2012-04-18 La Vanguardia: 2012-04-17 InGGenio: 2012-03-26 2012-05-13 2012-05-16 Organización de Estados Iberoamericanos: 2012-03-28 Hoy robotica: 2012-03-28 robotikka: 2012-03-28 Science Knowledge: 2012-03-25 Eccus.net: 2012-04-17 bajoelagua: 2012-04-18 Noticias999: 2012-04-07 La pizarra sin borrar: 2012-04-07
- Conferences and exhibits:
E.T.S.I. Navales presentation: Video jpg UNVEX'12: Brochure jpg jpg jpg jpg IX Businness Start-Up Contest UPM: info Airbus Innovation Cell's "Innovation Symposium" (Toulouse, FRA): invitation jpg jpg
- Broadcasting:
Geekvibrations: 2012-04-12 Cadena SER: Hora 14 Radio Nacional: España Directo Onda Madrid: Aquí no hay playa Radio San Vicente: El cinturón de Orión

Contents |
Greatest hits
Project scope

The main goal of this project is to create a vision system capable of carrying out visually guided tasks on an AUV (Autonomous Underwater Vehicle). The primary role of this vision system, is to process and extract information from an image, to provide the AUV with navigational decisions. Before this can be achieved, the core tasks, such as image capture and processing must first be implemented.
Mission tasks
The AUV will first get trough a dock and later identify and release some color buoys, pass over an obstacle course (PVC pipe to pass over), recognize and drop markers, shoot torpedoes through a cutout), manipulate a cylinder), and finally find a pinger, grab an object and move/release the object.
SLAM
We will implement SLAM within the AUV. Simultaneous localization and mapping (SLAM) is a technique used by robots and autonomous vehicles to build up a map within an unknown environment (without a priori knowledge), or to update a map within a known environment (with a priori knowledge from a given map), while at the same time keeping track of their current location.
To establish the position and heading of the vehicle, we will use a D-MARG (Depth - Magnetic Angle Rate Gravity) navigation system combined with EKF (Extended Kalman Filter).
Stereo vision
The AUV will be provided with two digital cameras at the front, and another two at the bottom, for extraction of information about the relative position of 3D objects in the vicinity of the autonomous system, and for object recognition.
3D reconstruction
Finally, we will develop an algorithm capable of generating a 3D representation of the environment, with all the recognized objects, and the path followed by the AUV, in an OpenGL graphics window.
Mission tasks
The fundamental goal of the mission is for an AUV to demonstrate its autonomy by completing an underwater Ides of TRANSDEC mission. They will be able to commence in training (dock/release buoys), pass over an obstacle course (PVC pipe to pass over), enter the gladiator ring (drop markers), kill Caesar (shoot torpedoes through a cutout), feed grapes to the emperor (manipulate a cylinder), and finally collect the Laurel wreath and crown the new emperor (find a pinger, grab an object and move/release the object). Starting, we must begin the run by passing under a validation gate. At any time during the run, if a vehicle breaches the surface, the run is terminated.
Validation gate

Path Strip

There will be six sections of the path which are 4 feet (1.2 m) long by 6 inches (15 cm) wide PVC sheet. The path will be covered in BLAZE ORANGE colored Duck Tape. Each path segment will be directly after the current task, and point to the next task or tasks. There will be one following the gate that points to the training (buoy) task. After the training task, one will point to one of the two obstacle courses (structure to pass over). There will be two after the first obstacle course, one which points to the feed the Emperor grapes task (cylinder manipulation), and one which points to the Gladiator Ring (bins)/kill Caesar (cutouts) tasks. Note that there will be no directional markers from the cylinder manipulation task. Following the bins and cutouts, there will also be two paths. One which points to the cylinder manipulation task, and one that passes over the second obstacle course, and towards the laurel wreaths (object pickup / octagons).
Training

The task consists of a green, red , and yellow 9” float. If all goes well, there will be two methods for scoring. The first it to bump the buoy. The second will be to touch a similarly colored square plate (colored Duck Tape: Neon Green, Red, Yellow), located below the buoy, to release the buoy.
Obstacle course

A “U” shaped section of 2” PVC pipe will be moored to the floor and consist of one long horizontal section, with two smaller vertical posts secured to the horizontal section and suspended above it. To get full points, a vehicle must pass inside the vertical segments and ½ or more of it's height below the plane created by the top of the vertical segments.
Gladiator ring

Each black bin will be surrounded by a 6” white border. A total of two markers can be dropped from the vehicle. Inside the bins will be silhouettes. There will be 2 types of gladiator weapons and two types of shields (total of 4 bins). Maximum points awarded for dropping a marker in the correct weapon and shield, some points awarded for dropping a marker in any bin (or landing on the white border).
Kill Caesar

A 24” x 24” (61 x 61 cm) window, red on one side, blue on the other (Question: different size, colors?). It will be moored to the floor, and will have two different size circular cutouts on the face. One torpedo must be marked as blue, and one as red. Maximum points for firing the red torpedo through the small circle on the red side and the blue torpedo through the small circle on the blue side. Other points will be awarded for firing any torpedo through any cutout.
Feed Emperor's grapes

A single 4ft x 4ft (1.2 x 1.2 m) PVC square is place vertically in the water column. It will be suspended from the water surface. Two 1” (2.5 cm) PVC cylinders are placed within the cutout in the square. The cylinders must be removed from the square. They are held in place by a tab and must be moved either vertically (for the vertically orientated cylinder), or horizontally (for the horizontally orientated cylinder), and then released (they will be tied to the square so they can't be lost).
Laurel wreath
This task consists of an acoustic pinger located off the floor of the pool. Placed directly above the pinger is the laurel wreath for the vehicle to retrieve. Floating above the pinger on the surface will be an octagon representing the emperors palace. In order to obtain full points for the zone, the vehicle must surface fully inside the octagon. There will be two different octagons on the competition side, and a team will get points for surfacing within either area. However, only one pinger will be on during a particular run (active), and a vehicle surfacing within the octagon with the active pinger will receiver more points. Points are awarded if the vehicle retrieves (maintaining control) the object. When the vehicle surfaces, more points will be awarded if the object is released. (NEW) Additional points will be awarded if, after the vehicle surfaces, it submerges to replace the object (a moot point if you knock over the stand, you know who you are!). A team may elect, before the vehicle surfaces, to switch the active pinger and traverse over and surface in the second octagon for extra points.
Submarine construction
Despite not being directly related to the development of the Master's project, in this section we will publish the progress constructing the submarine with videos and photographs.

Clam construction
The clam will be used in the "Feed emperor's grapes" mission. Its built in aluminium, using a water based cuttinf fluid on a CNC milling cutter machine.
Welding the hull frame
The hull frame is built in aluminum, which is the most difficult alloy to weld. Here are the photos of our first attempt.

Testing the valves
In our submarine, we've designed a propulsion system based on valves. The next videos show our first test using this new system.
Components
Here we will show images and videos from some of the components used in the submarine:


Tools
LDC algorithm
LDC stands for Line detection Algorithm using Contours. It follows 3E principles in its design: easy (to understand), effective (return accurate results) and efficient (fast). Most of the line detection solutions follow the Hough transforms algorithms and its variants. It has been designed by Andres Solis Montero, Amiya Nayak, Milos Stojmenovic and Nejib Zaguia.
In LDC we pre-process the input image with normalization, Gaussian smooth, Laplace edge detection, and thresholding. This steps produce a binary image representing the borders from the image. We then extract the consecutive boundary pixels from each connected component that create the contours of the image. Contours are divided into short segments, which are classified by their orientation into 9 discrete categories. Finally, line segments are detected by finding consecutive sequences of segments of similar slope.
LCD is somewhat faster than the Hough Line algorithm. Extracting edges from input image takes similar time, but LCD is much faster in extracting line segments from edges than Hough lines solution.
Edge detection from color images
The first step is to convert the color image to gray using cvCvtColor. Then, the normalization (cvNormalize) procedure is optional. Edge detection may be difficult if the image is affected by illumination problems, and normalization creates higher intensity contrasts. However sometimes it may be somewhat detrimental, introducing noise in the image. Later, Gaussian smooth and Laplacian edge detector. First, cvSmooth using a GKS (Gaussian Kernel size) odd number between 1 and 19. Then cvLaplace, with LAS (Laplacian aperture size) of 3,5 or 7, that defines kxk kernel with K = LAS. This method will make the image zones brighter where edges are found. Finally, we apply a binary threshold cvThreshold.
Extracting contours
We use the find contours function from OpenCv cvFindContours. As part of our algorithm, we introduce a threshold parameter MLC, to consider only contours with at least MLC pixels. It will filter out too short lines.
Repeated Segment Directions on Image Contours
The final step is to analyze the list of points on each contour in order to find straight lines under a discrete linearity criterion. Diagonal lines are discretized to produce a division of an 11x11 matrix into 9 areas, labeled 0-8. The central 3x3 region is labeled 0 and represents class 0. Classes 1-8 correspond to eight approximate directions respectively: east, south-east, south, south-west, west, north-west, north, and north-east. Each fragment is classified in the following way. The beginning of the fragment is placed in the center of the 11x11 matrix. The fragment is classified according to the location of its endpoint in this matrix, and receives the label associated with it. Class 0 is a ‘neutral’ direction, or a lack of direction. It represents fragments that do not make sufficient progress in any direction, and can be treated as loops. This class was introduced due to the irregularity and noise in the edge detection steps. Progress with distances of at least 2 from the center in one of the directions suffices to classify slopes into one of the 8 classes. The location of intermediate pixels does not impact the classification.
Finally we merge fragments into segments. The exact criteria to detect line segments from fragments are:
1.- A segment has at least X fragments of the same slope classification.
2.- Two segments with the same slope classification, separated by less than X fragments with different classification, will create a single segment with all the fragments from and between both segments.
3.- The 0 slope classification is a neutral position for the algorithm. It will not interfere in the classification of any segments; it will be treated as being same class as the previous class in the sequence.
4.- If the starting fragment and final fragment of a contour have the same classification then they join in a same segment with length equal to the sum of both.
Improving LDC
LDC algorithm can be further improved. Eliminating duplicate lines is easy. There are some pathological cases; for example, stair like structure may be declared a single line. The rules for accepting a line could be fine-tuned, including the change of variable X for minimal number of fragments to make a segment. The fragment size may be increased and more directions could be considered. The insertion of neutral fragments into a line may be reconsidered.
Performance videos
Kalman filter
The Kalman filter is a set of mathematical equations that provides an efficient computational (recursive) means to estimate the state of a process, in a way that minimizes the mean of the squared error. The filter is very powerful in several aspects: it supports estimations of past, present, and even future states, and it can do so even when the precise nature of the modeled system is unknown. We will be using Kalman filters in this project for data fusion and tracking objects. In order to reduce the computation and hence latency, video images are pre-processed, using a segmentation technique.
We are using the OpenCv routines for the Kalman filter. There aren't many examples on the Internet and most of the coding has wrong concepts.
The Kalman Filter (also known as Linear Quadratic Estimation) is an algorithm which uses some noisy measurements over time and produces estimations that are more precise than the measurements alone.
The Kalman Filter in OpenCv is conceptualized as two distinct phases: "Predict" and "Correct". The predict state uses the estimate state of the previous timestep to produce an estimate of the current timestep. This is the a priori state estimate. In the correct phase, the current a priori estimation is combined with the current observation information to refine the state estimate. This is known as the a posteriori state estimate.
If we don't have any current observation information, the Kalman Filter uses only the a priori state estimate to refine the state estimate and obtain the a posteriori state estimate. This concept is essential in order to build a correct Kalman Filter.
cvKalmanPredict
Predicted (a priori) state estimate: x(k|k-1) = A(k)*x(k-1|k-1) + B(k)*u(k)
x(k|k-1) = predicted state estimate at time k. x(k-1|k-1) = uptaded state estimate at time k-1. A(k) = state transition model matrix at time k. B(k) = control input model matrix at time k. u(k) = control input at time k.
Predicted (a priori) estimate covariance: P(k|k-1) = H(k)*P(k-1|k-1)*H(k)t + Q(k)
P(k|k-1) = a priori error covariance matrix at time k. H(k) = obersvation model matrix at time k. H(k-1) = transposed observation model matrix at time k. P(k-1|k-1) = a posteriori error covariance matrix at time k-1.
cvKalmanCorrect
Error Ellipses
We are having trouble calculating the error ellipse...