Information based adaptive robotic exploration

Information Based Adaptive Robotic Exploration

Frederic Bourgault, Alexei A. Makarenko, Stefan B. Williams,Ben Grocholsky, Hugh F. Durrant-Whyte

Australian Centre for Field RoboticsThe University of Sydney, Sydney NSW 2006, Australia

{f.bourgault, a.makarenko, stefanw, ben, hugh}@acfr.usyd.edu.au

Abstract

Exploration involving mapping and concurrent local-ization in an unknown environment is a pervasive taskin mobile robotics. In general, the accuracy of themapping process depends directly on the accuracy ofthe localization process. This paper address the prob-lem of maximizing the accuracy of the map buildingprocess during exploration by adaptively selecting con-trol actions that maximize localisation accuracy. Themap building and exploration task is modeled usingan Occupancy Grid (OG) with concurrent localisationperformed using a feature-based Simultaneous Local-isation And Mapping (SLAM) algorithm . Adaptivesensing aims at maximizing the map information bysimultaneously maximizing the expected Shannon in-formation gain (Mutual Information) on the OG mapand minimizing the uncertainty of the vehicle pose andmap feature uncertainty in the SLAM process. Theresulting map building system is demonstrated in anindoor environment using data from a laser scannermounted on a mobile platform.

1 Introduction

The main objective of exploration is to create an ac-curate map of an unknown area. To perform thismapping task a robot must localize itself accuratelywithin the environment as the fidelity of the resultingmap depends directly on the accuracy of the locali-sation process. Exploration must therefore satisfy thedual, and sometimes conflicting objectives, of localisa-tion accuracy, for high fidelity mapping, and exploringunknown territory.

The problem of adaptive exploration and mapping ofan unknown environment can be cast as an informa-tion, or an information gain, maximization problem.Shannon and Fisher information measures provide anatural way of quantifying localisation accuracy andmap fidelity. Formal information measures provide anatural mean for describing uncertainty and for eval-uating the effectiveness of future control actions inreducing this uncertainty [9]. Practically, information

methods also provide a natural way of mixing mea-sures of continuous information gain (in localisationfor example) with discrete information gain (througharea exploration for example). The adaptive explo-ration policies presented in this paper are based onthe use of information as a measure of utility for tak-ing exploration control actions. Such techniques arealso referred to as “Information-Theoretic” in the lit-erature (see [9, 7]).

The exploration framework proposed in this paperuses an Occupancy Grid (OG) environment model [4].This provides a simple straight-forward spatial rep-resentation of the environment within which sensorreadings are easily fused. The representation is par-ticularly helpful in indoor exploration problems. How-ever, the OG method requires an accurate estimate ofthe location of the robot from which data is being ac-quired. Failure to have an accurate location estimateresults in new sensor data, taken relative to the robot,being used to update the wrong OG cells. In turn, thisresults in a low fidelity map.

The localisation framework employed in this paperuses a feature-based Simultaneous Localisation AndMapping (SLAM) algorithm [2, 8]. The SLAMmethod is most appropriate for exploring unknownenvironments in the absence of an absolute positionsensor such as GPS. In its simplest form, the SLAMalgorithm is an Extended Kalman Filter (EKF) thatgenerates estimates of vehicle location and estimatesof the location of landmark features in the environ-ment.

While SLAM can provide accurate localisation infor-mation, the maps built usually consist of only a sparseset of easily seen features. This contrasts with thespatially dense and rich maps composed in the OGmethod and argues that both SLAM features and OGrepresentations should be used as part of an explo-ration strategy.

Information metrics have previously been proposed forOG techniques [5], for SLAM methods [6], and for

https://www.researchgate.net/publication/2822953_Adaptive_Mobile_Robot_Navigation_and_Mapping?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

https://www.researchgate.net/publication/238983085_Occupancy_Grid_a_Stochastic_Spatial_Representation_for_Active_Robot_Perception?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

https://www.researchgate.net/publication/239666676_Information-Theoretic_Control_of_Multiple_Sensor_Platforms?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

https://www.researchgate.net/publication/243676947_Simultaneous_Map_Building_and_Localisation_for_an_Autonomous_Robot?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

https://www.researchgate.net/publication/3298739_A_Solution_to_the_Simultaneous_Localization_and_Map_Building_SLAM_Problem?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

https://www.researchgate.net/publication/221539977_Robot_Navigation_Integrating_Perception_Environmental_Constraints_and_Task_Execution_Within_a_Probabilistic_Framework?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

hybrid sensor control tasks [9]. The contribution ofthis paper is to fuse these ideas in the competing re-quirements of adaptive exploration. Section 2 providessome background to both the SLAM and OG meth-ods. Section 3 develops information metrics for eachrepresentation and shows how these are combined in asingle measure of exploration performance. Section 5shows how these metrics are then used to determinethe future motion or trajectory of the sensor platform.Section 6 describes the implementation and evaluationof these techniques on an indoor vehicle equipped witha laser ranging system.

2 Background

This section briefly describes the two main algorithmsemployed in this work: the SLAM algorithm for local-isation and the OG method for map building.

2.1 The SLAM Algorithm

The SLAM algorithm employs an EKF to generateand maintain a combined estimate of the vehicle andmap feature locations together with a measure of un-certainty in these estimates [2, 10, 8]. The SLAMalgorithm requires a vehicle model, f , and an observa-tion (sensor) model, h.

Vehicle and Landmark Models. The augmentedstate vector at a time tk contains both the state of thevehicle, xv(k), and the state of all the landmarks, xm,

x(k) = [xTv (k),xT

1 , · · · ,xTnf

]T = [xTv (k),xT

m]T .

The state transition model for the vehicle and land-marks is described by

xv(k + 1) = f(xv(k),u(k)) + v(k) (1)xm(k + 1) = xm(k) (2)

where v(k) is taken to be a zero mean uncorrelatedrandom variable with covariance Q(k) describing plat-form motion uncertainty.

Sensor Model. The sensor observation is describedby a model in the form

z(k) = h(xv(k),xm) + w(k) (3)

where w(k) is taken to be a zero mean uncorrelatedrandom variable with covariance R(k) describing ob-servation uncertainty.

Estimation Process. The estimation process is arecursive three step algorithm consisting of prediction,observation and update steps.

Prediction: An estimate x(k | k) of the state x(k)at time tk is assumed. A prediction x(k + 1 | k) isformed by taking expectations of the state transition

model conditioned on the observation sequence up totime tk as follows

[xv(k + 1 | k)xm(k + 1 | k)

]=

[f(xv(k | k),u(k))

xm(k | k)

](4)

A covariance P(k | k) in the estimate x(k | k) is as-sumed and the prediction covariance is then computedas

P(k + 1 | k) = FP(k | k)FT + Q(k) (5)

where F = ∇vf(k) is the Jacobian of f with respectto the vehicle states evaluated at x(k | k).

Observation: An observation is made according toEquation 3. A predicted observation is computed bytaking expected values Equation 3 as

z(k + 1 | k) = h(xv(k + 1 | k), xm(k + 1 | k)). (6)

The difference between the actual and the predictedobservation, (the innovation) is then computed from

ν(k + 1) = z(k + 1) − z(k + 1 | k) (7)

along with the innovation covariance

S(k + 1) = HP(k + 1 | k)HT + R(k + 1) (8)

where H = ∇xh(k + 1) is the Jacobian of the obser-vation function with respect to all the states.

Update: The state estimate and the correspondingstate estimate covariance matrix are now updated inthe normal manner:

x(k + 1 | k + 1) = x(k + 1 | k) + Wν(k + 1) (9)

P(k + 1 | k + 1) = P(k + 1 | k) − WS(k + 1)WT

(10)where the optimal Kalman gain is

W = P(k + 1 | k)HT S−1(k + 1) (11)

2.2 Occupancy Grid

An OG is a discretised random field in which the prob-ability of occupancy of each independent cell in a spa-tial lattice is maintain [4]. The state xi of the ith cellin the grid may be occupied or unoccupied

xi = {OCC,EMP}, ∀i = 1, · · · , N.

The OG method aims to compute a probability of oc-cupancy Pi(xi|Zk), for each cell, given sequence ofobservations Zk = {z1, ..., zk}. Given a sensor modelp(zk|xi), this probability may be computed recursivelyfrom [4, 5]

Pi(xi|Zk) =p(zk|xi)Pi(xi|Zk−1)∑

xi∈X p(zk|xi)Pi(xi|Zk−1)(12)

https://www.researchgate.net/publication/243676947_Simultaneous_Map_Building_and_Localisation_for_an_Autonomous_Robot?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

https://www.researchgate.net/publication/3298739_A_Solution_to_the_Simultaneous_Localization_and_Map_Building_SLAM_Problem?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

https://www.researchgate.net/publication/230642952_Efficient_Solutions_to_Autonomous_Mapping_and_Navigation_Problems?el=1_x_8&enrichId=rgreq-4b8335fc-de7c-4e88-9136-cc373174bca0&enrichSource=Y292ZXJQYWdlOzM5NzIwNDc7QVM6MTA0NzQzNzI4ODQ0ODE0QDE0MDE5ODQyNTk5NzA=

The sensor model for this problem p(zk|xi) can bequite complex as it must be defined over all all pos-sible grid configurations (see [3, 4, 5] for derivations).The grid is initialised be setting all cells to having aprobability of 0.5 of being occupied and being unoc-cupied. This initial uniform distribution for all statesof each cell corresponds to the least informative, ormaximum entropy distribution.

3 Information Metrics

3.1 Info Metrics for Localizing (SLAM)

A multivariate Gaussian distribution has entropy pro-portional to the logarithm of the determinant of itscovariance matrix P ([1]):

12ln[(2πe)n detP]. (13)

Since the determinant of a matrix is a measure of itsvolume (product of its eigenvalues), the entropy mea-sures the compactness of the corresponding Gaussiandistribution. Maximizing information about a stateestimate is equivalent to minimizing the product ofthe eigenvalues of the corresponding covariance ma-trix.

A suitable information-based cost function C(·) forevaluating the utility of a candidate localisation con-trol action proposed in ([6]) is given by

C(P) = π∏j

√λj(Pvv) + π

nf∑i=1

∏j

√λj(Pii)

= π√

det(Pvv) + π

nf∑i=1

√det(Pii) (14)

where λj(.) is the jth eigenvalue of its argument. Thecost C(P) represents the sum of the areas of the fea-tures and vehicle uncertainty ellipses for the predictedcovariance matrix P after the expected observationfrom the predicted state.

3.2 Info Metrics for Mapping (OG)

For each cell ci of the map, it is possible to computethe entropy (Shannon information) [1] as a metric ofthe knowledge acquired so far. The a priori entropy attime tk for the ith grid cell is defined as the expectationof the logarithm of the probability distribution Pi(xi)

Hk,i ≡ −E[ln Pi(xi)] = −∑

xi∈Xi

Pi(xi) ln Pi(xi) (15)

Since there are two possible states (occupied andempty), with p = Pi(OCC) the probability of occu-pancy, and q = Pi(EMP ) = 1− p its complement, wecan write

Hk,i[Xi] = −p ln p − q ln q (16)

Given an observation zk at time tk, the conditionalentropy [1] for cell i is defined as

Hi(zk) ≡ −E[lnPi(xi|zk)]

= −∑

xi∈Xi

Pi(xi|zk) ln Pi(xi|zk).

The posterior probability Pi(xi|zk) can be obtainedfrom Bayes rule as

Pi(xi|zk) =Pi(zk|xi)Pi(xi)

Pi(zk). (17)

Note that the conditional entropy is a function of theobservation. The mean conditional entropy (over allpossible observations) is an a priori measure of thevalue of an observation. It is the expectation of en-tropy left after observation and is given by [1]

Hi ≡ E[Hi(zk)] =∫

Hi(zk)Pi(zk)dzk (18)

In practice, this integral is evaluated numerically overthe sensor range. Also, since the observations to bemade zk depend directly on the vehicle pose fromwhich the observations are made, the mean condi-tional entropy becomes a function of the estimatedvehicle pose prior to observation.

Using Equations 15) and (18), it is possible to com-pute the expected mutual information gain followingan observation ([1]):

Ii(xi) ≡ −E

[ln

Pi(xi|zk)Pi(xi)

]= Hi − Hi(xi|zk) (19)

The actual information gain estimate can only be eval-uated after the control action and the observation havebeen made.

Ii = Hk−1,i(xi) − Hk,i(xi) (20)

The total mutual information (total expected infor-mation gain) for a scan from a scanning sensor can bedefined as

ISj(xi|zk) =

∑i∈Sj

Ii(xi|zk) (21)

where Sj is the region covered by the scan, and NSj

is the number of grid cells included in Sj . If Sj istaken as the scanning region associated with predictedvehicle pose xv(k + 1 | k, j) = f(xv(k | k),uj(k) fora given control action uj(k) ∈ U(k), where U(k) ={u1(k), · · · ,un(k)} is the discretized set of possiblecontrol action at time tk, then ISj

can be used directlyas a utility or value measure.

U(x,uj(k)) = ISj(22)

3.3 Combined Information Utilities: Integra-tion

As described in [9], for any multi-objective optimiza-tion problem, for which the individual utilities arebased on information metrics, a composite utility canbe constructed simply by linear combination. Hence,for the dual objective of mapping (OG) and localizing(SLAM) the composite utility can be defined as

Uk = Icomposite(x,xc,uj(k)) (23)= w1ISLAM (x,uj(k)) + w2IOG(xc,uj(k))

where the weights w1(k) = α/ISLAMMAX(k) and w2 =

(1−α)/IOGMAXare the respective normalization fac-

tors multiplied by a constant. ISLAMMAX(k) is ob-

tained from the upper bound on the SLAM covariancematrix for a given number of landmarks (see [2]), andIOGMAX

is simply the total information of a perfectlyknown OG map (probability of 0 or 1 for every cell).The default value of the tuning parameter α is 0.5,hence making the upper bound of the combined util-ity to sum to 1. Increasing α would create a moreaccurate OG map by putting the emphasis on the lo-calization, and reducing it would make the robot ex-plore more deliberately to the detriment of accuracy.

4 Optimal Trajectory

4.1 Local Optimization

The optimal control action u∗(k) at time tk, for a onestep look-ahead is easily selected as the action whichmaximizes the total expected utility at time tk

u∗(k) = arg maxuj

Uk(x,xc,uj(k)) (24)

Viewing the information space as a potential field, theoptimal control action at any given time tk is the ac-tion that constantly directs the vehicle towards themaximum increase in global information. The controlstrategy is to “surf” towards the highest rate of mu-tual information. Notice though that unlike controlactions driven by potential fields, information surf-ing strategies do not get trapped in local minima asthe mutual information gain decreases exponentiallyas observations are made. Thus the local optimum inthe field reduces in value as observations are made.

While this control strategy is not necessarily globallyoptimal, it exhibits an interesting reactive property byconstantly adapting the control decisions to the cur-rent information state. In an exploration scenario, theutility for mapping IOG will attract the robot towardsunknown areas (areas of high entropy, see Section 5),while the utility for localization ISLAM competes toensures that the robot stays well localized relative toknown features in the map. The relative importance oflocalization will change with localization uncertainty.

The robot will adapt by taking control action that willincrease or decrease sensitivity to localization uncer-tainty. This is a useful result as there is little value ina robot exploring and mapping new areas when it hasno idea of how accurately it knows it’s own location.

4.2 Global Optimization

The generalization of the integrated utility functionover a finite horizon of length ∆t = Nk can be definedin the form

UN (πj) =N∑

k=1

Uk(x,xc,uj(k)) (25)

where the πj = {uj(1), · · · ,uj(N)} is a policy vectorof N control decisions. The optimal control strategyis given by

π∗ = {u∗(1), · · · ,u∗(N)} = arg maxπj

UN (πj) (26)

Multi-step solutions are appropriate in problems suchas path planning. However, the computational costwill grow very rapidly with the depth of look-aheadunless the information metrics are very simple.

5 Implementation

Figure 1 shows the information-based exploration al-gorithm implemented, while Figure 2 shows an illus-tration of what the algorithm does.

The control action space is first discretised,{u1(k),u2(k),u3(k)}. Then, following Figure 1, thevehicle model (Equations 4 and 5) is used to predictthe states and covariance matrix for each of these con-trol inputs (a). From each projected state, the OGMutual Information in (b), (Equations 19 and 21),and the expected covariance matrix in (c), (Equation14) are evaluated, taking into account the uncertaintyin the predicted state. Then, in (d), the compositeutility is computed (Equation 23), and in (e) the “op-timal” control action (Equation 24) is performed. In(f), the observation is made (Equation 3), and theKalman filter updates the state estimate and covari-ance matrix (Equations 9 and 10). Finally in (g), theOccupancy Grid is updated (Equation 12).

6 Experiment and Results

The mobile sensing platform (see Figure 3) consists ofa Pioneer2 robot equipped with a laser ranging device.Figure 4a shows the Occupancy Grid and Figure 4bits corresponding entropy map. The robot position ismarked by the small circle. On Figure 5 is shown amore intuitive 3-D image of the same entropy map.The low parts corresponding to the very well knownregion (maximum information). The mapping utilitywill attract the robot towards region of high entropy

Figure 1: Integrated Adaptive Information-based Ex-ploration Algorithm.

Figure 2: Projected vehicle state estimates,xvj

(k | k), for three discrete control inputs,{u1(k),u2(k),u3(k)} and their corresponding sensorcoverage. The * are the SLAM feature estimates withtheir actual and predicted uncertainty ellipses.

(unexplored) that are actually in the line of sight of

Figure 3: The Pioneer2 mobile robot equipped with alaser.

(a) (b)

Figure 4: Occupancy Grid (a), and correspondingentropy map (b). The square symbols represents thelandmarks, and the circle represent the vehicle with apose of about 80 degrees. The ellipses are the actual 3σuncertainty bounds of the landmarks and the vehicleamplified five times for better visualization.

the sensor. Figure 6 shows the Mutual Informationgain for three potential control actions. In this case,the third action (c) is the most appealing from themapping point of view (highest mutual information)while action (a) is the most appealing from the local-ization point of view as it is the one that is expectedto shrink the most the uncertainty of the landmarkand vehicle estimates (ellipses).

7 Conclusion

This paper has shown how to combined OG andSLAM information metrics in exploration of an un-known environment. Good preliminary results indi-cate the validity of the approach. While this paper isconcerned with the control and trajectory optimiza-

020

4060

80100

120

0

5

10

15

20

25

30

35

0

0.05

0.1

x (m)

y (m)

020

4060

80100

120

0

5

10

15

20

25

30

35

0

0.05

0.1

x (m)

y (m)

020

4060

80100

120

0

5

10

15

20

25

30

35

0

0.05

0.1

x (m)

y (m)

(a) (b) (c)

Figure 6: Mutual Information for three discrete control inputs {u1(k),u2(k),u3(k)}. The small circles onthe top figures represent the vehicle initial position and projected state estimate xvj [k|k] with their respectiveuncertainty. The semicircles are the corresponding sensor coverage (range = 5 m), and the squares are theSLAM features. Again the uncertainty bounds have been amplified five times for visualization. The lower figureare the three-dimensional representation of the same mutual information as the grey traces in the upper figures.

020

4060

80100

120140

160

0

20

40

60

80

100

120

140

160

180

00.5

1

x (m)

y (m)

Figure 5: 3-D representation of the entropy map.

tion problem of a single mobile platform for explo-ration and mapping tasks, it constitutes a buildingblock of the much broader problem of management inDecentralized systems. The various solutions to infor-mation maximization problems presented here, whilebeing suboptimal, give practical and quantifiable al-gorithms with the potential to scale to multi-robotsystems.

References

[1] T.M. Cover and J.A. Thomas. Elements of informationtheory. Wiley series in telecommunications. Wiley, NewYork, 1991.

[2] M.W.M.G. Dissanayake, P. Newman, S. Clark, H.F.Durrant-Whyte, and M. Csobra. A solution to the si-multaneous localization and map building (slam) prob-lem. Robotics and Automation, 17(3):229–241, 2001.

[3] A. Elfes. Occupancy Grids : A Probabilistic Frameworkfor Robot Perception and Navigation. Phd, CarnegieMellon University, 1989.

[4] A. Elfes. Occupancy grids: A stochastic spatial rep-resentation for active robot perception. In the SixthConference on Uncertainty in AI, pages 60–70, 1990.

[5] A. Elfes. Robot navigation: Integrating perception,environmental constraints and task execution withina probabilistic framework. In Reasoning with Uncer-tainty in Robotics. International Workshop (RUR ’95),pages 93–129, Amsterdam, Netherlands, 1995.

[6] H.J.S. Feder, J.J. Leonard, and C.M. Smith. Adaptivemobile robot navigation and mapping. Int. Journal ofRobotics Research, 18(7):650–668, 1999.

[7] B. Grocholsky. Information-Theoretic Control of Mul-tiple Sensor Platforms. Phd, The University of Sydney,2002.

[8] J.J. Leonard and H.F. Durrant-Whyte. Simultane-ous map building and localisation for an autonomousrobot. In IEEE/RSJ International Workshop on In-telligent Robots and Systems (IROS ’91), pages 1442–1447, Osaka, Japan, 1991.

[9] J. Manyika and H.F. Durrant-Whyte. Data Fusionand Sensor Management: a decentralized information-theoretic approach. Ellis Horwood Series in Electricaland Electronic Engineering. Ellis Horwood, 1994.

[10] S.B. Williams. Efficient Solutions to AutonomousMapping and Navigation Problems. Phd, The Univer-sity of Sydney, 2001.

Date post:	14-May-2023
Category:	Documents
Upload:	sydney
View:	1 times
Download:	0 times

Information based adaptive robotic exploration

Documents