Bayesian inference of stress release models applied to some Italian seismogenic zones

Geophys. J. Int. (2007) 169, 301–314 doi: 10.1111/j.1365-246X.2006.03314.x

GJI

Sei

smolo

gy

Bayesian inference of stress release models applied to some Italianseismogenic zones

R. Rotondi and E. VariniC.N.R. - Institute of Applied Mathematics and Information Technology, Via Bassini 15, 20133 Milano, Italy. E-mail: [email protected]

Accepted 2006 November 28. Received 2006 September 21; in original form 2006 February 13

S U M M A R Y

In this paper, we evaluate the seismic hazard of a region in southern Italy by analysing stress

release models from the Bayesian viewpoint; the data are drawn from the most recent version

of the parametric catalogue of Italian earthquakes. For estimation we just use the events up to

1992, then we forecast the date of the next event through a stochastic simulation method and we

compare the result with the really occurred shocks in the span 1993–2002. The original version

of the stress release model, proposed by Vere-Jones in 1978, transposes Reid’s elastic rebound

theory in the framework of stochastic point processes. Since the nineties enriched versions

of this model have appeared in the literature, applied to historical catalogues from China,

Iran, Japan; they envisage the identification of independent or interacting tectonic subunits

constituting the region under exam. It follows that the stress release models, designed for

regional analyses, are evolving towards studies on fault segments, realizing some degree of

convergence to those models that start from an individual fault and, considering the interactions

with nearby segments, are driven to studies on regional scale. The optimal performance of the

models we consider depends on a set of choices among which: the seismogenic region and

possible subzones, the threshold magnitude, the length of the time period. In this paper, we focus

our attention on the influence of the subdivision of the region under exam into tectonic units;

in the light of the recent studies on the fault segmentation model of Italy we propose a partition

of Sannio-Matese-Ofanto-Irpinia, one of the most seismically active region in southern Italy.

The results show that the performance of the stress release models improves in terms of both

fitting and forecasting when the region is split up into parts including new information about

potential seismogenic sources.

Key words: Bayesian inference, Markov chain Monte Carlo methods, seismogenic sources,

stochastic simulation, stress release models.

1 I N T RO D U C T I O N

In the community of multidisciplinary experts involved in studies on the seismic risk the awareness is growing that earthquake prediction

requires probabilistic methods and that the sufficiently extensive and flexible theoretical framework is provided by the theory of point

processes. The large amount of earthquake models proposed in the literature may be classified according to different criteria. As there is yet

no comprehensive model of earthquake occurrence, the formulation of models can be based on phenomenological analyses or on physical

hypotheses; this subdivision has generated two classes of models with contrasting features and opposite prospects of application. Examples of

the first class are the models which describe the temporal and spatial clustering of earthquakes (Ogata 1988; 1999; Kagan 1991, and references

therein). Offspring of the generalized Omori’s law, these models belong to the class of the self-exciting models from the probabilistic viewpoint;

their properties are studied in Hawkes (1971) and Hawkes & Oakes (1974). Clustering models, born to describe the decay of secondary shocks

following a strong earthquake, have been then applied for detecting anomalies in the seismic activity like precursory relative quiescence

(Matsu’ura 1986; Ogata 1997).The strong time asymmetry in the foreshock–main shock–aftershock sequences makes the prediction provided

by these models best just in the immediate aftermath of the main shock (Kagan 1991); hence, as their prediction lead time is very small,

they are useful when the damage mitigation policy is based on short-term strategies with low alarm-initiation cost (Kagan 1997). On the

other hand, every catalogue shows earthquake clustering; hence we can either avoid the issue by declustering the catalogue or allow for this

feature by inserting in the forecasting procedure a component which models this trend of the seismic activity. A big effort in this direction is

represented by the long- and short-term forecasting model currently tested on the northwest and southwest Pacific regions (Kagan & Jackson

2000) and in southern California (Kagan et al. 2003). The long-term, time-independent component of the hazard is obtained by smoothing

C© 2007 The Authors 301

Journal compilation C© 2007 RAS

302 R. Rotondi and E. Varini

catalogue data through kernel functions, whereas the short-term forecast is modelled by a branching process where the seeds of the clusters

follow a Poisson point process with a constant rate. Another approach is presented by Faenza et al. (2003, 2004) and Cinti et al. (2004); they

use a semi-parametric model, the Cox regression model, assuming that the hazard function at time t, time elapsed from the last event, depends

also on tectonic/physics factors besides the usual data of the catalogue. The application to Italian territory indicates a time clustering of the

large earthquakes for a few years, followed by a constant trend of the hazard. We recall that the Cox model, as defined in these papers, implies

two assumptions: the first that the hazard ratio for any two locations is constant in time, because it does not include the baseline function;

the second that independent covariates affect the hazard in a multiplicative way because of the use of the exponential function for linking the

covariates to the hazard.

Most of the models built on the recurrence physical hypothesis go back to the elastic rebound theory of Reid, among them the seismic

gap and the characteristic earthquake models (Nishenko 1991). According to these models large earthquakes are quasi-periodic; after a strong

shock the probability of a further large event is low, and it increases as time elapses. Moreover, the earthquake size distribution on ‘individual’

faults follows the ‘characteristic earthquake’ model, that is on each fault segment a large earthquake is exclusively characteristic. According

to Kagan (1991, p. 513) one of the reasons for the unsuccessful performance of these models is that they assume that all the slip is released

in characteristic earthquakes of the same size, while in fact considerable slip is released by both smaller and larger earthquakes. Due to

the stochastic nature of seismicity the relationship between strain accumulation and its release must be expressed in probabilistic terms as

is done, for instance, by the stress release (SR) models, translation of Reid’s theory in the framework of stochastic point processes. These

models were introduced by Vere-Jones in 1978 and then applied to historical catalogues of Japan, New Zealand, Iran and China, in both their

original form, we call simple, and subsequent versions enriched by geophysical information (Vere-Jones & Yonglu 1988; Zheng & Vere-Jones

1991; 1994). In these studies the models appear to fit, better than Poisson models, sets of large earthquakes occurred in large areas on long

timescales. Despite the intuitive physical assumptions supporting these recurrence models, we have to recognize that the catalogues testify

to the presence of clusters even of large events in contrast with the expected quiescence after a strong earthquake. A partial answer to this

issue comes from the version of the SR model named linked; it explains clusters of large events in terms of stress transfer among different,

possible closely related faults (Bebbington & Harte 2003, and references therein). An intermediate version between the simple and the

linked SR models is the one called independent where the region under exam is decomposed into different, not interacting seismogenic zones,

homogeneous from the seismotectonic viewpoint. Clusters of the original data set could be formed by events recorded in the same period in such

zones.

The aim of this paper is to analyse a particularly active region of southern Italy enriching the statistical modelling with recent geological

achievements in the understanding and characterization of Italian seismicity (Section 2), collected in the Database of Individual Seismogenic

Sources (DISS) (Valensise & Pantosti 2001b). As seismogenic source the authors intend a simplified and georeferenced 3-D representation of

a fault plane; each source was singly able to generate large-magnitude earthquakes in the past or is likely to experience a significant earthquake

in the future. We have then treated the sources located in southern Apennines as subunits of that seismogenic region. As data set we have used

the catalogue CPTI04 (CPTI Working Group 2004), the most recent version of the parametric catalogue of damaging earthquakes in the Italian

area. It is the result of the revision and extension of the previous catalogue CPTI99, in its turn, fusion of different catalogues, mainly NT4.1.1

(Camassi & Stucchi 1997) and CFTI 1–2 (Boschi et al. 1997), in the perspective of seismic hazard assessment. CPTI04 covers the period

from the Ancient World to 2002 and contains 2550 records of main shocks with epicentral intensity of I 0 ≥ V − V I degree of the Mercalli-

Cancani-Sieberg scale, or magnitude M S ≥ 4.0. The most remarkable developments concern the part related to the interval 1981–2002: (i) the

gaps in the time window 1981–1992 have been filled; (ii) empirical relationships have been estimated to obtain M S from the instrumentally

evaluated magnitude M L . The compilers have not included in the catalogue the events falling into time–space windows of 30 km and

±90 days, centred on main shocks. From the catalogue CPTI04 we have drawn the earthquakes occurred in Sannio-Matese-Ofanto-Irpinia

region in the time period 1650–1992 with magnitude M S ≥ 4.8. This magnitude threshold makes the events considered consistent with Reid’s

theory, that is, appropriate to be described by stress release models. In Section 3, we recall some versions of the SR model and we apply the

simple SR model on the entire data set, whereas the independent SR model is fitted on the set partitioned according to geological information.

Following the Bayesian paradigm we insert in the analysis information obtained from previous studies (Rotondi & Varini 2003a) through

assigning the prior distributions of the model parameters. In such a way we take into account the uncertainty in the evaluation of the conditional

intensity function characterizing the particular point process model. A further advantage of the Bayesian approach consists in providing, besides

an estimator, the whole posterior distribution of each quantity to estimate, a tool for quantifying the uncertainty of the estimator. The greater

flexibility of the modelling requires more powerful inferential methodologies; in our case as it is not possible to give the posterior distributions

explicitly, we have estimated them by generating sufficiently large samples from them through Markov chain Monte Carlo (MCMC) methods.

In Appendix A, we give a concise description of these methods referring to Gilks et al. (1995) for more information. The goodness of fit

of the estimated models is evaluated through the residual analysis and the models are compared each other on the basis of the Bayes factor.

Finally, we forecast the next step in the time evolution of the process, that is the date of the next earthquake. In fact the stress release models

give time-dependent hazard assessment conditionally on the entire seismic history of the region in exam. From this viewpoint forecasting

the next event means to simulate the time evolution of the process; we have done that through a stochastic simulation method, the timescale

transformation method, both using the plug-in estimate of the conditional intensity and the same intensity with parameter values sampled

from their posterior distributions. The forecasts, given in terms of mean and quantiles of the distribution of the date of the next event, are

compared with the really occurred events in the period 1993–2002 (Section 4).

C© 2007 The Authors, GJI, 169, 301–314


Bayesian inference of stress release models 303

2 S E I S M O G E N I C Z O N E S A N D C ATA L O G U E

One of the crucial factors affecting the performance of the stress release models is the identification of suitable seismogenic regions. Southern

Apennines are one of the portions of the Italian peninsula that experience the most frequent and destructive seismic activity; in the most

recent zonation ZS9 of the Italian territory (MPS04 Working Group 2004) this area corresponds to the zone 927, depicted in Fig. 1, which

coincides largely with the region, homogeneous from the viewpoint of the kinematic context and the expected rupture mechanism, formed

by the four zones 58, 62, 63 and 64 of the previous zonation ZS4 (Meletti et al. 2000). Changes to ZS4 have been done to make the zonation

compatible with the new catalogue CPTI04 and to acknowledge the innovative contribution of the database of the potential seismogenic

sources DISS (Valensise & Pantosti 2001b). This database is the result of a project developed by the scientists of the Istituto Nazionale di

Geofisica e Vulcanologia (INGV) starting from 1996; its goal is a fault segmentation model of Italy based on the assumption that seismicity

may be approximated by a finite number of potential seismogenic sources, surface projection of inferred faults which is likely to experience a

significant earthquake in the future. According to the recent research achievements in this project several large historical earthquakes, occurred

in the relatively narrow belt of seismicity characterizing the southern Apennines, are associated with inferred faults, either blind or hidden (see

Fig. 1 in Valensise & Pantosti 2001a). In particular for the earthquakes of 1688, 1732, 1857, respectively of magnitude M S = 6.72, 6.61, 6.96,

we refer to Valensise et al. (1993) and for the shock of 1805, M S = 6.57, to Cucci et al. (1996).

The average quality of buildings and the enormous cultural heritage make Italian cities especially vulnerable to earthquakes that would

not be as dangerous in other earthquake-prone countries; therefore we generally consider events with magnitude M S5+ as strong. In the new

catalogue CPTI04 the magnitude of several events has been re-evaluated and this process has led on average to a reduction of the magnitude

value by 0.2. Hence for homogeneity with previous studies we have used M S = 4.8 as threshold magnitude in the present analysis. The data set

is constituted by 35 events; the analysis of the completeness, performed through the identification of the change-point (Rotondi & Garavaglia

2002), indicates that the data set can be considered complete from the second half of the seventeenth century. Table 1 provides the summaries

of the posterior distributions of the change-point s, date from which the catalogue is considered complete, and of the rate h2 of the complete

part of the catalogue. The graphical representation of these results is given in Fig. 2.

We use the 24 data recorded from 1650 to 1992 to make inference, forecast the time of the next event and compare it with the real

occurrences in the interval 1993–2002. In the light of all available historical and geological evidence we have subdivided the zone 927 into

14˚E

14˚E

16˚E

16˚E

40˚N 40˚N

42˚N 42˚N

14˚E

14˚E

16˚E

16˚E

40˚N 40˚N

42˚N 42˚N

Figure 1. Seismogenic zone 927 of the Italian zonation ZS9 and epicentres of the shocks with magnitude M S ≥ 4.8 in the time interval 1650–2002. The

symbols mark the events belonging to the different subunits denoted by 1, 2, 3 and 4 in the text; circle = 1, inverse triangle = 2, square = 3, triangle = 4; star

indicates Calabrian-Lucano Apennines earthquake, 1998/9/9. The symbol size scales with magnitude.

C© 2007 The Authors, GJI, 169, 301–314



Table 1. Zone 927. Results on the completeness of the data set constituted by events of magnitude M S ≥ 4.8.

Summaries of the posterior distributions of the change point s and of the rate h2 of the part considered complete

of the catalogue.

Mean Mode Standard deviation

s 1665.00 1688.42 55.03h2 0.0751 0.070 0.0151

Quantiles q0.025 q0.05 q0.25 q0.5 q0.75 q0.95 q0.975

s 1446.2 1557.8 1666.3 1678.6 1685.1 1691.5 1697.6

h2 0.05 0.05 0.06 0.07 0.09 0.10 0.11

.95 HPD interval q l q h

h2 0.04 0.10

200 600 1000 1400 1800

0.0

0.0

02

0.0

04

0.0

06

0.0

08

po

ste

rio

r m

ea

n r

ate

cu

mu

lative

nu

mb

er

of

eve

nts

05

10

15

20

25

30

35

time

ma

gn

itu

de

100 300 500 700 900 1200 1500 1800

46

8

position of change–point

pro

ba

bili

ty d

en

sity

800 1000 1200 1400 1600 1800 2000

0.0

0.0

02

0.0

05

height h2

pro

ba

bili

ty d

en

sity

0.0 0.010 0.025 0.040 0.055

05

10

15

20

25

Figure 2. Zone 927—Graphic representation of the results on completeness. Above: posterior mean rate (solid line), cumulative number of events (dotted line),

and seismic history. Below left: estimated probability density of the position of the change point; below right: estimated probability density of the rate h2.

four subunits, each characterized by the presence of a fault segment which generated at least an earthquake with M S ≥ 6 in the past and

provided by a number of events at least equal to the number of model parameters (in our case n ≥ 3). Table 2 contains the data subdivided

by subunit; we are uncertain about the classification of the 1826 historical earthquake which could be assigned to either the subunit 3 or 4.

Its causative fault is unknown, but if we consider the transverse lineaments taken from the Synthetic Structural-Kinematic Map of Italy then

it should be separated from the fault segment generating the 1694 and 1980 strong (M S > 6) earthquakes. Hence our favourite allocation

is to the subunit 4; we have also examined the alternative classification and the results obtained are not significantly different as shown in

Section 4.

C© 2007 The Authors, GJI, 169, 301–314



Table 2. From the catalogue CPTI04, list of the earthquakes with M S ≥ 4.8 occurred in the zone 927 of the zonation ZS9 in the interval 1650–1992. The

number in the right-hand column denotes the membership subzone starting from North to South; the correspondence with the symbols in Fig. 1 is: 1 = circle,

2 = inverse triangle, 3 = square and 4 = triangle.

Date Latitude Longitude M S Zone Date Latitude Longitude M S Zone

1688/06/05 41.280 14.570 6.72 1 1857/12/16 40.350 15.850 6.96 4

1694/09/08 40.880 15.350 6.87 3 1858/08/06 40.750 15.550 4.80 3

1702/03/14 41.120 14.980 6.32 2 1882/06/06 41.550 14.200 4.96 1

1708/01/26 39.922 16.126 5.46 4 1885/09/17 41.133 14.800 4.80 2

1732/11/29 41.080 15.050 6.61 2 1893/01/25 40.583 15.417 4.80 3

1794/06/12 41.000 15.000 4.80 2 1905/11/26 41.134 15.028 5.02 2

1805/07/26 41.500 14.470 6.57 1 1910/06/07 40.900 15.420 5.84 3

1807/11/11 40.297 15.845 4.80 4 1914/12/19 41.583 14.250 4.80 1

1826/02/01 40.520 15.730 5.56 3/4 1941/09/07 41.200 15.000 4.80 2

1831/01/02 40.082 15.785 5.23 4 1962/08/21 41.130 14.970 6.19 2

1836/11/20 40.150 15.780 5.79 4 1980/11/23 40.850 15.280 6.89 3

1853/04/09 40.820 15.220 5.89 3 1982/03/21 40.008 15.766 5.05 4

3 S T R E S S R E L E A S E M O D E L S : E S T I M AT I O N A N D S I M U L AT I O N

Let us consider the two data sets D = {ti , mi }i=1,...,n and U = {ti , mi , ri }i=1,...,n where, for each i = 1, . . . , n, t i = T i − T 0, T i is the date

and m i the magnitude of the ith event, T i ∈ (T 0, T f ) (in our case T 0 = 1650 and T f = 1992), m i > M 0 = 4.79, magnitude threshold, and

r i is the membership subunit of the ith event as defined in Section 2. Let us assume that at time t the stress present in the region is given

by X (t) = X (0) + ρt − S(t) where S(t) =∑

i :ti <t 10 0.75·(mi −M0) is the sum of the stress released, according to Benioff’s formula, by all

the events over the period (0, t), ρ > 0 is the constant unknown loading rate and X (0) is the initial stress level. Following the proposal of

Zheng & Vere-Jones (1994) we factor the conditional intensity function characterizing the model, into the product of a risk function ψ(X (t))

– the probability of occurrence in (t , t + dt), given the stress level at time t – and of the density function f (m) for the magnitude and we

obtain λ(t, m | Ht ) = ψ(X (t)) f (m | X (t)) where Ht is the history of the process up to t. In Rotondi & Varini (2003a) we have studied

the case of magnitude distributions depending on the history of the process up to time t; here, for the sake of simplicity, we assume that

f (m) is independent of the time and therefore it does not affect the estimate of the time component of the hazard. The risk function is to be

chosen among the increasing to infinity and convex functions; the forms used in the literature are the integer part ψ(X (t)) = [µ + νX (t)]+ in

Vere-Jones (1978), Rotondi & Varini (2003b), and the exponential ψ(X (t)) = exp{µ + νX (t)} in all the other works. The parameter ν can be

interpreted as the response of the region to the stress accumulated and is related to the strength and the heterogeneity of the crust (Bebbington

& Harte 2003). Here we have followed the exponential choice because, on the one hand, there are no physical reasons to discriminate between

the two functional forms and, on the other hand, the results obtained in the above-mentioned studies are not very sensitive to this option. We

examine two possibilities: (a) a single stress regime X (t) acts in the overall region and (b) different, independent stress levels X j (t), j =

1, . . . , 4, exist in each of the four subunits constituting the region and responding separately to the stress. The case (a) is modelled by a simple

SR model on the data set D:

λS(t |Ht ) = exp{α + ν (ρ t − S(t))} (1)

being α = µ + ν X (0) and the case (b) by an independent SR model on the data set U :

λI (t |H′t ) =

4∑

j=1

ψ j (X j (t)) =

4∑

j=1

exp{α j + ν j (ρ j t − S j (t))}, (2)

where α j = µ j + ν j X j (0) and S j (t) =∑n

i :ti <t,ri = j 10 0.75·(mi −M0), that is, only the events belonging to the jth subunit enter into the sum. A

variant of this model with interesting physical interpretations is obtained by assuming that the tectonic input is the same in all the subunits,

that is ρ j = ρ, j = 1, . . . , 4. This means that the subunits are subjected to the same tectonic regime, but their geological structure, and hence

their sensitivity, is different. To distinguish the conditional intensity function of this particular independent model we indicate it by λ Iρ . The

time component of the log-likelihood is given by:

log(L) =

n∑

i=1

log λ(ti |Hti ) −

∫ tn+1

t0

λ(t |Ht ) dt

in the simple model, whereas H′t substitutes Ht in the independent model and in its variant with common tectonic loading rate. If we set at

zero all the parameters but α’s in (1) and (2), then we obtain homogeneous Poisson processes with constant rate in the region and in each

subunit, respectively. In this case α represents the logarithm of the rate. We will use this formulation to make inference on the Poisson models

by means of the same procedure used for the SR models.

Inference on the parameters θ = (α, ν, ρ), {θ j = (α j , ν j , ρ j )}4j=1, and {θ j = (α j , ν j , ρ)}4

j=1 has been performed following the Bayesian

paradigm. First, to take into account the uncertainty on the value of the parameters we have assigned their prior distributions on the basis

of estimates found in the literature and of information drawn from previous studies (Rotondi & Varini 2003a). Table 3 summarizes the prior

distributions and their parameters. The assignments do not change in the different SR models; on the contrary, in the Poisson models the value

C© 2007 The Authors, GJI, 169, 301–314



Table 3. Prior distributions for the parameters of the stress release and Poisson models.

Model Parameter Prior Mean Variance

Stress release α Normal −5. 9.

(simple and independent) ν Gamma 0.1 0.01

ρ Gamma 0.5 0.25

Poisson (in the region) α Normal −3. 1.

Poisson (in each subunit) α Normal −4.5 1.

Table 4. Proposal distributions used in the Metropolis–Hastings algorithm.

Candidate value Proposal Mean Variance

of the chain

α(i) Normal α(i−1) 0.5

ν(i) Lognormal ν(i−1) 0.0001

ρ(i) Lognormal ρ(i−1) 0.2

of the α parameter decreases about by log 4 moving from the entire region to four subunits as α is approximately the logarithm of the expected

occurrence rate and the rate of each subunit is taken equal to the fourth part of the regional rate.

In principle, the posterior distributions are obtained through Bayes’ theorem, but in our case it is impossible to evaluate analytically the

required integrals; therefore, we resort to Markov chain Monte Carlo techniques, in particular to the Metropolis–Hastings algorithm, to get

samples from the posterior distributions of the parameters. The procedure mainly consists in designing for each parameter a Markov chain

whose long-time equilibrium is the corresponding posterior distribution. At each iteration a new candidate element of the chain is drawn

from a suitable distribution, called proposal; with a certain probability it is accepted and the chain moves, otherwise it is rejected and the

chain remains at the current value. The ergodic averages of suitable functions of the generated sample converge to the desired summaries of

the posterior distribution like mean, variance, and quantiles. Moreover, kernel smoothing techniques applied to the sampled chains provide

estimates of the posterior densities. In Table 4, we have listed the proposals used in the Metropolis–Hastings algorithm; the indices mean that,

for instance, for the parameter α the candidate value α(i) at the ith iteration is generated from a normal distribution centred in the previous

element of the chain, α(i−1), and of variance 2. The main ideas on the MCMC methods are outlined in Appendix A; for more details we refer

to Gilks et al. (1995) and references therein. After evaluating the posterior means of the parameters we have substituted them in the likelihood

of the different models and have obtained their plug-in estimate which has allowed us to compare the fitting of the models to the data set.

The goodness of fit of the different models can also be evaluated through another statistical tool: the residual analysis (Ogata 1988). The

1–1 transformation of the time from t to τ = �(t), being � the cumulative intensity function of any of the considered models, changes the

occurrence dates t 1, t 2, . . . , t n into τ 1, τ 2, . . . , τ n , which are distributed as the standard stationary Poisson process if λ(·) is the intensity

function of the process really generating the data. Standard tests for stationarity, independence, and exponentially distributed interevent times

can be used to determine whether there are systematic deviations of the data from the fitted model.

3.1 Comparison of models through the Bayes factor

Consistently with the Bayesian approach we are following we adopt the Bayes factor as quantitative measure of the evidence in favour of a

model (Kass & Raftery 1995). Given the models M1,M2, and the data set D, the Bayes factor is the ratio of the posterior odds of M1 to its

prior odds, that is:

B12 =pr (D | M1)

pr (D | M2)=

pr (M1 |D)

pr (M2 |D)÷

pr (M1)

pr (M2). (3)

When the prior probabilities of the two competing hypotheses are equal, the Bayes factor coincides with the posterior odds. The densities

pr (D | Mk), k = 1, 2, are obtained by integrating over the parameter space with respect to their prior distributions

pr (D | Mk) =

∫

pr (D | θk,Mk) π (θk |Mk) dθk, (4)

where π (θk |Mk) is the prior density of the parameter θ k under Mk , and pr (D | θk,Mk) is the likelihood function of θ k . The quantity

pr (D | Mk) is a marginal (or integrated) likelihood; it is also referred to as the evidence for Mk . Integration with respect to the prior

distributions allows to take into account the different number of parameters in the models, besides the uncertainty on their value, as is done

in the frequentist statistics by model selection criteria like AIC. To clarify how the evidence penalizes highly complex models we follow the

interpretation given by Bishop (1995) of the expression (4). Let us consider a single parameter θ . The integrand of (4) is proportional to the

posterior density of θ ; if it is sharply peaked around the most probable value, the mode θ MP, then we can approximate the integral (4) by

pr (D | Mk) ≈ pr (D | θM P ,Mk) π (θM P |Mk) � θposterior, (5)

C© 2007 The Authors, GJI, 169, 301–314



where � θ posterior is the width of the peak. If we take the prior π (θ ) to be uniform over some large interval � θ prior and we approximate its

value by 1� θprior

, then (5) becomes

pr (D | Mk) ≈ pr (D | θM P ,Mk)

(

� θposterior

� θprior

)

.

The expression on the right-hand side is the product of the likelihood evaluated for the most probable parameter value and of the ratio, called

Occam factor, with value <1. For a model with many parameters, each will generate a similar Occam factor which reduces the evidence.

Hence the best model according the Bayes factor comes out in the balance between fitting the data well (large likelihood) and not being too

complex (relatively large Occam factor).

The computational difficulties concerning the evaluation of the multidimensional integral in (4) can be solved by numerical methods or

better by Monte Carlo methods. The simple Monte Carlo integration is based on the sampling from the priors π (θk |Mk); its precision can

be improved by the importance sampling which consists in the generating a sample {θ(i)k ; i = 1, . . . , M}, from another density π∗(θk |Mk),

called importance sampling function, and in approximating the integral (4) by∑M

i wi pr(

D | θ(i)k ,Mk

)

∑M

i wi

, (6)

where wi = π (θ(i)k |Mk)/π∗(θ

(i)k |Mk). In our case it is particularly convenient to exploit the sequences already generated by the Metropolis–

Hastings algorithm for estimation, and to adopt the posterior density as importance function. Substituting in (6) yields the harmonic mean of

the likelihood values as an estimate for pr (D | Mk)

pr (D | Mk) =

{

1

M

M∑

i

pr(

D | θ(i)k ,Mk

)−1

}−1

. (7)

This converges almost surely to the correct value, pr (D |Mk), as M → ∞, but it may have infinite variance because of the occasional

occurrence of a value of θ(i)k with a small likelihood and hence a large effect on the final result (Raftery et al. 2006). If the values θ

(i)k were

drawn from the prior density the estimator (7) would satisfy the Gaussian central limit; then a solution could consist in using as importance

sampling function a mixture of the prior and posterior densities. To avoid doing a double sampling we add to the M samples from the posterior

a further δM/(1 − δ) values of θ drawn from the prior and associate the expected value of the likelihood pr (D |Mk) to each of them (Newton

& Raftery 1994). This yields to approximate the evidence pr (D | Mk) by

pr (D | Mk) =δ M/(1 − δ) +

∑M

i=1 pr(

D | θ(i)k ,Mk

)

/{

δ pr (D | Mk) + (1 − δ) pr(

D | θ(i)k ,Mk

)}

δM/(1 − δ) pr (D | Mk)−1 +∑M

i=1

{

δ pr (D | Mk) + (1 − δ) pr(

D | θ(i)k ,Mk

)}−1.

The estimator pr (D | Mk) is evaluated by using a simple iterative scheme which generally converges fast. Following the hints given by

Newton & Raftery (1994) we adopted δ as small as 0.01 and we did not note any of the instability of pr (D | Mk) (7).

Among the properties of the Bayes factor we remind that it does not require alternative models to be nested (Kass & Raftery 1995);

moreover the Bayes factor can be used to calculate the posterior probabilities of competing models. When one thinks that none of the considered

models is correct, any approach that selects one of them and then makes inference conditionally on that model ignores the uncertainty involved

in model selection. This difficulty can be avoided if one calculates composite estimation and prediction by using the posterior probabilities of

the models provided by the corresponding Bayes factors (Hoeting et al. 1999; Wasserman 2000).

3.2 Forecasting the time-to-the-next event

We have then tackled the problem of forecasting the date of the next event by simulating the distribution of the waiting time. The thinning

method (Ogata 1981; Bebbington & Harte 2003), provides a general solution to the problem of the simulation of a point process. In our

case it is also possible to apply an alternative and more efficient simulation method, the timescale transformation method. It is particularly

appropriate when it is easy to invert the cumulative intensity function �(t | ·). The algorithm is composed by the following steps:

(i) generate u ∼Unif (0, 1);

(ii) the time-to-next-event is given by t ′ = 1νρ

log{1 − 1A

log(1 − u)}, where

A =1

νρexp{α + ν (ρ T − S(T ))}.

Details of these and other simulation procedures are given in Wang et al. (1991). As regards the independent SR models we simulate

the distribution of the time t ′j to the next earthquake in each subunit by applying the above-mentioned procedure to each ψ j (·) separately;

then the minimum t ′ = min j=1,...,4t ′j provides the time-to-next-event in the overall region. In this way we have estimated the distribution of

the time of the next event in each subunit as well as in the global region. Let us suppose to repeat the simulation M times. The frequency

π j = 1/M∑M

i I (t ′i = t ′

i j ) of the sweeps in which the simulated next event comes from the jth subunit provides a rough estimate of the

probability that the next earthquake will occur in that subunit.

C© 2007 The Authors, GJI, 169, 301–314



Lu & Vere-Jones (2001) affirm that ‘additional uncertainty arises from the fact that the parameters are fitted from limited data runs

and so are not known exactly. To capture this additional uncertainty, each simulation should start by first selecting parameters values from a

posterior distribution for such values. This will somewhat increase the spread of the simulations and so decrease the predictability . . . .’ The

Bayesian setting is the most suitable to deal with the uncertainty about the parameters, because estimation procedure provides the probability

distribution of each parameter besides its estimate. In the simulation of the time-to-next-event we have applied the timescale transformation

method first to the plug-in estimate of the conditional intensities λS(· |HT , θ) (1) and λI (· |HT , θ), λIρ(· |HT , θ) (2) and then to the same

intensities but evaluated in each parameter vector of a sample of size M drawn from the posterior distribution.

4 R E S U LT S

Summarizing the models examined are: Poisson (MP ) and simple SR (MS) on the entire region, Poisson (MP4 ), independent SR with

(MIρ) and without (MI ) common tectonic loading rate in the 4 subunits. For each of them posterior mean and standard deviation of

the parameters have been estimated using the ergodic averages of suitable functions of the Markov chains produced by the Metropolis–

Hastings algorithm; these estimates are reported in Table 5. The MCMC sampling scheme followed in estimation consists in generating

a chain of 107 elements for each parameter, recording the output every 250th iteration to reduce the autocorrelation within the chain and

discarding the initial 20 per cent of scans to avoid starting biases. The plug-in estimates of the log-likelihood functions are reported in

Table 5 as well; in particular, for the independent SR models the log-likelihood related to each subunit is given in the last column. Supposing

that the prior probabilities of the models are equal, the Bayes factor (3), expressed in Log10, of the Poisson and simple SR models is

Log10 pr (D | MS) − Log10 pr (D | MP ) = −38.75 + 38.56 = −0.19, that is a slight evidence in favour of MP according to Jeffreys’ scale

recalled in Table 6.

Let us examine what happens when adding geophysical information, that is, subdividing the region in seismogenic sources. We point

out that the models MP and MS are applied to the smaller data set D ⊂ U ; in order to compare the values of their likelihood functions with

those of the other models we have to add the membership probability of each event to a subunit; these probabilities are roughly estimated

through the frequencies n j/n, where n j , j = 1, . . . , 4, is the number of events in the jth subunit, that is, the number of r i = j , i = 1, . . . , n.

With this addition the log-likelihood of the model MP becomes −120.58 versus −120.76 of the model MP4 showing a negligible gain. On

the contrary the log-likelihood of the simple model MS becomes −120.10 versus −113.20 for the independent SR model MI and −110.41

Table 5. Posterior means, standard deviations (in parentheses) of the parameters, and plug-in estimate of log-likelihood for the Poisson, simple

and independent SR models with and without common loading rate. The 1826 earthquake is included into zone 4.

Model α (s.d.) ν (s.d.) ρ (s.d.) loge L(θ )

Poisson −2.69 (0.20) −87.85

simple −2.79 (0.39) 0.0078 (0.0069) 0.79 (0.41) −87.37

Model Subunit α (s.d.) ν (s.d.) ρ (s.d.) loge L(θ )

Poisson 1 −4.54 (0.46) − 21.82

2 −4.03 (0.37) −34.31

3 −4.18 (0.39) −30.32

4 −4.03 (0.37) −34.31

−120.76

Independent (ρ j = ρ) 1 −4.61 (0.64) 0.0585 (0.0436) 0.20 (0.03) −21.19

2 −4.39 (0.59) 0.0678 (0.0441) “ −32.66

3 −4.40 (0.73) 0.1037 (0.0584) ” −27.19

4 −5.84 (1.20) 0.1047 (0.0427) “ −29.36

−110.40

Independent 1 −4.44 (0.85) 0.0441 (0.0397) 0.22 (0.21) −21.37

2 −4.22 (0.75) 0.0542 (0.0454) 0.24 (0.21) −33.89

3 −4.57 (0.92) 0.0781 (0.0576) 0.26 (0.17) −29.23

4 −6.46 (1.52) 0.0994 (0.0431) 0.25 (0.08) −28.72

−113.20

Table 6. Scale suggested by Jeffreys (1961) for interpreting the Bayes factor.

. . . evidence against M1 B12 Log10 B 12

∣

∣

∣. . . evidence in favour of M1 B12 Log10 B 12

Slight . . . (0.31623, 1) (−0.5, 0)∣

∣

∣Slight . . . (1, 3.1623) (0, 0.5)

Moderate . . . (0.1, 0.31623) (−1, −0.5)∣

∣

∣Moderate . . . (3.1623, 10) (0.5, 1)

Strong . . . (0.01, 0.1) (−2, −1)∣

∣

∣Strong . . . (10, 100) (1, 2)

Decisive . . . (0, 0.01) (−∞, −2)∣

∣

∣Decisive . . . (100, ∞) (2, ∞)

C© 2007 The Authors, GJI, 169, 301–314



Table 7. Marginal likelihood and pairwise Bayes factors of the models considered.

MP MS MP4 MI MIρ

Log10 pr (D | M)∣

∣

∣−52.78 −52.96 −53.71 −51.26 −50.59

∣

∣

∣M1

Log10(B1,2)∣

∣

∣MP MS MP4 MI MIρ

MP

∣

∣

∣0 −0.19 −0.93 1.52 2.19

MS

∣

∣

∣0.19 0 −0.75 1.70 2.37

M2 MP4

∣

∣

∣0.93 0.75 0 2.45 3.12

MI

∣

∣

∣−1.52 −1.70 −2.45 0 0.67

MIρ | −2.19 −2.37 −3.12 −0.67 0

Table 8. Posterior means, standard deviations (in parentheses) of the parameters, and plug-in estimate of log-likelihood for the Poisson,

simple and independent SR models with and without common loading rate. The 1826 earthquake is included into zone 3.

Model Subunit α (s.d.) ν (s.d.) ρ (s.d.) loge L(θ )

Poisson 1 −4.54 (0.46) − 21.82

2 −4.03 (0.37) −34.31

3 −4.03 (0.37) −34.31

4 −4.18 (0.39) −30.32

−120.76

Independent (ρ j = ρ) 1 −4.61 (0.64) 0.0585 (0.0436) 0.20 (0.03) −21.19

2 −4.39 (0.59) 0.0678 (0.0441) ” −32.67

3 −3.99 (0.62) 0.0867 (0.0488) ” −31.67

4 −5.99 (1.20) 0.0989 (0.0432) ” −26.23

−111.76

Independent 1 −4.44 (0.85) 0.0441 (0.0397) 0.22 (0.21) −21.37

2 −4.22 (0.75) 0.0542 (0.0454) 0.24 (0.21) −33.89

3 −4.33 (0.83) 0.0633 (0.0472) 0.28 (0.19) −33.25

4 −6.19 (1.46) 0.0880 (0.0442) 0.24 (0.10) −26.02

−114.53

for the independent model MIρ with common loading rate. In this case the increase in the likelihood justifies the subdivision in zones. The

marginal likelihood and the pairwise Bayes factors of the five models are summarized in Table 7; the highest values correspond to the Bayes

factors of the independent SR models against the Poisson model MP4 :

Log10(BMIρ ,MP4

) = 3.12 Log10(BMI ,MP4

) = 2.45.

In both the cases there is decisive evidence against the Poisson MP4 model; moreover we note that the data provide significant evidence for

the model MIρ against any other.

The same analysis could be repeated for the results in Table 8 corresponding to the case in which the 1826 earthquake is assigned to

the subunit 3; the conclusions are clearly the same. We wish to note that the residual analysis has not come out to be a discriminating factor

among the different models; in fact all the models have passed the Kolmogorov–Smirnov test for the uniform distribution of the occurrence

transformed times; the p-values obtained for the different models are given in Table 9.

Fig. 3 represents the plug-in estimate of the conditional intensity of the simple (a) and of the independent (b) SR models. Being the

addends ψ j (X j ) of (2) independent, the conditional intensity function of the independent models can be decomposed into the sum of the

conditional intensities related to each subunit; these are represented in Fig. 4(a) for the independent model with different loading rates and

in Fig. 4(b) for the independent model with ρ j = ρ. In Fig. 5 we gather most of the information we obtain from the iterative estimation

procedure on the conditional intensity function. For the best model according to the Bayes factor, MIρ , Fig. 5 depicts the plug-in estimate

and the ergodic mean of λIρ ; moreover, if we consider the set of values λIρ (t | θ(i)), i = 1, . . . , 32 000, where θ(i)’s are drawn from the chain

generated by the inferential MCMC method, then we can compute, at each t, the median and the extremes of the shortest interval including

90 per cent of those values.

The simulation procedure performed for forecasting was repeated M = 5 × 105 times. The histograms of the times-to-the-next event

simulated through the plug-in estimates λS(θ), λI (θ) and λIρ(θ) are represented in Figs 6(a–c), respectively. Fig. 7 depicts the histogram of

Table 9. p-values of the Kolmogorov–Smirnov test for the uniform distribution of

the transformed times of occurrence.

Model MP MS MP4 MI MIρ

p-value 0.15 0.31 0.21 0.26 0.08

C© 2007 The Authors, GJI, 169, 301–314



psi(t)

0.0

20.0

40.0

60.0

80.1

0

PS

time (in years)

magnitude

1700 1800 1900 2000

45

67

8

(a)

0.0

0.1

0.2

0.3

P4II_rho

time (in years)

1700 1800 1900 2000

45

67

8

(b)

Figure 3. Estimated conditional intensity function of the Poisson (dotted line) and simple (solid line) models (a); and of the Poisson (dotted line), independent

SR (solid line), independent SR with common loading rate (dashed line) models (b); each vertical dotted line corresponds to an event and is proportional to its

magnitude.

time (in years)

psi(t)

1700 1800 1900 2000

0.0

0.0

50.1

00.1

5

sub–unit 1

sub–unit 2

sub–unit 3

sub–unit 4

(a)

time (in years)

1700 1800 1900 2000

sub–unit 1

sub–unit 2

sub–unit 3

sub–unit 4

(b)

Figure 4. Estimated conditional intensity function of each subunit in the independent SR model with different loading rate (a) and with common rate (b) (the

1826 earthquake is included into the subunit 4).

time (in years)

lam

bd

a(t

)

1700 1800 1900 2000

0.0

0.0

50

.10

0.1

50

.20

0.2

5

50% quantileergodic meanplugin90% shortest interval

Figure 5. Plug-in estimate (short dashed line) and ergodic mean (dotted line) of the conditional intensity of the independent SR model with common loading

rate. At each t, varying the parameters simulated by the MCMC method, a set of values λ(t | θ(i)), i = 1, . . . , 32 000, is obtained; the median value and the

shortest interval including 90 per cent of the values of this set are represented by the solid and the dashed lines, respectively (the 1826 earthquake is included

into the subunit 4).

C© 2007 The Authors, GJI, 169, 301–314



2000 20300.0

0.0

50.1

00.1

5time

frequency

(a)

Simple

2000 2030

0.0

0.0

50.1

00.1

5

time

frequency

(b)

Independent

2000 2030

0.0

0.0

50.1

00.1

5

time

frequency

(c)

Independent

common rate

Figure 6. Histograms of the simulated distribution of the time of the next event for the simple (a) and independent SR model with different loading rate (b)

and independent SR model with common loading rate (c).

2000 2050 2100 2150

0.0

0.0

20.0

6

time

frequency

sub–unit 1

2000 2050 2100 2150

0.0

0.0

20.0

6

time

frequency

sub–unit 2

2000 2050 2100 2150

0.0

0.0

04

0.0

10

time

frequency sub–unit 3

2000 2050 2100 2150

0.0

0.0

10

0.0

20

time

frequency

sub–unit 4

Figure 7. Histograms of the simulated distributions of the time of the next event in each subunit obtained by applying the timescale transformation method to

each ψ j (·) of the independent MI model, separately.

the values sampled through each ψ j (·) of the model MI for every subunit. As suggested by Lu & Vere-Jones (2001) we have simulated by

using, in the conditional intensity, both the estimates of the parameters and values drawn from the posterior distributions. Summaries of the

samples obtained in the two ways are given in Tables 10 and 11, respectively: the mean, the standard deviation, the (1 − α) quantiles, for α =

0.1, 0.5, 0.9, and, in the case of the independent SR model, also the probabilities that the next event in the region occurs in a specific subunit.

We have also evaluated the marginal likelihood functions over the time interval 1993–2002 based on the information coming from the

calibration period 1650–1992, that is, by integrating (4) with respect to the posterior densities. This is a way to test the ability of the models

to predict the future data (Forster 2000). The results, in terms of log-likelihood, are: −4.6785 for the Poisson MP4 model, −4.5286 for the

independent MI and −5.2408 for the independent MIρ models; hence in the extrapolation the assumption that the tectonic loading rate

is different for each subunit provides the best fitting. However we have to recognize that the conclusion cannot be significant as ten years,

the length of the available period for this test, are few. All the same, we think this example is of avail in highlighting the potentiality of the

Bayesian methodologies in predictive context.

C© 2007 The Authors, GJI, 169, 301–314



Table 10. Estimated quantiles and mean of the simulated distribution of the time of the first event after 1992 obtained by using the plug-in estimate of the

log-likelihood for the considered models; for the independent SR models π j , j , . . . , 4, are the estimates of probabilities that the next event occurs in the subunit

j.

Model Quantiles tnext (s.d.) π 1 π 2 π 3 π 4

10 per cent 50 per cent 90 per cent

Poisson 1994.52 2003.01 2026.25 2007.44 (14.44)

Simple 1994.48 2002.51 2022.67 2006.07 (12.18)

Poisson (4 subunits) 1994.59 2003.47 2027.78 2008.11 (15.11) 0.18 0.28 0.25 0.29

Independent (ρ j = ρ) 1994.32 2001.25 2017.31 2003.89 (9.61) 0.35 0.50 0.03 0.12

Independent 1993.71 1997.56 2007.16 1999.24 (5.76) 0.24 0.51 0.08 0.17

Table 11. Analogously to Table 10, summaries of the simulated distributions of the time of the first event after 1992 obtained by sampling from the posterior

distribution of each parameter.

Model Quantiles tnext (s.d.) π 1 π 2 π 3 π 4

10 per cent 50 per cent 90 per cent

Poisson 1994.52 2003.16 2027.90 2008.06 (15.73)

Simple 1994.33 2001.82 2023.57 2006.25 (14.25)

Poisson (4 subunits) 1994.59 2003.57 2029.28 2008.66 (16.27) 0.18 0.28 0.25 0.29

Independent (ρ j = ρ) 1993.70 1999.74 2021.01 2004.41 (13.75) 0.34 0.45 0.05 0.16

Independent 1994.11 2000.39 2018.12 2003.89 (11.30) 0.23 0.35 0.09 0.33

5 C O N C L U S I O N S

In this paper, we have shown how a more careful appraisal of the data set can improve widely the performance of a stochastic model; in fact we

have exploited information on the seismogenic sources to split up the region under exam and to fit to the enriched data set different versions of

the stress release model. We can conclude that in our case a substantially better fit is obtained by treating the parts of the region as independent

units than by treating them as parts of a common unit. Moreover, between the two possibilities, common or different tectonic loading rate,

the first has given the best result in terms of Bayes factor. These results match those provided in the literature by Zheng & Vere-Jones (1991,

1994) where large-scale geographical regions of China, Iran and Japan give better fits to the stress release models when broken down into

subunits. The class of stress release models seems capable of considerably further ramification; future researches could concern possible fault

interactions; the model which should be used for this aim is the linked SR model (Bebbington & Harte 2003).

The forecast of the time of the next event is better when one uses the independent SR models as well; in fact in Table 10 the time predicted

by these models is 2003.89 or 1999.24 when the loading rate is common or different, respectively, versus 2006.07 of the simple model. We

remind that the date of the unique earthquake with M S ≥ 4.8 really occurred after 1992 is 1998.69; it hit the subunit 4 (see Fig. 1). Again,

considering the quantiles, the probability that the next event occurs within the interval (1993–1999) is surely larger for the independent SR

models than for the simple. As regards the probabilities π j , j = 1, . . . , 4, we have to resort to the simulation based on the parameter values

sampled from the posterior distributions at every iteration (see Table 11) to obtain, in agreement with the real occurrence of the last earthquake

in the subunit 4, that π4 is very similar to π2 = max j=1,...,4 π j , for the models MP4 ,MI .

In conclusion the winning combination seems that between stress release model and partition of the region in subunits related to

seismogenic sources; while it is clear that the model MIρ fits better the observations in the backward analysis, the model MI seems to be

preferred in the predictive analysis, but the number of occurrences and the length of the time window make not conclusive this latter result.

A C K N O W L E D G M E N T S

The authors wish to express their sincere thanks to the two anonymous reviewers for their precise and thoughtful comments. The map was

produced by using the GMT software (Wessel & Smith 1998). This work was funded by the Italian Dipartimento della Protezione Civile in

the frame of the 2004–2006 Agreement with Istituto Nazionale di Geofisica e Vulcanologia - INGV.

R E F E R E N C E S

Bebbington, M. & Harte, D.S., 2003. The linked stress release model for

spatio-temporal seismicity: formulations, procedures and applications,

Geophys. J. Int., 154(3), 925–946.

Bishop, C.M., 1995. Neural Networks for Pattern Recognition, Clarendon

Press, Oxford

Boschi, E., Guidoboni, E., Ferrari, G., Valensise, G. & Gasperini, P., 1997.

Catalogo dei Forti Terremoti in Italia dal 461 a.C. al 1990, ING and SGA,

Bologna, pp. 644.

Camassi, R. & Stucchi, M., 1997. NT4.1.1 un catalogo paramet-

rico di terremoti di area italiana al di sopra del danno, Gruppo

Nazionale per la Difesa dai Terremoti, Milano, 95 pp., available from

http://emidius.itim.mi.cnr.it/NT

Cinti, F.R., Faenza, L., Marzocchi, W. & Montone, P., 2004. Probability map

of the next M ≥ 5.5 earthquakes in Italy, Geochem. Geophys. Geosyst.,

5, Q11003. doi:10.1029/2004GC000724.

CPTI Working Group, 2004. Catalogo Parametrico dei Terremoti Ital-

iani - Version 2004 (CPTI04). INGV, Bologna (I); available on

http://emidius.mi.ingv.it/CPTI/

C© 2007 The Authors, GJI, 169, 301–314



Cucci, L., D’Addezio, G., Valensise, G. & Burrato, P., 1996. Investigating

seismogenic faults in central and southern Apennines (Italy): modelling

of fault related landscape features, Ann. Geophys., 94, 603–618.

Faenza, L., Marzocchi, W. & Boschi, E., 2003. A non-parametric haz-

ard model to characterize the spatio-temporal occurrence of large earth-

quakes; an application to the Italian catalogue, Geophys. J. Int., 155, 521–

531.

Faenza, L., Marzocchi, W., Lombardi, A.M. & Console, R., 2004. Some in-

sights into the time clustering of large earthquakes in Italy, Ann. Geophys.,

47(5), 1635–1640.

Forster, M.R., 2000. Key concepts in model selection: performance and gen-

eralizability, J. Math. Psychol., 44, 205–231.

Gilks, W.R., Richardson, S. & Spiegelhalter, D.J., eds. 1995. Markov chain

Monte Carlo in Practice, Chapman & Hall, London.

Hawkes, A.G., 1971. Point spectra of some mutually exciting point processes,

J. Roy. Statist. Soc., B33, 438–443.

Hawkes, A.G. & Oakes, D.A., 1974. A cluster process representation of

self-exciting process, J. Appl. Probab., 11, 493–503.

Hoeting, J.A., Madigan, D., Raftery, A.E. & Volinsky, C.T., 1999. Bayesian

model averaging: a tutorial, Statist. Sci., 14(4), 382–417.

Jeffreys, H., 1961. Theory of Probability, Oxford, Clarendon Press.

Kagan, Y.Y., 1991. Likelihood analysis of erthquake catalogue, Geophys. J.

Int., 106, 135–148.

Kagan, Y.Y., 1997. Are earthquakes predictable?, Geophys. J. Int., 131, 505–

525.

Kagan, Y.Y. & Jackson, D.D., 2000. Probabilistic forecasting of earthquakes,

Geophys. J. Int., 143, 438–453.

Kagan, Y.Y., Rong, Y.F. & Jackson, D.D., 2003. Probabilistic forecasting of

seismicity, Chapter 5.2 in Earthquake Science and Seismic Risk Reduc-

tion, pp. 185–200, eds Mulargia, F. & Geller, R.J., Kluwer, Dordrecht.

Kass, R.E. & Raftery, A.E., 1995. Bayes factors, J. Amer. Statist. Ass., 90,

430, 773–795.

Lu, C. & Vere-Jones, D., 2001. Statistical analysis of synthetic earthquake

catalogs generated by models with various levels of faults zone disorder,

J. geophys. Res., 106(B6), 11 115–11 125.

Matsu’ura, R.S., 1986. Precursory quiescence and recovery of aftershock

activities before some large aftershocks, Bull. Earthquake Res. Instit.,

University of Tokyo, 61, 1–65.

Meletti, C., Patacca, E. & Scandone, P., 2000. Construction of a seismotec-

tonic model: the case of Italy, Pure. appl. Geophys., 157, 11–35.

MPS04 Working Group, 2004. Redazione della mappa di pericolosita sis-

mica prevista dall’Ordinanza PCM 20.03.03 n. 3274 - App. 2. INGV,

Milano-Roma (I), pp. 38.

Newton, M.A. & Raftery, A.E., 1994. Approximate Bayesian inference with

the weighted likelihood bootstrap, J.R. Statist. Soc., B, 56(1), 3–48.

Nishenko, S.P., 1991. Circum-Pacific seismic potential: 1989–1999, Pure.

appl. Geophys., 135, 169–259.

Ogata, Y., 1981. On Lewis’ simulation method for point processes, IEEE

Trans. Inform. Theory IT-30, 23–31.

Ogata, Y., 1988. Statistical models for earthquake occurrences and residual

analysis for point processes, J. Amer. Statist. Assoc., 83, 401, 9–27.

Ogata, Y., 1997. Detection of precursory relative quiescence before great

earthquakes through a statistical model, J. geophys. Res., 97, 19 845–

19 871.

Ogata, Y., 1999. Seismicity analysis through point-process modeling: a re-

view, In Seismicity Patterns, Their Statistical Significance and Physical

Meaning (eds. M. Wyss, K. Shimazaki and A. Ito) (Birkhauser, Basel),

Pure. appl. Geophys., 155, 471–507.

Raftery, A.E., Newton, M.A., Satagopan, J.M. & Krivitsky, P.N., 2006. Es-

timating the integrated likelihood via posterior simulation using the har-

monic mean identity, Proc. Valencia/ISBA 8th World Meeting on Bayesian

Statistics, Benidorm, (Alicante, Spain), June 1–6, 2006.

Rotondi, R. & Garavaglia, E., 2002. Statistical analysis of the completeness

of a seismic catalogue, Natural Hazards, 25(3), 245–258.

Rotondi, R. & Varini, E., 2003a. Bayesian analysis of a marked point pro-

cess: application in seismic hazard assessment, Stat. Methods Appl., 12,

79–92.

Rotondi, R. & Varini, E., 2003b. Bayesian analysis of linked stress release

models, Proceedings of the meeting on ‘Complex models and computer-

intensive methods for estimation and prediction’, Treviso (I), 4–6

September 2003, 356–361.

Valensise, G. & Pantosti, D., 2001a. The investigation of potential earthquake

sources in peninsular Italy: a review. J. Seism., 5, 287–306.

Valensise, G. & Pantosti, D. (eds.), 2001b. Database of Potential Sources

for Earthquakes Larger than M 5.5 in Italy. Ann. Geophys., 44, 4, Suppl.,

797–964, with CD-ROM - version DISS 2.0, version DISS3.0 available

on http://www.ingv.it/%7ewwwpaleo/DISS3/

Valensise, G., Pantosti, D., D’Addezio, G., Cinti, F.R. & Cucci, L.,

1993. L’identificazione e la caratterizzazione di faglie sismogenetiche

nell’Appennino centro-meridionale e nell’arco calabro: nuovi risultati e

ipotesi interpretative, Proc. XII Meeting Gruppo Nazionale di Geofisica

della Terra Solida, Rome, 331–342.

Vere-Jones, D., 1978. Earthquake prediction: a statistician’s view, J. Phys.

Earth, 26, 129–146.

Vere-Jones, D. & Yonglu, D., 1988. A point process analysis of histori-

cal earthquakes from North China. Earthquake Res. China, 2(2), 165–

181.

Wang, A.L., Vere-Jones, D. & Zheng, X., 1991. Simulation and estimation

procedures for stress release models, in Stochastic Processes and Their

Applications, Vol. 370, pp. 11–27, eds Beckmann, M.J., Gopalan, M.N. &

Subramanian, R., Springer Lecture Notes in Economics and Mathematical

Systems.

Wasserman, L., 2000. Bayesian model selection and model averaging, J.

Math. Psychology, 44, 92–107.

Wessel, P. & Smith, W.H.F., 1998. New, improved version of Generic Map-

ping Tools released, EOS Trans. Am. Geophys. U., 79 (47), pp. 579.

Zheng, X. & Vere-Jones, D., 1991. Application of stress release models to

historical earthquakes from North China. Pure. appl. Geophys., 135, 4,

559–576.

Zheng, X. & Vere-Jones, D., 1994. Further applications of the stochastic

stress release model to historical earthquake data, Tectonophysics, 229,

101–121.

A P P E N D I X A : M A R K O V C H A I N M O N T E C A R L O M E T H O D S

All relevant information on the unknown parameters θ given the observed data y can be inferred from the posterior distribution p(θ | y) by

evaluating integrals of the form:

J =

∫

f (θ) p(θ | y) dθ

of some function f (θ) with respect to the posterior distribution p(θ | y). For example, point estimates for unknown parameters are given by

the posterior means, i.e. f (θ) = θ, or measures of the uncertainty by the variances, i.e. f (θ) = (θ − E(θ))2. When it is not possible to evaluate

analytically these integrals, we can resort to the simulation of a sample if we can construct a Markov chain whose equilibrium distribution is

p(θ | y). Under suitable regularity conditions, if θ1,θ2, . . . , θt , . . . is a realization from an appropriate chain, the following results hold

θt → θ ∼ p(θ | y), in distribution

C© 2007 The Authors, GJI, 169, 301–314



and

J =1

t

t∑

i=1

g(θi ) → Eθ|y{g(θ)} almost surely.

The most used algorithms to set up the chains are the Gibbs sampling and the Metropolis–Hastings. For a complete review of this topic we

refer to Gilks et al. (1995). The Metropolis–Hastings algorithm constructs a Markov chain converging to p(θ | y) by defining the transition

probability from θt = θ to the next realised state θt+1 as follows. Let q(θ, θ′) denote a transition probability function such that, if θt = θ, the

vector θ′j drawn from q (θ, θ′) is considered as a proposed possible value for θt+1. It is accepted with probability

α(θ,θ′) = min

{

p(θ′ | y) q(θ′,θ)

p(θ | y) q(θ,θ′)1

}

; (B1)

otherwise it is rejected and we set θt+1 = θ. In this way the transition probabilities of the Markov chain are given by

p(θ,θ′) =

{

q(θ,θ′) α(θ,θ′) if θ′ �= θ

1 −∑

θ′′ q(θ,θ′′) α(θ,θ′′) if θ′′ �= θ.

It is easy to check that p(θ | y) p(θ, θ′) = p(θ′ | y) p(θ′, θ), which, together with the irreducibility and aperiodicity of q(θ, θ′) on a suitable

state space, is a sufficient condition for p(θ | y) to be the equilibrium distribution of the constructed chain. It is important to observe that it is

sufficient to be able to evaluate the distribution p(θ | y) up to proportionality (given by the likelihood multiplied by the prior) as it appears in

(8) only through the ratio p(θ′ | y)/p(θ | y).

In the Gibbs sampling the transition probabilities are given by the full conditional posterior distributions of the parameters. In general,

if we denote by θ = (θ 1, θ 2, . . . , θ p) the parameter vector, the Gibbs sampler proceeds iteratively generating from the conditional posterior

distribution

θ(t+1)j ∼ p

(

θ j

∣

∣ θ(t+1)1 , . . . , θ

(t+1)j−1 , θ

(t)j+1, . . . , θ

(t)p , y

)

for j = 1, . . . , p; we note that in the Gibbs sampling the new generated value is always accepted.

C© 2007 The Authors, GJI, 169, 301–314


Date post:	15-Nov-2023
Category:	Documents
Upload:	cnr-it
View:	0 times
Download:	0 times

Bayesian inference of stress release models applied to some Italian seismogenic zones

Documents