"R" and Paleodemography

        This page contains some computer "stuff" for use in paleodemography, all of it implemented using the statistical package "R."  The main reason for using "R" is that the package is a free download, and the "code" from the page you are looking out now reads more or less like English.  Further, you can easily modify the code to do things the way you want (as opposed to the way I want).  And finally, the documentation for "R" is excellent, and free (again!)(available as *.pdf and *.html).  There's a bit of a learning curve, but when you consider the utility of "R" for doing both statistics and publication quality graphics, it is well worth the adventure.

 

        "R" executables can be downloaded from the following address:

http://temper.stat.cmu.edu/R/CRAN/

In all of the examples from this page you start the "RGui" going, then mark from this page and paste into the "R" window and hit an "enter" ("return").  If you want to edit code do something like:

mystuff<-edit(yourstuff)

to edit an old file and place it in a new name, or save over by using mystuff<-edit(mystuff).  Be sure to save your workspace when you exit "R" if you want to keep what you've done.



Kimura and Chikuni (1987), Bocquet-Appel and Masset (1996), and Hoenig and Heisy (1987)

        This section compares the algorithms from Kimura and Chikuni (1987), Bocquet-Appel and Masset (1996), and Hoenig and Heisy (1987).  You will get functional variants of the algorithms, though there are "bells and whistles" that you may well want to add.  To get started, mark and copy the block below, then paste it into the "RGui" window:

nij1<-matrix(c(42,12,8,2,0,0,0,0,0,0,
            1,13,20,56,24,12,2,0,0,0,
            0,0,8,16,29,38,27,15,7,3,
            0,0,0,0,2,5,11,27,13,4),ncol=10,byrow=T)
nj2<-c(143,71,122,214,161,172,118,133,59,21)

         This will give you a "reference" data set (nij1) and "target" data (nj2).  The reference and target come from Hoenig and Heisy (1987:242), so you can compare your answer.

 

        Now mark the following blocks and paste them into "R."  To run just type KF(), KF2(), BAM(), or HH().  If you want to change defaults you can always do something like BAM(niter=1000), which will run 5 times as long.  As far as "bells and whistles" there are a couple of things one might want to do:

  1. Add a convergence check, which would slow things down a bit.
  2. Add calculation of the log-likelihood, which would slow things down a bit.
  3. Add calculation of standard errors.  This can be done using analytical forms for the information matrix, then invert (chol2inv(chol(I))) and take the square root of the diagonal.  See Kimura and Chikuni and Hoenig and Heisey for analytical forms.
  4. Check that the number of age categories is equal to or less than the number of indicator states.

Note (August 23, 2002) - We're still seeing situations where people have persisted in trying to fit models with more age classes than stages.  To try to prevent this we've added some of the "bells and whistles" above to the script "KF."  The new script is "KF2."  The new script does a Pearson chi-square goodness-of-fit test (returned as "X2," with "df" degrees of freedom, and a probability value of "p").  If you try to run a model with more ages than stages you'll get an error message.


KF<-
function (n1=nij1,n2=nj2,niter=200) 							   # Konigsberg and Frankenberg
{
	nr<-NROW(n1)
	nc<-NCOL(n1) 
	pia<-n1/apply(n1,1,sum) 							   # pia from reference
	N<-sum(n2)
	da<-rep(N*1/nr,nr) 								   # Start from uniform da

	for(iter in 1:niter) {
		pai<-(da%o%rep(1,nc)*pia)/(rep(1,nr)%o%(apply(da%o%rep(1,nc)*pia,2,sum)))  # K&F eqn. 9
		da<-as.vector(t(n2)%*%t(pai)) 						   # K&F eqn. 10 
			     }
	return(da)
}

KF2<-
function (n1=nij1,n2=nj2,tol=1e-5,niter=500)    # Konigsberg and Frankenberg
{
old.lnlk<-1000
deltalnlk<-tol+1
nr<-NROW(n1)		   # number of age classes		
nc<-NCOL(n1) 		   # number of stages

if(nr>nc){cat("YOU HAVE MORE AGE CLASSES THAN STAGES!\n")
	  return()}

pia<-n1/apply(n1,1,sum)    # pia from reference
N<-sum(n2)
da<-rep(N*1/nr,nr)    	   # Start from uniform da

last.iter<-0

for(i in 1:niter){
if(abs(deltalnlk)>tol){
pai<-(da%o%rep(1,nc)*pia)/(rep(1,nr)%o%(apply(da%o%rep(1,nc)*pia,2,sum)))  # K&F eqn. 9
da<-as.vector(t(n2)%*%t(pai))    # K&F eqn. 10
lnlk<-as.vector(log(as.vector((da/sum(da))%*%pia))%*%n2)
deltalnlk<-lnlk-old.lnlk
old.lnlk<-lnlk
last.iter<-last.iter+1} 
     }

e<-da%*%pia
l<-e/sum(e)
X2<-sum(as.vector((n2-e)^2/e))
df<-nc-nr
p<-1-pchisq(X2,df)

Info<-matrix(rep(0,(nr-1)^2),ncol=nr-1)
for(j in 1:nr-1){
	for(jp in 1:nr-1){
		for(i in 1:nc){
	Info[j,jp]<-Info[j,jp]+(pia[j,i]-pia[nr,i])*(pia[jp,i]-pia[nr,i])/l[i]
			      }
			}
		}
Info<-sum(n2)*Info
se<-diag(chol2inv(chol(Info)))
se<-sqrt(c(se,sum(se)))
da<-round(da,2)
return(da,lnlk,deltalnlk,last.iter,X2,df,p,se)
}


 
BAM<-
 
function (n1=nij1,n2=nj2,niter=200)                 # Bocquet-Appel and Masset
{
      nr<-NROW(n1)
      nc<-NCOL(n1)    
      ma<-rep(sum(nj2)/nr,nr)
      fia<-(n1)/((n1%*%rep(1,nc))%*%rep(1,nc))      # fia from reference
      fi<-apply(fia,2,sum)                          # fi from reference
      for (i in 1:niter) {
      fai<-fia/(rep(1,nr)%*%t(fi))                  # B-A&P eqn 1.
      ma2<-fai%*%n2                                 # B-A&P eqn 2.
      fia<-fia*(ma2/ma)%*%rep(1,nc)                 # B-A&P eqn 3.
      fi<-apply(fia,2,sum)                          # B-A&P eqn 4.
      ma<-ma2
      }
      return(as.vector(ma))
}

HH<-
 
function (n1=nij1,n2=nj2,niter=200) 
{	
	nr<-NROW(n1)
	nc<-NCOL(n1)
	nij2<-rep(1,nr)%*%t(n2)*(n1/(rep(1,nr)%*%(rep(1,nr)%*%n1)))

	for(iter in 1:niter) {
	Pji<-(n1+nij2)/((n1%*%rep(1,nc))%*%rep(1,nc)+(nij2%*%rep(1,nc))%*%rep(1,nc))
	ai2<-rep(1,nc)%*%t(nij2)/sum(nij2)
	nj2<-rep(1,nr)%*%nij2
	ni2<-rep(1,nc)%*%t(nij2)
	nij2<-Pji*(t(ni2)%*%nj2)/(rep(1,nr)%*%(ni2%*%Pji))
	iter<-iter+1
	}
	return(apply(nij2,1,sum))
}