Sunday, November 22, 2009

More on the CRU Data Dump...

The blogosphere is chock full of stuff this morning, mainly analyzing the emails and documents (more on that in another post).  So this morning I thought I'd take a gander at some of the software code that was included in the hacker's dump.  This is the contents of an IDL file named briffa_sep98_e.pro.  This code appears to be used to pre-process raw tree ring data.  I've highlighted areas of interest in bold red, and added a comment with a footnote number; otherwise the file's contents are unmodified:
;
; PLOTS 'ALL' REGION MXD timeseries from age banded and from hugershoff
; standardised datasets.
; Reads Harry's regional timeseries and outputs the 1600-1992 portion
; with missing values set appropriately.  Uses mxd, and just the
; "all band" timeseries
;****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********
;
yrloc=[1400,findgen(19)*5.+1904]  ;note 1
valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,$
  2.6,2.6,2.6]*0.75         ; fudge factor
if n_elements(yrloc) ne n_elements(valadj) then message,'Oooops!'
;
loadct,39
def_1color,20,color='red'
plot,[0,1]
multi_plot,nrow=4,layout='large'
if !d.name eq 'X' then begin
  window, ysize=800
  !p.font=-1
endif else begin
  !p.font=0
  device,/helvetica,/bold,font_size=18
endelse
;
; Get regional tree lists and rbar
;
restore,filename='reglists.idlsave'
harryfn=['nwcan','wnam','cecan','nweur','sweur','nsib','csib','tib',$
  'esib','allsites']
;
rawdat=fltarr(4,2000)
for i = nreg-1 , nreg-1 do begin
  fn='mxd.'+harryfn(i)+'.pa.mean.dat'
  print,fn
  openr,1,fn
  readf,1,rawdat
  close,1
  ;
  densadj=reform(rawdat(2:3,*))
  ml=where(densadj eq -99.999,nmiss)
  densadj(ml)=!values.f_nan
  ;
  x=reform(rawdat(0,*))
  kl=where((x ge 1400) and (x le 1992)) ;note 3
  x=x(kl)
  densall=densadj(1,kl)     ; all bands
  densadj=densadj(0,kl)     ; 2-6 bands
  ;
  ; Now normalise w.r.t. 1881-1960
  ;
  mknormal,densadj,x,refperiod=[1881,1960],refmean=refmean,refsd=refsd
  mknormal,densall,x,refperiod=[1881,1960],refmean=refmean,refsd=refsd
;
; APPLY ARTIFICIAL CORRECTION
;
yearlyadj=interpol(valadj,yrloc,x) ;note 2
densall=densall+yearlyadj
  ;
  ; Now plot them
  ;
  filter_cru,20,tsin=densall,tslow=tslow,/nan
  cpl_barts,x,densall,title='Age-banded MXD from all sites',$
    xrange=[1399.5,1994.5],xtitle='Year',/xstyle,$
    zeroline=tslow,yrange=[-7,3]
  oplot,x,tslow,thick=3
  oplot,!x.crange,[0.,0.],linestyle=1
  ;
endfor
;
; Restore the Hugershoff NHD1 (see Nature paper 2)
;
xband=x
restore,filename='../tree5/densadj_MEAN.idlsave'
; gets: x,densadj,n,neff
;
; Extract the post 1600 part
;
kl=where(x ge 1400)
x=x(kl)
densadj=densadj(kl)
;
; APPLY ARTIFICIAL CORRECTION
;
yearlyadj=interpol(valadj,yrloc,x) ;note 2
densadj=densadj+yearlyadj
;
; Now plot it too
;
filter_cru,20,tsin=densadj,tslow=tshug,/nan
cpl_barts,x,densadj,title='Hugershoff-standardised MXD from all sites',$
  xrange=[1399.5,1994.5],xtitle='Year',/xstyle,$
  zeroline=tshug,yrange=[-7,3],bar_color=20
oplot,x,tshug,thick=3,color=20
oplot,!x.crange,[0.,0.],linestyle=1
;
; Now overplot their bidecadal components
;
plot,xband,tslow,$
  xrange=[1399.5,1994.5],xtitle='Year',/xstyle,$
  yrange=[-6,2],thick=3,title='Low-pass (20-yr) filtered comparison'
oplot,x,tshug,thick=3,color=20
oplot,!x.crange,[0.,0.],linestyle=1
;
; Now overplot their 50-yr components
;
filter_cru,50,tsin=densadj,tslow=tshug,/nan
filter_cru,50,tsin=densall,tslow=tslow,/nan
plot,xband,tslow,$
  xrange=[1399.5,1994.5],xtitle='Year',/xstyle,$
  yrange=[-6,2],thick=3,title='Low-pass (50-yr) filtered comparison'
oplot,x,tshug,thick=3,color=20
oplot,!x.crange,[0.,0.],linestyle=1
;
; Now compute the full, high and low pass correlations between the two
; series
;
perst=1400.
peren=1992.
;
openw,1,'corr_age2hug.out'
thalf=[10.,30.,50.,100.]
ntry=n_elements(thalf)
printf,1,'Correlations between timeseries'
printf,1,'Age-banded vs. Hugershoff-standardised'
printf,1,'     Region    Full   <10   >10   >30   >50  >100'
;
kla=where((xband ge perst) and (xband le peren))
klh=where((x ge perst) and (x le peren))
ts1=densadj(klh)
ts2=densall(kla)
;
r1=correlate(ts1,ts2)
rall=fltarr(ntry)
for i = 0 , ntry-1 do begin
  filter_cru,thalf(i),tsin=ts1,tslow=tslow1,tshigh=tshi1,/nan
  filter_cru,thalf(i),tsin=ts2,tslow=tslow2,tshigh=tshi2,/nan
  if i eq 0 then r2=correlate(tshi1,tshi2)
  rall(i)=correlate(tslow1,tslow2)
endfor
;
printf,1,'ALL SITES',r1,r2,rall,$
  format='(A11,2X,6F6.2)'
;
printf,1,' '
printf,1,'Correlations carried out over the period ',perst,peren
;
close,1
;
end
Well.  Where to start?  I guess I'll just dive right in...

1.  This code constructs a pair of 20 element arrays that are used later as inputs to an interpolation routine.  As the comments "fudge factor" and "very artificial correction for decline" would seem to indicate, these values appear to be completely arbitrary, and unsupported by any underlying theory about why the raw data should be manipulated this way.  I'll tell you what it looks like to this ancient, gray-bearded software who has constructed many software models in his career: this looks like someone manipulating the input to a model to get the desired results.  Publicly such a modeler would proclaim: “Lookee here!  I crunched the tree-ring data and got this result!” – when the reality is that the modeler “fudged” and “corrected” the data until the results matched his preconceived and desired result.  Smoking code gun number one.

2.  These two places are where the “very artificial correction” is being applied. 

3.  This code limits the data in the results to the years 1401 to 1991.  It would be interesting to know why those particular years were chosen.  I'm especially intrigued by the 1991 cutoff, as I've read elsewhere that the tree ring data from the late 20th century doesn't support the notion that tree ring width is a good proxy for temperature.  Could this be the reason for the early cutoff?

1 comment:

  1. Question: where are the data files this code processes? The first one loaded is mxd.nwcan.pa.mean.dat, etc. but these files are apparently not in the FOI2009 archive. (They could be but in another name.)

    ReplyDelete