Redating the Radiocarbon Dating
of the Dead Sea Scrolls

By Joseph Atwill and Steve Braunheim

Introduction

The first request for the application of up-to-date AMS carbon dating on Qumran documents was made by Professors Robert Eisenman of California State University Long Beach and Philip Davies of the University of Sheffield, England in a letter to Amir Drori, then Head of the Israeli Antiquities Authority, on May 2, 1989.  The letter stemmed from their frustration at the denial of an earlier request in March 16, 1989  , addressed to John Strugnell and copied to Mr. Drori, to gain access to the Qumran parallels to the famous Damascus Document (CD) and the general situation denying access to unpublished Qumran materials to scholars not part of the "International Team” or those favored for some reason by it.

In their letter describing and protesting this situation, Eisenman and Davies suggested that if Mr. Drori could not force the International Team to open access to the unpublished Scrolls, he could at least employ the recently developed methods of AMS carbon testing to the Scrolls, which had early on been dated by older carbon testing techniques that consumed too much Scroll material to be applied in any general fashion.  The new AMS C14 techniques did not consume so much material and therefore, could be used test the claims of paleographic analysis that were at the time being cited regarding the chronology of the Scrolls by members of “the International Team” as gospel.

In their letter, however, aside from sending an attachment detailing these new methods, they cited two caveats.  One was that the new methods of dating materials should be applied to determine relative not absolute chronology, that is, earlier versus later in the same test run -- absolute chronology in their view being virtually impossible to determine because of the multiple imprecisions to which C14 testing was subject.

To put this in another way, they framed their request in this manner because they did not believe that anything conclusive regarding the absolute dating of the Scrolls could be achieved with a technique as subject to multiple imprecisions as carbon testing was.  As a second caveat, they insisted that ‘opposition scholars’ be included in the process because they were the ones must likely to understand which were the key documents that should be tested and "they were the ones who felt the most need for it.”  In the event, neither of these caveats was respected.

Four months later in September of 1989, a spokesman for the Antiquities Authority announced that a run of carbon testing of samples taken from the Scrolls was to be undertaken.  Not only did he not state from where the idea to do such testing had originally come, but he put forth no statistical or historical methodology for determining which Scrolls should be tested and by whom.  Thus, while John Strugnell, then chief editor of the Scrolls Project, and Israeli scholars Magen Broshi, then Head of the Shrine of the Book, and ultimately Emmanuel Tov, who succeeded Strugnell, were among those named to oversee or be included in the process, no ‘opposition scholar’ was included or mentioned -- not even in an advisory capacity, though they were the ones who had originally called for the tests and presumably felt the most need for them.

The C14 tests that were done were conducted in two separate runs, one in 1989-91 by laboratories in Oxford and Zurich and a second in 1994 at the University of Arizona ( though general gossip has it that some earlier, seemingly inconclusive tests, were undertaken at the Weismann Institute of Science in Israel ).  Not incuriously, these were the same laboratories that had previously been selected for the C 14 testing of the Holy Shroud of Turin.

However these things may be, following the tests, the group controlling the process was governed by the belief that the C14 results -- which were on the whole inconclusive or to use the words of BAR’s reportage “skewed”  -- in some manner confirmed the accuracy of the results arrived at by those basing their chronological determinations on paleography.  This was clear not only from the two articles drafted after both runs, but also in press releases and interviews accompanying the announcements of the results in which the personal bitterness that has characterized the debate from the beginning was so evident.

In particular, these attacks focused upon Eisenman, possibly because, though he was the scholar who had initially called for the tests, they did not wish to follow his caveats or possibly because it had since become clear that he was one of the prime movers in the campaign to gain access to and free the Scrolls.  Not only did the framers of these articles directly attack his theories, but Magen Broshi took this attack to a new level of personalized invective in the press releases and news stories accompanying these reports, calling Eisenman  “ignorant,” "vain,” even worse, and describing his theories as “cranky.”   In this drumbeat of attacks on his person and theories, Professor Lawrence Schiffman of N.Y.U. ( who while not part of the team appointed to test the Scrolls was generally representative of network theorizing regarding them ) was quoted as referring
 to Eisenman’s position that there was a connection between the Scrolls and the movement we call “Christianity” as “a wholesale theft from the Jewish people.”

Despite this lack of scholarly collegiality and respect for opposing and dissenting theorizing, these reports and press releases accompanying the announcement of the results bordered on being taken as being “official. “  So influential were they that in academic papers around the world these two runs of AMS C14 testing were looked at as conclusively demonstrating when the Scrolls had been written -- and this was generally taken to be before 40 BC even though no such results were warranted.  Indeed, so widespread was their effect that many scholars began to argue that no “sectarian” documents were put in the caves later than a date of approximately 50BC – “sectarian texts” being generally considered to be the most important documents and, in particular, those documents representing the unique ideas of the sect or movement itself.

Since, with the publication of a large body of previously unpublished texts, the “Messianic” character of the sect or movement -- as it may be called -- was being more clearly acknowledged and characters within the extant literature, such as the widely renowned “Righteous Teacher” or the author of the Qumran Hymns (1QH), had distinct characteristics paralleling the New Testament’s “Saviour” or “Christ;” other scholars like Alvar Ellegard   and Michael Wise  took the ‘results’ as indicating that they had to look for "a Messiah earlier than Christ” -- as the quest then started to be called.

Review

The authors have now undertaken an independent review of the results of the two
rounds of carbon testing, in particular the second for which the actual raw data
upon which the analyses culminating in these results are based is more fully available

and we have determined that:

1)  In both the 1989-91 and 1994-95 AMS C14 dating runs an inaccurate dating curve
was utilized or, more succinctly, a dating curve that because of its imprecision has since come to be considered inexact.   This inaccurate dating curve for the 200BC-200CE period made the absolute dating indications for some samples appear older than they actually were -- this perhaps by a period of some fifty years or more.   Surprisingly, even though a majority of Qumran specialists worldwide have now been relying uncritically upon the interpretation of these results, no retractions or press releases have come forth from the group that issued the original reports based on this erroneous model.

2) The methods used in interpreting the meaning of the AMS carbon testing were also inaccurate from a purely statistical point-of-view.

3) The results did not rule out the various opposition theories of the kind put forth by
 scholars like Robert Eisenman, Norman Golb, Cecil Roth, G. R. Driver, Joel Teicher, Barbara Thiering, and John Allegro, but actually supported such theories in that they carried the dates of many of the “sectarian” or “extra-biblical” scrolls well into the first century CE, contemporaneous with movements such as that those called “Zealot” or “Sicarii” and the rise of early or at least proto-Christianity in Palestine.

Despite the heavy public relation blitz claiming the opposite, in fact the theories of
 these “opposition scholars” were in better alignment with the actual results of the tests than those of establishment scholars such as Roland de Vaux, John Strugnell, Josef Milik, F. M. Cross, Geza Vermes, Lawrence Schiffman, Emmanuel Tov, James VanderKam,  Emile Puech, F. Garcia-Martinez, and others.  In our judgment the group that drew the conclusions given in the several press releases above was simply biased ab initio and was confirming its own theories with its interpretations of the results.

4) Finally -- and this is a general statement -- carbon testing (and to some extent as a result the findings of paleography) is too imprecise a tool to provide conclusive evidence for a time span as short as the one at issue in the debate concerning when the sectarian  Scrolls were written.

ANALYSIS

Let us go into these conclusions separately and more fully.  In the first place, radiocarbon dating is only able to give approximate dates and its results, therefore, are given in units of mean and standard deviations -- known as sigmas -- that represent the statistical ‘range’ in which the mean date may fall.  The first sigma is the time span that radiocarbon dating theory posits would contain the actual date 68% of the time; the second sigma is a wider time span that would theoretically include the date 98% of the time.

Where it comes to analyzing the results of the carbon testing of Qumran documents, it should be observed  that these time spans or “sigmas” are not narrow.  Where the first sigma is concerned, the time span can range to over a hundred years.  When the second sigma is taken into consideration, this time span can extend to well over two hundred years.  Right from the start, this is well beyond the margin of error required to date individual Scrolls with the accuracy necessary to affect the present chronological debate or arrive, for instance, at absolute dates.

The groups that oversaw the two recent rounds of AMS C14 used the inexact pre-1998 dating curve in calculating these sigmas.  In addition, they presented the time range  arrived at by prior paleographic analysis -- analysis that was begun primarily by Frank Moore Cross and Josef Milik, but was now being carried forward by their inheritors, like Emmanuel Tov, Geza Vermes, James VanderKam, Lawrence Schiffman, Emil Puech,  and others.  The relationship between these two sets of time spans was, presumably, the basis for the assertion by many of these persons that the carbon dates ‘confirmed’ the reliability of paleography where the scrolls were concerned.

The chart below presents this data together with the distance in years between the mean of the carbon dating age and the median of the paleographic age, which we have labeled as “Amount of Error” --  we are using here mainly the results arrived at in the second 1994 run of carbon dating, the raw data concerning which is more complete.

Sample          Carbon age           Paleographic age       Amount of Error
4Q266        45BC-AD120           100-50BC                   +117years
( The last column of the Damascus Document – not paralleled in CD from the Genizah )
1QpHab       88-2BC                     30-1BC                     -30years
( The Habakkuk Pesher from Cave 1 which mentions the Righteous Teacher, Liar or Spouter of Lying, and the Wicked Priest )
1QS             206BC-AD111      100-75BC                   +18years
( The Community Rule from Cave 1 )
4Q258        95BC AD122              100BC                 +113 years
( Material from the Community Rule from Cave 4 )
4Q521           93BC-AD 80         100-80BC                    +83 years
( A Messianic text known as  “The Messiah of Heaven and Earth” or “A Messianic
Apocalypse”)
4Q267           94-45 BC                50-0 BC                      - 94years
( Another, supposedly “early” copy of the Damascus Document, including supposedly its
 opening column )
4Q208       186-92BC                    200BC                       +61years
( The Astrological Enoch – this in theory should be an early document )
4Q22          207BC-AD63         100-25BC                     -8 years
(  An Exodus manuscript in what is known as “paleo-Hebrew” script – this in theory
 should also be early, if it is not simply a late copy )
4Q2          120BC-63AD           50BC-AD50         -23years
(A biblical patch from Genesis found in Cave 4)
1QIsa            356-103BC              150-125BC       -97years
( One of the Isaiah scrolls from Cave 1 )

The 1995 report  also included the radiocarbon date range for 4Q171.  This is the Psalm 37 Pesher, a document also mentioning not only the Liar, but the Righteous Teacher and the Wicked Priest as well.  The report did not, however, include its paleographic range, though it had previously been determined by such analysis to be of the same “Herodian semiformal script” as 1QpHab, which was dated to 30-1BC.  Had its paleographic age been included, its “Amount of Error” would have been close to the average of those given above, which is 66 years.

Sample                   Calibrated age       Paleographic age
DSS-7 4Q171          AD5-111                 none given

What is the weakness of these test results, even as they were originally announced as per the above table?  As already alluded to, using the paleographers own data, the average amount of error between the radiocarbon and the paleographic medians for the samples is virtually the same length of time as the parameters of the debate of when the Scrolls were written.  Generally speaking, a measuring instrument that has an average degree of error as large as the phenomenon it is designed to measure is largely without use.  It follows, therefore, that the results of the radiocarbon tests cannot be seen as “confirmation” of the accuracy of paleographic analysis with respect to the debate of when the Scrolls were written.

On the contrary, if the radiocarbon dates are accurate, they indicate that the technique does not have the sensitivity to delineate between the time frames at issue.  In the paper presenting the results of the first run of carbon testing in 1989-91, published in Radiocarbon, the authors, Bonani, Broshi, Stugnell, Wolfi, et. al., stated that “the radiocarbon dates are in good agreement with the estimates based on paleography” and that “the results confirm the reliability of paleography”,  but this is misleading and those making such assertions were simply using a different definition of “reliability” than the one required for the technique to be useful in this debate.  We shall say more about this later.

One scroll in particular, the Habakkuk Pesher (1QpHab), is often cited as proving that the sectarian texts were written before 40 BC.  It was dated in the initial run of carbon testing to a first sigma of 104-43 BC, though according to the newer calibration this first sigma should rather read 88-2 BC.  As noted above, this Scroll mentions the three central figures of Scroll polemics, the Righteous Teacher, the Spouter of Lying, and the Wicked Priest.  Scholars like Geza Vermes take the last two of these as equivalent, but this is nowhere proven in the extant texts.  Vermes too, summarized the general position concerning the Habakkuk Pesher on the part of consensus scholars quite well:  “If the carbon dating (of 1QpHab) establishes a terminus ad quem prior to 30 CE, this will damage almost beyond repair the hypothesis proposing a Christian connection”, i. e, he means here the connection between the principal sectarian scrolls and Christianity, whatever he purports to mean by “Christian”.
This statement is inaccurate.  In the first place, not only did the 1998 recalibration, as we saw, substantially move the Scroll’s first sigma forward in time, but, Vermes also neglects to mention that Habakkuk is paleographically equivalent to the Psalm 37 Pesher (4Q171 -- it is also parallel in terms of content ).  But the first sigma of the Psalm 37 Pesher, even according to the 1994-95 calibrations, was 22CE-81CE -- 22CE-78CE using the 1998 calibration curve.

In our view, not only are these last named dates chronologically in synch with theories like Eisenman’s, Roth-Driver’s, or others, they represent the more likely date of composition of Habakkuk as well, given the numerous cleansings and undoubted impurities that seeped into the process to skew the results of a document as worried over as Habakkuk, until recently on display in the Shrine of the Book.

Disparities such as these illustrate the vagaries of applying C14 results to confirm paleographic attempts to absolutely date a document’s terminus ad quem.  In this instance, for two documents as typologically similar paleographically speaking and content-wise as 1QpHab and 4Q171, one ends up with a C14 first-sigma dating range of between 88 BC and 81 CE.  When one takes the second sigmas into account, these results diverge by yet another 100 years.  This creates a potential range, radiocarbon speaking , of almost four hundered years for two documents that according to the estimates of paleography were written at approximately the same time!

None of these constraints were even signaled by those who hurried to proclaim the results of paleography proven by the recent run of C14 dating and, in particular, that the sectarian Scrolls were all written before 50 BC.  As already noted too, many such persons derived a second result from all this -- that the Scrolls were put into the caves before 40 BC as well.  Unfortunately for these assumptions, a fragment, recently identified by Hanan Eshel as being from Cave 4, gives dated evidence of a contract carrying the name of a High Priest and date of approximately 46-47 CE; thus giving vivid internal evidence that negates any idea that the documents were deposited in this cave prior to this time.   This is an instructive example of what is can be meant by relying on the internal evidence – of what the documents themselves state -- rather than the external in debates of this kind.
Of  the Scrolls that have been radiocarbon dated, only nine can be seen as in any way relevant to the question regarding whether the sect was active during the first century CE or not.  These are: 11QT, 1QH, 1Q266, 1QpHab, 1QS, 4Q258, 4Q171, 4Q521 and 4Q267.

The following table gives the Carbon dating one-sigma time range for the death of the animal whose skin that was eventually used to produce the Scroll in question.  It should be remarked that these dates are terminus a quos, since they only measure when the animal whose skin was eventually used died not when the Scroll was actually written.  The table gives two calibrated ranges, one based on the 1986 calibration curve then in effect   and the other the one from 1998 we have already referred to above.  As noted, this new calibration curve produced a significant change for the range of documents like 1QpHab and 4Q267 ( paleographically speaking, considered the earliest fragment relating to the Damascus Document).

Scroll                              1998 Calibration        1986 Calibration
11QT (Temple Scroll)      53 BCE- 21 CE          97BCE-1 CE
1QH                                  37 BCE-68 CE            21BCE-61 CE
1Q266                               4-82 CE                      5-80 CE
1QpHab                            88-2 BCE                   104-43 BCE
1QS                                  116-50 CE                  159 BCE-20 CE
4Q258                              36BCE-81 CE             11 BCE-78 CE
4Q171                              29-81 CE                     22-78 CE
4Q521                              39 BCE-66 CE             35B CE-59 CE
4Q267                              168 – 51 BCE              172-98 BCE

As explained, these one-sigma distributions represent a 68% probability that the actual date lies within the specified range.  For claims based on results that may include measurement or other types of error, however, it is more instructive to use the 98% percent confidence intervals.  The plot given below illustrates with a veridical rectangle the one-sigma distributions for the more recent calibration (given above) along with the wider 98% percent confidence interval of the calibrated data represented by the line extending from the rectangle.

Where 1QpHab and 4Q267 were concerned, the 1998 recalibration was particularly significant as it brought both of those Scrolls’ two-sigma range well into the first century CE.  In every case, the 98% confidence interval for all scrolls, including not only the Damascus Document but also the Habakkuk and Psalm 37 Peshers, encompass a date when even normative Christians believe the figure known as “Jesus Christ” was alive.  This underscores the point that even the AMS C14 test results from previous two runs that were done cannot be used to decouple a theoretical relationship between the Qumran Community and Christian origins in Palestine, which was the thrust of the general presentation of both the 1991 and 1995 papers.  As can be seen, the raw data provided by the 1998 recalibration in and of itself provides support for the premise that the community producing the literature at Qumran was active in the first century CE.

There is, however, reason to believe that the reported standard deviations in the C14 measurements of the Scrolls do not represent the true variation within these measurements.  This is because only a single sample from each scroll was used in a
majority of the work.  This includes both 1QpHab and 4QpP37.  As argued by  N. L. Caldararo  , when using only a single sample any variation that would exist between different samples that came from the same host is lost and the imprecision of
the measurement technique becomes the predominate contributor to the reported  variance.

It has been shown that repeated measurements from different samples from the same host are required in assessing the true sample variance.  As demonstrated by R.E.M. Hedges   on well-controlled samples, the sample-to-sample variation was found to be a substantial portion of the overall variance in multi-sample tests.  In the case he cites the best overall standard deviation achieved was +/- 45 years, although it can be significantly larger.  This was for measurement precision originally established as +/- 20 years.

In other words, there can be great differences between samples taken from the same host.  These must be included when calculating the range of the sigma.  If only one sample is taken, this variance is not accounted for and the resulting sigma is less accurate than one obtained from multiple samples.

This brings us back to our fourth overall point.  The uncertainty surrounding C14 dating generally is comprised of several variance components.  These include the precision of the test on a single sample, variation from sample to sample from a given source, and a variety of other unknowns such as possible calibration error and the uncertainty, also alluded to above, regarding the period of time between death of the animal whose skin was used for the parchment being tested and when that parchment was written upon.

In general, the different contributions to uncertainty add up according the equation:

 S2 total = s2precision  + s2reproducibility + s2other

Therefore, if sample-to-sample variations and other unknowns are left out of the analysis of C14 dates -- as both the 1991 and 1995 reports and reported comments to the press by  scholars such as Tov, Broshi, Schiffman, and Vermes did -- the conclusions are rendered inaccurate in proportion to the degree described by the above equation.

For example, suppose multiple samples from the appropriate scrolls had been run and the reported variance had increased only by a factor of two ( the fact that the Scrolls have been contaminated by various cleaning solutions and samples were taken largely only from frayed edges make it is reasonable to assume that the actual overall variance would be even larger than this ); then the calibrated standard deviations and confidence intervals given above would need to be recalculated based on the corrected standard deviation of the new measurements.  The graph below gives a picture of what the theoretical results emerging from such a process might look like.

Here even the one-sigma ranges for 1QpHab and 4Q267 move into the first century.
Work done by M. Baillie suggests that the complex structure of the C14 calibration curve creates more error in the calibrated ranges than suggested even in the aforementioned calibrations.  As a general rule of thumb, Baillie states that the calendar age range is typically 100 years for high-precision dates (+/- 20 years).  This agrees more closely with the multi-sample correction provided above.

In a recent issue of Science it was pointed out, for example, how a “recalibration of Carbon 14 dates…indicates the Ururk period lasted a minimum of 700-800 years.” The latter period had appeared formerly to have been “relatively short-lived”.   Two key words emerge here: “recalibration” and “appear”.  Most scientists in the physical fields think of “calibration” as something hard, that is, units of measurement which can be traced back to the National Bureau of Standards or its equivalent having accuracies in terms of a fraction of – say angstroms for linear units and equally miniscule units for other measures of weight, volume, etc.  In these cases calibration is a ‘real’ process involving a real statistical concept of accuracy.

In sciences like the historical field, calibration takes on another meaning having to do with placing an event in context or attempting to create a chronology that makes sense given all the facts.  Radiocarbon dating fits into the second class not the first and, as such, typifies the difference of what can be meant by words like ‘accuracy’ and/or ‘precision’.  ‘Precision’ is a statistical concept that allows one to make inferences between two measurements, one say 5.987units and another say 6.012 units.  Given the ‘precision’ of his work in such a context, an analyst may be able to accept or reject a hypothesis with some specified degree of probability and confidence.

On the other hand, ‘accuracy’ is the statistical concept that enables an analyst to tell the difference between, say 1unit and 10 units.  Carbon 14 testing falls within the latter category.  If the analyst is not careful, each successive wave of recalibration can stand history on it’s head, as for instance in the above example from Science.Magazine.  Carbon 14 testing can certainly tell one from which epoch a given fragment of carbon-rich scrap may have come but not much more.  Claims like the ones that are sometimes heard in the field of Dead Sea Scrolls studies, that for instance Carbon 14 testing may have a ‘precision’ of 30 years belie the fact that it may not have an ‘accuracy’ of even 200 years!

 Numerous analysts have attempted to calibrate a proper Carbon 14 dating curve for the period 200BCE to the 200 CE.  Some have suggested using a “spline” (a drafting instrument similar to a ship’s or French curve) to fit the disparate data.  The more analytical suggest using fourth-order polynomials, each propounding the merits of his or her own approach and the lack thereof  of someone else’s.

Regardless of which method is used, however, a quick glance at their curves and the ‘fit’ to the data they achieve quickly reveals that none are particularly accurate and certainly none sufficiently precise to draw the conclusions drawn by those disparaging works, such as those by Eisenman or others of an ‘opposition’ mindset.  In the timeframe represented by the documents from Qumran, it is simply ‘dealers choice’, that is to say,  pick the one that supports your own arguments and toss the others aside.  Surely there has to be a better way of making such determinations.

Our purpose here has not been to support any given theory, but simply to demonstrate that the radiocarbon testing that was done vis-à-vis the Dead Sea Scrolls did not rule out any given theory, and tendentious claims to the contrary, based upon amateur evaluations of inaccurate data are simply that – tendentious claims to the contrary.  In particular, the claim by one well-known Qumran scholar that a single such C14 result would “damage almost beyond repairs the hypothesis proposing a Christian connection” is not only inaccurate in fact; it is wrong in principle.  This is because when dealing with an array of items expressed in units of  probability, the results of the entire sample must be considered.  A statistical outlier – that is, a single result that is outside a pattern determined by multiple other results -- is always possible and no single data point can ever produce information that has greater meaning than that provided by the array that contains it.

Therefore, the language that specialists such as Professor Vermes might choose to employ notwithstanding, it is self evident that the complete array of C14 dates shown above does not preclude the possibility that the Qumran Sect had a “Christian connection” in the slightest.  Two samples were included in the collection that were taken from scrolls that were known to have been produced at the time of Simon Bar Kochba’s revolt against Rome.  They were used as a control to determine the accuracy of the radiocarbon dating.  Both of these samples produced errors of over one hundred years.

Sample                           Calibrated age       Paleographic age     Amount of Error
DSS-52 Kefar Bebayou    AD144-370       AD135 (dated)        +122 years
DSS-53 5/6 Hev 21           AD 86-314        AD128 (dated)       +100 years

It should clear that these results too make questionable how useful C14 dating can be in attempting fix absolute dates within a time span as narrow as the one at issue where the Dead Sea Scrolls are concerned.

In closing, it should be observed that C14 dating of a range of objects can only be meaningful if the samples that are tested are selected in an objective and scientific manner.  This was obviously not the case for the samples used to produce the results reported in 1995 in Radiocarbon.  As the authors stated in a press release announcing these results, the method used to select the Scroll samples was as follows:  “Most of the (samples) had been suggested to us by colleagues who had special interests in C14 analysis of particular texts.”  It is unacceptable that the Scroll samples were selected on the basis of the special interests of colleagues and not on a methodologically sound basis open to the general community of scholars.  Such an approach is not only unscientific, but inevitably leads to speculation about the interests involved.

However this may be, the C14  test results did not demonstrate the reliability of paleography.  On the contrary, when taken as a whole the C14 dates showed that neither paleography nor C14 dating is a sufficiently precise enough tool to contribute conclusively to the debate over the accurate dating of the Scrolls.  Moreover, C14 dates generally support and do not preclude the premise that some of the Scrolls were produced well into the First Century CE.

The End