Skip to content

Multiversion Concurrency Control Before InterBase

I’ve been doing some reading in the area of concurrency theory, and, interestingly, have found some citations on the use of multiple record versions for concurrency control and recovery which predate InterBase.

The Birth of InterBase

First, some InterBase history. According to this timeline Jim Starkey started writing JRD — a personal project which would later grow into InterBase — between 1981 and 1984 while at DEC. Ann Harrison states that:

[Jim] began playing with shadowing, which he saw as a way to provide a repeatable read without blocking updates. Then, one morning in the shower, he realized that the shadows could be also prevent update conflicts and undo failed transactions.

Groton Database Systems was founded in 1984, and InterBase as a product went into beta about a year after that.

So InterBase’s implementation of multiversion concurrency control and recovery was conceived sometime between 1981 and 1984, and implemented by 1985.

Multiversion Concurrency Control In Computing Literature Before InterBase

Multiversion concurrency control is described in some detail 4.3 and 5.5 of this 1981 paper by Philip Bernstein and Nathan Goodman — then employed by the Computer Corporation of America. [Note: An ACM Portal subscription is required to read the full text of the paper.] Bernstein and Goodman’s paper cites a 1978 dissertation by D.P. Reed which quite clearly describes MVCC and claims it as an original work. Reed’s paper is cited by 66 other authors, according to the Guide to Computing Literature, and Bernstein and Goodman’s, 180 times.

Conclusions

It seems that the idea of multiversion concurrency control and recovery is somewhat older than InterBase, at least insofar as discussion in technical papers is concerned. Working implementations were somewhat harder to come by. DEC’s Rdb/ELN was a commercial database using MVCC and was released just prior to InterBase, but it was also designed by Jim Starkey in the same time period. Other than that, the only earlier working example I can find is a noncommercial application discussed in Reed’s paper.

Jim Starkey notes in the comments section that he was unaware of Reed’s paper, despite having discussed the subject with Bernstein, and came up with the same idea independently, later on.

{ 11 } Comments

  1. Kjell Rilbe | February 21, 2005 at 10:31 am | Permalink

    So, any risks of patent infringments?

    Kjell

  2. Craig Stuntz | February 21, 2005 at 10:34 am | Permalink

    If anyone has patented the technique, it’s news to me. Of course, with the way the USPTO is run, someone could probably patent it tomorrow….

  3. Ann Harrison | February 21, 2005 at 11:46 am | Permalink

    The patent issue is moot. The USPTO has many flaws, but it does require that an invention be patented within a reasonable amount of time (1 year?) after its first commercial use. Rdb/ELN was released in 1984, using exactlly the same MVCC that InterBase uses, so that window is well passed.

  4. Craig Stuntz | February 21, 2005 at 12:00 pm | Permalink

    Ann is correctly describing U.S. patent law; my comment about what the USPTO actually does was (mostly) intended as humor.

  5. Jim Starkey | February 21, 2005 at 12:18 pm | Permalink

    Nat and I overlapped at CCA in the mid-1970s before I went to DEC. The SDD-1 project with Dave Shipman, and Jim Rothnie came much later. Bernstein collaborated, but was then an assistant professor at Harvard, not a CCA employee.

    Bernstein was later hired by DEC, who had funded much of his earlier research. He had published a claiming there only two or three four ways to concurrency control, only so many ways to recovery, etc., so there were only a small number of ways to write a database system, so people were wasting their time thinking out it any. We argued quite a bit since Rdb/ELN essentially broke his argument. He eventually conceded the point (this was around 1983/84).

    He never mentioned an earlier paper on multi-generational concurrency control, and I was certainly unaware of it. It would have been very interesting reading.

    As for pre-Interbase commercial use of multi-generational concurrency control, of course there was one — DEC’s Rdb/ELN. Not only did it use much the same technology as InterBase, it was also called JRD.

    The inspiration for multi-generational concurrency control was a database system done by Prime that supported page level snapshots. The intention of the feature was to give a reader a consistent view of the database without blocking writers. The idea intrigued me as a very useful characteristic of a database system. Sometime later, and I member the exact instant, it occurred to me that a single mechanism could be used to provide a static snapshot, provide a transaction recovery mechanism, and handle concurrency control. The DEC JRD started a few days later as an advanced development project, and was later picked up by DEC’s storage systems as the software core for a database machines based on the HSC disk controller, code named "Hawk". The first product in the family was Jayhawk, a dedicated server implementation on the MicroVax-II using Dave Cutler’s VAX Eln operating system. Cutler took a fancy to JRD and decided he’d rather have a full blown relational database than an ISAM system, so we agreed to do a first release on ELN and ELN’s development system, VMS.

  6. Craig Stuntz | February 21, 2005 at 12:33 pm | Permalink

    Jim, thanks for your comments. I’ll update the post to remove some of the ambiguity / speculation. In case you’re still interested in reading Reed’s paper, note that I have linked the full text.

  7. Craig Stuntz | February 21, 2005 at 12:48 pm | Permalink

    I have updated the post based on Jim’s comments. The last two paragraphs originally read:<blockquote>It seems that the idea of multiversion concurrency control and recovery is somewhat older than InterBase. I haven’t found any commercial implementations of MVCC prior to InterBase, but Reed seems to have written a noncommercial application.<br>

    <br>

    It not clear whether or not any of this previous work influenced InterBase, or whether Jim Starkey came up with the same ideas independently, later on.</blockquote>

  8. Duke Ganote | October 10, 2005 at 1:10 pm | Permalink

    Ken Jacob’s history of the Oracle DBMS claims that Oracle Version 4, dated Oct 1984, had "read consistency" / MVCC. See http://www.oracle.com/technology/oramag/oracle/03-may/o33drdba.html

  9. Craig Stuntz | October 10, 2005 at 1:56 pm | Permalink

    Duke, read consistency isn’t *necessarily* the same thing as MVCC; SQL Server does it today without multi-versioning, and from the sound of Jim Starkey’s comments Prime used a limited form of multi-versioning for reading only. I don’t know what technology Oracle used in 1984, though.

    Most people, I think, interpret MVCC as an alternative to two-phase locking for both reading and writing data.

  10. Duke Ganote | October 10, 2005 at 3:52 pm | Permalink

    Concur: "read consistency" is the goal — MVCC is one means of achieving it; locking is another. However, I’m guessing Oracle was using MVCC in Version 4. Note the latter part of Jacob’s description of the *prior* version: "Oracle version 3 also introduced nonblocking queries, using data saved in a "before image file" for both queries and transaction rollback, thus avoiding the use of read locks (even though its throughput was limited by use of table-level locking)."

  11. Duke Ganote | October 10, 2005 at 5:34 pm | Permalink

    Jacob’s additional comments on Oracle v3 can be found at

    http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:49361026453314#49411709820488

{ 1 } Trackback

  1. [...] http://blogs.teamb.com/craigstuntz/2005/02/18/2699/               – Multiversion Concurrency Control Before InterBase [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

Bad Behavior has blocked 713 access attempts in the last 7 days.

Close