Skip to content

Comparing C#, C++, and Delphi (Win32) Generics

C#, C++, and Delphi all have a generic type and method language feature. Although all three languages are statically typed, they implement generics in very different ways. I’m going to give a brief overview of the differences, both in terms of language features and implementation. I presume that Delphi Prism generics work essentially the same as C# generics, which, as you’ll see, is different than Delphi/Win32 generics.

Let me say at the outset that although all three systems work somewhat differently, I don’t see an overwhelming advantage to any one design. Generally, you can do what you need to do in all three environments. I’m writing this article not to claim that any one system is better than the others, but to point out some of the subtleties in the implementations.

Before I get started, I’d like to thank Barry Kelly for his useful feedback on my first draft of this article.

Compiling Instantiations

Every implementation of generic types works via a two-step process. First, you define a generic type or method with a "placeholder" for a specific type, which will be substituted later on. Later (exactly when depends upon the language), the type is "instantiated." Note that instantiating a generic type is very different from instantiating an object. The former happens within the compiler, whereas the latter happens at runtime.

Instantiation is triggered when some code uses a generic type or method with a specific type parameter, and means that based upon the generic definition and the types or values passed when the generic is used, a specific implementation is substituted in order to allow the generation of machine code. Instantiation is one of the most important differences between real generic types and using non-generic types with casts. In the end, different machine code is generated for instantiations for different type parameters.

In C# and Delphi, there is a language feature which is solely dedicated to implementing generic types and methods. In C++, on the other hand, the "templates" language feature can be used to implement generic types and methods, among many, many other things. It is even possible to do general-purpose programming using templates, which C++ programmers call "metaprogramming."

C++ templates require the template source code to be available when the code using the template is compiled. This is because the compiler does not actually compile the template as a separate entity, but rather instantiates it "in-place" and only compiles the instantiation. The C++ compiler is effectively doing code generation, substituting the type parameter (or value) for the placeholder for the type, and generating new code for the instantiation. Update: Moritz Beutel elaborates on this in his excellent comment on this post. You should read the full comment, but the short version is that the manner in which templates are compiled can result in errors in the code which uses the template appearing (from compiler error messages), incorrectly, to be errors in the template itself. Moreover, the implementation of most C++ compilers makes this problem even worse than what is necessary in order to implement the C++ standard.

In Delphi and C#, on the other hand, the generic type or method in the code which uses the generic type or method can be compiled separately. Therefore, you can compile a library which contains a generic type, and later on compile an executable which uses a instantiation of that type and has a reference to the binary library, rather than to the source code for the library.

Another way to think of this difference is that in C++, a template will not be compiled at all until it is used. In Delphi and C#, on the other hand, a generic type or method must be compiled before it can be used.

In Delphi, the compiler uses a feature closely related to the method inlining feature. This causes the compiler to store the relevant bits of the abstract syntax tree for the generic type parameter in the compiled DCU. When the code which uses the generic type is compiled, this bit of the abstract syntax tree is read and included in the abstract syntax tree for the code which uses the generic type, so that when machine code is produced, based on the new, “compound” abstract syntax tree, it looks, to the code emitter, like the type was defined with the type parameter "hard coded." Instead of linking to compiled code in the DCU, the code which uses the generic type emits new code for the instantiation into its own DCU.

Because generic instantiation is performed in the same area of the Delphi compiler which does method inlining, there are some limitations on what you can do in a generic method, or a method of a generic type. Like inlined methods, these methods cannot contain ASM. Also, calls to these methods cannot be inlined. These restrictions are limitations of the implementation, not of the language design, and could theoretically be removed in a future version of the compiler.

C# generics use the .NET Framework 2.0+, which has native support for generic types. The C# compiler emits IL which specifies that a generic type should be used, with certain type parameters. The .NET framework implements these types using one instantiation for any reference type, and custom instantiations for value types. (Don’t confuse “instantiation” with “instance” in the preceding sentence; they mean entirely different things in this context. There are usually many instances of one instantiation.) This is because a reference to a reference type is always the same size, whereas value types can be many different sizes. Later, the IL will be JITted into machine code, and, as with compiled C++ or Delphi code, types don’t really exist at the machine code level. In .NET, generic type instantiation and JITting are two distinct operations.

So one important difference in generics implementations is when the instantiation occurs. It occurs very early in C++ compilation, somewhat later for Delphi compilation, and as late as possible for .NET compilation.

Custom Specializations

Another very important difference is that C++ allows custom instantiations, called specializations, including specializations by value. With C# and Delphi, on the other hand, the only way to instantiate a generic type is to use that type with an explicit type parameter. The implementation will always be the same, with the exception of the type of the type parameter. Because C++ allows custom instantiations, it is easy for a programmer to write different implementations of a method, for example, for different integer values. Like operator overloading, this is a powerful feature which requires considerable self-restraint to avoid abuse.

Constraints

Delphi and C# both have a generic constraint feature, which allows/requires the developer of a generic or method type to limit which type parameter values can be passed. For example, a generic type which needs to iterate over some list of data could require that the type parameter support IEnumerable, in either language. This allows the developer of the generic type to make her intentions for the use of the type very clear. It also allows the IDE to provide code completion/IntelliSense on the type parameter, within the definition of the generic type. Also, it allows a user of the generic type to be confident that they are passing a legal value for the type parameter without having to compile their code to find out.

In C++, on the other hand, there is not presently any such feature. A more powerful/complex feature called "concepts" was considered for, but ultimately removed from, C++0x.

An implication of the lack of constraints is that C++ templates are duck typed. If a generic method calls some method, Foo on a type passed as the generic type parameter, then the template is going to compile just fine so long as the type parameter passed contains some method called Foo with the appropriate signature, no matter where or how it is defined.

Covariance and Contravariance

Let’s say I have a function which takes an argument of type IEnumerable<TParent>. Can I pass an argument of type IEnumerable<TChild>; to that function? What if the argument type were List<TParent>; instead of IEnumerable<TParent>? Or what if the generic type was the function result rather than the function argument? The formal names for these problems are covariance and contravariance. The precise details are too complicated to explain in this article, but the examples above summarize the most common times you run into the problem.

Delphi generics and C++ templates do not support covariance and contravariance. So the answers to the questions above are no, no, and no, although there are, of course, workarounds, like copying the data into a new list. In C# 4.0, function arguments and results can be declared covariant or contravariant, so the examples above can be made to work where appropriate. "Where appropriate" involves non-trivial subtleties hinted at above, and exemplified by the fact that arrays in .NET have (intentionally) broken covariance. However, the BCL routines in the .NET Framework 4.0 have been annotated to support covariance and contravariance when appropriate, so developers will benefit from the feature without having to fully understand it.

{ 4 } Comments

  1. Jolyon Smith | October 1, 2009 at 2:30 pm | Permalink

    "so developers will benefit from the feature without having to fully understand it"

    A path paved with folly and edged with razors on one side and a precipitous cliff on the other, imho and ime. ymmv. :)

    Response: I view it as very much like ASM. My wild guess is that the majority of Delphi developers don’t understand ASM. Yet all Delphi developers use routines written in ASM, every day, and in every program they write. Will you be a better Delphi developer if you do understand ASM? Yes, certainly; even if you never write in ASM routine yourself, you will regularly debug through them. But can you get started and be a productive developer without learning ASM first? The evidence I’ve seen suggests yes.

  2. Moritz Beutel | October 1, 2009 at 3:32 pm | Permalink

    Great summary, thanks.

    On the C++ part, I think it should be noted that, while most compilers simply reparse the substituted template definition when instantiating it, the standard actually demands "two-phase name lookup", which requires that the template be parsed as far as possible, and all expressions which don’t depend on the template arguments be checked syntactically and semantically when the template definition is first encountered. I think only EDG’s frontend (which is being used for Intel’s and Comeau’s C++ compiler) implements this correctly.

    I understand that you’re concentrating on the technical implementation bits, but especially when mentioning concepts, I think that one major flaw of C++ templates is worth mentioning: constraint errors being reported as errors in the actual template. This was one of the major issues which Concepts were meant to address. As an example, imagine calling the std::sort algorithm, which requires you to pass random-access iterators:

    // -----
    std::vector <int> myInts (42);
    std::sort (myInts.begin (), myInts.end (), std::less <int> ());
    // -----

    This works fine - the code instantiates a vector of ints, sets the size to 42 and then sorts them; to access a specific element in the array, it uses direct index access via the first iterator, i.e. myints.begin()[n]. Now, the cool things about iterators is that their handling is independent from the type of the container they belong to, similar to IComparable/IEnumerable, but with all the polymorphism being resolved at compile time. So for most algorithms that only require uni- or bidirectional iterators, changing the std::vector to a std::list would just work fine. But as I said, std::sort() requires random access iterators (which provide an overload for the [] operator), so calling std::sort() with iterators pointing to a linked list will fail. Of course the error is in the user’s code: calling std::sort() on iterators without random access isn’t supposed to work. But the sad thing is that the compiler won’t tell you: e.g., all you get from BCC is two errors in system headers (sorry for German language):

    // —–
    Fehler E2093 E:\Programme\Weaver\\Include\dinkumware\algorithm 2220: ‘operator-’ ist im Typ ‘list<int,allocator >::iterator’ für Argumente desselben Typs nicht implementiert in Funktion std::void sort<list<int,allocator >::iterator,less >(list<int,allocator >::iterator,list<int,allocator >::iterator,less)
    Fehler E2285 E:\Programme\Weaver\\Include\dinkumware\algorithm 2220: Keine Übereinstimmung für ‘_Sort(list<int,allocator >::iterator,list<int,allocator >::iterator,undefined,less)’ gefunden in Funktion void sort<list<int,allocator >::iterator,less >(list<int,allocator >::iterator,list<int,allocator >::iterator,less)
    // —–

    This is because std::sort() is a function template, and as such it’s just reparsed for this instantiation, but with above type substitution, the reparse causes an error - which of course occurs in the actual template code. Note that BCC is one of the well-behaving compilers WRT template error messages - with a little experience, you can see what happened in above messages, and with Extended Error Info turned on the compiler will even tell you about the file which the instantiation occurred in. But not all compilers are that easy to work with. Just look at the output EDG generates for this code (you can try it online on Dinkumware’s site at http://www.dinkumware.com/exam/default.aspx):

    // -----
    #include <algorithm>
    #include <vector>
    #include <list>

    int main (void)
    {
    //std::vector <int> myInts (42);
    std::list <int> myInts;
    std::sort (myInts.begin (), myInts.end (), std::less <int> ());
    }
    // -----

    I won’t quote the output here, for it’s 37 error messages spread over a total of 863 lines, all of which point to the system header std::sort() is defined in.

    Also, while reading the bit on covariance and contravariance, though, I thought you should have mentioned Java there. While I think that Java’s generics are fundamentally crippled due to their use of type erasure (for the sake of JVM backward compatibility), they still handle covariance and contravariance nicely, something which I often miss in C++ and Delphi.

    Response: Thank you very much, Moritz, this is a great comment! I’ve taken the liberty of formatting your source code and replacing the angle brackets; please let me know if I’ve missed anything. Great point about the error messages; I will update the article to include that (with the clarification in your second paragraph). I really appreciate all the thought you put into the comment! Regarding Java, I didn’t include it because, unlike the other languages which I did include, I consider the implementation to be essentially broken by design (for precisely the reason you mention). It would’ve been very interesting to include Scala, but I would need to understand it better to say anything intelligent on the subject.

  3. John Moshakis | October 2, 2009 at 11:31 am | Permalink

    Delphi Prism has had covariance and contravariance for a while, this is the wiki page

    http://prismwiki.codegear.com/en/Generic_Variance

  4. 30km | November 16, 2009 at 6:18 pm | Permalink

    Very helpful resource for me to get started with Delphi generics. It is very different from C++ templates, and sometimes seems buggy:P

Post a Comment

Your email is never published nor shared. Required fields are marked *

Bad Behavior has blocked 713 access attempts in the last 7 days.

Close