Saturday, January 27, 2007

Parameter passing in C#

So there I was, merrily browsing the Internet when I came across this fantastic site on topics such as:

  • Implementing a singleton pattern in C#,
  • Type initializers,
  • Static constructors,
  • Delegates and events,
  • And as my title suggests: Parameter passing in C#.

In short, all those things you rarely need to know to get your job done, but that separate mediocre developers from good ones. The site is by Jon Skeet and the articles are informative, well researched, well explained, well written. Here's the C# part of his site:

http://www.yoda.arachsys.com/csharp/

The article that caught my attention made sense to me, but being a very visual person I couldn't help but think that some pictures could really help illustrate the points. So without further ado, I illustrated the article. You probably don't need to read the article to understand this post - but you should:

http://www.yoda.arachsys.com/csharp/parameters.html.

Note: you can click the images to get a clearer view.

1. Value Types

Notice that the values live inside the box which will not be the case for reference types. Also, the assignment operation copies the value inside of the box, this is important to compare with reference types.

Quiz: What is the result of the WriteLine statement?

Answer: 5

2. Reference Types

Variables that hold reference types actually hold a reference to a location in memory (on the heap). So assignment operations copy the address. Notice this is still consistent with diagram #1, the copy operation copies the value inside of the box.

Quiz: What is the result of the WriteLine statement?

Answer: hello world

3. Immutable Reference Types

Immutable reference types like strings behave just like regular reference types except they don't provide a way to change their value.

Quiz: What is the result of the WriteLine statement?

Answer: hello

4. Value Types Passed by Value

Passing a variable to a function by value is equivilant to instantiating a new variable and assigning it to the first (well, ignoring scope issues and such). Notice that the diagram below is nearly identical to diagram #1.

Quiz: What is the result of the WriteLine statement?

Answer: 5, same as #1

5. Reference Types Passed by Value

In #4 I said passing a variable to a function by value is equivilant to instantiating a new variable and assigning it to the first. Is that still true of reference types? Yup. And did you notice there's an implicit assignment statement when passing by value? As you'll see shortly there won't be when passing by reference.

Quiz: What is the result of the WriteLine statement?

Answer: hello world

6. Value Types Passed by Reference

Passing by reference doesn't involve an implicity copy, instead it instantiates the inner variable to the address in memory of the outer variable. Then all references to the inner variable are implicitly dereferenced for you and voila, magically you're changing the value of the outer variable.

Quiz: What is the result of the WriteLine statement?

Answer: 10, and notice how different the diagram and results are than #1 and #4.

7. Reference Types Passed by Reference

Really this is no different than value types passed by reference (#6), except calling sb.Append() from an inner variable is dereferenced once to get to the outer variable and again because the outer variable is itself a pointer.

By the way, when you get to the section in Jon's article called:

"Sidenote: what is the difference between passing a value object by reference and a reference object by value?"

Please read it carefully, it's an extremely good point. It can be sumed up by comparing the final assignment statement above (Reference Types Passed by Reference) to the final assignment statement in in diagram #5 (Reference Types Passed by Value). It's a subtle, but important difference.

Oh and the quiz, what is the value of the Console.WriteLine in #7?

Answer: NullReferenceExceptinon – Object reference not set to an instance of an object

Still confused? Then I didn't do my job right, since this is the point in the article when I thought pictures would help. So please post your thoughts whether it makes sense or not.

- Lee

23 comments:

Nandu said...

The Answer for Quiz in section 3 ("3. Immutable Reference Types") should be just "Hello" NOT "Hello World".

Lee Richardson said...

Good catch nandu. Got the diagram right, but the text wrong. I've fixed it now.

Zytan said...

You just taught me that strings are immutable references, cool. Ok, but for point #5 you said "did you notice there's an implicit assignment statement when passing by value? As you'll see shortly there won't be for reference types". I assumed the implicit assingment was the blue arrow, which I see for both value and reference passing. So, I don't follow this part of the article. Great pictures, though, they help a lot!

Zytan said...

Ah, wait, the text in #5 "And did you notice there's an implicit assignment statement when passing by value? As you'll see shortly there won't be for reference types" is referring to #5 itself. "did" means in the past, so I thought you meant in #4, that should be fixed. And "you'll see shortly" doesn't mean the image coming up in #5, it means in #7 (Reference Types Passed by Reference). That's ambiguious. And "reference types" should be "Reference Types Passed by Reference" since both #5 and #7 are about reference types. Ok, was that correct?

L Kujonewicz said...

I think your diagram in #3 is incorrect, from what I understand. An immutable reference type such as a string works differently, as you say. But your diagram doesn't explain why/how s2.append ends up not affecting the string that s1 points to.

I think in the second step, rather than showing s2 and s1 both pointing to the same string, you should show that there is an implicit creation of a new string, and s2 is given a reference to it. That way, when you mess around with s2, you are affecting another instance of a string. The #1 and #3 steps of your diagram still make sense, though.

Please correct me if I'm wrong.

L Kujonewicz said...

And actually, I'm a bit confused about the last step in #7. There is a lot of implicit de-referencing going on, so I want to be clear.

When you set sb2 = null, what does that really do? Does it set the thing that sb2 points to, to null? Or does it set sb2 itself to null (and is there a difference?)

At the very end, does sb2 still point to sb1 (which now contains a null pointer?). Or does it point to null itself?

I'm guessing that because these reference types were passed BY REFERENCE, that whenever you mess with them, everything is automatically de-referenced all the way up the chain. So perhaps the final diagram should show both sb1 and sb2 containing null in their boxes?

Lee Richardson said...

zytan,

You're second post is absolutely correct. I changed the text to read "there won't be when passing by reference." Thanks for pointing it out.

Lee Richardson said...

L Kujonewicz,

Regarding your first post, I had to research your point to double check whether:

string s1 = "hello";
string s2 = s1;

Results in one "hello" on the heap or two. It turns out that if you compare the two addresses via:

Console.WriteLine((object)s1 == (object)s2);

The result is actually true (you have cast to avoid the string operator overloading). So there is only one "hello" on the heap.

It turns out, however, that even:

string s1 = "hello";
string s2 = "hello";

Will return true if you compare the two addresses, because of String Interning. Here's a great article on the details:
http://www.codeproject.com/books/0735616485.asp

Lee Richardson said...

By the way, if you liked this article you may also like my Boxing and Unboxing article.

Daniel said...

Dear Mr. Richardson.
I'm currently learning the basics of C# and I must say that your article has clarified many issues that has confused me in the past regarding reference types, and the pictures certainly helped in the process. So thank you so much for this article!

Regarding #6, I believe that the code in the image is missing a line of code. After the declaration of the method Change, there are two operations, and there's no code for the second one. I believe that the line missing is "j = 10;".
Please check that segment of the article and change it (or respond and tell me why I'm mistaken).

Lee Richardson said...

Wow, I can't believe no one spotted that before now. Nice job Daniel, you're absolutely correct. I'll update that shortly.

Vipin said...

First of all, THANKS for providing such a useful and beautifully designed post. With this I have got all my confusions regarding "value" and "ref" cleared.

However it took me some time to understand #6 [Value Types Passed by Reference] and #7 [Reference Types Passed by Reference]. Because in existing pictures j [in #6] seems like a NEW ref variable pointing to i. So as sb2 [in #7] seems like a NEW ref variable pointing to sb1 and that confuses how
sb2.Append(" world");
a ref [sb2] to ref [sb1] changes value of an object pointed by first ref [sb1].

Since there is no new memory is allocated when passing by "ref" there should not be a seprate box for j [in #6] and sb2 [in #7]; instead the BOX for i and sb1 should be shown as pointed by j and sb2 respectively. For difference, i and sb1 should be in light [gray] color while j and sb2 should be in dark [black] color stating that the same location is accessible in function using these darker variables.

JaneClark said...

Hi,
Its a great article that you have. It explains the concept step-by-step, reaching in the end the most difficult part (Reference Types Passed by Reference), which I was actually looking for.

Shall bookmark the page right away..for future reference! :-D

Am highly thankful to Jon & U for your efforts. Thanks.

Sergey said...

Thank you for a good article. It was a question for me if string parameter is passed into function as a value type (what would be very non-productive). Now I quess string is duplicated within function only if it is changed there.

Lee Richardson said...

Sergy,

Yup, strings are definitely reference types, just immutable ones. So it sounds like there might be some duplication, but there really isn't because of string interning. Check out my post to L Kujonewicz (8 posts down) which references the following article: http://www.codeproject.com/books/0735616485.asp.

Kala said...

Hi Mr. Richardson,
I really enjoyed reading your article. I always love to learn by drawing pictures - for every math problem i do i draw pictures and make it very visual. i am a C++ programmer and i have always drawn pictures like this to understand pointers and passing parameters. So i really really appreciated what you have done for people like us. one picture is equivalent to 1000 words! thanks again for your wonderful article.

I have a question. this is in combination with #3 and #5. What happens when the immutable string is passed by value?

string s1 = "hello";
Change(s1);
Console.Writeln(s1.ToString());

void Change(string s2)
{
s2 += " world";
}

From what i understand from all your explanations, when Change method is called, s2 is allocated on the stack and gets the address of "hello" (will have the same value as s1 - i.e s2 will be 'pointing' to "hello". but when the statement s2 += " world"; is executed, a new string "hello world" is allocated on the heap and s2 now points to that. When we leave the Change method, s2 is deallocated on the stack. Is this what really happens? am i understanding it correctly?
So will the Writeln put out the answer hello?
Am i correct in understanding it this way?

Regards,
Kala

Kye said...

Pictures do help immensely with understanding value objects and reference objects, thanks for the great article.

I was playing around with Visual C++ recently, and noticed something interesting. Reference objects in C# are managed pointers (or whatever they're called) in C++. You can tell because they have ^ after them. Suddenly, passing reference objects by value makes sense if you think about it from C++.

Here's what I think is going on. Let's say a reference object is a pointer to the object in memory. You pass this pointer by value. C# automatically dereferences the pointer for you when you work with it. someString.replace() might be something like someString->replace(). You know that if you assign this pointer to something else (another object, null, new object), it doesn't affect the original pointer. So you can modify the contents of the reference object because of automatic dereferencing, you just can't change what it points to.

So, basically, reference objects are pointers and dereferencing is automatic. Did I get it right?

Dale said...

One thing that is missing in your discussion and Jon's is the effect of new() on reference parameters.

I know it has unique behaviors - I just can't remember the details - which is why I came to your article and Jon's on this topic - so I'll have to keep looking. :)

Thanks for an otherwise very good article.

Anonymous said...

Thanks Riachardson, very useful article. one quick question when one would use between #5 and #7 ? since end result is same ,is there an advantage one over the other ?

Anonymous said...

Anonymous above^:

#5 and #7 are not the same. You must read the rest of the article.

Mike Durthaler said...

Lee, I'm confused about a semantics point (in the English sense)

Passing by reference doesn't involve an implicity should be implicit :) copy, instead it instantiates the inner variable to the address in memory of the outer variable. Then all references to the inner variable are implicitly dereferenced for you and voila, magically you're changing the value of the outer variable.

I don't get what "dereferenced" means:

Then all references to the inner variable are implicitly dereferenced for you ...

It would seem my understanding of what is happening is correct if the terms val and ref are reversed :)

If something is DE referenced, then there is NO connection between the 2 items. So changes in j can't affect i but they do.

The change in i in diagram 6 could not happen if the reference location of i and j were gone ... or what did I miss?

j would HAVE to be referring (referencing) the same address because changes to j affect i.

Please clarify

Mike Durthaler said...

I don't see what you mean by dereferenced because this would mean dis-relating things that were previously related.

Then all references to the inner variable are implicitly dereferenced for you ...

well, if changes to j are reflected in i, then they can't be DE referenced.

What did I miss?

Sudhir DBAKings said...
This comment has been removed by a blog administrator.