2018-12-09

3: "Lvalue", "Rvalue" in C++? They Are Not Values!

<The previous article in this series | The table of contents of this series | The next article in this series>

They are misnomers, and prevalent explanations on them do not make sense, because they explain the concepts as values.

Topics


About: C++

The table of contents of this article


Starting Context



Target Context


  • The reader will have a reasonable explanation of what so-called "lvalue" and "rvalue" are in C++.

Orientation


Hypothesizer 7
The following code causes a compile error like "lvalue required as unary ‘&’ operand".

@C++ Source Code
void pointerArgumentFunction (int * a_integerPointer) {
	*a_integerPointer = 3 * 4;
}

int main (int a_argumentsNumber, char const * a_arguments []) {
	pointerArgumentFunction (&(2 * 3));
}

"lvalue"? . . . What, the hell, is "lvalue"? . . . Whatever it exactly is, one would naturally guess that it would be a kind of value, as the name is l"value".

However, this successful code proves that it is not so.

@C++ Source Code
#include <iostream>

void referenceArgumentFunction (int const & a_integerReference) {
	::std::cout << "### The address of the temporary object is " << &a_integerReference << ::std::endl << ::std::flush;
}

int main (int a_argumentsNumber, char const * a_arguments []) {
	referenceArgumentFunction ((2 * 3));
}

That code is successful because "a_integerReference" is not a "rvalue", but a "lvalue". However, the value concerned in the two codes above is the single datum that is created by the expression "2 * 3", which is a so-called temporary object. . . . Why is a single datum an "rvalue" and also an "lvalue"? . . . Strange.

It has turned out that any "lvalue" or "rvalue" is not any value, but an expression. . . . Quite confusing names. I think that they should be called 'lexpression' and 'rexpression'.

I have seen an explanation like "An rvalue is an xvalue, a temporary object or subobject thereof, or a value that is not associated with an object.", but that is obviously wrong: that explanation tells as though an "rvalue" is a value, which is not true. In fact, although "a_integerReference" in the second code above represents a temporary object, the expression is not any "rvalue".

Note that although there are also the terms, "glvalue", "xvalue", and "prvalue", still, any expression is an "lvalue" or an "rvalue" exclusively, so, restricting the topic to only "lvalue" and an "rvalue" makes sense, and I will not delve into those three types of expressions in this article because I have to first explain the so-called "move semantics" (which is another confusing term). In fact, without "move semantics", "xvalue" does not exist; "glvalue" equals "lvalue"; "prvalue" equals "rvalue": so only "lvalue" and "rvalue" matter.

By the way, is the distinction between "lvalue" and "rvalue" really necessary? . . . I will think about that in a section.


Main Body


1: What Is "Rvalue"?


Hypothesizer 7
First any so-called "rvalue" is not any value, but an expression. In other words, being an "rvalue" is an attribute of an expression, not of any datum. In fact, I use a term, 'rexpression', instead.

Let me again see a code cited above in 'Orientation'.

@C++ Source Code
#include <iostream>

void referenceArgumentFunction (int const & a_integerReference) {
	::std::cout << "### The address of the temporary object is " << &a_integerReference << ::std::endl << ::std::flush;
}

int main (int a_argumentsNumber, char const * a_arguments []) {
	referenceArgumentFunction ((2 * 3));
}

The expressions, "2 * 3" and "a_integerReference", are an rexpression and an lexpression, respectively, although they represent the same temporary object.

Although there are prevalent explanations like "A temporary object is an rvalue.", the above code clearly proves that they are wrong. In fact, the correct statement is "Any expression that creates a temporary object is an rexpression".

What I think is a correct definition of 'rexpression' is 'any expression that creates a temporary object or any expression that is explicitly made an rexpression'.

"explicitly made an rexpression"? . . . Yes, with the introduction of so-called "move semantics", a way to make any expression an rexpression is introduced. In fact, "move semantics" aside, the first half of the definition is enough.


2: What Is "Lvalue"?


Hypothesizer 7
Again, any so-called "lvalue" is not any value, but an expression. In fact, I use a term, 'lexpression', instead.

The definition of 'lexpression' is 'any expression that is not any rexpression'.


3: Some Examples of Rexpressions and Lexpressions


Hypothesizer 7
Let me see some examples of rexpressions and lexpressions. Note that whether the address of the value can be gotten or not is a litmus test of being an lexpression or an rexpression, and any statement that tries to get the address of the value of any rexpression causes a compile error.

@C++ Source Code
#include <iostream>

			double const & lexpressionReturnFunction (double const & a_doubleReference) {
				return a_doubleReference; // The return is really the value of "a_doubleReference".
			}
			
			float prexpressionReturnFunction () {
				float l_float = 1.0;
				return l_float; // The return is not really the value of "l_float", but a copy of the value of "l_float".
			}
			
			double && xexpressionReturnFunction (double & a_doubleReference) {
				return ::std::move (a_doubleReference); // The return is really the value of "a_doubleReference", but any call of this function is explicitly turned into an rexpression by virtue of the '::std::move'.
			}
			
			void checkExpressionTypes () {
				int l_integer = 1;
				int * l_integerPointer = &l_integer;
				int const & l_integerReference = l_integer;
				double l_double = 2.0 * 3.0;
				::std::cout << "### The address of 'l_integer' is '" << &l_integer << "'." << ::std::endl << ::std::flush; // an lexpression
				::std::cout << "### The address of 'l_integerPointer' is '" << &l_integerPointer << "'." << ::std::endl << ::std::flush; // an lexpression
				::std::cout << "### The address of '*l_integerPointer' is '" << &(*l_integerPointer) << "'." << ::std::endl << ::std::flush; // an lexpression
				::std::cout << "### The address of 'l_integerReference' is '" << &l_integerReference << "'." << ::std::endl << ::std::flush; // an lexpression
				//::std::cout << "### The address of 'l_integer * 2' is '" << &(l_integer * 2) << "'." << ::std::endl << ::std::flush; // an rexpression
				::std::cout << "### The address of '*((int *) 1)' is '" << &*((int *) 1) << "'." << ::std::endl << ::std::flush; // an lexpression, although bad (although "*((int *) 1)" does not represent any valid datum, it is still an lexpression: an explanation like "a value that is not associated with an object is a rvalue" is wrong)
				::std::cout << "### The address of 'checkExpressionTypes' is '" << (void *) &checkExpressionTypes << "'." << ::std::endl << ::std::flush; // an lexpression
				::std::cout << "### The address of 'lexpressionReturnFunction (l_double)' is '" << &(lexpressionReturnFunction (l_double)) << "'." << ::std::endl << ::std::flush; // an lexpression
				::std::cout << "### The address of 'lexpressionReturnFunction (2.0 * 3.0)' is '" << &(lexpressionReturnFunction (2.0 * 3.0)) << "'." << ::std::endl << ::std::flush; // an lexpression, although an expression that represents a temporary object
				//::std::cout << "### The address of 'prexpressionReturnFunction ()' is '" << &(prexpressionReturnFunction ()) << "'." << ::std::endl << ::std::flush; // an rexpression
				//::std::cout << "### The address of 'xexpressionReturnFunction ()' is '" << &(xexpressionReturnFunction (l_double)) << "'." << ::std::endl << ::std::flush; // an rexpression
				::std::cout << "### The address of const_cast <int &> (l_integer)' is '" << &(const_cast <int &> (l_integer)) << "'." << ::std::endl << ::std::flush; // an lexpression
				//::std::cout << "### The address of ::std::move (l_integer)' is '" << &(::std::move (l_integer)) << "'." << ::std::endl << ::std::flush; // an rexpression
			}

"::std::move" above has explicitly made the lexpressions rexpressions (will be discussed in the next article).


4: Why Do 'Lexpression' and 'Rexpression' Have to Be Distinguished?


Hypothesizer 7
But why do 'lexpression' and 'rexpression' have to be distinguished? . . . Well, they do not, I think.

Historically, they are distinguished in order to prohibit some expressions from being used as the left hand side of any assignment.

Let me see some examples.

@C++ Source Code
#include <string>

				int l_integer1 = 1;
				int l_integer2 = 2;
				//l_integer1 * l_integer2 = 3;
				::std::string ("aaa") = ::std::string ("bbb");

The statement, "l_integer1 * l_integer2 = 3;", is not allowed by the compiler, but does that statement really make no sense? . . . I do not think so. That statement means that a temporary object is created somewhere (it does not matter even if the location is in the CPU register) and '3' is put into the location. That makes sense.

Certainly, that statement may not be useful, but many other unuseful statements (like "l_integer1 = l_integer1;") are allowed in C++, and I do not see any necessity of prohibiting only certain unuseful statements: in fact, why do not they just let the compiler silently eliminate unuseful statements as a part of the optimization, without troubling programmers with error messages about "lvalue" or "rvalue"?

Although there is also a restriction that any address-getting operation on any rexpression is forbidden, that restriction is also unnecessarily in my opinion. Logically speaking, being an rexpression does not necessitate forbidding any address-getting operation on the expression, at all. If that is about letting the compiler store a temporary object only in the CPU register, just have the compiler judge what can be safely exonerated from memory allocation as a part of the optimization (if the address of a temporary object is tried to be gotten, have compilers store it in a memory; if not, compilers are free to store the temporary object only in the CPU register): I do not think that draconianly forbidding any address-getting operation on any rexpression without having very meaningful result (the address can be gotten in any function that accesss the temporary object via a reference argument, anyway) is reasonable.

Someone might say that passing an address of a temporary object into a function (for the purpose of making an argument modifiable) is unuseful because the changes cannot be seen after the function call; I say that there are some functions that have pointer arguments and I have to pass the address of a temporary object into one of them whether the change is visible for this specific call or not (the argument's being a pointer is meaningful because the function is not particularly only for being called with a temporary object: sometimes, it is called with a non-temporary object).

Certainly, as I am not any compiler developer, there might be some reasons why some compiler developers want such restrictions, but I claim that a convincing explanation for programmers is due, if there are such reasons (I have not found any, so far).


5: The Conclusion and Beyond


Hypothesizer 7
Now, I seem to understand what "lvalue" and "rvalue" are.

Although I do not understand the necessity of distinguishing the two, I have to face the reality that some codes are not allowed because of some restrictions based on the distinction. As compilers give me errors about "lvalue" and "rvalue", I have to understand what they are.

In fact, being an rexpression has assumed another consequence because of the introduction of the so-called "move semantics" (another confusing term). I will study what "move semantics" is and what "lgvalue", "xvalue", and "prvalue" are in the next article.


References


<The previous article in this series | The table of contents of this series | The next article in this series>