Thursday, December 31, 2009

Friday, December 4, 2009

returning by reference faster than returning by value?

Is returning by reference faster than returning by value? It's always been my assumption that returning by value can in many cases be optimized away. However to test this assumption I wrote a couple of small tests:


By value:
#include <math.h>
#include <iostream>
#include <vector>

using namespace std;

vector<size_t> test() {
vector<size_t> t;

for(size_t n=0;n<100;n++) {
t.push_back(rand());
}

return t;
}

int main() {

size_t sum=0;
for(size_t n=0;n<10000000;n++){
vector<size_t> t = test();
for(size_t n=0;n<t.size();n++) {
sum = sum + t[n];
}
}
cout << sum << endl;

}



By reference:

#include <math.h>
#include <iostream>
#include <vector>

using namespace std;

void test(vector<size_t> &t) {
for(size_t n=0;n<100;n++) {
t.push_back(rand());
}
}

int main() {

size_t sum=0;
for(size_t n=0;n<10000000;n++){
vector<size_t> t;
test(t);
for(size_t n=0;n<t.size();n++) {
sum = sum + t[n];
}
}
cout << sum << endl;

}



Here are my timings on an Intel Core2Duo:

By value:
real 0m34.122s
user 0m33.219s
sys 0m0.436s

By reference:
real 0m34.147s
user 0m33.308s
sys 0m0.512s


So in this case the user times seem to be more or less the same. What's happening is the return value being optimized out as expected? Is there something else I should take in to account in my test?

i686-apple-darwin10-g++-4.2.1 was used.

http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

Also interesting comment on gcc mailing list:

Your subject isn't quite right.  It should be "Returning by reference faster than returning by value".                                                                                                                                      

C++ has RVO ("return value optimization").

RVO is part of C++, just as the elide constructors is a part of C++.

Also note: you get RVO even if you are -O0. It really is part of the C++ language, even for non-optimized compiling.

So your two tests as given should have the about the same performance, since the two tests are basically doing the same thing. And, indeed, they do have about the same performance. As expected.

Read carefully the constraints upon when RVO can be applied. (I don't have a URL handy. You'll have to use your Google-fu, or read the ISO 14882 standard.)

In situations where RVO cannot be applied, then for non-trivial data types, an output reference parameter will (likely) be faster.

A lot of STL and Boost rely upon RVO for their template magic to be highly performant.

Wednesday, December 2, 2009

Reverting an svn commit

To revert an svn commit do something like:

svn merge -r 4552:4551 svn+ssh://svn/path_to_svn_location
svn commit

Where 4552 is the last commit and 4551 is the one before.

Tuesday, December 1, 2009

Simple quicksort implementation

A simple C++ quicksort implementation. This version is not in place and therefore requires O(n) extra space...

#include "math.h"
#include <vector>
#include <iostream>

using namespace std;

template<class value_type>
void quick_sort(vector<value_type> &values,size_t start,size_t end,size_t pivot) {

vector<value_type> less;
vector<value_type> greater;

for(size_t n=start;n<end;n++) {
if(n != pivot)
if(values[n] < values[pivot]) {
less.push_back(values[n]);
} else {
greater.push_back(values[n]);
}
}

value_type pivot_temp = values[pivot];
for(size_t n=0;n<less.size();n++) {
values[n+start] = less[n];
}

values[start+less.size()] = pivot_temp;

for(size_t n=0;n<greater.size();n++) {
values[n+start+less.size()+1] = greater[n];
}

if(less.size() > 1) { quick_sort(values,start ,start+less.size(),start+(less.size()/2)); }
if(greater.size() > 1) { quick_sort(values,less.size(),end ,less.size()+(end/2) ); }
}

int main() {

vector<size_t> values;

for(size_t n=0;n<10;n++) {
values.push_back(rand());
}

cout << "start values..." << endl;
for(size_t n=0;n<values.size();n++) {
cout << "val[" << n << "]: " << values[n] << endl;
}

quick_sort(values,0,values.size(),5);
cout << "sorted values..." << endl;
for(size_t n=0;n<values.size();n++) {
cout << values[n] << endl;
}
}