Friday, February 15, 2008

Configuring port forwarding in config for ssh

I often run ssh forwarding ports. I also make use of config files quite often for this job.

I often run the following command to do this:
ssh -L 3128:localhost:3128 www.example.com -l username



However, the syntax in the ssh config file is not quite what you would expect and differs depending on whether its an inbound or outbound forwarding. The following is the config for local forwarding

Host example_proxy
User username
Hostname www.example.com
IdentityFile /home/user/.ssh/pwdlogin
Compression yes
CompressionLevel 7
Cipher blowfish
LocalForward 3128 localhost:3128



Notice the awkardly placed spelling. No-one in the man file is this explained. Admittedly its not much of a leap of faith to work out that -L is LocalForward and -R is RemoteForward but seriously this could be in the man file.

Friday, February 8, 2008

is not a template type

When compiling some code I'd been writing I received the error:

../bla.h:10: error: 'MyStruct' is not a template type

I have no idea what this means, it's not a template type (but there was no reason it should be). Turns out that I'd forgotten the closing ; on a template class below which the header file which defined MyStruct was included. No idea why this should cause that problem but there you go.

Wednesday, February 6, 2008

C++ is slow?

I'm constantly confronted with the following two techniques, which I believe often produce less readable code, C style code, but I am told are faster therefore better.

1. The STL is slow.

More specifically vector. The argument goes like this:

"Multidimensional arrays should be allocated as large contiguous blocks. This is so that when you are accessing the array and reach the end of a row, the next row will already be in the cache. You also don't need to spend time navigating pointers when accessing the array. So a 2 dimensional array of size 100x100 should be created like this:

const int xdim=100;
const int ydim=100;

int *myarray = malloc(xdim*ydim*sizeof(int));

and accessed like this:

myarray[xdim*ypos+xpos] = avalue;

Is this argument reasonable? (Sounds reasonable to me, though the small tests I've performed don't usually show any significant difference).

To me this syntax looks horrible, especially compared to the C++ style vector.


2. iostream is slow.

I've encountered this is work recently. I'd not considered it before, I like the syntax and don't do so much IO generally... I'm just now starting to process terabytes of data, so it'll become an issue. Is iostream slow? My own small tests showed it was approximately 8times slower than stdio (printf) on a simple example outputting a bunch of 'A's to the standard output (which was then piped to /dev/null).

I posted a thread on comp.lang.c++ to get some input and ammo for discussing these issues with my colleagues. You can read the full thread here:

http://groups.google.com/group/comp.lang.c++/browse_thread/thread/8b626ef9c2c312aa

Based on this discussion, the consensus seems to be that yes, iostream is slower than stdio but that it's largely down to poor implementations. I guess if I want to know exactly why this is so, I'd need to dig around in the implementations. Probably the best way to go is use iostreams for small amounts of IO and write my own custom, possibly platform specific, IO code when speed is critical.

The array versus vector discussion seemed more problematic. A couple of people made the point that actually it's not down to using vectors, it's how you use them, it's basically down to allocating large blocks versus lots of small ones. It's a good point to keep in mind, there's still no advantage to C style arrays here.

The point was also made that if I decide to allocate large blocks and calculate indexes then I should wrap this functionally in a class. However, this class shouldn't use the operator[], it should use operator(). I find this problematic because it makes it incompatible with vector... I'm still searching for the right solution. That the STL doesn't provide a good container for this seems odd. I conclusion so far is that using Matrix with a operator() is the way to go, it should be possible to make this extensible to n-dimensional matrices.

Whither this is valuable or not seems to be an open question, and probably platform dependent. Any comments welcome!

Tuesday, February 5, 2008

do something a given number of times (bash)

Like this (do from 1 to 34) ./intensity2cycle ./s_1_0001_sig2.txt.filtered ./cycleNUMBER NUBMER

for i in $(seq 1 34); do ./intensity2cycle ./s_1_0001_sig2.txt.filtered ./cycle$i $i; done