Tuesday, December 9, 2008

SCP fail

I was trying to copy a bunch of pdfs from my desktop to a server using scp. This command failed:

/usr/bin/scp user@example.com:~/foo/bar/*.pdf Training/


It returned a error saying "/usr/bin/scp: No match.". This is because the regex wildcard is expanded on the local machine (in this case the server) where the files are not present and therefore not found.

Whereas this one worked.
/usr/bin/scp user@example.com:~/foo/bar/\*.pdf Training/

Notice that the wildcard is now escaped so that it is ignored on the server but will be expanded when scp gets to the remote machine.

Sunday, November 23, 2008

Thursday, November 13, 2008

Join lines of two lines

I want to join the lines of two files. So I have file1:

A
B
C
D
E
F


and file 2:

G
H
I
J
K
L


and get:

AG
BH
CI
DJ
EK
FL


Use awk like this (robbed and modified from various places on the internet):

awk '{str = $0 ; getline < "run1717_1731_1.end2.pf.fastq" ; print str $0 > "run1717_1731_1_joined.fastq"}' run1717_1731_1.end1.pf.fastq

Wednesday, November 12, 2008

Replace a string in every other line starting with the first line:

sed '1~2s/foo/bar/g' filename


In my case, it was to reformat a Swift output file for use with MAQ:

sed '1~2s/:end1:/:end:/g' run1717_1731_1.end1.pf.fastq > run1717_1731_1.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_1.end2.pf.fastq > run1717_1731_1.end2.pf.fastq.c
sed '1~2s/:end1:/:end:/g' run1717_1731_2.end1.pf.fastq > run1717_1731_2.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_2.end2.pf.fastq > run1717_1731_2.end2.pf.fastq.c
sed '1~2s/:end1:/:end:/g' run1717_1731_3.end1.pf.fastq > run1717_1731_3.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_3.end2.pf.fastq > run1717_1731_3.end2.pf.fastq.c
sed '1~2s/:end1:/:end:/g' run1717_1731_4.end1.pf.fastq > run1717_1731_4.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_4.end2.pf.fastq > run1717_1731_4.end2.pf.fastq.c
sed '1~2s/:end1:/:end:/g' run1717_1731_5.end1.pf.fastq > run1717_1731_5.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_5.end2.pf.fastq > run1717_1731_5.end2.pf.fastq.c
sed '1~2s/:end1:/:end:/g' run1717_1731_6.end1.pf.fastq > run1717_1731_6.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_6.end2.pf.fastq > run1717_1731_6.end2.pf.fastq.c
sed '1~2s/:end1:/:end:/g' run1717_1731_7.end1.pf.fastq > run1717_1731_7.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_7.end2.pf.fastq > run1717_1731_7.end2.pf.fastq.c
sed '1~2s/:end1:/:end:/g' run1717_1731_8.end1.pf.fastq > run1717_1731_8.end1.pf.fastq.c
sed '1~2s/:end2:/:end:/g' run1717_1731_8.end2.pf.fastq > run1717_1731_8.end2.pf.fastq.c

maq fastq2bfq run1717_1731_1.end1.pf.fastq.c run1717_1731_1.end1.pf.bfq &> make_lane1_end1_bfq&
maq fastq2bfq run1717_1731_1.end2.pf.fastq.c run1717_1731_1.end2.pf.bfq &> make_lane1_end2_bfq&

maq fastq2bfq run1717_1731_2.end1.pf.fastq.c run1717_1731_2.end1.pf.bfq &> make_lane2_end1_bfq&
maq fastq2bfq run1717_1731_2.end2.pf.fastq.c run1717_1731_2.end2.pf.bfq &> make_lane2_end2_bfq&

maq fastq2bfq run1717_1731_3.end1.pf.fastq.c run1717_1731_3.end1.pf.bfq &> make_lane3_end1_bfq&
maq fastq2bfq run1717_1731_3.end2.pf.fastq.c run1717_1731_3.end2.pf.bfq &> make_lane3_end2_bfq&

maq fastq2bfq run1717_1731_4.end1.pf.fastq.c run1717_1731_4.end1.pf.bfq &> make_lane4_end1_bfq&
maq fastq2bfq run1717_1731_4.end2.pf.fastq.c run1717_1731_4.end2.pf.bfq &> make_lane4_end2_bfq&

maq fastq2bfq run1717_1731_5.end1.pf.fastq.c run1717_1731_5.end1.pf.bfq &> make_lane5_end1_bfq&
maq fastq2bfq run1717_1731_5.end2.pf.fastq.c run1717_1731_5.end2.pf.bfq &> make_lane5_end2_bfq&

maq fastq2bfq run1717_1731_6.end1.pf.fastq.c run1717_1731_6.end1.pf.bfq &> make_lane6_end1_bfq&
maq fastq2bfq run1717_1731_6.end2.pf.fastq.c run1717_1731_6.end2.pf.bfq &> make_lane6_end2_bfq&

maq fastq2bfq run1717_1731_7.end1.pf.fastq.c run1717_1731_7.end1.pf.bfq &> make_lane7_end1_bfq&
maq fastq2bfq run1717_1731_7.end2.pf.fastq.c run1717_1731_7.end2.pf.bfq &> make_lane7_end2_bfq&

maq fastq2bfq run1717_1731_8.end1.pf.fastq.c run1717_1731_8.end1.pf.bfq &> make_lane8_end1_bfq&
maq fastq2bfq run1717_1731_8.end2.pf.fastq.c run1717_1731_8.end2.pf.bfq &> make_lane8_end2_bfq&

Monday, November 10, 2008

xmodmap pipe key

I wanted that key in the top left corner on the mac keyboard (US using a UK layout) to be pipe. In X this command:

 xmodmap -e "keycode 49 = bar bar bar"


Which remaps that key to pipe under all modifiers.

irssi and screen resize correctly

Do what Jimmy says:

[10:23] <@jtang> ctrl-a then shift f
[10:23] <@jtang> ctrl-a, shift-f
[10:23] <@jtang> that will do the trick

Friday, November 7, 2008

Reverse endianness in C++

Here you go kids:

///\brief reverses the endianness of a string
template <typename T> inline
void reverse_endian(T& t){
unsigned char* res = reinterpret_cast<unsigned char*>(&t);
unsigned char *temp = new unsigned char[sizeof(T)];
for(int n=0;n<sizeof(T);n++) {temp[sizeof(T)-1-n] = res[n]; }
for(int n=0;n<sizeof(T);n++) res[n] = temp[n];
delete[] temp;
}



Have fun yall!

Update: now with added working!

Friday, October 31, 2008

Log memory used by a process over time

Messy script. This logs memory usage by the process called "swift" to the file called "log":

while true;do ps -euf | grep "./swift --align" | grep -v grep | awk '{print $6}' >> log;sleep 5; done

Tuesday, October 28, 2008

Beagleboard HandheldsMojo no serial console

On first boot on Handhelds console (installed using instructions from: http://elinux.org/BeagleBoardHandheldsMojo) the serial console doesn't appear to be setup. To set it up mount the SD card on your PC and add a file at /etc/event.d/ttyS2 with the following contents:

# ttyS2 - getty
#
# This service maintains a getty on tty6 from the point the system is
# started until it is shut down again.

start on runlevel 2
start on runlevel 3

stop on runlevel 0
stop on runlevel 1
stop on runlevel 4
stop on runlevel 5
stop on runlevel 6

respawn
exec /sbin/getty -L 115200 ttyS2



also add ttyS2 to /etc/securetty

mmcinit causes reboot on beagleboard

mmcinit causes a beagleboard to reboot if it can't draw enough power. I had this problem when powering the beagleboard over USB from a unpowered hub. Plugging directly in to the PC solved the problem.

Thursday, October 9, 2008

last.fm-ripper: Undefined argument in option spec

I couldn't get last.fm-ripper to work, it kept giving the error "Undefined argument in option spec". I commented out a bunch of stuff then it worked again, watevar! Here are the random changes I made:

GetOptions(
#'help|?' => \$help,
# 'debug|d' => \$debug,
# 'no_covers|n' => $no_covers,
'artist|a=s' => \$artist,
'username|u=s' => \$username,
'password|p=s' => \$password,
'output_dir|o=s' => \$output_directory#,
#'aws_token|w=s' => \$aws_token
);




I don't know perl I don't want to know perl.

Saturday, September 20, 2008

Grab a set of tile images from an Illumina/Solexa run folder

This script copies a set of images for a single tile in an illumina runfolder and creates another runfolder containing just that image set. It's useful if you want to grab an image set to process it later, or save a representative sample.

#!/bin/bash

SOURCE=$1
LANE=$2
TILE=$3
DESTINATION=$4

echo Source : $SOURCE
echo Lane : $LANE
echo Tile : $TILE
echo Destination: $DESTINATION


mkdir $DESTINATION
mkdir $DESTINATION/Images
mkdir $DESTINATION/Images/L00$LANE

cd $SOURCE/Images/L00$LANE


find . -type d -exec mkdir $DESTINATION/Images/L00$LANE/{} \;

for ((CYCLE=1; CYCLE<=100; CYCLE++))
do
find ./C$CYCLE.1/ -name s_$LANE\_$TILE\_* -exec cp {} $DESTINATION/Images/L00$LANE/{} \;
done



Example:

./grabtile /staging/IL18/outgoing/080910_IL18_1380 4 100 $HOME/1380_4_100

Thursday, September 18, 2008

How do you get subversion to post directly to a blog?

I'm being proactive here.
[09:47] < nobodycares> can you create a hook into svn so that it tracks commits and
publishes the comments somewhere like a blog?
[09:47] < new> yes
[09:47] < new> easy way would be to get the svn to email in to the blog
[09:47] < nobodycares> I'm thinking that instead of a lab notebook you could have a blog of the svn
[09:47] < nobodycares> commit messages
[09:48] < nobodycares> how easy is it?
[09:48] < new> you can publish straight from an email address with, for example, blogspot
[09:48] < new> so you'd just tell the svn hook to mail that address
[09:48] < new> and it should all work
[09:48] < new> setting up a svn hook to send an email is easy
[09:48] < new> you need to have the MTA or whatever on your computer setup correctly mind
[09:49] < new> on the svn server that is
[09:51] < nobodycares> okay
[09:51] < nobodycares> I'm gonna forget about this. And ask you again in six months or so.
[09:52] < new> ok cool
[09:52] < new> so long as I know.


Anybody care to weigh in? The simplest way to me would seem to be to setup an email hook and get it to email your blog directly (I guess it will depend on your blog software), blogspot for example supports sending to a special email address which gets automagically posted to the blog. It's not very secure but it would work.

Sorting by a different comparison method in C++ (STL)

You have a list of numbers in C++ that you want to sort, but you don't want to sort them in simple numerical order. You might, for example, want to sort them by their deviation from 0. Here's how you do it using the STL:

#include <vector>
#include <iostream>

using namespace std;

class compare_deviation
{
public:
compare_deviation(int d) : deviation(d) {
}

bool operator ()(const int p1,const int p2)
{

int v1 = abs(deviation - p1);
int v2 = abs(deviation - p2);
return(v1 < v2);
}

int deviation;
};


int main() {


vector<int> numbers;

numbers.push_back(1);
numbers.push_back(-10);
numbers.push_back(-9);
numbers.push_back(-5);
numbers.push_back(0);
numbers.push_back(5);
numbers.push_back(6);
numbers.push_back(7);
numbers.push_back(10);
numbers.push_back(11);

compare_deviation comp(0);

cout << "Unsorted numbers: ";

for(vector<int>::iterator i=numbers.begin();i != numbers.end();i++) cout << " " << (*i);
sort(numbers.begin(),numbers.end(),comp);
cout << endl;

cout << "Sorted numbers: ";
for(vector<int>::iterator i=numbers.begin();i != numbers.end();i++) cout << " " << (*i);
cout << endl;

return 1;
}



Bang On!

Thursday, September 11, 2008

Urxvt and other terminal fun

Here's my .Xdefaults for urxvt, it's very simple:

! urxvt*font: xft:Terminus:pixelsize=12
! urxvt*boldFont: xft:Terminus:pixelsize=12
urxvt*font: -*-profont-*-*-*-*-11-*-*-*-*-*-iso8859-*
urxvt*boldFont: -*-profont-*-*-*-*-11-*-*-*-*-*-iso8859-*


URxvt.keysym.M-y: perl:mark-and-yank:activate_mark_mode
URxvt.keysym.M-u: perl:mark-and-yank:activate_mark_url_mode
URxvt.perl-lib: /home/new/.urxvt/
URxvt.perl-ext-common: tabbed,mark-and-yank
URxvt.urlLauncher: firefox

urxvt*borderLess: false
urxvt*externalBorder: 0
urxvt*internalBorder: 2
urxvt*scrollBar: false
urxvt*depth: 32
urxvt*background: rgba:0000/0000/0000/0000
urxvt*foreground: #ffffff
urxvt*cursorColor: #ffff00
urxvt*cursorColor2: #00ffff
urxvt*cursorBlink: true
urxvt*inheritPixmap: false
urxvt*termName: rxvt-unicode
urxvt*saveLines: 65535
urxvt*scrollTtyOutput: true
urxvt*scrollTtyKeypress: true
urxvt*mouseWheelScrollPage: true
urxvt*cutchars: `'",;@&*=|?()<>[]{}
URxvt.urlLauncher: firefox



I've also added an alias to clear my scrollback, it's useful in .bashrc:

alias cls='tput reset'



BBC Micro FTW!

Thursday, August 14, 2008

Adding docs to external jars in eclipse

If you have some external jars, like in my case the GWT libraries and you want to add the handy javadoc to them - its not exactly obvious. You need to navigate on the Package explorer view to the "Referenced Libraries". Unfold this tree, and then right click on the relevant library. Click on properties and then javadoc. Add the directory and click validate just to check. Bingo - now the javadoc will appear using the tooltip in the main editor.

Friday, July 25, 2008

Count chars

We might have had this one before but I couldn't find it.

cat commafile.txt | tr -dc ',' | wc -c

That could be replaced by any file in the commfile.txt. So, basically count the number of occurrences of the expression ',' in the file and print it out. Why? you ask - cos I gotta.

Wednesday, July 2, 2008

Remote SVN access for Sanger Institute

As usually this config is ripped from various parts of the web, but it allows me to remotely access the SVN at the Sanger Institute. Place this configuration in ~/.ssh/config:

Host sangerTunnel
HostName ssh.sanger.ac.uk
Port 22

# Subversion Server
LocalForward localhost:2222 svn:22

###Hostname alias directives###
#These allow you to mimic hostnames as they appear at work.
#We just take the localhost names from the above section and add alias names.
#Note that you don't need to use a FQDN; you can use a short name ,such as smtp instead of smtp.pretendco.com.

Host svn.internal.sanger.ac.uk
HostName localhost
User YOURUSERNAME
Port 2222
#End Config File




Then type:

ssh -v sangerTunnel -lYOURUSERNAME


to setup the tunnel. You will need to login (and connect to a host). You can then checkout your files, as if you were connected to the local network.

Here is the same for EMBL (couldn't correctly formatted code to comment):
I wanted to do the same to access my machine in EMBL. Here is how.

Host emblTunnel
HostName ssh-proxy.embl.de
Port 22

# Subversion Server
LocalForward localhost:2222 svn:22

###Hostname alias directives###
#These allow you to mimic hostnames as they appear at work.
#We just take the localhost names from the above section and add alias names.
#Note that you don't need to use a FQDN; you can use a short name ,such as smtp instead of smtp.pretendco.com.
Host mymachine
HostName localhost
Port 2222
HostKeyAlias mymachine
User nobodycares


You then type:
ssh emblTunnel

And then:
ssh mymachine

And then it shoudl be possible to scp stuff using scp mymachine:/path/to/file

Monday, June 23, 2008

base64 decode

I was playing around with launching the taverna workflow engine programmatically. The answers it spat back out all looked slightly weird.

IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIwojIFByb2dyYW06IGFudGln
ZW5pYwojIFJ1bmRhdGU6IFRodSAxOSBKdW4gMjAwOCAxMjoyNDo0MgojIENvbW1hbmRsaW5lOiBh
bnRpZ2VuaWMKIyAgICAtc2VxdWVuY2Ugc3dpc3Nwcm90OlAwNDYzNwojICAgIC1vdXRmaWxlICIv
ZWJpL2V4dHNlcnYvc29hcGxhYi13b3JrL3NvYXBsYWIyX2VtYm9zczQvU0FOREJPWC9bcHJvdGVp
bl9tb3RpZnMuYW50aWdlbmljXS0yNTc4NTMyNi


According to the taverna user group this is cos the answer is encoded in base64. To see how it should look, stick it through mmencode:
mmencode -u name.txt

A somewhat more comprehensable answer should emerge:
########################################
# Program: antigenic
# Rundate: Thu 19 Jun 2008 12:24:42
# Commandline: antigenic
# -sequence swissprot:P04637
# -outfile "/ebi/extserv/soaplab-work/soaplab2_emboss4/SANDBOX/[protein_motifs.antigenic]-25785326.11a9e381fc5.67f8/o_outfile"
# -auto
# Report_format: motif
# Report_file: /ebi/extserv/soaplab-work/soaplab2_emboss4/SANDBOX/[protein_motifs.antigenic]-25785326.11a9e381fc5.67f8/o_outfile
########################################

#=======================================
#
# Sequence: P53_HUMAN from: 1 to: 393
# HitCount: 15
#=======================================



Now to find the java libraries to do this in code.

Saturday, June 21, 2008

Vim C++ Autocompletion

Download and install vimcppomnicomplete: http://www.vim.org/scripts/script.php?script_id=1520


Install ctags (not the Emacs version): http://ctags.sourceforge.net/

In /usr/include type:

ctags -f ~/.vim/stdtags -R --c++-kinds=+p --fields=+iaS --extra=+q .



Add the following to ~/.vimrc:


set nocp
filetype plugin on
map <C-L> :!ctags -R --c++-kinds=+p --fields=+iaS --extra=+q .<CR><CR>

set tags=~/.vim/stdtags,tags,.tags,../tags

autocmd InsertLeave * if pumvisible() == 0|pclose|endif


The final line closes the completion box when you leave insert mode. Ctrl-L will update the tags (and so let you autocomplete based on new files) files in the current directory. Thanks go to the various places on the interwebs I robs this info from.

Thursday, June 12, 2008

Truecrypt commands

Given that I will probably forget all this I thought it best to write it down.

Create the volume:
truecrypt --create test.tc 


Mount that volume:
truecrypt test.tc /mnt/data/


Unmount it:
truecrypt --dismount test.tc


All stolen from here:
http://www.movingtofreedom.org/2007/02/10/truecrypt-in-ubuntu-and-fedora-gnu-linux/

Monday, April 28, 2008

Average a bunch of lines (C++)

A Short C++ program to average a bunch of entries on a bunch of lines. I.e.:




1 2 3 4 5 6 END OF LINE

7 8 9 10 11 12 END OF LINE

13 14 15 16 17 18 END OF LINE



Would average 1 7 13. 2 8 and 14 etc...

#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>

using namespace std;

int main(int argc,char **argv) {

ifstream infile(argv[1]);

int count=0;
vector<double> sums(1000,0);
for(;!infile.eof();) {
string instring;
getline(infile,instring);

stringstream ss(instring);

for(int n=0;!ss.eof();n++) {
double cur;
ss >> cur;
sums[n]+= cur;
}
count++;
}

for(vector<double>::iterator i = sums.begin();i != sums.end();i++) {
cout << (*i)/count << " ";
}
cout << endl;

}

print 5th through 40th entries on a line using awk

awk '{n=5; for(n=5;n<40;n++) printf $n " "; printf "\n"}'

Friday, April 25, 2008

grep for a tab character

Did I do this already?

You need to get a tab on to the console. On many consoles press ctrl-V then tab, and a tab will be inserted in to the console.

Wednesday, April 23, 2008

Pull files out of a runfolder and stick them here named by cycle

Won't mean anything to anyone but me probably...

for ((i=1; i<=36; i++)); do for j in ./C$i.1/*; do cp $j ./`basename $j .tif`_$i.tif; done; done


Make a GA style folder:

for ((i=1; i<=36; i++)); do mkdir ./lane1tile1/C$i.1; for j in ./C$i.1/s_1_1_*; do cp $j ./lane1tile1/C$i.1/`basename $j`; done; done

map const accessor

I wanted to access map in a const form. However the default [] accessor doesn't allow you to do that, because it will create a new entry in map if the thing you are looking for doesn't exist. This program describes the problem:

#include <iostream>
#include <vector>
#include <map>

using namespace std;

class MyMapEncap {
public:

map<string,vector<int> > mymap;

const vector<int> &getvec(string id) const {
return mymap[id];
}
};

int main() {
MyMapEncap e;

e.getvec("RAW");

}



The solution is to use the find method which returns an iterator to the thing your looking for in the above example like this:

#include <iostream>
#include <vector>
#include <map>

using namespace std;

class MyMapEncap {
public:

map<string,vector<int> > mymap;

const vector<int> &getvec(string id) const {
return (*mymap.find(id)).second;
}
};

int main() {
MyMapEncap e;

e.getvec("RAW");

}



Sunday, April 20, 2008

striping html

I wanted to download this and covert it to a flat text file, and basic html file. Here's the script to do it, basically runs it though html2text, skips everything up to "ACKNOWLEDGEMENTS" then removes any line with "continue..." at the start. Then strips off the first and last lines (which are blank).

html2text -nobs ch1-a.html | awk '{if(p==1) if($1 != "continue...") print $0; if($1 == "ACKNOWLEDGEMENTS") p=1; n++;}' | sed '1d;$d' >> ch1-a.txt


Here's the complete version which iterates over all the chNUMBER-* files in a directory, creates a single textfile for each chapter then converts that to html:

#!/bin/bash

for ((chapnum=1; chapnum<=24; chapnum++)) do
for i in ./ch$chapnum-*
do
html2text -nobs $i | awk '{if(p==1) if($1 != "continue...") print $0; if($1 == "ACKNOWLEDGEMENTS") p=1; n++;}' | sed '1d;$d' >> $chapnum.txt
done
txt2html $chapnum.txt > $chapnum.html
done

Saturday, April 19, 2008

OM digial back


I've been wanting to play with the idea of a digital conversion for old manual SLRs for a while. I don't know if it's a practical proposition but it's a fun project. The biggest difficulty will probably be getting and interfacing a large enough CCD.



As a first step I'm been playing with cheap Labtec webcams. I bought a bunch of these for something like 3 quid each for ebuyer. They are only 320x240 and 25fps but they'll do for a proof of concept. They also have good Linux support which is a plus.



Here's an image from the webcam:




I'm only interested in the CCD, not the optics so I strip them down:




and wrap them in a bunch of insulating tape so nothing shorts out:






OK, now we need to mod the OM itself. I'm using an OM2 here, it doesn't really matter which OM you use, for our purposes it's really just a mount for the lens. So to start with we need to jam the shutter open. OM cameras have "B" mode. This keeps the shutter open as long as the trigger is depressed. To constantly depress the trigger without hacking the camera I put a screw in the trigger and used a bunch of cable ties to pull it down.







Next mod the camera back.







Remove the back, and the sprung plate that lies against the film.











Next I drilled out the back. I used my Dremmel. Dremmels're really cool for this kind of thing. I used the piller attachment, they call it the "workstation":





When your cutting out a hole of the CCD you better use those goggles!





Final cut out back should look something like this (but better):







Now put it all together:











That's it! It works ok, obviously because the CCD is so small compared to 35mm film you end up imaging only a small part of the full frame. It basically looks like it's zoomed in all the time. Here's an example image:









Next up I'll try a better CCD. I've bought a Phillips SPC900NC for this purpose, it's a really nice webcam (1.3Mega pixel, and says it can do 60fps). The images look really nice...



Ideally I'd like to interface all this to a embedded processor like a gumstix, so rather than being tethered to a PC it's more like a real digital camera.

Monday, April 14, 2008

Dependant names

Using a template argument as a template argument for something else sometimes doesn't work, for example the following code:

#include <iostream>

using namespace std;

template<class _prec=double>
class myclass {
public:

typedef int atype;
};

template<class _prec=double>
class myclass2 {
public:

_prec method() {
myclass<_prec>::atype t = 0;
return t;
}
};

int main() {

myclass2<> m;

}


Gives this compilation error under g++:

templateargasarg.cpp: In member function '_prec myclass2<_prec>::method()':
templateargasarg.cpp:17: error: expected `;' before 't'
templateargasarg.cpp:18: error: 't' was not declared in this scope


It's something to do with dependant names, google it. I don't fully understand it. All I know is that as usual a typename fixes it i.e.:

    myclass<_prec>::atype t = 0;


becomes:

   typename myclass<_prec>::atype t = 0;


and all is well. My comp.lang.c++ thread is here.

Tuesday, April 1, 2008

grep with OR

You need to quote the search string and escape | to be used as an OR operator eg:

grep "Offset\|Stretch\|Alignment" superamazingfile

Count number of A,T,G and Cs in first base of each sequence in a fastq file

awked!

awk '{if(n%4 == 1) print $0;n++;}' s_4_sequence.txt | sort | awk '{first = substr($0,4,1); if(first=="A") as++; if(first=="T") ts++; if(first=="C") cs++; if(first="G") gs++;}END{print as; print ts; print gs; print cs;}'


you don't need the sort, and the fastq file is called s_4_sequence.txt here.

Monday, March 31, 2008

Formating text for reading on small devices

I sometimes want to download files, say html, and convert them to txt format to read on a small portable device (an n770 or cough cough iPhone cough cough sellout cough).

For html (on Mac OS X - should work on linux too) files this is the command I use.

textutil -convert txt -strip printableArticle.jhtml.html 


The -strip should remove most of the html tags and preserve the formating. I suggest using the printable version of files from the interwebs as this usually has the complete text and usually has no ads. There should also be fewer links, markup, pictures etc. I guess a more unix approach should appear in the comments.

For pdfs I use the pdf tools.

pdftotext -layout filename.pdf


The -layout ensures that you should have most of the formating intact. FBreader is an excellent program for reading on the nokia tablets.

Template static member brace-enclosed initializer

Or something...

#include <iostream>

using namespace std;

template<class _prec=double>
class myclass {
public:
static const string mystuff[];

_prec avalue;
};

template<class _prec>
const string myclass<_prec>::mystuff[] = {"Athing", "Another","Things"};

int main() {

myclass<> m;

cout << m.mystuff[0] << endl;
}

Friday, February 15, 2008

Configuring port forwarding in config for ssh

I often run ssh forwarding ports. I also make use of config files quite often for this job.

I often run the following command to do this:
ssh -L 3128:localhost:3128 www.example.com -l username



However, the syntax in the ssh config file is not quite what you would expect and differs depending on whether its an inbound or outbound forwarding. The following is the config for local forwarding

Host example_proxy
User username
Hostname www.example.com
IdentityFile /home/user/.ssh/pwdlogin
Compression yes
CompressionLevel 7
Cipher blowfish
LocalForward 3128 localhost:3128



Notice the awkardly placed spelling. No-one in the man file is this explained. Admittedly its not much of a leap of faith to work out that -L is LocalForward and -R is RemoteForward but seriously this could be in the man file.

Friday, February 8, 2008

is not a template type

When compiling some code I'd been writing I received the error:

../bla.h:10: error: 'MyStruct' is not a template type

I have no idea what this means, it's not a template type (but there was no reason it should be). Turns out that I'd forgotten the closing ; on a template class below which the header file which defined MyStruct was included. No idea why this should cause that problem but there you go.

Wednesday, February 6, 2008

C++ is slow?

I'm constantly confronted with the following two techniques, which I believe often produce less readable code, C style code, but I am told are faster therefore better.

1. The STL is slow.

More specifically vector. The argument goes like this:

"Multidimensional arrays should be allocated as large contiguous blocks. This is so that when you are accessing the array and reach the end of a row, the next row will already be in the cache. You also don't need to spend time navigating pointers when accessing the array. So a 2 dimensional array of size 100x100 should be created like this:

const int xdim=100;
const int ydim=100;

int *myarray = malloc(xdim*ydim*sizeof(int));

and accessed like this:

myarray[xdim*ypos+xpos] = avalue;

Is this argument reasonable? (Sounds reasonable to me, though the small tests I've performed don't usually show any significant difference).

To me this syntax looks horrible, especially compared to the C++ style vector.


2. iostream is slow.

I've encountered this is work recently. I'd not considered it before, I like the syntax and don't do so much IO generally... I'm just now starting to process terabytes of data, so it'll become an issue. Is iostream slow? My own small tests showed it was approximately 8times slower than stdio (printf) on a simple example outputting a bunch of 'A's to the standard output (which was then piped to /dev/null).

I posted a thread on comp.lang.c++ to get some input and ammo for discussing these issues with my colleagues. You can read the full thread here:

http://groups.google.com/group/comp.lang.c++/browse_thread/thread/8b626ef9c2c312aa

Based on this discussion, the consensus seems to be that yes, iostream is slower than stdio but that it's largely down to poor implementations. I guess if I want to know exactly why this is so, I'd need to dig around in the implementations. Probably the best way to go is use iostreams for small amounts of IO and write my own custom, possibly platform specific, IO code when speed is critical.

The array versus vector discussion seemed more problematic. A couple of people made the point that actually it's not down to using vectors, it's how you use them, it's basically down to allocating large blocks versus lots of small ones. It's a good point to keep in mind, there's still no advantage to C style arrays here.

The point was also made that if I decide to allocate large blocks and calculate indexes then I should wrap this functionally in a class. However, this class shouldn't use the operator[], it should use operator(). I find this problematic because it makes it incompatible with vector... I'm still searching for the right solution. That the STL doesn't provide a good container for this seems odd. I conclusion so far is that using Matrix with a operator() is the way to go, it should be possible to make this extensible to n-dimensional matrices.

Whither this is valuable or not seems to be an open question, and probably platform dependent. Any comments welcome!

Tuesday, February 5, 2008

do something a given number of times (bash)

Like this (do from 1 to 34) ./intensity2cycle ./s_1_0001_sig2.txt.filtered ./cycleNUMBER NUBMER

for i in $(seq 1 34); do ./intensity2cycle ./s_1_0001_sig2.txt.filtered ./cycle$i $i; done

Wednesday, January 23, 2008

do some command on all files matching pattern

I keep forgetting how to do this:

for i in *sequence; do ~/code/myprog $i $i.out; done


runs myprog on all files ending in sequence myprogs second argument is filename followed by .out.