Wednesday, March 21, 2007

Counting the number of instances of a letter in a file

To explain why this might be necessary:

First I have an xml file with some sequences in it. I grep the xml file finding the instances I am interested in and save these to a file, stripping out the xml using vim.

grep -A 5 \<type\>Dis filename.xml | grep \<sequence\> > file_raw_seq.txt
Alternatively I could use the fasta file and strip the headers using grep -v \>.

I am now left with the raw text of the sequences from the xml file that I am interested in.

I can now run the following command to check out the number of times say M appears in the file;

tr -dc M < input.fa | wc -c


Easy. The alternative was to write a C++ program to do all this - which would have taken considerably longer. Especially given that some people haven't committed working versions of their code. I'm looking at you Mr. TreeCreate and Mr IntVector.

Thursday, March 1, 2007

Sticky

Further to the back up stuff I been documenting recently, I was told of this useful (?) little titbit. I have a spare disk called /dataStore that is mounted locally on my workstation. Other people log in to my computer to compile stuff, maybe run some stuff etc. So I am letting them back their stuff up locally on my computer. In order to give them the appropriate permissions on the data disk I have made a directory in this disk called imaginatively enough backup.

I have made this directory group writable so that everyone can use this disk. However, in order to ensure that no one fucks up anyone else's data I have made the directory a sticky. To do this type:

chmod +t dir

This is as close to an explanation as I could find:
http://www.uwsg.iu.edu/UAU/files/sticky.html