I had a file where the lines look like this:
all_initstring0010010100111000/1dca.rule118.iter100.score all_initstring0010010100111000/1dca.rule140.iter100.score: 0
all_initstring0010010100111000/1dca.rule128.iter100.score all_initstring0010010100111000/1dca.rule122.iter100.score: 122312
all_initstring0010010100111000/1dca.rule113.iter100.score all_initstring0010010100111000/1dca.rule143.iter100.score: 3213
I wanted to extract the value after rule and before the . in the 1st and 2nd fields and also print the third field. I used awk and the substitution function to replace everything but the required value using a regular expression. Here's the code:
gawk '{gsub(/^.*rule/,"",$1); gsub(/[^0-9].*/,"",$1); gsub(/^.*rule/,"",$2); gsub(/[^0-9].*/,"",$2); print $1 " " $2 " " $3}' myfile
There is actually another solution which is probably nicer:
awk '{ split($1,a1,/\./) ; split($2,a2,/\./); print substr(a1[2],5), substr(a2[2],5), $NF; }' myfile