This is a fast multiple threading implementation of the nereast neighbor algorithm.
To get help, ./NNA -i h
For jackknife testing. ./NNA -i infilename -o outfilename
For predicting. ./NNA -i infilename -t testingfilename -o outfilename
For special use case, such as predicting protein-protein interaction, where the reverse vectoe should be use, extra parameters should be added.
For jackknife testing. ./NNA -i infilename -r reversefilename -o outfilename
For predicting. ./NNA -i infilename -x reverse testingfilename -o outfilename
More detail outputs can be obtained by adding option -l.
Example input file (same format as SVMlight inputs),
that is
| CLASS_LABEL DIMENTION:VALUE |
| 1 1:0.4 2:0.5 3:0.3 |
| -1 1:0.3 2:0.2 3:0.7 |
| 3 2:0.9 3:0.7 |
| 1 1:0.2 3:0.4 |
| 2 2:0.9 3:0.7 |
| -1 1:0.2 3:0.4 |
Assume you outputfile name is "out". Replace two "out"s when you copy this script
The bash scripts could be like this:
| for i in `awk '{print $2}' out | sort | uniq`; do echo -n "class-"$i" "; cat out | awk '{if($2=="'$i'") {t++; if($2==$3) c++;}}END{print t"\t"c"\t"(t-c)"\t"c/t}' ; done |
| Class | total sample | success | failed | success rate |
|---|---|---|---|---|
| class-1 | 100 | 92 | 8 | 0.92 |
| class-2 | 100 | 97 | 3 | 0.97 |
Assume your fasta like inputfile is "test.fa", which is like this in text format
| >sampl21 2 0 1 2 >sample2 2 3 1 2 |
| awk '{if(/>/) {c=1;printf "\n"$2" ";} else {printf c":"$1" "; c++;} }END{print ""}' test.fa |
| 2 1:0 2:1 3:2 1 1:2 2:0 3:3 |
中国科学院上海生命科学研究院生物信息中心
(Bioinfomatics Center, Shanghai Institutes for Biological Sciences).
版权所有(All rights reserved).