multi-threading NNA

Usage

This is a fast multiple threading implementation of the nereast neighbor algorithm.

To get help, ./NNA -i h

For jackknife testing. ./NNA -i infilename -o outfilename
For predicting. ./NNA -i infilename -t testingfilename -o outfilename

For special use case, such as predicting protein-protein interaction, where the reverse vectoe should be use, extra parameters should be added.
For jackknife testing. ./NNA -i infilename -r reversefilename -o outfilename
For predicting. ./NNA -i infilename -x reverse testingfilename -o outfilename

More detail outputs can be obtained by adding option -l.

Example input file (same format as SVMlight inputs), that is
CLASS_LABEL DIMENTION:VALUE
The class label can be any integer in multiple classifier usage
1 1:0.4 2:0.5 3:0.3
-1 1:0.3 2:0.2 3:0.7
3 2:0.9 3:0.7
1 1:0.2 3:0.4
2 2:0.9 3:0.7
-1 1:0.2 3:0.4

Output file format
First column is the sample id
Second column is the original class
Third column is the leave-one-out predicted class
Fouth column is the count of nearest neighbor found


Make stats on success rates

Assume you outputfile name is "out". Replace two "out"s when you copy this script
The bash scripts could be like this:
for i in `awk '{print $2}' out | sort | uniq`; do echo -n "class-"$i" "; cat out | awk '{if($2=="'$i'") {t++; if($2==$3) c++;}}END{print t"\t"c"\t"(t-c)"\t"c/t}' ; done

We will get the output file like this:
Class total sample success failed success rate
class-1 100 92 8 0.92
class-2 100 97 3 0.97

Change fasta like format file into SVMlight format files

Assume your fasta like inputfile is "test.fa", which is like this in text format
>sampl21 2
0
1
2
>sample2 2
3
1
2

The bash script could be
awk '{if(/>/) {c=1;printf "\n"$2" ";} else {printf c":"$1" "; c++;} }END{print ""}' test.fa
The transformed file would be
2 1:0 2:1 3:2
1 1:2 2:0 3:3


Software Package

New memory save version

  1. Download pre-complied Linux X86_32 version from NNA.less.mem.X86_32.lin.exe

  2. Download pre-complied Linux X86_64 version from NNA.less.mem.X86_64.lin.exe

Old version

  1. Download pre-complied Linux X86_32 version from NNA_X86_32-2007-07-23.

  2. Download pre-complied Linux X86_64 version from NNA_X86_64-2007-07-23.


中国科学院上海生命科学研究院生物信息中心
(Bioinfomatics Center, Shanghai Institutes for Biological Sciences).
版权所有(All rights reserved).