The Protocol Informatics Project

Copyright © 2004 Marshall Beddoe

multiple alignment


The Protocol Informatics project is a software framework that allows for advanced sequence and protocol stream analysis by utilizing bioinformatics algorithms. The sole purpose of this software is to identify protocol fields in unknown or poorly documented network protocol formats. The algorithms that are used perform comparative analysis on a series of samples to better understand the underlying structure of the otherwise random-looking data. The PI framework was designed for experimentation through the use of a widget-based component set.


Date Information
Oct-5-2004 Faster than expected. PI Prototype released. See example for usage.
Oct-4-2004 Danny O'Brien wrote an article in Wired about PI. Code debuting in one week.
Sept-25-2004 As promised, Toorcon presentation posted here.
Sept-16-2004 Join the PI mailing list here
Sept-15-2004 Finished slides for Toorcon presentation. Link available after conference.
Sept-14-2004 Project site established.


Filename Description Size
PI-v0.01beta.tgz Protocol Informatics prototype version 0.01 beta in python. 48k

Core Algorithms

Filename Description Size
needleman-wunsch.tgz Needleman-Wunsch algorithm in C. For reference only 4k
smith-waterman.tgz Smith-Waterman algorithm in C. For reference only 4k


Title Description
PI-Toorcon.pdf Presentation slides from Toorcon 2004
FRAMEWORK Description of PI framework modules
README General project information
PI.pdf Paper: Network Protocol Analysis using Bioinformatics Algorithms

Suggested Reading

Weight Matrices for Sequence Similarity Scoring
Construction of distance tree using UPGMA
Needleman-Wunsch Algorithm
Algorithms for Multiple Sequence Alignment
Methods of Phylogenetic Tree Reconstruction
Consensus Sequence Zen
A Glossary for Molecular Information Theory


Many thanks to the following:

Christopher Abad
Terry Gaasterland
David Hulton
Danny O'Brien
Barclay Osborn
Tom Schneider


Author: Marshall Beddoe