PHP Spam detection project

While I was googling-up for know how the LibTextCat works internally, I found the paper that had changed my life, N-Gram-Based Text Categorization. This papers talk about N-gram (An n-gram is a sub-sequence of n items from a given sequence), and how it can help to construct language independent algorithms to categorize texts.