This project is an idea that I got while I was reading the N-gram text Categorization. paper wrote by William B. Cavnar and John M. Trenkle. The idea is to suggest categories¹ to new post while it is written based on similarities with previous post’s categories.
Using n-grams (a n-gram is a sequence of n […]
tags: google summer of code, wordpress author: Cesar D. Rodas comments: No Comments
Hello dude! welcome back to ThyPHP, your PHP blog!. Today I will write about how to compile an extension of PHP on Unix and about an interesting tip found on PHP creator blog.
Yesterday I saw a post in Rasmus Lerdorf “toys page”. It was a nice post about MVC model… but the part that I […]
tags: php4, php5 author: Cesar D. Rodas comments: 7 Comments
While I was googling-up for know how the LibTextCat works internally, I found the paper that had changed my life, N-Gram-Based Text Categorization. This papers talk about N-gram (An n-gram is a sub-sequence of n items from a given sequence), and how it can help to construct language independent algorithms to categorize texts.
tags: artificial intelligence, php classes, spam author: Cesar D. Rodas comments: No Comments