GSoC - WP - Category Suggester - [Proposal]

This project is an idea that I got while I was reading the N-gram text Categorization. paper wrote by William B. Cavnar and John M. Trenkle. The idea is to suggest categories¹ to new post while it is written based on similarities with previous post’s categories.
Using n-grams (a n-gram is a sequence of n […]

Knowing a little more about PHP

Hello dude! welcome back to ThyPHP, your PHP blog!. Today I will write about how to compile an extension of PHP on Unix and about an interesting tip found on PHP creator blog.
Yesterday I saw a post in Rasmus Lerdorf “toys page”. It was a nice post about MVC model… but the part that I […]

PHP Spam detection project

While I was googling-up for know how the LibTextCat works internally, I found the paper that had changed my life, N-Gram-Based Text Categorization. This papers talk about N-gram (An n-gram is a sub-sequence of n items from a given sequence), and how it can help to construct language independent algorithms to categorize texts.