| Author: | Wojciech Muła, wojciech_mula@poczta.onet.pl |
|---|---|
| Last update: | 2011-04-14 |
| Added on: | 2011-03-2x |
pyahocorasick is a Python module implements two kinds of data structures: trie and Aho-Corasick string matching automaton.
Trie is a dictionary indexed by strings, which allow to retrieve associated items in a time proportional to string length. Aho-Corasick automaton allows to find all occurrences of strings from given set in a single run over text.
(BTW in order to use Aho-Corasick automaton, a trie have to be created; this is the reason why these two distinct entities exist in a single module.)
There are two versions:
Documentation of C extension API is available on separate page.
Python module API is similar, but isn't exactly the same.
Library is licensed under very liberal two-clauses BSD license. Some portions has been released into public domain.
Full text of license is available in LICENSE file.
Following files are available: