"Fossies" - the Fresh Open Source Software Archive

Member "texstudio-3.0.1/src/hunspell/NEWS" (31 Aug 2020, 28266 Bytes) of package /linux/misc/texstudio-3.0.1.tar.gz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file. See also the last Fossies "Diffs" side-by-side code changes report for "NEWS": 2.12.22_vs_3.0.0.

    1 2018-11-12: Hunspell 1.7.0 release:
    2 
    3   New features and bug fixes by Lszl Nmeth, supported by FSF.hu Foundation:
    4 
    5   - No annoying suggestion times any more, especially in languages with
    6     compound word handling and complex morphology. By adding balanced
    7     multi-level time limits, now the guaranteed suggestion time is there
    8     within half a second, not seconds (nor dozen of seconds or more
    9     in extreme cases) for longer misspellings, too.
   10 
   11   - add SPELLML support for run-time dictionary extension with optional
   12     affixation of user words. See new "Grammar By" feature of
   13     language-specific user dictionaries of LibreOffice 6.0:
   14 
   15     News: https://wiki.documentfoundation.org/ReleaseNotes/6.0#.E2.80.9CGrammar_By.E2.80.9D_spell_checking
   16 
   17     Screencast with English example: https://www.youtube.com/watch?v=EsS3gaBTfOo
   18 
   19     Screencast with German example: https://www.youtube.com/watch?v=aYVFDqCUb6I
   20 
   21   - Improved, highly customizable suggestions on level of dictionary words:
   22     Pronunciations and typical misspellings defined by optional "ph:" fields of
   23     the dictionary words are used not only in n-gram suggestions, but as
   24     elements of the REP replacement list getting the highest priority in normal
   25     suggestions, also giving the best suggestions for short words, too.
   26     More information: see "ph:" in man 5 hunspell.
   27 
   28   - Handling multiple word suggestions is much more easier. Like in a
   29     traditional spelling dictionary, for example, to get the correct suggestion
   30     "a lot" for the typical misspelling "alot" at the first place, now it's
   31     enough to put the following line to the dic(tionary) file:
   32 
   33     a lot
   34 
   35   - Limit compound overgeneration by dictionary based word pairs:
   36     Now it's possible to filter bad compound words by listing
   37     the correct word pairs with space in the dictionary, as in a traditional
   38     spelling dictionary.
   39 
   40   - clean-up suggestion:
   41 
   42     - no n-gram and compound word suggestions, if "good" suggestion
   43       exists, ie. uppercase, REP, ph: or dictionary word pair suggestions
   44 
   45     - word pairs are always suggested, if they exist in the dic file
   46 
   47     - word pairs have top priority in suggestions, and
   48       these are the only suggestions if there is no other good suggestion.
   49 
   50     - also dictionary word pairs separated by dash instead of space
   51       are handled specially in two-word suggestion (depending from the
   52       language)
   53 
   54    - limit bad suggestions by improved n-gram suggestion rules:
   55 
   56      don't suggest capitalized dictionary words for lower
   57      case misspellings in n-gram suggestions, except
   58 
   59      - PHONE usage, or
   60      - in the case of German, where not only proper
   61        nouns are capitalized, or
   62      - the capitalized word has special pronunciation
   63 
   64      and don't suggest if the difference of lengths of misspellings and
   65      suggestions is 5 or more characters.
   66 
   67   - Extend dotless i and dotted I rules to Crimean Tatar language
   68     Allow dotted I in dictionary, and disable bad capitalization of i.
   69 
   70   - BREAK: extended recursive word breaking algorithm to handle words or
   71     words with suffixes when they already contain word break characters,
   72     for example, "e-mail" is a dictionary word with a word break character, and
   73     it wasn't accepted before in compounds in some languages.
   74 
   75   - FORBIDDENWORD precedes BREAK: Now it's possible to forbid compound
   76     forms recognized by BREAK word breaking by adding the bad compounds to
   77     the dictionary with FORBIDDENWORD flags.
   78 
   79   - lower limit for "doubletwochars" suggestion algorithm:
   80     one of the typical misspellings recognized by Hunspell suggestion
   81     mechanism is the syllable duplication. Along the old pattern
   82     ABABA -> ABA, for example nutrITITIon -> nutrITIon, now also the
   83     simpler ABAB -> AB pattern is recognized in non-starting position,
   84     for example, regretTETEd -> regretTEd.
   85 
   86   - lower limit for longswapchar and movechar: recognized only max.
   87     4-character distances to avoid slow and bad suggestions.
   88 
   89   - fix compound handling for new Hungarian orthography reform
   90 
   91   - Allow suggestion search for prefix + *two suffixes*:
   92     Remove artificial performance limit to get correct
   93     suggestions for relatively simple misspellings in
   94     Hungarian, etc., when the word form contains prefix
   95     and both derivative and inflectional suffixes, too:
   96 
   97     lefikszlsa -> lefixlsa
   98 
   99   Improvements for command-line Hunspell:
  100 
  101   - Remove false alarms during checking OpenDocument (ODF)
  102     documents by ignoring <text:span> elements. (LibreOffice
  103     creates a lot of <text:span> elements also within words
  104     during text reediting, resulted often huge amount of broken
  105     words before this fix.)
  106 
  107   - List filenames during filtering multiple files in command-line:
  108 
  109     Examples:
  110 
  111     $ hunspell -l *.odt
  112     a.odt: mispelling
  113     b.odt: egzample
  114 
  115     $ hunspell -l -G *.odt
  116     a.odt: good
  117     b.odt: words
  118 
  119   - Dictionary search by option -D doesn't wait for the standard input
  120     (fixed by Siva Mahadevan)
  121 
  122   Other improvements:
  123 
  124   - makealias dictionary compression: add option --minimize-diff
  125     to reuse free positions of alias lists to create minimal and
  126     readable diffs for alias compressed dictionaries stored in
  127     revision control systems, as dictionaries of LibreOffice.
  128 
  129   - Brazilian-Portuguese translation by Rafael Fontenelle
  130 
  131   - Catalan translation by robert dot buj at gmail
  132 
  133   - Minor bug fixes by several contributors, see git log
  134 
  135 2017-09-03: Hunspell 1.6.2 release:
  136   - Library changes: no. Same as 1.6.1.
  137   - Command line tool:
  138       - Added German translation
  139       - Fixed bug with wrong output encoding, not respecting system locale.
  140 
  141 2017-03-25: Hunspell 1.6.1 release:
  142   - Library changes:
  143       - Performance improvements in suggest()
  144       - Fixes regressions for Hungarian related to compounding.
  145       - Fixes regressions for Korean related to ICONV.
  146   - Command line tool:
  147       - Added Tajik translation 
  148       - Fix regarding serching of OOo dicts installed in user folder
  149   - Manpages:
  150       - Fix microsoft-cp1251 to cp1251. Dicts should not use the first.
  151       - Typos.
  152   
  153 2016-12-22: Hunspell 1.6.0 release:
  154   - Library changes:
  155       - Performance improvement in ngsuggest(), suggestions should be faster.
  156       - Revert MAXWORDLEN to 100 as in 1.3.3 for performance reasons.
  157       - MAXWORDLEN can be set during build time with -D defines.
  158       - Fix crash when word with 102 consecutive X is spelled.
  159   - Command line tool:
  160       - -D shows all loaded dictionares insted of only the first.
  161       - -D properly lists all available dictionaries on Windows.
  162 
  163 2016-11-30: Hunspell 1.5.4 release:
  164   - Fixes the command COMPOUNDSYLLABLE used in Hungarian dictionary.
  165 
  166 2016-11-28: Hunspell 1.5.3 release:
  167   - Removed a #include from hunspell.hxx that was creating trouble
  168 
  169 2016-11-27: Hunspell 1.5.2 release:
  170   - Reverted full backward compatibility with 1.4 public API, again
  171 
  172 2016-11-27: Hunspell 1.5.1 release:
  173   - Reverted full backward compatibility with 1.4 public API
  174 
  175 2016-11-18: Hunspell 1.5.0 release:
  176   - Lot of stability fixes
  177   - Fixed compilation errors on various systems (Windows, FreeBSD)
  178   - Small performance improvement compared to 1.4.0
  179   - The C++ API is updated to use modern C++ types (string, vector).
  180     Backward compatibility is kept for most of the functions except for
  181     the following:
  182       - get_wordchars();
  183       - get_version();
  184       - input_conv(string, string);
  185       - removed get_csconv();
  186 
  187 2016-04-15: Hunspell 1.4.0 release:
  188   - various abi changes due to moving away from char* to std::string
  189 
  190 2014-06-02: Hunspell 1.3.3 release:
  191   - OpenDocument (ODF and Flat ODF) support (ODF needs unzip program)
  192   - various bug fixes
  193 
  194 2011-02-02: Hunspell 1.3.2 release:
  195   - fix library versioning
  196   - improved manual 
  197 
  198 2011-02-02: Hunspell 1.3.1 release:
  199   - bug fixes
  200 
  201 2011-01-26: Hunspell 1.2.15/1.3 release:
  202   - new features: MAXDIFF, ONLYMAXDIFF, MAXCPDSUGS, FORBIDWARN, see manual
  203   - bug fixes
  204 
  205 2011-01-21:
  206   - new features: FORCEUCASE and WARN, see manual
  207   - new options: -r to filter potential mistakes (rare words
  208     signed by flag WARN in the dictionary)
  209   - limited and optimized suggestions
  210 
  211 2011-01-06: Hunspell 1.2.14 release:
  212   - bug fix
  213 2011-01-03: Hunspell 1.2.13 release:
  214   - bug fixes
  215   - improved compound handling and
  216     other improvements supported by OpenTaal Foundation, Netherlands
  217 2010-07-15: Hunspell 1.2.12 release
  218 2010-05-06: Hunspell 1.2.11 release:
  219   - Maintenance release bug fixes
  220 2010-04-30: Hunspell 1.2.10 release:
  221   - Maintenance release bug fixes
  222 2010-03-03: Hunspell 1.2.9 release:
  223   - Maintenance release bug fixes and warnings
  224   - MAP support for composed characters or character sequences
  225 2008-11-01: Hunspell 1.2.8 release:
  226   - Default BREAK feature and better hyphenated word suggestion to accept
  227     and fix (compound) words with hyphen characters by spell checker
  228     instead of by work breaking code of OpenOffice.org. With this feature
  229     it's possible to accept hyphenated compound words, such as "scot-free",
  230     where "scot" is not a correct English word.
  231 
  232   - ICONV & OCONV: input and output conversion tables for optional character
  233     handling or using special inner format. Example:
  234 
  235   # Accepting de facto replacements of the Romanian comma acuted letters
  236   SET UTF-8
  237   ICONV 4
  238   ICONV ş ș
  239   ICONV ţ ț
  240   ICONV Ş Ș
  241   ICONV Ţ Ț
  242 
  243     Typical usage of ICONV/OCONV is to manage an inner format for a segmental
  244     writing system, like the Ethiopic script of the Amharic language.
  245 
  246   - Extended CHECKCOMPOUNDPATTERN to handle conpound word alternations, like
  247     sandhi feature of Telugu and other writing systems.
  248 
  249   - SIMPLIFIEDTRIPLE compound word feature: allow simplified Swedish and
  250     Norwegian compound word forms, like tillåta (till|låta) and
  251     bussjåfør (buss|sjåfør)
  252 
  253   - wordforms: word generator script for dictionary developers (Hunspell
  254     version of unmunch).
  255 
  256   - bug fixes
  257 
  258 2008-08-15: Hunspell 1.2.7 release:
  259   - FULLSTRIP: new option for affix handling. With FULLSTRIP, affix rules can
  260     strip full words, not only one less characters.
  261   - COMPOUNDRULE works with all flag types. (COMPOUNDRULE is for pattern
  262     matching. For example, en_US dictionary of OpenOffice.org uses COMPOUNDRULE
  263     for ordinal number recognition: 1st, 2nd, 11th, 12th, 22nd, 112th, 1000122nd
  264     etc.).
  265   - optimized suggestions:
  266     - modified 1-character distance suggestion algorithms: search a TRY character
  267       in all position instead of all TRY characters in a character position
  268       (it can give more readable suggestion order, also better suggestions
  269       in the first positions, when TRY characters are sorted by frequency.)
  270       For example, suggestions for "moze":
  271       ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6),
  272       maze, more, mote, ooze, mole etc. (Hunspell 1.2.7).
  273     - extended compound word checking for better COMPOUNDRULE related
  274       suggestions, for example English ordinal numbers: 121323th -> 121323rd
  275       (it needs also a th->rd REP definition).
  276   - bug fixes
  277 
  278 2008-07-15: Hunspell 1.2.6 release:
  279   - bug fix release (fix affix rule condition checking of sk_SK dictionary,
  280     iconv support in stemming and morphological analysis of the Hunspell
  281     utility, see also Changelog)
  282 
  283 2008-07-09: Hunspell 1.2.5 release:
  284   - bug fix release (fix affix rule condition checking of en_GB dictionary,
  285     also morphological analysis by dictionaries with two-level suffixes)
  286 
  287 2008-06-18: Hunspell 1.2.4-2 release:
  288   - fix GCC compiler warnings
  289 
  290 2008-06-17: Hunspell 1.2.4 release:
  291   - add free_list() for C, C++ interfaces to deallocate suggestion lists
  292   
  293   - bug fixes
  294 
  295 2008-06-17: Hunspell 1.2.3 release:
  296   - extended XML interface to use morphological functions by standard
  297     spell checking interface, spell() and suggest(). See hunspell.3 manual page.
  298 
  299   - default dash suggestions for compound words: newword-> new word and new-word
  300 
  301   - new manual pages: hunspell.3, hzip.1, hunzip.1.
  302   
  303   - bug fixes
  304 
  305 2008-04-12: Hunspell 1.2.2 release:
  306   - extended dictionary (dic file) support to use multiple base and
  307     special dictionaries.
  308     
  309   - new and improved options of command line hunspell:
  310     -m: morphological analysis or flag debug mode (without affix
  311         rule data it signs the flag of the affix rules)
  312     -s: stemming mode
  313     -D: list available dictionaries and search path
  314     -d: support extra dictionaries by comma separated list. Example:
  315     
  316     hunspell -d en_US,en_med,de_DE,de_med,de_geo UNESCO.txt
  317 
  318     - forbidding in personal dictionary (with asterisk, / signs affixation)
  319 
  320   - optional compressed dictionary format "hzip" for aff and dic files
  321     usage:
  322     hzip example.aff example.dic
  323     mv example.aff example.dic /tmp
  324     hunspell -d example
  325     hunzip example.aff.hz >example.aff
  326     hunzip example.dic.hz >example.dic
  327 
  328   - new affix compression tool "affixcompress": compression tool for
  329     large (millions of words) dictionaries.
  330 
  331   - support encrypted dictionaries for closed OpenOffice.org extensions or
  332     other commercial programs
  333 
  334   - improved manual
  335 
  336   - bug fixes
  337 
  338 2007-11-01: Hunspell 1.2.1 release:
  339   - new memory efficient condition checking algorithm for affix rules
  340   
  341   - new morphological functions:
  342     - stem() for stemming
  343     - analyze() for morphological analysis
  344     - generate() for morphological generation
  345 
  346   - new demos:
  347     - analyze: stemming, morphological analysis and generation
  348     - chmorph: morphological conversion of texts
  349 
  350 2007-09-05: Hunspell 1.1.12 release:
  351   - dictionary based phonetic suggestion for words with
  352     special or foreign pronounciation or alternative (bad) transliteration
  353     (see Changelog, tests/phone.* and manual).
  354 
  355   - improved data structure and memory optimization for dictionaries
  356     with variable count fields
  357 
  358   - bug fixes for Unicode encoding dictionaries and ngram suggestions
  359   
  360   - improved REP suggestions with space: it works without dictionary
  361     modification
  362 
  363   - updated and new project files for Windows API
  364 
  365 2007-08-27: Hunspell 1.1.11 release:
  366   - portability fixes
  367 
  368 2007-08-23: Hunspell 1.1.10 release:
  369   - pronounciation based suggestion using Bjrn Jacke's original Aspell
  370     phonetic transcription algorithm (http://aspell.net), relicensed under
  371     GPL/LGPL/MPL tri-license with the permission of the author
  372 
  373   - keyboard base suggestion by KEY (see manual)
  374 
  375   - better time limits for suggestion search
  376 
  377   - test environment for suggestion based on Wikipedia data
  378 
  379   - bug fixes for non standard Mozilla platforms etc.
  380 
  381 2007-07-25: Hunspell 1.1.9 release:
  382   - better tokenization:
  383     - for URLs, mail addresses and directory paths (default: skip these tokens)
  384     - for colons in words (for Finnish and Swedish)
  385   
  386   - new examples:
  387     - affixation of personal dictionary words
  388     - digits in words
  389 
  390   - bug fixes (see ChangeLog)
  391 
  392 2007-07-16: Hunspell 1.1.8 release:
  393   - better Mac OS X/Cygwin and Windows compatibility
  394 
  395   - fix Hunspell's Valgrind environment and memory handling errors
  396     detected by Valgrind
  397 
  398   - other bug fixes (see ChangeLog)
  399 
  400 2007-07-06: Hunspell 1.1.7 release:
  401   - fix warning messages of OpenOffice.org build
  402 
  403 2007-06-29: Hunspell 1.1.6 release:
  404   - check capitalization of the following word forms
  405     - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG
  406     - allcap words and suffixes: UNICEF's - UNICEF'S
  407     - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA
  408 
  409   - suggestion for missing sentence spacing: something.The -> something. The
  410 
  411   - Hunspell executable: improved locale support
  412     - -i option: custom input encoding
  413     - use locale data for default dictionary names. 
  414     - tools/hunspell.cxx: fix 8-bit tokenization (letters without
  415       casing, like ß or Hebrew characters now are handled well) 
  416     - dictionary search path (automatic detection of OpenOffice.org directories)
  417     - DICPATH environmental variable
  418     - -D option: show directory path of loaded dictionary
  419 
  420   - patches and bug fixes for Mozilla, OpenOffice.org.
  421 
  422 2007-03-19: Hunspell 1.1.5 release:
  423   - optimizations: 10-100% speed up, smaller code size and memory footprint
  424     (conditional experimental code and warning messages)
  425 
  426   - extended Unicode support:
  427     - non BMP Unicode characters in dictionary words and affixes (except
  428       affix rules and conditions)
  429     - support BOM sequence in aff and dic files
  430 
  431   - IGNORE feature for Arabic diacritics and other optional characters
  432 
  433   - New edit distance suggestion methods:
  434     - capitalisation: nasa -> NASA
  435     - long swap: permenant -> permanent
  436     - long move: Ghandi -> Gandhi, greatful -> grateful
  437     - double two characters: vacacation -> vacation
  438     - spaces in REP sug.: REP alot a_lot (NOTE: "a lot" must be a dictionary word)
  439 
  440   - patches and bug fixes for Mozilla, OpenOffice.org, Emacs, MinGW, Aqua,
  441     German and Arabic language, etc.
  442 
  443 2006-02-01: Hunspell 1.1.4 release:
  444   - Improved suggestion for typical OCR bugs (missing spaces between
  445     capitalized words). For example: "aNew" -> "a New".
  446     http://qa.openoffice.org/issues/show_bug.cgi?id=58202
  447 
  448   - tokenization fixes (fix incomplete tokenization of input texts on big-endian
  449     platforms, and locale-dependent tokenization of dictionary entries)
  450 
  451 2006-01-06: Hunspell 1.1.3.2 release:
  452   - fix Visual C++ compiling errors
  453 
  454 2006-01-05: Hunspell 1.1.3 release:
  455   - GPL/LGPL/MPL tri-license for Mozilla integration
  456   
  457   - Alias compression of flag sets and morphological descriptions.
  458     (For example, 16 MB Arabic dic file can be compressed to 1 MB.)
  459   
  460   - Improved suggestion.
  461   
  462   - Improved, language independent German sharp s casing with CHECKSHARPS
  463     declaration.
  464 
  465   - Unicode tokenization in Hunspell program.
  466   
  467   - Bug fixes (at new and old compound word handling methods), etc.
  468 
  469 2005-11-11: Hunspell 1.1.2 release:
  470 
  471   - Bug fixes (MAP Unicode, COMPOUND pattern matching, ONLYINCOMPOUND
  472     suggestions)
  473 
  474   - Checked with 51 regression tests in Valgrind debugging environment,
  475     and tested with 52 OOo dictionaries on i686-pc-linux platform.
  476 
  477 2005-11-09: Hunspell 1.1.1 release:
  478 
  479   - Compound word patterns for complex compound word handling and
  480     simple word-level lexical scanning. Ideal for checking
  481     Arabic and Roman numbers, ordinal numbers in English, affixed
  482     numbers in agglutinative languages, etc.
  483     http://qa.openoffice.org/issues/show_bug.cgi?id=53643
  484 
  485   - Support ISO-8859-15 encoding for French (French oe ligatures are
  486     missing from the latin-1 encoding).
  487     http://qa.openoffice.org/issues/show_bug.cgi?id=54980
  488     
  489   - Implemented a flag to forbid obscene word suggestion:
  490     http://qa.openoffice.org/issues/show_bug.cgi?id=55498
  491 
  492   - Checked with 50 regression tests in Valgrind debugging environment,
  493     and tested with 52 OOo dictionaries.
  494 
  495   - other improvements and bug fixes (see ChangeLog)
  496 
  497 2005-09-19: Hunspell 1.1.0 release
  498 
  499 * complete comparison with MySpell 3.2 (from OpenOffice.org 2 beta)
  500 
  501 * improved ngram suggestion with swap character detection and
  502   case insensitivity
  503 
  504 ------ examples for ngram improvement (input word and suggestions) -----
  505 
  506 1. pernament (instead of permanent)
  507 
  508 MySpell 3.2: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
  509         ornament, ornamentals, ornamental, ornamentally
  510 
  511 Hunspell 1.0.9: ornamental, ornament, tournament
  512 
  513 Hunspell 1.1.0: permanent
  514 
  515 Note: swap character detection
  516 
  517 
  518 2. PERNAMENT (instead of PERMANENT)
  519 
  520 MySpell 3.2: -
  521 
  522 Hunspell 1.0.9: -
  523 
  524 Hunspell 1.1.0: PERMANENT
  525 
  526 
  527 3. Unesco (instead of UNESCO)
  528 
  529 MySpell 3.2: Genesco, Ionesco, Genesco's, Ionesco's, Frescoing, Fresco's,
  530              Frescoed, Fresco, Escorts, Escorting
  531 
  532 Hunspell 1.0.9: Genesco, Ionesco, Fresco
  533 
  534 Hunspell 1.1.0: UNESCO
  535 
  536 
  537 4. siggraph's (instead of SIGGRAPH's)
  538 
  539 MySpell 3.2: serigraph's, photograph's, serigraphs, physiography's,
  540              physiography, digraphs, serigraph, stratigraphy's, stratigraphy
  541              epigraphs
  542 
  543 Hunspell 1.0.9: serigraph's, epigraph's, digraph's
  544 
  545 Hunspell 1.1.0: SIGGRAPH's
  546 
  547 --------------- end of examples --------------------
  548 
  549 * improved testing environment with suggestion checking and memory debugging
  550 
  551   memory debugging of all tests with a simple command:
  552   
  553   VALGRIND=memcheck make check
  554 
  555 * lots of other improvements and bug fixes (see ChangeLog)
  556 
  557 
  558 2005-08-26: Hunspell 1.0.9 release
  559 
  560 * improved related character map suggestion
  561 
  562 * improved ngram suggestion
  563 
  564 ------ examples for ngram improvement (O=old, N = new ngram suggestions) --
  565 
  566 1. Permenant (instead of Permanent)
  567 
  568 O: Endangerment, Ferment, Fermented, Deferment's, Empowerment,
  569         Ferment's, Ferments, Fermenting, Countermen, Weathermen
  570 
  571 N: Permanent, Supermen, Preferment
  572 
  573 Note: Ngram suggestions was case sensitive.
  574 
  575 2. permenant (instead of permanent) 
  576 
  577 O: supermen, newspapermen, empowerment, endangerment, preferments,
  578         preferment, permanent, preferment's, permanently, impermanent
  579 
  580 N: permanent, supermen, preferment
  581 
  582 Note: new suggestions are also weighted with longest common subsequence,
  583 first letter and common character positions
  584 
  585 3. pernemant (instead of permanent) 
  586 
  587 O: pimpernel's, pimpernel, pimpernels, permanently, permanents, permanent,
  588         supernatant, impermanent, semipermanent, impermanently
  589 
  590 N: permanent, supernatant, pimpernel
  591 
  592 Note: new method also prefers root word instead of not
  593 relevant affixes ('s, s and ly)
  594 
  595 
  596 4. pernament (instead of permanent)
  597 
  598 O: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
  599         ornament, ornamentals, ornamental, ornamentally
  600 
  601 N: ornamental, ornament, tournament
  602 
  603 Note: Both ngram methods misses here.
  604 
  605 
  606 5. obvus (instad of obvious):
  607 
  608 O: obvious, Corvus, obverse, obviously, Jacobus, obtuser, obtuse,
  609         obviates, obviate, Travus
  610 
  611 N: obvious, obtuse, obverse
  612 
  613 Note: new method also prefers common first letters.
  614 
  615 
  616 6. unambigus (instead of unambiguous) 
  617 
  618 O: unambiguous, unambiguity, unambiguously, ambiguously, ambiguous,
  619         unambitious, ambiguities, ambiguousness
  620 
  621 N: unambiguous, unambiguity, unambitious
  622 
  623 
  624 
  625 7. consecvence (instead of consequence)
  626 
  627 O: consecutive, consecutively, consecutiveness, nonconsecutive, consequence,
  628         consecutiveness's, convenience's, consistences, consistence
  629 
  630 N: consequence, consecutive, consecrates
  631 
  632 
  633 An example in a language with rich morphology:
  634 
  635 8. Misisipiben (instead of Mississippiben [`in Mississippi' in Hungarian]):
  636 
  637 O: Misikdiben, Pisisediben, Misikiiben, Pisisekiben, Misikiben,
  638         Misikidiben, Misikkiben, Misikikiben, Misikimiben, Mississippiiben
  639 
  640 N: Mississippiben, Mississippiiben, Misiiben
  641 
  642 Note: Suggesting not relevant affixes was the biggest fault in ngram
  643    suggestion for languages with a lot of affixes.
  644 
  645 --------------- end of examples --------------------
  646 
  647 * support twofold prefix cutting
  648 
  649 * lots of other improvements and bug fixes (see ChangeLog)
  650 
  651 * test Hunspell with 54 OpenOffice.org dictionaries:
  652 
  653 source: ftp://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
  654 
  655 testing shell script:
  656 -------------------------------------------------------
  657 for i in `ls *zip | grep '^[a-z]*_[A-Z]*[.]'`
  658 do
  659 	dic=`basename $i .zip`
  660 	mkdir $dic
  661 	echo unzip $dic
  662 	unzip -d $dic $i 2>/dev/null
  663 	cd $dic
  664 	echo unmunch and test $dic
  665 	unmunch $dic.dic $dic.aff 2>/dev/null | awk '{print$0"\t"}' |
  666 	hunspell -d $dic -l -1 >$dic.result 2>$dic.err || rm -f $dic.result
  667 	cd ..
  668 done
  669 --------------------------------------------------------
  670 
  671 test result (0 size is o.k.):
  672 
  673 $ for i in *_*/*.result; do wc -c $i; done 
  674 0 af_ZA/af_ZA.result
  675 0 bg_BG/bg_BG.result
  676 0 ca_ES/ca_ES.result
  677 0 cy_GB/cy_GB.result
  678 0 cs_CZ/cs_CZ.result
  679 0 da_DK/da_DK.result
  680 0 de_AT/de_AT.result
  681 0 de_CH/de_CH.result
  682 0 de_DE/de_DE.result
  683 0 el_GR/el_GR.result
  684 6 en_AU/en_AU.result
  685 0 en_CA/en_CA.result
  686 0 en_GB/en_GB.result
  687 0 en_NZ/en_NZ.result
  688 0 en_US/en_US.result
  689 0 eo_EO/eo_EO.result
  690 0 es_ES/es_ES.result
  691 0 es_MX/es_MX.result
  692 0 es_NEW/es_NEW.result
  693 0 fo_FO/fo_FO.result
  694 0 fr_FR/fr_FR.result
  695 0 ga_IE/ga_IE.result
  696 0 gd_GB/gd_GB.result
  697 0 gl_ES/gl_ES.result
  698 0 he_IL/he_IL.result
  699 0 hr_HR/hr_HR.result
  700 200694989 hu_HU/hu_HU.result
  701 0 id_ID/id_ID.result
  702 0 it_IT/it_IT.result
  703 0 ku_TR/ku_TR.result
  704 0 lt_LT/lt_LT.result
  705 0 lv_LV/lv_LV.result
  706 0 mg_MG/mg_MG.result
  707 0 mi_NZ/mi_NZ.result
  708 0 ms_MY/ms_MY.result
  709 0 nb_NO/nb_NO.result
  710 0 nl_NL/nl_NL.result
  711 0 nn_NO/nn_NO.result
  712 0 ny_MW/ny_MW.result
  713 0 pl_PL/pl_PL.result
  714 0 pt_BR/pt_BR.result
  715 0 pt_PT/pt_PT.result
  716 0 ro_RO/ro_RO.result
  717 0 ru_RU/ru_RU.result
  718 0 rw_RW/rw_RW.result
  719 0 sk_SK/sk_SK.result
  720 0 sl_SI/sl_SI.result
  721 0 sv_SE/sv_SE.result
  722 0 sw_KE/sw_KE.result
  723 0 tet_ID/tet_ID.result
  724 0 tl_PH/tl_PH.result
  725 0 tn_ZA/tn_ZA.result
  726 0 uk_UA/uk_UA.result
  727 0 zu_ZA/zu_ZA.result
  728 
  729 In en_AU dictionary, there is an abbrevation with two dots (`eqn..'), but
  730 `eqn.' is missing. Presumably it is a dictionary bug. Myspell also
  731 haven't accepted it.
  732 
  733 Hungarian dictionary contains pseudoroots and forbidden words.
  734 Unmunch haven't supported these features yet, and generates bad words, too.
  735 
  736 * check affix rules and OOo dictionaries. Detected bugs in cs_CZ,
  737 es_ES, es_NEW, es_MX, lt_LT, nn_NO, pt_PT, ro_RO, sk_SK and sv_SE dictionaries).
  738 
  739 Details:
  740 --------------------------------------------------------
  741 cs_CZ
  742 warning - incompatible stripping characters and condition:
  743 SFX D   us          ech        [^ighk]os
  744 SFX D   us          y          [^i]os
  745 SFX Q   os          ech        [^ghk]es
  746 SFX M   o           ech        [^ghkei]a
  747 SFX J   m          ej         m
  748 SFX J   m          ejme       m
  749 SFX J   m          ejte       m
  750 SFX A   ouit       up         oupit
  751 SFX A   ouit       upme       oupit
  752 SFX A   ouit       upte       oupit
  753 SFX A   nout        l          [aeiouyr][^aeiouyrl][^aeiouy
  754 SFX A   nout        l          [aeiouyr][^aeiouyrl][^aeiouy
  755 
  756 es_ES
  757 warning - incompatible stripping characters and condition:
  758 SFX W umar se [ae]husar
  759 SFX W emir iis eir
  760 
  761 es_NEW
  762 warning - incompatible stripping characters and condition:
  763 SFX I unan nen unar
  764 
  765 es_MX
  766 warning - incompatible stripping characters and condition:
  767 SFX A a ote e
  768 SFX W umar se [ae]husar
  769 SFX W emir iis eir
  770 
  771 lt_LT
  772 warning - incompatible stripping characters and condition:
  773 SFX U ti      siuosi          tis       
  774 SFX U ti      siuosi          tis       
  775 SFX U ti      siesi           tis       
  776 SFX U ti      siesi           tis       
  777 SFX U ti      sis             tis       
  778 SFX U ti      sis             tis       
  779 SFX U ti      sims           tis       
  780 SFX U ti      sims           tis       
  781 SFX U ti      sits           tis       
  782 SFX U ti      sits           tis       
  783 
  784 nn_NO
  785 warning - incompatible stripping characters and condition:
  786 SFX D   ar  rar  [^fmk]er
  787 SFX U   re  orde  ere
  788 SFX U   re  ort  ere
  789 
  790 pt_PT
  791 warning - incompatible stripping characters and condition:
  792 SFX g   os        oas        o
  793 SFX g   os        oas        o
  794 
  795 ro_RO
  796 warning - bad field number:
  797 SFX L   0          le         [^cg] i
  798 SFX L   0          i          [cg] i
  799 SFX U   0          i          [^i] ii
  800 warning - incompatible stripping characters and condition:
  801 SFX P   l          i          l	[<- there is an unnecessary tabulator here)
  802 SFX I   a          ii         [gc] a
  803 warning - bad field number:
  804 SFX I   a          ii         [gc] a
  805 SFX I   a          ei         [^cg] a
  806 
  807 sk_SK
  808 warning - incompatible stripping characters and condition:
  809 SFX T   a         ol        kla
  810 SFX T   a         olc       kla
  811 SFX T   sa        l        sla
  812 SFX T   sa        lc       sla
  813 SFX R   c         liem      c
  814 SFX R   is        tie       mias
  815 SFX R   iez        iem        [^i]ez
  816 SFX R   iez        ie        [^i]ez
  817 SFX R   iez        ie         [^i]ez
  818 SFX R   iez        eme        [^i]ez
  819 SFX R   iez        ete        [^i]ez
  820 SFX R   iez                  [^i]ez
  821 SFX R   iez        c         [^i]ez
  822 SFX R   iez        z          [^i]ez
  823 SFX R   iez        me         [^i]ez
  824 SFX R   iez        te         [^i]ez
  825 
  826 sv_SE
  827 warning - bad field number:
  828 SFX  C  0  net  nets [^e]n
  829 --------------------------------------------------------
  830 
  831 2005-08-01: Hunspell 1.0.8 release
  832 
  833 - improved compound word support
  834 - fix German S handling
  835 - port MySpell files and MAP feature
  836 
  837 2005-07-22: Hunspell 1.0.7 release
  838 
  839 2005-07-21: new home page: http://hunspell.sourceforge.net