"Fossies" - the Fresh Open Source Software Archive 
Member "texstudio-3.1.1/src/hunspell/NEWS" (21 Feb 2021, 28266 Bytes) of package /linux/misc/texstudio-3.1.1.tar.gz:
As a special service "Fossies" has tried to format the requested text file into HTML format (style:
standard) with prefixed line numbers.
Alternatively you can here
view or
download the uninterpreted source code file.
1 2018-11-12: Hunspell 1.7.0 release:
2
3 New features and bug fixes by László Németh, supported by FSF.hu Foundation:
4
5 - No annoying suggestion times any more, especially in languages with
6 compound word handling and complex morphology. By adding balanced
7 multi-level time limits, now the guaranteed suggestion time is there
8 within half a second, not seconds (nor dozen of seconds or more
9 in extreme cases) for longer misspellings, too.
10
11 - add SPELLML support for run-time dictionary extension with optional
12 affixation of user words. See new "Grammar By" feature of
13 language-specific user dictionaries of LibreOffice 6.0:
14
15 News: https://wiki.documentfoundation.org/ReleaseNotes/6.0#.E2.80.9CGrammar_By.E2.80.9D_spell_checking
16
17 Screencast with English example: https://www.youtube.com/watch?v=EsS3gaBTfOo
18
19 Screencast with German example: https://www.youtube.com/watch?v=aYVFDqCUb6I
20
21 - Improved, highly customizable suggestions on level of dictionary words:
22 Pronunciations and typical misspellings defined by optional "ph:" fields of
23 the dictionary words are used not only in n-gram suggestions, but as
24 elements of the REP replacement list getting the highest priority in normal
25 suggestions, also giving the best suggestions for short words, too.
26 More information: see "ph:" in man 5 hunspell.
27
28 - Handling multiple word suggestions is much more easier. Like in a
29 traditional spelling dictionary, for example, to get the correct suggestion
30 "a lot" for the typical misspelling "alot" at the first place, now it's
31 enough to put the following line to the dic(tionary) file:
32
33 a lot
34
35 - Limit compound overgeneration by dictionary based word pairs:
36 Now it's possible to filter bad compound words by listing
37 the correct word pairs with space in the dictionary, as in a traditional
38 spelling dictionary.
39
40 - clean-up suggestion:
41
42 - no n-gram and compound word suggestions, if "good" suggestion
43 exists, ie. uppercase, REP, ph: or dictionary word pair suggestions
44
45 - word pairs are always suggested, if they exist in the dic file
46
47 - word pairs have top priority in suggestions, and
48 these are the only suggestions if there is no other good suggestion.
49
50 - also dictionary word pairs separated by dash instead of space
51 are handled specially in two-word suggestion (depending from the
52 language)
53
54 - limit bad suggestions by improved n-gram suggestion rules:
55
56 don't suggest capitalized dictionary words for lower
57 case misspellings in n-gram suggestions, except
58
59 - PHONE usage, or
60 - in the case of German, where not only proper
61 nouns are capitalized, or
62 - the capitalized word has special pronunciation
63
64 and don't suggest if the difference of lengths of misspellings and
65 suggestions is 5 or more characters.
66
67 - Extend dotless i and dotted I rules to Crimean Tatar language
68 Allow dotted I in dictionary, and disable bad capitalization of i.
69
70 - BREAK: extended recursive word breaking algorithm to handle words or
71 words with suffixes when they already contain word break characters,
72 for example, "e-mail" is a dictionary word with a word break character, and
73 it wasn't accepted before in compounds in some languages.
74
75 - FORBIDDENWORD precedes BREAK: Now it's possible to forbid compound
76 forms recognized by BREAK word breaking by adding the bad compounds to
77 the dictionary with FORBIDDENWORD flags.
78
79 - lower limit for "doubletwochars" suggestion algorithm:
80 one of the typical misspellings recognized by Hunspell suggestion
81 mechanism is the syllable duplication. Along the old pattern
82 ABABA -> ABA, for example nutrITITIon -> nutrITIon, now also the
83 simpler ABAB -> AB pattern is recognized in non-starting position,
84 for example, regretTETEd -> regretTEd.
85
86 - lower limit for longswapchar and movechar: recognized only max.
87 4-character distances to avoid slow and bad suggestions.
88
89 - fix compound handling for new Hungarian orthography reform
90
91 - Allow suggestion search for prefix + *two suffixes*:
92 Remove artificial performance limit to get correct
93 suggestions for relatively simple misspellings in
94 Hungarian, etc., when the word form contains prefix
95 and both derivative and inflectional suffixes, too:
96
97 lefikszálása -> lefixálása
98
99 Improvements for command-line Hunspell:
100
101 - Remove false alarms during checking OpenDocument (ODF)
102 documents by ignoring <text:span> elements. (LibreOffice
103 creates a lot of <text:span> elements also within words
104 during text reediting, resulted often huge amount of broken
105 words before this fix.)
106
107 - List filenames during filtering multiple files in command-line:
108
109 Examples:
110
111 $ hunspell -l *.odt
112 a.odt: mispelling
113 b.odt: egzample
114
115 $ hunspell -l -G *.odt
116 a.odt: good
117 b.odt: words
118
119 - Dictionary search by option -D doesn't wait for the standard input
120 (fixed by Siva Mahadevan)
121
122 Other improvements:
123
124 - makealias dictionary compression: add option --minimize-diff
125 to reuse free positions of alias lists to create minimal and
126 readable diffs for alias compressed dictionaries stored in
127 revision control systems, as dictionaries of LibreOffice.
128
129 - Brazilian-Portuguese translation by Rafael Fontenelle
130
131 - Catalan translation by robert dot buj at gmail
132
133 - Minor bug fixes by several contributors, see git log
134
135 2017-09-03: Hunspell 1.6.2 release:
136 - Library changes: no. Same as 1.6.1.
137 - Command line tool:
138 - Added German translation
139 - Fixed bug with wrong output encoding, not respecting system locale.
140
141 2017-03-25: Hunspell 1.6.1 release:
142 - Library changes:
143 - Performance improvements in suggest()
144 - Fixes regressions for Hungarian related to compounding.
145 - Fixes regressions for Korean related to ICONV.
146 - Command line tool:
147 - Added Tajik translation
148 - Fix regarding serching of OOo dicts installed in user folder
149 - Manpages:
150 - Fix microsoft-cp1251 to cp1251. Dicts should not use the first.
151 - Typos.
152
153 2016-12-22: Hunspell 1.6.0 release:
154 - Library changes:
155 - Performance improvement in ngsuggest(), suggestions should be faster.
156 - Revert MAXWORDLEN to 100 as in 1.3.3 for performance reasons.
157 - MAXWORDLEN can be set during build time with -D defines.
158 - Fix crash when word with 102 consecutive X is spelled.
159 - Command line tool:
160 - -D shows all loaded dictionares insted of only the first.
161 - -D properly lists all available dictionaries on Windows.
162
163 2016-11-30: Hunspell 1.5.4 release:
164 - Fixes the command COMPOUNDSYLLABLE used in Hungarian dictionary.
165
166 2016-11-28: Hunspell 1.5.3 release:
167 - Removed a #include from hunspell.hxx that was creating trouble
168
169 2016-11-27: Hunspell 1.5.2 release:
170 - Reverted full backward compatibility with 1.4 public API, again
171
172 2016-11-27: Hunspell 1.5.1 release:
173 - Reverted full backward compatibility with 1.4 public API
174
175 2016-11-18: Hunspell 1.5.0 release:
176 - Lot of stability fixes
177 - Fixed compilation errors on various systems (Windows, FreeBSD)
178 - Small performance improvement compared to 1.4.0
179 - The C++ API is updated to use modern C++ types (string, vector).
180 Backward compatibility is kept for most of the functions except for
181 the following:
182 - get_wordchars();
183 - get_version();
184 - input_conv(string, string);
185 - removed get_csconv();
186
187 2016-04-15: Hunspell 1.4.0 release:
188 - various abi changes due to moving away from char* to std::string
189
190 2014-06-02: Hunspell 1.3.3 release:
191 - OpenDocument (ODF and Flat ODF) support (ODF needs unzip program)
192 - various bug fixes
193
194 2011-02-02: Hunspell 1.3.2 release:
195 - fix library versioning
196 - improved manual
197
198 2011-02-02: Hunspell 1.3.1 release:
199 - bug fixes
200
201 2011-01-26: Hunspell 1.2.15/1.3 release:
202 - new features: MAXDIFF, ONLYMAXDIFF, MAXCPDSUGS, FORBIDWARN, see manual
203 - bug fixes
204
205 2011-01-21:
206 - new features: FORCEUCASE and WARN, see manual
207 - new options: -r to filter potential mistakes (rare words
208 signed by flag WARN in the dictionary)
209 - limited and optimized suggestions
210
211 2011-01-06: Hunspell 1.2.14 release:
212 - bug fix
213 2011-01-03: Hunspell 1.2.13 release:
214 - bug fixes
215 - improved compound handling and
216 other improvements supported by OpenTaal Foundation, Netherlands
217 2010-07-15: Hunspell 1.2.12 release
218 2010-05-06: Hunspell 1.2.11 release:
219 - Maintenance release bug fixes
220 2010-04-30: Hunspell 1.2.10 release:
221 - Maintenance release bug fixes
222 2010-03-03: Hunspell 1.2.9 release:
223 - Maintenance release bug fixes and warnings
224 - MAP support for composed characters or character sequences
225 2008-11-01: Hunspell 1.2.8 release:
226 - Default BREAK feature and better hyphenated word suggestion to accept
227 and fix (compound) words with hyphen characters by spell checker
228 instead of by work breaking code of OpenOffice.org. With this feature
229 it's possible to accept hyphenated compound words, such as "scot-free",
230 where "scot" is not a correct English word.
231
232 - ICONV & OCONV: input and output conversion tables for optional character
233 handling or using special inner format. Example:
234
235 # Accepting de facto replacements of the Romanian comma acuted letters
236 SET UTF-8
237 ICONV 4
238 ICONV ÅŸ È™
239 ICONV ţ ț
240 ICONV Ş Ș
241 ICONV Ţ Ț
242
243 Typical usage of ICONV/OCONV is to manage an inner format for a segmental
244 writing system, like the Ethiopic script of the Amharic language.
245
246 - Extended CHECKCOMPOUNDPATTERN to handle conpound word alternations, like
247 sandhi feature of Telugu and other writing systems.
248
249 - SIMPLIFIEDTRIPLE compound word feature: allow simplified Swedish and
250 Norwegian compound word forms, like tillåta (till|låta) and
251 bussjåfør (buss|sjåfør)
252
253 - wordforms: word generator script for dictionary developers (Hunspell
254 version of unmunch).
255
256 - bug fixes
257
258 2008-08-15: Hunspell 1.2.7 release:
259 - FULLSTRIP: new option for affix handling. With FULLSTRIP, affix rules can
260 strip full words, not only one less characters.
261 - COMPOUNDRULE works with all flag types. (COMPOUNDRULE is for pattern
262 matching. For example, en_US dictionary of OpenOffice.org uses COMPOUNDRULE
263 for ordinal number recognition: 1st, 2nd, 11th, 12th, 22nd, 112th, 1000122nd
264 etc.).
265 - optimized suggestions:
266 - modified 1-character distance suggestion algorithms: search a TRY character
267 in all position instead of all TRY characters in a character position
268 (it can give more readable suggestion order, also better suggestions
269 in the first positions, when TRY characters are sorted by frequency.)
270 For example, suggestions for "moze":
271 ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6),
272 maze, more, mote, ooze, mole etc. (Hunspell 1.2.7).
273 - extended compound word checking for better COMPOUNDRULE related
274 suggestions, for example English ordinal numbers: 121323th -> 121323rd
275 (it needs also a th->rd REP definition).
276 - bug fixes
277
278 2008-07-15: Hunspell 1.2.6 release:
279 - bug fix release (fix affix rule condition checking of sk_SK dictionary,
280 iconv support in stemming and morphological analysis of the Hunspell
281 utility, see also Changelog)
282
283 2008-07-09: Hunspell 1.2.5 release:
284 - bug fix release (fix affix rule condition checking of en_GB dictionary,
285 also morphological analysis by dictionaries with two-level suffixes)
286
287 2008-06-18: Hunspell 1.2.4-2 release:
288 - fix GCC compiler warnings
289
290 2008-06-17: Hunspell 1.2.4 release:
291 - add free_list() for C, C++ interfaces to deallocate suggestion lists
292
293 - bug fixes
294
295 2008-06-17: Hunspell 1.2.3 release:
296 - extended XML interface to use morphological functions by standard
297 spell checking interface, spell() and suggest(). See hunspell.3 manual page.
298
299 - default dash suggestions for compound words: newword-> new word and new-word
300
301 - new manual pages: hunspell.3, hzip.1, hunzip.1.
302
303 - bug fixes
304
305 2008-04-12: Hunspell 1.2.2 release:
306 - extended dictionary (dic file) support to use multiple base and
307 special dictionaries.
308
309 - new and improved options of command line hunspell:
310 -m: morphological analysis or flag debug mode (without affix
311 rule data it signs the flag of the affix rules)
312 -s: stemming mode
313 -D: list available dictionaries and search path
314 -d: support extra dictionaries by comma separated list. Example:
315
316 hunspell -d en_US,en_med,de_DE,de_med,de_geo UNESCO.txt
317
318 - forbidding in personal dictionary (with asterisk, / signs affixation)
319
320 - optional compressed dictionary format "hzip" for aff and dic files
321 usage:
322 hzip example.aff example.dic
323 mv example.aff example.dic /tmp
324 hunspell -d example
325 hunzip example.aff.hz >example.aff
326 hunzip example.dic.hz >example.dic
327
328 - new affix compression tool "affixcompress": compression tool for
329 large (millions of words) dictionaries.
330
331 - support encrypted dictionaries for closed OpenOffice.org extensions or
332 other commercial programs
333
334 - improved manual
335
336 - bug fixes
337
338 2007-11-01: Hunspell 1.2.1 release:
339 - new memory efficient condition checking algorithm for affix rules
340
341 - new morphological functions:
342 - stem() for stemming
343 - analyze() for morphological analysis
344 - generate() for morphological generation
345
346 - new demos:
347 - analyze: stemming, morphological analysis and generation
348 - chmorph: morphological conversion of texts
349
350 2007-09-05: Hunspell 1.1.12 release:
351 - dictionary based phonetic suggestion for words with
352 special or foreign pronounciation or alternative (bad) transliteration
353 (see Changelog, tests/phone.* and manual).
354
355 - improved data structure and memory optimization for dictionaries
356 with variable count fields
357
358 - bug fixes for Unicode encoding dictionaries and ngram suggestions
359
360 - improved REP suggestions with space: it works without dictionary
361 modification
362
363 - updated and new project files for Windows API
364
365 2007-08-27: Hunspell 1.1.11 release:
366 - portability fixes
367
368 2007-08-23: Hunspell 1.1.10 release:
369 - pronounciation based suggestion using Björn Jacke's original Aspell
370 phonetic transcription algorithm (http://aspell.net), relicensed under
371 GPL/LGPL/MPL tri-license with the permission of the author
372
373 - keyboard base suggestion by KEY (see manual)
374
375 - better time limits for suggestion search
376
377 - test environment for suggestion based on Wikipedia data
378
379 - bug fixes for non standard Mozilla platforms etc.
380
381 2007-07-25: Hunspell 1.1.9 release:
382 - better tokenization:
383 - for URLs, mail addresses and directory paths (default: skip these tokens)
384 - for colons in words (for Finnish and Swedish)
385
386 - new examples:
387 - affixation of personal dictionary words
388 - digits in words
389
390 - bug fixes (see ChangeLog)
391
392 2007-07-16: Hunspell 1.1.8 release:
393 - better Mac OS X/Cygwin and Windows compatibility
394
395 - fix Hunspell's Valgrind environment and memory handling errors
396 detected by Valgrind
397
398 - other bug fixes (see ChangeLog)
399
400 2007-07-06: Hunspell 1.1.7 release:
401 - fix warning messages of OpenOffice.org build
402
403 2007-06-29: Hunspell 1.1.6 release:
404 - check capitalization of the following word forms
405 - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG
406 - allcap words and suffixes: UNICEF's - UNICEF'S
407 - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA
408
409 - suggestion for missing sentence spacing: something.The -> something. The
410
411 - Hunspell executable: improved locale support
412 - -i option: custom input encoding
413 - use locale data for default dictionary names.
414 - tools/hunspell.cxx: fix 8-bit tokenization (letters without
415 casing, like ß or Hebrew characters now are handled well)
416 - dictionary search path (automatic detection of OpenOffice.org directories)
417 - DICPATH environmental variable
418 - -D option: show directory path of loaded dictionary
419
420 - patches and bug fixes for Mozilla, OpenOffice.org.
421
422 2007-03-19: Hunspell 1.1.5 release:
423 - optimizations: 10-100% speed up, smaller code size and memory footprint
424 (conditional experimental code and warning messages)
425
426 - extended Unicode support:
427 - non BMP Unicode characters in dictionary words and affixes (except
428 affix rules and conditions)
429 - support BOM sequence in aff and dic files
430
431 - IGNORE feature for Arabic diacritics and other optional characters
432
433 - New edit distance suggestion methods:
434 - capitalisation: nasa -> NASA
435 - long swap: permenant -> permanent
436 - long move: Ghandi -> Gandhi, greatful -> grateful
437 - double two characters: vacacation -> vacation
438 - spaces in REP sug.: REP alot a_lot (NOTE: "a lot" must be a dictionary word)
439
440 - patches and bug fixes for Mozilla, OpenOffice.org, Emacs, MinGW, Aqua,
441 German and Arabic language, etc.
442
443 2006-02-01: Hunspell 1.1.4 release:
444 - Improved suggestion for typical OCR bugs (missing spaces between
445 capitalized words). For example: "aNew" -> "a New".
446 http://qa.openoffice.org/issues/show_bug.cgi?id=58202
447
448 - tokenization fixes (fix incomplete tokenization of input texts on big-endian
449 platforms, and locale-dependent tokenization of dictionary entries)
450
451 2006-01-06: Hunspell 1.1.3.2 release:
452 - fix Visual C++ compiling errors
453
454 2006-01-05: Hunspell 1.1.3 release:
455 - GPL/LGPL/MPL tri-license for Mozilla integration
456
457 - Alias compression of flag sets and morphological descriptions.
458 (For example, 16 MB Arabic dic file can be compressed to 1 MB.)
459
460 - Improved suggestion.
461
462 - Improved, language independent German sharp s casing with CHECKSHARPS
463 declaration.
464
465 - Unicode tokenization in Hunspell program.
466
467 - Bug fixes (at new and old compound word handling methods), etc.
468
469 2005-11-11: Hunspell 1.1.2 release:
470
471 - Bug fixes (MAP Unicode, COMPOUND pattern matching, ONLYINCOMPOUND
472 suggestions)
473
474 - Checked with 51 regression tests in Valgrind debugging environment,
475 and tested with 52 OOo dictionaries on i686-pc-linux platform.
476
477 2005-11-09: Hunspell 1.1.1 release:
478
479 - Compound word patterns for complex compound word handling and
480 simple word-level lexical scanning. Ideal for checking
481 Arabic and Roman numbers, ordinal numbers in English, affixed
482 numbers in agglutinative languages, etc.
483 http://qa.openoffice.org/issues/show_bug.cgi?id=53643
484
485 - Support ISO-8859-15 encoding for French (French oe ligatures are
486 missing from the latin-1 encoding).
487 http://qa.openoffice.org/issues/show_bug.cgi?id=54980
488
489 - Implemented a flag to forbid obscene word suggestion:
490 http://qa.openoffice.org/issues/show_bug.cgi?id=55498
491
492 - Checked with 50 regression tests in Valgrind debugging environment,
493 and tested with 52 OOo dictionaries.
494
495 - other improvements and bug fixes (see ChangeLog)
496
497 2005-09-19: Hunspell 1.1.0 release
498
499 * complete comparison with MySpell 3.2 (from OpenOffice.org 2 beta)
500
501 * improved ngram suggestion with swap character detection and
502 case insensitivity
503
504 ------ examples for ngram improvement (input word and suggestions) -----
505
506 1. pernament (instead of permanent)
507
508 MySpell 3.2: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
509 ornament, ornamentals, ornamental, ornamentally
510
511 Hunspell 1.0.9: ornamental, ornament, tournament
512
513 Hunspell 1.1.0: permanent
514
515 Note: swap character detection
516
517
518 2. PERNAMENT (instead of PERMANENT)
519
520 MySpell 3.2: -
521
522 Hunspell 1.0.9: -
523
524 Hunspell 1.1.0: PERMANENT
525
526
527 3. Unesco (instead of UNESCO)
528
529 MySpell 3.2: Genesco, Ionesco, Genesco's, Ionesco's, Frescoing, Fresco's,
530 Frescoed, Fresco, Escorts, Escorting
531
532 Hunspell 1.0.9: Genesco, Ionesco, Fresco
533
534 Hunspell 1.1.0: UNESCO
535
536
537 4. siggraph's (instead of SIGGRAPH's)
538
539 MySpell 3.2: serigraph's, photograph's, serigraphs, physiography's,
540 physiography, digraphs, serigraph, stratigraphy's, stratigraphy
541 epigraphs
542
543 Hunspell 1.0.9: serigraph's, epigraph's, digraph's
544
545 Hunspell 1.1.0: SIGGRAPH's
546
547 --------------- end of examples --------------------
548
549 * improved testing environment with suggestion checking and memory debugging
550
551 memory debugging of all tests with a simple command:
552
553 VALGRIND=memcheck make check
554
555 * lots of other improvements and bug fixes (see ChangeLog)
556
557
558 2005-08-26: Hunspell 1.0.9 release
559
560 * improved related character map suggestion
561
562 * improved ngram suggestion
563
564 ------ examples for ngram improvement (O=old, N = new ngram suggestions) --
565
566 1. Permenant (instead of Permanent)
567
568 O: Endangerment, Ferment, Fermented, Deferment's, Empowerment,
569 Ferment's, Ferments, Fermenting, Countermen, Weathermen
570
571 N: Permanent, Supermen, Preferment
572
573 Note: Ngram suggestions was case sensitive.
574
575 2. permenant (instead of permanent)
576
577 O: supermen, newspapermen, empowerment, endangerment, preferments,
578 preferment, permanent, preferment's, permanently, impermanent
579
580 N: permanent, supermen, preferment
581
582 Note: new suggestions are also weighted with longest common subsequence,
583 first letter and common character positions
584
585 3. pernemant (instead of permanent)
586
587 O: pimpernel's, pimpernel, pimpernels, permanently, permanents, permanent,
588 supernatant, impermanent, semipermanent, impermanently
589
590 N: permanent, supernatant, pimpernel
591
592 Note: new method also prefers root word instead of not
593 relevant affixes ('s, s and ly)
594
595
596 4. pernament (instead of permanent)
597
598 O: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
599 ornament, ornamentals, ornamental, ornamentally
600
601 N: ornamental, ornament, tournament
602
603 Note: Both ngram methods misses here.
604
605
606 5. obvus (instad of obvious):
607
608 O: obvious, Corvus, obverse, obviously, Jacobus, obtuser, obtuse,
609 obviates, obviate, Travus
610
611 N: obvious, obtuse, obverse
612
613 Note: new method also prefers common first letters.
614
615
616 6. unambigus (instead of unambiguous)
617
618 O: unambiguous, unambiguity, unambiguously, ambiguously, ambiguous,
619 unambitious, ambiguities, ambiguousness
620
621 N: unambiguous, unambiguity, unambitious
622
623
624
625 7. consecvence (instead of consequence)
626
627 O: consecutive, consecutively, consecutiveness, nonconsecutive, consequence,
628 consecutiveness's, convenience's, consistences, consistence
629
630 N: consequence, consecutive, consecrates
631
632
633 An example in a language with rich morphology:
634
635 8. Misisipiben (instead of Mississippiben [`in Mississippi' in Hungarian]):
636
637 O: Misikédéiben, Pisisedéiben, Misikéiéiben, Pisisekéiben, Misikéiben,
638 Misikéidéiben, Misikékéiben, Misikéikéiben, Misikéiméiben, Mississippiiben
639
640 N: Mississippiben, Mississippiiben, Misiiben
641
642 Note: Suggesting not relevant affixes was the biggest fault in ngram
643 suggestion for languages with a lot of affixes.
644
645 --------------- end of examples --------------------
646
647 * support twofold prefix cutting
648
649 * lots of other improvements and bug fixes (see ChangeLog)
650
651 * test Hunspell with 54 OpenOffice.org dictionaries:
652
653 source: ftp://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
654
655 testing shell script:
656 -------------------------------------------------------
657 for i in `ls *zip | grep '^[a-z]*_[A-Z]*[.]'`
658 do
659 dic=`basename $i .zip`
660 mkdir $dic
661 echo unzip $dic
662 unzip -d $dic $i 2>/dev/null
663 cd $dic
664 echo unmunch and test $dic
665 unmunch $dic.dic $dic.aff 2>/dev/null | awk '{print$0"\t"}' |
666 hunspell -d $dic -l -1 >$dic.result 2>$dic.err || rm -f $dic.result
667 cd ..
668 done
669 --------------------------------------------------------
670
671 test result (0 size is o.k.):
672
673 $ for i in *_*/*.result; do wc -c $i; done
674 0 af_ZA/af_ZA.result
675 0 bg_BG/bg_BG.result
676 0 ca_ES/ca_ES.result
677 0 cy_GB/cy_GB.result
678 0 cs_CZ/cs_CZ.result
679 0 da_DK/da_DK.result
680 0 de_AT/de_AT.result
681 0 de_CH/de_CH.result
682 0 de_DE/de_DE.result
683 0 el_GR/el_GR.result
684 6 en_AU/en_AU.result
685 0 en_CA/en_CA.result
686 0 en_GB/en_GB.result
687 0 en_NZ/en_NZ.result
688 0 en_US/en_US.result
689 0 eo_EO/eo_EO.result
690 0 es_ES/es_ES.result
691 0 es_MX/es_MX.result
692 0 es_NEW/es_NEW.result
693 0 fo_FO/fo_FO.result
694 0 fr_FR/fr_FR.result
695 0 ga_IE/ga_IE.result
696 0 gd_GB/gd_GB.result
697 0 gl_ES/gl_ES.result
698 0 he_IL/he_IL.result
699 0 hr_HR/hr_HR.result
700 200694989 hu_HU/hu_HU.result
701 0 id_ID/id_ID.result
702 0 it_IT/it_IT.result
703 0 ku_TR/ku_TR.result
704 0 lt_LT/lt_LT.result
705 0 lv_LV/lv_LV.result
706 0 mg_MG/mg_MG.result
707 0 mi_NZ/mi_NZ.result
708 0 ms_MY/ms_MY.result
709 0 nb_NO/nb_NO.result
710 0 nl_NL/nl_NL.result
711 0 nn_NO/nn_NO.result
712 0 ny_MW/ny_MW.result
713 0 pl_PL/pl_PL.result
714 0 pt_BR/pt_BR.result
715 0 pt_PT/pt_PT.result
716 0 ro_RO/ro_RO.result
717 0 ru_RU/ru_RU.result
718 0 rw_RW/rw_RW.result
719 0 sk_SK/sk_SK.result
720 0 sl_SI/sl_SI.result
721 0 sv_SE/sv_SE.result
722 0 sw_KE/sw_KE.result
723 0 tet_ID/tet_ID.result
724 0 tl_PH/tl_PH.result
725 0 tn_ZA/tn_ZA.result
726 0 uk_UA/uk_UA.result
727 0 zu_ZA/zu_ZA.result
728
729 In en_AU dictionary, there is an abbrevation with two dots (`eqn..'), but
730 `eqn.' is missing. Presumably it is a dictionary bug. Myspell also
731 haven't accepted it.
732
733 Hungarian dictionary contains pseudoroots and forbidden words.
734 Unmunch haven't supported these features yet, and generates bad words, too.
735
736 * check affix rules and OOo dictionaries. Detected bugs in cs_CZ,
737 es_ES, es_NEW, es_MX, lt_LT, nn_NO, pt_PT, ro_RO, sk_SK and sv_SE dictionaries).
738
739 Details:
740 --------------------------------------------------------
741 cs_CZ
742 warning - incompatible stripping characters and condition:
743 SFX D us ech [^ighk]os
744 SFX D us y [^i]os
745 SFX Q os ech [^ghk]es
746 SFX M o ech [^ghkei]a
747 SFX J ém ej ám
748 SFX J ém ejme ám
749 SFX J ém ejte ám
750 SFX A ou¾it up oupit
751 SFX A ou¾it upme oupit
752 SFX A ou¾it upte oupit
753 SFX A nout l [aeiouyáéíóúýùìr][^aeiouyáéíóúýùìrl][^aeiouy
754 SFX A nout l [aeiouyáéíóúýùìr][^aeiouyáéíóúýùìrl][^aeiouy
755
756 es_ES
757 warning - incompatible stripping characters and condition:
758 SFX W umar úse [ae]husar
759 SFX W emir iñáis eñir
760
761 es_NEW
762 warning - incompatible stripping characters and condition:
763 SFX I unan únen unar
764
765 es_MX
766 warning - incompatible stripping characters and condition:
767 SFX A a ote e
768 SFX W umar úse [ae]husar
769 SFX W emir iñáis eñir
770
771 lt_LT
772 warning - incompatible stripping characters and condition:
773 SFX U ti siuosi tis
774 SFX U ti siuosi tis
775 SFX U ti siesi tis
776 SFX U ti siesi tis
777 SFX U ti sis tis
778 SFX U ti sis tis
779 SFX U ti simës tis
780 SFX U ti simës tis
781 SFX U ti sitës tis
782 SFX U ti sitës tis
783
784 nn_NO
785 warning - incompatible stripping characters and condition:
786 SFX D ar rar [^fmk]er
787 SFX U Øre orde ere
788 SFX U Øre ort ere
789
790 pt_PT
791 warning - incompatible stripping characters and condition:
792 SFX g ãos oas ão
793 SFX g ãos oas ão
794
795 ro_RO
796 warning - bad field number:
797 SFX L 0 le [^cg] i
798 SFX L 0 i [cg] i
799 SFX U 0 i [^i] ii
800 warning - incompatible stripping characters and condition:
801 SFX P l i l [<- there is an unnecessary tabulator here)
802 SFX I a ii [gc] a
803 warning - bad field number:
804 SFX I a ii [gc] a
805 SFX I a ei [^cg] a
806
807 sk_SK
808 warning - incompatible stripping characters and condition:
809 SFX T µa» olú kla»
810 SFX T µa» olúc kla»
811 SFX T sµa» ¹lú sla»
812 SFX T sµa» ¹lúc sla»
813 SFX R µc» lèiem åc»
814 SFX R iás» ätie mias»
815 SFX R iez» iem [^i]ez»
816 SFX R iez» ie¹ [^i]ez»
817 SFX R iez» ie [^i]ez»
818 SFX R iez» eme [^i]ez»
819 SFX R iez» ete [^i]ez»
820 SFX R iez» ú [^i]ez»
821 SFX R iez» úc [^i]ez»
822 SFX R iez» z [^i]ez»
823 SFX R iez» me [^i]ez»
824 SFX R iez» te [^i]ez»
825
826 sv_SE
827 warning - bad field number:
828 SFX C 0 net nets [^e]n
829 --------------------------------------------------------
830
831 2005-08-01: Hunspell 1.0.8 release
832
833 - improved compound word support
834 - fix German S handling
835 - port MySpell files and MAP feature
836
837 2005-07-22: Hunspell 1.0.7 release
838
839 2005-07-21: new home page: http://hunspell.sourceforge.net