About: Apache Nutch is a web-search software. It builds on Lucene and Solr, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc. Source code. 1.x series.
Fossies downloads: / linux / www / apache-nutch-1.19-src.tar.gz (tar.bz2|tar.xz|zip)
Fossies services: CLOC analysis | Meta information | Member browsing
No. of package member files: 3656 (2424 files within 1232 directories)
The corresponding CLOC output data:2314 text files. 2243 unique files. 141 files ignored. github.com/AlDanial/cloc v 1.94 T=7.77 s (288.6 files/s, 72431.2 lines/s) -------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv -------------------------------------------------------------------------------- HTML 1294 2758 69151 335489 x 1.90 = 637429.10 Java 605 11743 25353 57881 x 1.36 = 78718.16 JavaScript 11 3881 5635 16061 x 1.48 = 23770.28 XML 159 2165 3294 10509 x 1.90 = 19967.10 Text 64 4682 0 6424 x 0.50 = 3212.00 Ant 79 542 1385 1947 x 1.90 = 3699.30 CSS 5 44 134 1478 x 1.00 = 1478.00 Markdown 13 197 0 547 x 1.00 = 547.00 Bourne Again Shell 3 78 166 546 x 3.81 = 2080.26 XSD 3 18 50 295 x 1.90 = 560.50 Properties 2 25 56 141 x 1.36 = 191.76 XSLT 2 5 31 68 x 1.90 = 129.20 DTD 2 42 138 38 x 1.90 = 72.20 Bourne Shell 1 6 1 15 x 3.81 = 57.15 -------------------------------------------------------------------------------- SUM: 2243 26186 105394 431439 x 1.79 = 771912.01 --------------------------------------------------------------------------------
A hint: This alternative CLOC analysis has tried to exclude third party and other code unsuited for a codespell analysis (e.g. files containing fonts, codepage or character set definitions, dictionaries, names, SVG or non-English languages). But there exists the "standard" CLOC analysis that has included all package contents files (with the exception of files generated by code-production systems such as GNU autotools).