About: Apache Nutch is a web-search software. It builds on Lucene and Solr, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc. Source code. 1.x series.
Fossies downloads: / linux / www / apache-nutch-1.19-src.tar.gz (tar.bz2|tar.xz|zip)
Fossies services: CLOC analysis | Meta information | Member browsing
No. of package member files: 3656 (2424 regular files in 1232 directories)
The corresponding CLOC output data:2353 text files. 2268 unique files. 156 files ignored. github.com/AlDanial/cloc v 1.94 T=1.85 s (1222.7 files/s, 304900.9 lines/s) -------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv -------------------------------------------------------------------------------- HTML 1304 2765 69523 336767 x 1.90 = 639857.30 Java 609 11831 25484 58271 x 1.36 = 79248.56 JavaScript 15 3892 5659 16065 x 1.48 = 23776.20 XML 159 2165 3294 10509 x 1.90 = 19967.10 Text 70 4690 0 6487 x 0.50 = 3243.50 Ant 79 542 1385 1947 x 1.90 = 3699.30 CSS 5 44 134 1478 x 1.00 = 1478.00 Markdown 13 197 0 547 x 1.00 = 547.00 Bourne Again Shell 3 78 166 546 x 3.81 = 2080.26 Properties 3 25 58 327 x 1.36 = 444.72 XSD 3 18 50 295 x 1.90 = 560.50 XSLT 2 5 31 68 x 1.90 = 129.20 DTD 2 42 138 38 x 1.90 = 72.20 Bourne Shell 1 6 1 15 x 3.81 = 57.15 -------------------------------------------------------------------------------- SUM: 2268 26300 105923 433360 x 1.79 = 775160.99 --------------------------------------------------------------------------------
A hint: This "standard" CLOC analysis has included all package contents files (with the exception of files generated by code-production systems such as GNU autotools). But there exists a perhaps more "realistic" alternative CLOC analysis (among others better suited for an optional codespell check rating) that tries additionally to exclude third party code but also files containing fonts, codepage or character set definitions, dictionaries, names, SVG or non-English languages.