Monday, 23 July 2018

shell script to count lines of code

shell script to count lines of code
At work, was asked to count lines of code for a web development project using specific requirements -- only include certain directories and files, exclude files in .git directories, don't report blank (empty) lines or commented lines and include files which aren't always clearly identified as belonging to a particular language. Strictly speaking, it's not counting just LOC, but lines that aren't blank or commented, which in this case translates to a fairly accurate LOC count.
UPDATE: Since writing my little script, I've come across other tools which are much, much more sophisticated, and would likely be a better fit if your use case is anything but very simple:
At the moment, this script supports single line comments using the following comment characters and strings:
#
//
/* ... */

As well as single and multi line comments using the following comment strings:
/* 
 ...
*/

The should cover most shell scripts, Perl, PHP, C, CSS, JS, HTML. Adding more should be trivial.
One issue with this script is that if a multiline comment starts at the end of a line containing code, that line of code won't get counted. This is shown in the "test.script.bad" example below.
Script in action:
$ get.line.counts.sh 
=====================================
Counting lines in files
=====================================
0 empty file, ugly name
0 test.script.bad
3 test.script.good
-------------------------------------
Total lines: 3

=====================================
Counting lines in dirs
=====================================
3 test.dir/test.script.good
-------------------------------------
Total lines: 3
The above example used the following test files:
$ find -type f
./empty file, ugly name
./test.script.bad
./test.script.good
./test.dir/test.script.good
And here's what the test files contained:
empty file, ugly name:

test.script.bad:
a bit of code /* mixed with
multi line
comment */
test.script.good:
# this is a single line comment
    # single line comment with leading spaces
//another single line comment

 this is a
   bit of amazing
 code

 /* this is
 a multi line
 comment */

/* this is a single line comment */

   

  
test.dir/test.script.good:
# this is a single line comment
    # single line comment with leading spaces
//another single line comment

 this is a
   bit of amazing
 code

 /* this is
 a multi line
 comment */

/* this is a single line comment */

   

  
And here's the script:
#!/bin/bash

files="
empty file, ugly name
test.script.bad
test.script.good
"

dirs="
test.dir/
"

Count()
{
  lcc=$(sed -r '/^$/d;/^([ ]+)?\/\//d;/^([ ]+)?#/d' "$f")
  lcn=$(echo "$lcc" | wc -l)
  ml1=$(echo "$lcc" | awk '/\/\*/,/\*\// {++ml1} END {print ml1+0}')
  ml2=$(echo "$lcc" | awk '//  {++ml2} END {print ml2+0}')
  tot=$(( $lcn - $ml1 - $ml2  ))
  echo -e "$tot\t$f"
}

countF()
{
 echo "$files" | sed '/^$/d' |
 while read f; do
   Count
 done
}

countD()
{
 find $dirs -type f | sed '/\/.git\//d' |
 while read f; do
   Count
 done
}

totalF=$(countF | awk '{sum+=$1} END {print "Total lines: " sum}')
totalD=$(countD | awk '{sum+=$1} END {print "Total lines: " sum}')

if [ -n "$files" ]; then
  echo =====================================
  echo Counting lines in files
  echo =====================================
  countF
  echo -------------------------------------
  echo $totalF
  echo 
fi

if [ -n "$dirs" ]; then
  echo =====================================
  echo Counting lines in dirs
  echo =====================================
  countD
  echo -------------------------------------
  echo $totalD
fi

0 comments:

Post a Comment