There are, of course, many ways to do it. The most straightforward one is to simply open each file on your preferred text-editor, be it notepad, notepad++, UltraEdit or whatever and then use CTRL-END to go to the end of the text, then simply look at the current line number.
This option is ok for a small set of files, say, ten or fifteen. But if you have a larger set, fifty or more, it probably will take too much time to do it. So, let's look at alternatives:
I. Using a DOS Batch
This option is simple and very quick to build. Experienced VBScript writers will prefer the next one since it's more flexible, but common users should default to this. They are very similar and have about the same basic result.
The Basics: The DOS Script is a scripting format for DOS (which now is called the Windows Command Prompt). Old timers will remember dealing with .bat files in the old DOS days, usually to pass parameters during program call. But, in fact, DOS bat files could do much more even back then, and now they are a very handy tool to any windows user that doesn't want to deal with the intrinsics of VBS programming.
Restrictions: Can only be applied to non-formatted files.
Downsides: It can only count the lines. VBScript solution might apply filters, etc.
Recommended: If all you need is the total number of lines for each file.
How to:
A DOS bat is a text file with the .bat extension (for instance, example.bat) that contains DOS native commands. The bat is purely interpreted -- it's as if each command is being typed separately by the user on the command line shell. To use it, open a text file on your favorite text editor (notepad, UltraEdit, notepad++, etc) and save it with the .bat extension (countLOC.bat). It's easier saving the file on c:\ so that we can access it easily from the command prompt. The first example will only count the lines of a single text file:
countLOC.bat (1)
- @echo off
- REM The following line sets delayed expansion, which is used to
- REM make sure variables have dynamic value.
- SETLOCAL ENABLEDELAYEDEXPANSION
- REM The following FOR loop reads the file line by line
- FOR /F "tokens=*" %%j in (data\text.txt) do (
- set /a numLines=!numLines!+1
- )
- echo data\text.txt !numLines!
- Open the DOS prompt (Start → Run, "cmd").
- First switch to the disk partition where the files are by typing the letter of the partition followed by a colon, like "c:", then enter.
- If your shell tells you you're on a different folder, just type "cd \", which will bring you to the root.
- Run the program by typing its name on the prompt:
c:\>countLOC
The lines 2 and 3 begin with REM, which is short for REMARK. Use it as a commentary mark (anything following REM is ignored). SETLOCAL in line 5 does what the commentary section explains: on DOS bat, if you declare a variable (which is done through a SET command), the variable is static. It means that it cannot be modified during execution of commands such as the FOR loop.
Finally, it's necessary to understand the FOR command. FOR is a very handy tool that enables script writers to do repeating tasks a certain number of times. It's help is very comprehensive (type c:\>for /?) if you want a description of its uses. The most interesting format is /f, which uses files (or text, actually) as input variables. The command will scan the file, "data\text.txt", line-by-line. Usually, this would be used in composition with another command, to extract some information from the file. In our case, though, we use the code on line 9 that will increment the numLines variable when executed.
The /a option in SET tells the command to interpret its input as a aritmethic expression, so line 9 adds one to numLines. If you ran the above script, you will see a output like this:
C:\>countLOC.bat
data\text.txt 9
The next goal is to do this with all the files inside a certain folder. Take a look at the next version:
countLOC.bat (2)
- @echo off
- REM The following line sets delayed expansion, which is used to make sure variables
- REM have real dynamic value.
- SETLOCAL ENABLEDELAYEDEXPANSION
- REM Get DIR from input
- SET DIR=%1
- for /f "tokens=*" %%i in ('dir /b /a-d %DIR%') do (
- for /f "tokens=*" %%j in (%DIR%\%%i) do (
- set /a numLines=!numLines!+1
- )
- echo %%i !numLines!
- )
Finally, I've "parametrized" the folder name. DOS bath can receive arguments from the command line through %1, %2, etc. So, the user must pass the folder name as an argument to the DOS bat the folder name, which is c:\data in our example:
C:\>countLOC data
asd.txt 7
cassi.txt 10
DBD_All.txt 8926
SourceSafe.txt 8927
The script is ready. All you have to do is copy the solution from countLOC.bat (2) and save it with this name. In case you want to learn more about DOS scripting, point your browser to Rob's van der Woude Scripting Page.
II. Use a VBScript
Using a scripted language to do it is probably the best way to go, since you can customize this solution to whatever you want. This solution is based on a article by Hey Scripting Guy, from MSDN. I've used it more than once, to count LOC from source files and records from data files.
The basics: You will need to write a very simple VBScript to do it. VBScript isn't the best scripting language out there, sure, but has a great advantage -- any Windows installation comes with it. To 'program', just open your prefferred text editor (we love text editors, don't we?) and sabe the file as something.vbs (as usual, "something" is anything you want). Then double click this file in Windows Explorer -- it will be compiled and ran.
"Compiled?", you ask. Yes, VBScript is, oddly enough, a compiled language. This means that your code is first fed into a compiler, generating a 'binary' of some sort (not necessarily native machine code) before it runs.
Restrictions: Can only be applied to non-formatted files.
Downsides: VBScript programming is a bit complex for newbies.
Recommend: If you want to apply some filters on the file, like verifying if a line is a comment, etc.
How to:
So, let's write this program. First you must understand that, to access files and other Operational System functions, VBScript will use a "component" from windows. It doesn't really matter what that object is, but if you are interested, just google it. To create this object, write this on the file:
countLOC.vbs (1):
Set fso = CreateObject("Scripting.FileSystemObject")
This will create a object named fso (short for file system object) which we'll use to access files and stuff. Let's say now that you want to open a file named "text.txt", which resides in "c:\data". All you have to do is:
countLOC.vbs (2):
Set fso = CreateObject("Scripting.FileSystemObject")
Set objTextFile = fso.OpenTextFile("c:\data\text.txt", 1)
This will give you access to the objTextFile, which you will use to manipulate the text file itself. Looking at the arguments (the stuff enclosed in parenthesis), you will notice a '1' being passed. This tells fso that the file is to be openend as read-only, preventing the script from messing up with your text. Now we need to count the lines from text.txt. We do that by reading until the end of the file and then checking how many lines were read:
countLOC.vbs (3):
Const ForReading = 1
Set fso = CreateObject("Scripting.FileSystemObject")
Set objTextFile = fso.OpenTextFile("c:\data\text.txt", ForReading)
objTextFile.ReadAll
Wscript.Echo(objTextFile.Line)
On the code above, we made two modifications. First, the number that tells fso that the file is for reading only was replaced by a constant, so that our code becomes more readable. Second, we read the whole file using the method ReadAll and then printed, through the use of Wscript.Echo, the number of the current line of the file. Since we read the whole file, the line printed will be the last one. The code above is the skeleton of what we will be using to create the report containing all the files number of line. The first difference is that we'll get a list of all the files in the folder and then we will analyzed each one of then. To do so, we will use interaction loops in our script. The second difference is that we will output the result to a file instead of pop-ups. It's easy but, if you are not really interested in learning the hows and whys, skip until the last code listing.
The way to retrieve the file listing from the folder is to use the method fso.GetFolder("folderPath"), which returns a folder object, and the use this object to retrieve the listing. We iterate through the listing by using For Each loop:
countLOC.vbs (4)
- On Error Resume Next
- Const ForReading = 1
- Set fso = CreateObject("Scripting.FileSystemObject")
- ' Open an output file to write results. Existing files will be overwritten.
- Set reportFile = fso.CreateTextFile("c:\FileList.txt", True)
- ' Get the file listing for the given folder
- Set folder = fso.GetFolder("c:\data")
- ' iterate through the the files
- For Each fileIdx In folder.Files
- ' Open the file using the name from folderIdx (folder index)
- Set objTextFile = fso.OpenTextFile("c:\data\" & fileIdx.Name, ForReading)
- objTextFile.ReadAll
-
- ' Output to the reportfile
- reportFile.WriteLine(fileIdx.Name & ";" & objTextFile.Line)
-
- Next
- ' Close the report file
- reportFile.Close
This is the gist of it. The script access a folder, "c:\data", iterate through all files in this folder and outputs each file LOC to "c:\FileList.txt". But we want to make this into a real tool, right? So what's wrong? Well, the script need to be edited everytime you want to change the folder that contains the files or the name to the report. So let's parametize those:
countLOC.vbs (4)
- On Error Resume Next
- Const ForReading = 1
-
- ' Get first parameter as the foldername
- sFolder = WScript.Arguments.Item(0)
- ' Second parameter is the report name
- sReport = WScript.Arguments.Item(1)
-
- If sFolder = "" Or sReport = "" Then
- Wscript.Echo "Invalid Syntax. Expected: countLOC.vbs folderName reportName"
- Wscript.Quit
- End If
-
- ' Create a file system object that will help us acessing files
- Set fso = CreateObject("Scripting.FileSystemObject")
-
- ' Open an output file to write results. Existing files will be overwritten.
- Set reportFile = fso.CreateTextFile(sReport, True)
- ' Get the file listing for the given folder
- Set folder = fso.GetFolder(sFolder)
-
- ' iterate through the the files
- For Each fileIdx In folder.Files
- ' Open the file using the name from folderIdx (folder index)
- Set objTextFile = fso.OpenTextFile(sFolder & "\" & fileIdx.Name, ForReading)
- objTextFile.ReadAll
-
- ' Output to the reportfile
- reportFile.WriteLine(fileIdx.Name & ";" & objTextFile.Line)
-
- Next
-
- ' Close the report file
- reportFile.Close
c:\>countLOC folder output
Where folder is the path of the folder that contains the files you want to analyze and output is the name of the report.
Now, a seasoned VBScript writer can re-adapt the code above and instead of using a ReadAll, read it line-by-line and check if the line is empty or not, if it's a comment or not, etc.
No comments :
Post a Comment