How Can I Eliminate Duplicate lines in a Text File?

from Microsoft TechNet article Hey, Scripting Guy!

1. do following–

C:\>cscript
Microsoft (R) Windows Script Host Version 5.6
Copyright (C) Microsoft Corporation 1996-2001. All rights reserved.

Usage: CScript scriptname.extension [option...] [arguments...]

Options:
//B Batch mode: Suppresses script errors and prompts from displaying
//D Enable Active Debugging
//E:engine Use engine for executing script
//H:CScript Changes the default script host to CScript.exe
//H:WScript Changes the default script host to WScript.exe (default)
//I Interactive mode (default, opposite of //B)
//Job:xxxx Execute a WSF job
//Logo Display logo (default)
//Nologo Prevent logo display: No banner will be shown at execution time
//S Save current command line options for this user
//T:nn Time out in seconds: Maximum time a script is permitted to run
//X Execute script in debugger
//U Use Unicode for redirected I/O from the console

This is the command we would use to run windows script.

2. here is the script


Const ForReading = 1

Set objDictionary = CreateObject("Scripting.Dictionary")
Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFile = objFSO.OpenTextFile _
("c:\scripts\namelist.txt", ForReading)

Do Until objFile.AtEndOfStream
strName = objFile.ReadLine
If Not objDictionary.Exists(strName) Then
objDictionary.Add strName, strName
End If
Loop

objFile.Close

For Each strKey in objDictionary.Keys
Wscript.Echo strKey
Next

3. copy this script and save as file, test.vbs

4. on command promp, type

C:\>cscript //Nologo test.vbs > unique.lines.txt

5. unique.lines.txt file would have all the duplicate lines removed.

(I’ve tested this code on Windows 2000)

Advertisements