Please note, this is a STATIC archive of website www.tutorialspoint.com from 11 May 2019, cach3.com does not collect or store any user information, there is no "phishing" involved.
Tutorialspoint

How to eliminate repeated lines in a python function?

Suppose, I have a text file having repeated lines as follows

A cow is an animal.
A cow is an animal.
A buffalo too is an animal.
Lion is the king of jungle.

How do I use a python function to eliminate the repeated lines in the text file?


1 Answer
Rajendra Dharmkar

Let us name the given text file as bar.txt

We use file handling methods in python to remove duplicate lines in python text file or function. The text file or function has to be in the same directory as the python program file. Following code is one way of removing duplicates in a text file bar.txt and the output is stored in foo.txt. These files should be in the same directory as the python script file, else it won’t work.

The file bar.txt is as follows

A cow is an animal.
A cow is an animal.
A buffalo too is an animal.
Lion is the king of jungle.

The code below removes the duplicate lines in bar.txt and stores in foo.txt

# This program opens file bar.txt and removes duplicate lines and writes the 
# contents to foo.txt file.
lines_seen = set()  # holds lines already seen
outfile = open('foo.txt', "w")
infile = open('bar.txt', "r")
print "The file bar.txt is as follows"
for line in infile:
    print line
    if line not in lines_seen:  # not a duplicate
        outfile.write(line)
        lines_seen.add(line)
outfile.close()
print "The file foo.txt is as follows"
for line in open('foo.txt', "r"):
    print line

OUTPUT

The file foo.txt is as follows

A cow is an animal.
A buffalo too is an animal.
Lion is the king of jungle.
Advertisements

We use cookies to provide and improve our services. By using our site, you consent to our Cookies Policy.