Trying to merge files after removing duplicate content

Here is my problem.

I have n files and they all have overlapping and common text in them. I want to create a file using these n files such that the new file only contains unique lines in it that exist across all of the n files.

I am looking for a bash command, python api that can do it for me. If there is an algorithm I can also attempt to code it myself.

Answers


If the order of the lines is not important, you could do this:

sort -u file1 file2 ...

This will (a) sort all the lines in all the files, and then (b) remove duplicates. This will give you the lines that are unique among all the files.


Need Your Help

Application Reopen Event - Cocoa/ObjectiveC

objective-c osx cocoa

I have application1 calling application2 using

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.