Right way to scan all files / directories
I have some application that needs to scan all files trying to recognize some particular stuff. But I'm really in doubt if this is the best way to scan all units/directories/files in the computer. Here's the code: To check if the unit is a fixed volume I'm doing:
procedure TForm1.MapUnits; var Drive: char; begin for Drive:= 'A' to 'Z' do begin case GetDriveType(PChar(Drive + ':/')) of DRIVE_FIXED: MapFiles(Drive + ':\'); end; end; end;
The MapFiles is:
procedure TForm1.MapFiles(DriveUnit: string); var SR: TSearchRec; DirList: TStringList; IsFound: Boolean; i: integer; begin DirList := TStringList.Create; IsFound:= FindFirst(DriveUnit + '*.*', faAnyFile, SR) = 0; while IsFound do begin if ((SR.Attr and faArchive) <> 0) and (SR.Name <> '.') then begin ScanFile(DriveUnit + SR.Name); end; if ((SR.Attr and faDirectory) <> 0) and (SR.Name <> '.') then begin DirList.Add(DriveUnit + SR.Name); end; IsFound := FindNext(SR) = 0; end; FindClose(SR); // Scan the list of subdirectories for i := 0 to DirList.Count - 1 do MapFiles(DirList[i] + '\'); DirList.Free; end;
Please, note this method I'm using adding the subdir list to a TStringList and after finishing all the main directories, I do recall in MapFiles but now passing the sub-dir. Is this ok? And to open the files found(ScanFile) I'm doing:
procedure TForm1.ScanFile(FileName: string); var i, aux: integer; MyFile: TFileStream; AnsiValue, Target: AnsiString; begin if (POS('.exe', FileName) = 0) and (POS('.dll', FileName) = 0) and (POS('.sys', FileName) = 0) then begin try MyFile:= TFileStream.Create(FileName, fmOpenRead); except on E: EFOpenError do MyFile:= NIL; end; if MyFile <> NIL then try SetLength(AnsiValue, MyFile.Size); if MyFile.Size>0 then MyFile.ReadBuffer(AnsiValue, MyFile.Size); for i := 1 to Length(AnsiValue) do begin //Begin the search.. //here I search my particular stuff in each file... end; finally MyFile.Free; end; end; end;
So, I'm doing this the correct way? Thank you!
- You test twice for SR.Name <> '.'. You should be able to do that once.
- Your test for SR.Name <> '.' is flawed. Yes it will find '.' and '..', but it will also find '.svn' and '.....' and so on. You need to test for equality with both '.' and '..'.
- Rather than *.*, it is idiomatic to use *.
- You ought to protect DirList with a try/finally.
- Your program will take forever to run.
- The program will almost certainly fail with memory fragmentation. And likely out right failure if it is a 32 bit program and you encounter any large files. You should avoid loading entire files into memory. Just read small pieces at a time.
- ReadBuffer may throw exceptions. Are you ready for them? You might be better off putting your try/except around the entire read operation, and not just the file open.
- Your attempt to find files with particular extensions is flawed. For instance, would you regard this file as being an executable file: MyFile.exe.blahblah.txt? No, I did not think so. The right way to test for extension is like so: SameText(ExtractFileExt(FileName), Extension).