Reading:
Fast Compare Two Files for Equality

Fast Compare Two Files for Equality

Metamug
Fast Compare Two Files for Equality

Guard Clauses before file reading

Checking below conditions will help us avoid reading the file into the memory. This will be very helpful when reading a large file.

  1. Check if same file object is provided for comparing
  2. Check if both the files exist
  3. Check if both files are not directories
  4. Check if both files don't have different length
//1
if(file1 == file2){
    return true; //same file
}

if (file1.getCanonicalFile().equals(file2.getCanonicalFile())) {
    // same file
    return true;
}

final boolean file1Exists = file1.exists();

//2
if (file1Exists != file2.exists()) {
    return false;
}

if (!file1Exists) {
    // two not existing files are equal
    return true;
}

//3
if (file1.isDirectory() || file2.isDirectory()) {
    // don't want to compare directory contents
    throw new IOException("Can't compare directories, only files");
}

//4
if (file1.length() != file2.length()) {
    // lengths differ, cannot be equal
    return false;
}

Converting FileInputStream to BufferedInputStream

After fetching the InputStream object from FileInputStream, we convert it into BufferedInputStream. Calling input.read() on FileInputStream object is expensive, since it is a system call. The same read function when called from BufferedInputStream object will use the data from buffer.

So we are neither reading all the bytes into the memory at once or reading each character from the hard disk.

InputStream input1 = new FileInputStream(file1);
InputStream input2 = new FileInputStream(file2))

input1 = new BufferedInputStream(input1);
input2 = new BufferedInputStream(input2);

Reading files till End of File character (EOF)

Keep reading both the files in parallel with a single loop. read() function reads the next character in the buffer. Here EOF is -1.

//compare the files character by character

for (int ch = input1.read(); EOF != ch; ch = input1.read()) {
    final int ch2 = input2.read();
    if (ch != ch2) {
        return false;
    }
}

//read the last character from second input
final int ch2 = input2.read(); 
return ch2 == EOF; 

Full Code

public static boolean fileEquals(final File file1, final File file2) throws IOException {

    final int EOF = -1;

    //1
    if(file1 == file2){
        return true; //same file
    }

    if (file1.getCanonicalFile().equals(file2.getCanonicalFile())) {
        // same file
        return true;
    }

    final boolean file1Exists = file1.exists();

    //2
    if (file1Exists != file2.exists()) {
        return false;
    }

    if (!file1Exists) {
        // two not existing files are equal
        return true;
    }

    //3
    if (file1.isDirectory() || file2.isDirectory()) {
        // don't want to compare directory contents
        throw new IOException("Can't compare directories, only files");
    }

    //4
    if (file1.length() != file2.length()) {
        // lengths differ, cannot be equal
        return false;
    }

    //compare the files character by character
    try (InputStream input1 = new FileInputStream(file1);
         InputStream input2 = new FileInputStream(file2)) {

        input1 = new BufferedInputStream(input1);
        input2 = new BufferedInputStream(input2);

        for (int ch = input1.read(); EOF != ch; ch = input1.read()) {
            final int ch2 = input2.read();
            if (ch != ch2) {
                return false;
            }
        }

        //read the last character from second input
        final int ch2 = input2.read(); 
        return ch2 == EOF; 

    }

}


Icon For Arrow-up
Comments

Post a comment