Comparing a string with the empty string (Java)

I have a question about comparing a string with the empty string in Java. Is there a difference, if I compare a string with the empty string with == or equals? For example:

String s1 = "hi";

if (s1 == "")

or

if (s1.equals("")) 

I know that one should compare strings (and objects in general) with equals, and not ==, but I am wondering whether it matters for the empty string.

Answers


s1 == ""

is not reliable as it tests reference equality not object equality (and String isn't strictly canonical).

s1.equals("")

is better but can suffer from null pointer exceptions. Better yet is:

"".equals(s1)

No null pointer exceptions.

EDIT: Ok, the point was asked about canonical form. This article defines it as:

Suppose we have some set S of objects, with an equivalence relation. A canonical form is given by designating some objects of S to be "in canonical form", such that every object under consideration is equivalent to exactly one object in canonical form.

To give you a practical example: take the set of rational numbers (or "fractions" are they're commonly called). A rational number consists of a numerator and a denomoinator (divisor), both of which are integers. These rational numbers are equivalent:

3/2, 6/4, 24/16

Rational nubmers are typically written such that the gcd (greatest common divisor) is 1. So all of them will be simplified to 3/2. 3/2 can be viewed as the canonical form of this set of rational numbers.

So what does it mean in programming when the term "canonical form" is used? It can mean a couple of things. Take for example this imaginary class:

public class MyInt {
  private final int number;

  public MyInt(int number) { this.number = number; }
  public int hashCode() { return number; }
}

The hash code of the class MyInt is a canonical form of that class because for the set of all instances of MyInt, you can take any two elements m1 and m2 and they will obey the following relation:

m1.equals(m2) == (m1.hashCode() == m2.hashCode())

That relation is the essence of canonical form. A more common way this crops up is when you use factory methods on classes such as:

public class MyClass {
  private MyClass() { }

  public MyClass getInstance(...) { ... }
}

Instances cannot be directly instantiated because the constructor is private. This is just a factory method. What a factory method allows you to do is things like:

  • Always return the same instance (abstracted singleton);
  • Just create a new intsance with every call;
  • Return objects in canonical form (more on this in a second); or
  • whatever you like.

Basically the factory method abstracts object creation and personally I think it would be an interesting language feature to force all constructors to be private to enforce the use of this pattern but I digress.

What you can do with this factory method is cache your instances that you create such that for any two instances s1 and s2 they obey the following test:

(s1 == s2) == s1.equals(s2)

So when I say String isn't strictly canonical it means that:

String s1 = "blah";
String s2 = "blah";
System.out.println(s1 == s2); // true

But as others have poitned out you can change this by using:

String s3 = new String("blah");

and possibly:

String s4 = String.intern("blah");

So you can't rely on reference equality completely so you shouldn't rely on it at all.

As a caveat to the above pattern, I should point out that controlling object creation with private constructors and factory methods doesn't guarantee reference equality means object equality because of serialization. Serialization bypasses the normal object creation mechanism. Josh Bloch covers this topic in Effective Java (originally in the first edition when he talked about the typesafe enum pattern which later became a language feature in Java 5) and you can get around it by overloading the (private) readResolve() method. But it's tricky. Class loaders will affect the issue too.

Anyway, that's canonical form.


It's going to depend on if the string is a literal or not. If you create the string with

new String("")

Then it will never match "" with the equals operator, as shown below:

	String one = "";
	String two = new String("");
	System.out.println("one == \"\": " + (one == ""));
	System.out.println("one.equals(\"\"): " + one.equals(""));
	System.out.println("two == \"\": " + (two == ""));
	System.out.println("two.equals(\"\"): " + two.equals(""));

--

one == "": true
one.equals(""): true
two == "": false
two.equals(""): true

Basically, you want to always use equals()


It's a bit sideways from your original question, but there's always

if(s1.length() == 0)

I believe this is equivalent to isEmpty() method from 1.6.


"".equals(s)

Seems to be the best option, but there is also Stringutils.isEmpty(s) contained in the Apache commons lang library


A string, is a string, is a string, whether it's the empty string or not. Use equals().


Use String.isEmpty(), or StringUtils.isEmpty(String str) if you need a null check.


Short answer

s1 == ""         // No!
s1.equals("")    // Ok
s1.isEmpty()     // Ok: fast (from Java 1.6) 
"".equals(s1)    // Ok: null safe

I would assure s1 is not null and use isEmpty().

Note: empty string "" is not a special String, but counts as any other "value".

A little longer answer

References to String objects depend on the way they are created:

String objects created using the operator new always refer to separate objects, even if they store the same sequence of characters so:

String s1 = new String("");
String s2 = new String("");
s1 == s2 // false

String objects created using the operator = followed by a value enclosed whitin double quotes (= "value") are stored in a pool of String objects: before creating a new object in the pool, an object with the same value is searched in the pool and referenced if found.

String s1 = ""; // "" added to the pool
String s2 = ""; // found "" in the pool, s2 will reference the same object of s1
s1 == s2        // true

The same is true for strings created enclosing a value whitin double quotes ("value"), so:

String s1 = "";  
s1 == "";        //true

String equals method checks for both, that's why it is safe to write:

s1.equals("");

This expression may throw a NullPointerException if s1 == null, so, if you don't check for null before, it is safer to write:

"".equals(s1);

Please read also How do I compare strings in Java?

Hope it may help not so experienced users, who may find other answers a bit too complicated. :)


Given two strings:

String s1 = "abc";
String s2 = "abc";

-or -

String s1 = new String("abc");
String s2 = new String("abc");

The == operator performed on two Objects checks for object identity (it returns true if the two operators return to the same object instance.) The actual behavior of == applied to java.lang.Strings does not always appear to be consistent because of String interning.

In Java, Strings are interned (at least partly at the discretion of the JVM.) At any point in time, s1 and s2 may or may not have been interned to be the same object reference (supposing they have the same value.) Thus s1 == s2 may or may not return true, based solely on whether s1 and s2 have both been interned.

Making s1 and s2 equal to empty Strings has no effect on this - they still may or may not have been interned.

In short, == may or may not return true if s1 and s2 have the same contents. s1.equals(s2) is guaranteed to return true if s1 and s2 have the same contents.


Need Your Help

join()->where() in Laravel's Eloquent ORM

php mysql laravel orm eloquent

I'm essentially building a multi-site site under a single domain using subdomains for each account. Each account has a unique set of usernames, but not unique across all sites. Therefore, there cou...

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.