Complete StringEscapeutils Guide in Java

Introduction to StringEscapeUtils in Java

In this guide we will learn how to use StringEscapeUtils in Java.

We will go through various examples for each api to make it easier for you to understand!

So let’s get started!

Handling text data often involves working with special characters that have a different meaning in various contexts for HTML, XML, JSON etc.

We want to handle them for security vulnerabilities and data integrity.

Apache Commons Text provides a powerful utility called StringEscapeUtils that simplifies the process of escaping and unescaping strings for different purposes.

As a developer you do not have to write your own code to handle various scenarios just making life of developer easy!

Table of Contents

  1. Introduction to StringEscapeUtils
  2. HTML Escaping and Unescaping
  3. XML Escaping and Unescaping
  4. Json Escaping and Unescaping
  5. Java Properties Escaping
  6. Conclusion

1. Introduction to StringEscapeUtils

StringEscapeUtils is a part of the Apache Commons Text library and provides a set of static methods for escaping and unescaping strings.

We need to add maven dependency in our project to use StringEscapeUtils apis.

This is the latest version at this moment.

Please update it to latest version whenever release happens to get rid of any vulnerabilites

Maven dependency for StringEscapeUtils:

1
2
3
4
5
         <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-text</artifactId>
            <version>1.10.0</version>
        </dependency>

2. HTML Escaping and Unescaping

HTML escaping involves converting special characters into their corresponding HTML entities or code.

If they are not properly escaped they can pose security risks.

Example Usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import org.apache.commons.text.StringEscapeUtils;

import org.apache.commons.text.StringEscapeUtils;

public class App {

    public static void main(String[] args) {
        String input = "<script>alert('This is a alert');</script>";
        String escaped = StringEscapeUtils.escapeHtml4(input);
        System.out.println(escaped);
    }
}

HTML

Output:

1
&lt;script&gt;alert('This is a alert');&lt;/script&gt;

HTML unescaping is the reverse process of HTML escaping.

It involves converting HTML entities or codes back into their corresponding characters for HTML display.

Unescaping is necessary when you receive HTML-encoded data and want to display it as HTML content.

Example Usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import org.apache.commons.text.StringEscapeUtils;

public class App {

    public static void main(String[] args) {
        String input = "&lt;script&gt;alert('This is a alert');&lt;/script&gt;";
        String unescapeHtml4 = StringEscapeUtils.unescapeHtml4(input);
        System.out.println(unescapeHtml4);
    }
}

Output:

1
<script>alert('This is a alert');</script>

3. XML Escaping and Unescaping

XML documents also use special characters like <, >, and &.

To include these characters in XML content, they must be properly escaped.

XML escaping involves converting special characters into their corresponding XML entities.

In XML, certain characters have special meanings and must be escaped to be treated as literal characters

within the XML document

Example Usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import org.apache.commons.text.StringEscapeUtils;

public class App {

    public static void main(String[] args) {
        String input = "<message>this is important message</message>";
        String escapeXml11 = StringEscapeUtils.escapeXml11(input);
        System.out.println(escapeXml11);
    }
}

escapeXml:

  • Escapes XML special characters like <, >, &, ", and ' for XML 1.0.
  • Converts < to &lt;, > to &gt;, & to &amp;, " to &quot;, and ' to &apos;.

XML unescaping is the reverse process of XML escaping.

It involves converting XML entities or codes back into their corresponding characters, restoring the original xml.

Unescaping is required when you receive XML-encoded data and want to extract it as plain text.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import org.apache.commons.text.StringEscapeUtils;

public class App {

    public static void main(String[] args) {
        String input = "&lt;message&gt;this is important message&lt;/message&gt;";
        String unescapeXml = StringEscapeUtils.unescapeXml(input);
        System.out.println(unescapeXml);
    }
}

Output:

1
<message>this is important message</message>

unescapeXml:

  • Reverts the escaped XML back to its original form.

4. Json Escaping and Unescaping

You do not need to manually escape special characters in json.

To escape special characters in the json string we can use StringEscapeUtils.escapeJson method().

You can unescape back using StringEscapeUtils.unescapeJson method.

Example Usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import org.apache.commons.text.StringEscapeUtils;

public class App {

    public static void main(String[] args) {
        String input = "Hello \nWorld \t\"Foo Bar\" \\Baz";  
        String escaped = StringEscapeUtils.escapeJson(input); //Hello \nWorld \t\"Foo Bar\" \\Baz
        String unescaped = StringEscapeUtils.unescapeJson(escaped);
        /*
                Hello 
                World 	"Foo Bar" \Baz
         */
  
    }
}

5. Java Properties Escaping

StringEscapeUtils.escapeJava is used for escaping strings intended to be used as values in Java properties files.

When working with Java properties files, you may need to escape special characters, such as = and :.

Example Usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import org.apache.commons.text.StringEscapeUtils;

public class App {

    public static void main(String[] args) {
        String input = "This is a value with special characters: =, :, \\, \n, Hello\tworld";
        String escaped = StringEscapeUtils.escapeJava(input);

        System.out.println("Input: " + input);
        
        System.out.println("Escaped: " + escaped);
    }
}
1
2
3
Input: This is a value with special characters: =, :, \,
, Hello	world
Escaped: This is a value with special characters: =, :, \\, \n, Hello\tworld

escapeJava:

  • Escapes special characters like =, :, \, \n, and \r.
  • Converts = to \=, : to \:, \ to \\, \n to \n, and \r to \r.

6. Conclusion

In this comprehensive guide, we have learnt how to use StringEscapeUtils library provided by Apache Commons Text.

With StringEscapeUtils, one cane handle text data in various scenarios, ensuring that special characters are properly escaped and unescaped.

StringEscapeUtils library can enhance security and ensures smooth handling of data,

For more information and additional utilities provided by Apache Commons Text, be sure to check out the official documentation StringEscapeUtils Docs and stay up-to-date with the latest releases.

Happy coding!