Problem

Run-length Encoded is a fast and simple method of encoding strings. The basic idea is to represent repeated successive characters as a single count and character.

Examples

Example 1:

Input: str = AAAABBBCCDAA
Output: 4A3B2C1D2A

Input Format

public class RLECodec {

    public String encode(String str) {
        
    }

    public String decode(String data) {
        
    }
}

Solution

Method 1 - Iterative

public class RLECodec {

	public static String encode(String str) {
		StringBuilder encoded = new StringBuilder();

		for (int i = 0; i < str.length(); i++) {
			// Count occurrences of the current character
			int count = 1;

			while (i + 1 < str.length() && str.charAt(i) == str.charAt(i + 1)) {
				count++;
				i++;
			}
			// Append the count and the character to the encoded StringBuffer
			encoded.append(count);
			encoded.append(str.charAt(i));
		}

		return encoded.toString();
	}

	public static String decode(String encodedStr) {
		StringBuilder decoded = new StringBuilder();
		int count = 0;

		for (char c: encodedStr.toCharArray()) {
			// If the character is a digit, build the count
			if (Character.isDigit(c)) {
				// accumulate count
				count = count * 10 + (c - '0');
			} else {
				// Append 'count' occurrences of the character to the decoded StringBuffer
                 while (count > 0) {
                     decoded.append(c);
                     count--;
                 }
			}
		}

		return decoded.toString();
	}

}

Applications

Run-length encoding (RLE) is a straightforward compression technique widely compatible with bitmap image formats like TIFF, BMP, and PCX. It is versatile, able to compress data of various types, though the degree of compression varies with the nature of the data. While not as compressive as advanced algorithms, RLE’s simplicity and speed offer a practical compromise for situations where complex compression is unnecessary or undesired.

RLE compresses data by condensing sequences of repeated characters—referred to as runs—into a concise format composed of two bytes. The first byte, known as the run count, indicates the sequence length, often subtracting one from the actual length to fit within a 0-127 or 0-255 range. The second byte, called the run value, specifies the repeated character, with a possible value range between 0 and 255.