Class Strings

java.lang.Object
com.pnfsoftware.jeb.util.format.Strings

public class Strings extends Object
Utility methods for Strings and CharSequences.
  • Field Details

    • LINESEP

      public static final String LINESEP
      Line-separator for *this* platform.
  • Constructor Details

    • Strings

      public Strings()
  • Method Details

    • hasLength

      public static boolean hasLength(CharSequence s)
      Determine if a string is non-null and non-empty.
      Parameters:
      s -
      Returns:
      the true IFF string contains at least one character
    • replaceNewLines

      public static String replaceNewLines(String s, String repl)
      Replace newline sequences. This method accepts null strings as input.
      Parameters:
      s - a string or null; in the latter case, null will be returned
      repl - the non-null substitution string, which must not contain new-line characters
      Returns:
    • normalizeNewLines

      public static String normalizeNewLines(String s)
      Replace all newline sequences by the standard \n LF charcter.
      Parameters:
      s - a string
      Returns:
    • safe

      public static String safe(Object s)
      Get the string representation of the parameter object, or the empty string if the object is null.
      Parameters:
      s - an object, possibly null
      Returns:
      the object toString(java.lang.Object) representation, or the empty string
    • safe

      public static String safe(Object s, String def)
      Get the string representation of the parameter object, or the provided string if the object is null.
      Parameters:
      s - an object, possibly null
      def - a non-null string
      Returns:
      a non-null string, possibly empty
    • safe2

      public static String safe2(Object s, String def)
      Get the string representation of the parameter object, or the provided non-empty string if the object is null or its string representation is the empty string.
      Parameters:
      s - an object, possibly null
      def - a non-null, non-empty string
      Returns:
      a string guaranteed to be non-empty
    • joinList

      public static String joinList(Iterable<?> objects)
      Join the elements of a list using "," as a separator and surround the resulting string with square brackets. Careful, this method does not abide to the common semantic of join.
      Parameters:
      objects - a list of objects
      Returns:
      the resulting string
    • joinv

      public static String joinv(String separator, Object... objects)
      Join the string representations of a sequence of objects using the provided separator. Null objects will be formatted as "null".
      Parameters:
      separator - a non-null separator
      objects - an array of objects
      Returns:
      the resulting string
    • joinv

      public static String joinv(String separator, String defaultValue, Object... objects)
      Join the string representations of a sequence of objects using the provided separator.
      Parameters:
      separator - a non-null separator
      defaultValue - String representation for null Objects
      objects - an array of objects
      Returns:
      the resulting string
    • join

      public static String join(String separator, Iterable<?> iterator)
      Join the string representations of a sequence of objects using the provided separator. Null objects will be formatted as "null".
      Parameters:
      separator - a non-null separator
      iterator - an iterator
      Returns:
      the resulting string
    • join

      public static <T> String join(String separator, Iterable<T> iterator, Function<T,CharSequence> f)
      Join a series of items. Format items using the function. For example, to display a list of long as hexadecimal separated by comma: Strings.join(", ", Arrays.asList(0x10L, 0x20L), l -> Long.toHexString(l))
      Type Parameters:
      T - Any Object
      Parameters:
      separator - a non-null separator
      iterator -
      f - toString() equivalent method to be applied to objects from list.
      Returns:
      the resulting string
    • join

      public static String join(String separator, String[] elts, int begin, int end)
      Join a series of non-null strings.
      Parameters:
      separator -
      elts -
      begin - inclusive start index
      end - exclusive end index
      Returns:
    • splitLines

      public static String[] splitLines(String s, boolean doNotReturnFinalEmptyLine)
      Split a text into an array of Lines. Empty lines are returned. The final new-line character(s) are trimmed off. Works for all new lines characters (\r, \n) or sequences of characters (\r\n)
      Parameters:
      s - mandatory input string
      doNotReturnFinalEmptyLine -
      Returns:
      the lines
    • splitLines

      public static String[] splitLines(String s)
      Split a text into an array of Lines. Empty lines are returned. The final new-line character(s) are trimmed off. Works for all new lines characters (\r, \n) or sequences of characters (\r\n)
      Parameters:
      s - mandatory input string
      Returns:
      the lines
    • splitall

      public static String[] splitall(String s, String delim)
    • firstLine

      public static String firstLine(String s)
    • search

      public static int search(CharSequence data, int index, String pattern, boolean regex, boolean caseSensitive, boolean reverseSearch)
      Search for a sub-string.

      Note: on JDK 11+, this implementation for regex=false, caseSensitive=false, reverseSearch=false may be slower than doing data.toLowerCase().indexOf(pattern.toLowerCase()).

      Parameters:
      data - buffer to be searched (aka, the haystack)
      index - in the case of a regular (forward) search, the search takes is [index,EOS); in the case of a reverse (backward) search, the search range is [0,index)
      pattern - text that is being searched (aka, the needle)
      regex - if true, the pattern will be treated as a regular expression; if the regex is invalid, it will be treated as a regular string and no error will be reported
      caseSensitive - search is case-sensitive
      reverseSearch - search is done in reverse
      Returns:
      index where the substring was found, or -1 if nothing was found
    • isContainedIn

      public static boolean isContainedIn(String s, String... elts)
      Determine if a string is contained in an var-arg list of provided strings.
      Parameters:
      s - string to be searched
      elts - the list of elements
      Returns:
      true iff the input string was not null and found in the list of elements
    • contains

      public static boolean contains(String s, String... elts)
      A many-element variant of String.contains.
      Parameters:
      s - the string
      elts - a list of string elements
      Returns:
      true if the string contains at least one of the provided elements
    • startsWith

      public static boolean startsWith(String s, String... elts)
      A many-element variant of String.startsWith.
      Parameters:
      s - the string
      elts - a list of string elements
      Returns:
      true if the string starts with one of the provided elements
    • containsAt

      public static boolean containsAt(String s, int index, String elt)
      Indicates if a String s contains a particular substring at a specified index. Semantically equivalent to s.substring(i).startsWith(elt) without intermediate substring creation.
      Parameters:
      s - the string
      index - String index to look at
      elt - element to identify
      Returns:
      true if s.substring(i).startsWith(elt) returns true
    • endsWith

      public static boolean endsWith(String s, String... elts)
      A many-element variant of String.endsWith.
      Parameters:
      s - the string
      elts - a list of string elements
      Returns:
      true if the string ends with one of the provided elements
    • equals

      public static boolean equals(String a, String b)
      A safer version of String.equals(Object).
      Parameters:
      a - first string, may be null
      b - second string, may be null
      Returns:
      true iff both strings are non-null and equals
    • equalsIgnoreCase

      public static boolean equalsIgnoreCase(String a, String b)
      Parameters:
      a - first string, may be null
      b - second string, may be null
      Returns:
      true iff both strings are non-null and iequals
    • toString

      public static String toString(Object o)
      A safe version of String.toString.
      Parameters:
      o - an object, could be null
      Returns:
      the String representation of the provided object, or "null"
    • toString

      public static String toString(Object o, String defaultValue)
      A safe version of String.toString.
      Parameters:
      o - an object, could be null
      defaultValue - default String representation if o is null
      Returns:
      the String representation of the provided object, or the default value
    • generate

      public static String generate(char c, int count)
      Generate a repeated-character String. For CharSequence generation, use pad(char, int).
      Parameters:
      c - character to repeat
      count - repeat count (ie, string length)
      Returns:
      the string
    • generate

      public static String generate(CharSequence s, int count)
      Generate a repeated string.
      Parameters:
      s - string to repeat
      count - repeat count
      Returns:
      the resulting result
    • spaces

      public static String spaces(int count)
      Generate a repeated string of spaces.
      Parameters:
      count -
      Returns:
    • isBlank

      public static boolean isBlank(CharSequence s)
      Determine if a character sequence is null, empty, or contains WSP chars exclusively.
      Parameters:
      s - the character sequence
      Returns:
      true if the sequence is null or blank
    • countNonBlankCharacters

      public static int countNonBlankCharacters(CharSequence s)
      Count the number of non blank characters in the provided string.
      Parameters:
      s -
      Returns:
    • indexOf

      public static int indexOf(CharSequence text, char c)
      Implementation of indexOf for CharSequence. Same behavior as String.indexOf(int).
      Parameters:
      text - string
      c - char
      Returns:
      the index position, or -1 if not found
    • indexOf2

      public static int indexOf2(CharSequence text, char c0, char c1)
      Find the first one of two characters and return its position.

      This is a 2-element implementation of String.indexOf(int).

      Parameters:
      text - string
      c0 - first char
      c1 - second char
      Returns:
      the position of the first occurrence of c0 or c1 (whichever came first), -1 if not found
    • indexOf2

      public static int indexOf2(CharSequence text, int from, char c0, char c1)
      Find the first one of two characters and return its position.
      Parameters:
      text - string
      from - start index
      c0 - first char
      c1 - second char
      Returns:
      the position of the first occurrence of c0 or c1 (whichever came first), -1 if not found
    • indexOfAny

      public static int indexOfAny(CharSequence text, Set<Character> cset)
      Find the first one of any of the provided characters and return its position.

      This is a N-element implementation of String.indexOf(int).

      Parameters:
      text -
      cset - a set of characters
      Returns:
    • indexOfNotInGroup

      public static int indexOfNotInGroup(CharSequence text, char c, int fromIndex, char[]... ingoreInGroups)
      Find the index a of character, ignoring some groups.
      For example:
    • ignore some text in parenthesis: indexOfNotInGroup("it is (almost) done", 'o', 0, ['(', ')']) will return 16
    • ignore generics: indexOfNotInGroup("std::myclass<a,b>::mymethod(type a, type b)", ',', 0, ['<', '>']) will return 34
    • Parameters:
      text - string
      c - character to find
      fromIndex - start index, use 0 by default
      ingoreInGroups - list of character groups to be ignored {'(', ')'}, {'<', '>'}. Each character group must contain at least 2 elements (one for open element, one for close element)
      Returns:
      positive index if found, -1 when not found, -2 in case of malformed
    • lastIndexOf2

      public static int lastIndexOf2(CharSequence text, char c0, char c1)
      Find the last one of two characters and return its position.

      This is a 2-element implementation of String.lastIndexOf(int).

      Parameters:
      text - string
      c0 - first char
      c1 - second char
      Returns:
      the position of the last occurrence of c0 or c1 (whichever came first), -1 if not found
    • lastIndexOf2

      public static int lastIndexOf2(CharSequence text, int from, char c0, char c1)
      Find the last one of two characters and return its position.
      Parameters:
      text - string
      from - start index
      c0 - first char
      c1 - second char
      Returns:
      the position of the last occurrence of c0 or c1 (whichever came first), -1 if not found
    • lastIndexOfAny

      public static int lastIndexOfAny(CharSequence text, Set<Character> cset)
      Find the last one of any of the provided characters and return its position.

      This is a N-element implementation of String.lastIndexOf(int).

      Parameters:
      text -
      cset - a set of characters
      Returns:
    • hasBlank

      public static boolean hasBlank(CharSequence s)
      Determine if a string contains one or more WSP characters.
      Parameters:
      s -
      Returns:
    • isWhitespace

      public static boolean isWhitespace(char c)
      Determine if a character is a white-space, per the Unicode standard. This method differs from Character.isWhitespace(char) (Java language definition of a WSP).
      Parameters:
      c -
      Returns:
    • isAsciiWhitespace

      public static boolean isAsciiWhitespace(int b, char... extraWhitespaceCharacters)
      Determine if a character is a white-space, per the Ascii standard. It only processes regular space, tab, CR and LF characters.
      Parameters:
      b - the int to test
      extraWhitespaceCharacters - additional ascii characters considered as whitespace
      Returns:
    • replaceWhitespaces

      public static String replaceWhitespaces(String str, char repl)
      Efficiently replace all Unicode white-spaces by the provided char.
      Parameters:
      str -
      repl -
      Returns:
    • trimWhitespaces

      public static String trimWhitespaces(String s)
      Trim (left and right) all characters considered to be white-space by the Unicode standard.
      Parameters:
      s - the input string
      Returns:
      the trimmed string
    • trim

      public static String trim(String s)
      Trim (left and right) all chars less than or equal to ' '. Note that this method differs from String.trim() which, for instance, does not consider CR or LF to be WSP.
      Parameters:
      s - a string
      Returns:
      the trimmed string
    • trim

      public static String trim(String s, char c)
      Trim (left and right) all chars to provided character.
      Parameters:
      s - a string
      c - the character to be removed
      Returns:
      the trimmed string
    • ltrim

      public static String ltrim(String s)
      Left trim all chars less than or equal to ' '. Note that this method differs from String.trim() which, for instance, does not consider CR or LF to be WSP.
      Parameters:
      s - a string
      Returns:
      the left-trimmed string
    • rtrim

      public static String rtrim(String s)
      Right trim all chars less than or equal to ' '. Note that this method differs from String.trim() which, for instance, does not consider CR or LF to be WSP.
      Parameters:
      s - a string
      Returns:
      the right-trimmed string
    • ltrim

      public static String ltrim(String s, char c)
      Left trim on a given character.
      Parameters:
      s -
      c -
      Returns:
    • rtrim

      public static String rtrim(String s, char c)
      Right trim on a given character.
      Parameters:
      s -
      c -
      Returns:
    • getAsciiLength

      public static int getAsciiLength(byte[] data, int maxlen)
      Retrieve the length of a potentially ASCII-encoded string. The String characters allowed are contained CR, LF, TAB, and any character in the [0x20, 0x7E] range.
      Parameters:
      data - a byte array
      maxlen - maximum length
      Returns:
      the length of the string
    • getAsciiLength

      public static int getAsciiLength(byte[] data)
      Parameters:
      data - a bybte array
      Returns:
      the length of the string
    • determinePotentialEncoding

      public static Charset determinePotentialEncoding(byte[] data, int offset, int size)
      Heuristically determine the encoding of a string.
      Parameters:
      data -
      offset -
      size -
      Returns:
      null if unknown, else one of ASCII, UTF-8, UTF-16, UTF-16LE, UTF-16BE, UTF-32LE or UTF-32BE
    • isNumber

      public static boolean isNumber(String text)
      Check that every character of the text parameter is a digit.
      Parameters:
      text -
      Returns:
      true if text is a valid decimal number
    • isHexNumber

      public static boolean isHexNumber(String text)
      Check that every character of the text parameter is an hexadecimal number. Allow upper case as well as lower case characters (only lower or only upper).
      Parameters:
      text -
      Returns:
      true if text is a valid hexadecimal number
    • f

      public static String f(String format, Object... args)
      Format using the US locale.
      Parameters:
      format -
      args -
      Returns:
    • getFastFormatInvocationCount

      public static int getFastFormatInvocationCount()
    • getFastFormatFailureCount

      public static int getFastFormatFailureCount()
    • resetFastFormatCounts

      public static void resetFastFormatCounts()
    • ff

      public static Appendable ff(Locale l, Appendable sink, String format, Object... args)
      Parameters:
      sink - optional recipient (if null, a new builder will be created; the formatted string is appended to the sink)
      l - locale to be used
      format - format string
      args - format arguments
      Returns:
      the sink, never null
    • ff

      public static Appendable ff(Appendable sink, String format, Object... args)
      Parameters:
      sink - optional recipient (if null, a new builder will be created; the formatted string is appended to the sink)
      format - format string
      args - format arguments
      Returns:
      the sink, never null
    • ff

      public static String ff(Locale l, String format, Object... args)
      Parameters:
      l - locale to be used
      format - format string
      args - format arguments
      Returns:
      the formatted string
    • ff

      public static String ff(String format, Object... args)
      Parameters:
      format - format string
      args - format arguments
      Returns:
      the formatted string
    • replaceLast

      public static String replaceLast(String str, String target, String replacement)
      Replace the last occurrence of target in str by the replacement
      Parameters:
      str - the string to search in
      target - the string to search for
      replacement - the replacement part
      Returns:
      the new string with replacement instead of last target occurence or original string if target was not found
    • substring

      public static String substring(String s, int begin, int end)
      Flexible version of String.substring(int, int). Allow Python-like negative indexes for convenience.
      Parameters:
      s - a string
      begin - index in the [-s_length, +s_length] range
      end - index in the [-s_length, +s_length] range
      Returns:
      the substring
    • truncate

      public static String truncate(String s, int maxLength)
      Truncate a string.
      Parameters:
      s - a string
      maxLength - positive length
      Returns:
      the truncated string, which will contain at most `maxLength` characters
    • truncateWithSuffix

      public static String truncateWithSuffix(String s, int maxLength, String suffix)
      Truncate a string and append an optional suffix to it if it was actually truncated.
      Parameters:
      s - a string
      maxLength - positive length, which must be greater than or equal to the suffix, if one was provided
      suffix - optional suffix appended to a string that is actualy truncated
      Returns:
    • indentBlock

      public static String indentBlock(String blk, String indent)
      Indent a buffer.
      Parameters:
      blk -
      indent -
      Returns:
    • indentBlock

      public static String indentBlock(String blk)
      Indent a buffer using a 4-space indentation.
      Parameters:
      blk -
      Returns:
    • urlencodeUTF8

      public static String urlencodeUTF8(String s)
      Urlencode a string. The resulting string will have the following characteristics:
      • a-z, A-Z, 0-9 remain the same
      • ., -, *, _ remain the same
      • space is converted to +
      • all other characters are UTF8 encoded using the "%xx" scheme
      Parameters:
      s - the string to be encoded
      Returns:
      the encoded string
    • urldecodeUTF8

      public static String urldecodeUTF8(String s)
      Decode a URL-encoded string.
      Parameters:
      s - the encoded string
      Returns:
      the decoded string
    • parseUrlParameters

      public static String[] parseUrlParameters(String s, String... entries)
      Extract the parameters of a URL-like encoded string. No decoding is taking place. Example:
       - s: "type=home&subtype=house&[another_key]=[another_value]"
       - entries: "type", "subtype"
       - returns: ["home", "house"]
       
      Parameters:
      s - the string to be parsed
      entries - the entries, whose count must match the number of key-value pairs
      Returns:
      the list of parameters, as they were (ie, without any decoding applied)
    • parseUrlParameter

      public static String parseUrlParameter(String s, String entry)
      Parameters:
      s - the URL-like string to be parsed, containing a single key-value pair, eg hometype=house
      entry -
      Returns:
      the parameter (without decoding applied)
    • encodeArray

      public static String encodeArray(Object... array)
      Encode an array of objects.
      Parameters:
      array - the array of objects
      Returns:
      the encoded array as a string
    • decodeArray

      public static String[] decodeArray(String s)
      Decode an encoded array of objects.
      Parameters:
      s - the encoded array
      Returns:
      the array of decoded strings
    • encodeList

      public static String encodeList(List<?> list)
      Encode a list of objects.
      Parameters:
      list - the list of objects
      Returns:
      the encoded list as a string
    • decodeList

      public static List<String> decodeList(String s)
      Decode an encoded list of objects.
      Parameters:
      s - optional encoded list
      Returns:
      the list of decoded strings
    • encodeMap

      public static String encodeMap(Map<?,?> map)
      Encode a dictionary. The encoding scheme will produce strings like: encodedKey1=encodedValue1&encodedKey2=encodedValue2&...
      Parameters:
      map - the map of key/values
      Returns:
      the encoded map as a string
    • decodeMap

      public static Map<String,String> decodeMap(String s)
      Decode an encoded map.
      Parameters:
      s - optional encoded map
      Returns:
      the decoded map
    • encodeUTF8

      public static byte[] encodeUTF8(String s)
      Encode a string using a UTF-8 encoder. If the encoder is not available, the string is encoded using the system's default encoder. This should never happen.
      Parameters:
      s - mandatory string
      Returns:
      the encoded byte buffer
    • decodeUTF8

      public static String decodeUTF8(byte[] bytes, int offset, int length)
      Decode a byte buffer using a UTF-8 decoder. If the decoder is not available, the byte buffer is decoded using the system's default decoder.
      Parameters:
      bytes - byte buffer
      offset - start offset
      length - count of bytes to be decoded
      Returns:
      the decoded string
    • decodeUTF8

      public static String decodeUTF8(byte[] bytes)
      Decode a byte buffer using a UTF-8 decoder. If the decoder is not available, the byte buffer is decoded using the system's default decoder.
      Parameters:
      bytes - mandatory byte buffer
      Returns:
      the decoded string
    • encodeASCII

      public static byte[] encodeASCII(String s)
      Encode a string using an ASCII encoder. If the encoder is not available, the string is encoded using the system's default encoder. This should never happen.
      Parameters:
      s - mandatory string
      Returns:
      the encoded byte buffer
    • decodeASCII

      public static String decodeASCII(byte[] bytes, int offset, int length)
      Decode a byte buffer using an ASCII decoder. If the decoder is not available, the byte buffer is decoded using the system's default decoder.
      Parameters:
      bytes - byte buffer
      offset - start offset
      length - count of bytes to be decoded
      Returns:
      the decoded string
    • decodeASCII

      public static String decodeASCII(byte[] bytes)
      Decode a byte buffer using an ASCII decoder. If the decoder is not available, the byte buffer is decoded using the system's default decoder.
      Parameters:
      bytes - mandatory byte buffer
      Returns:
      the decoded string
    • encodeLocal

      public static byte[] encodeLocal(String s)
      Encode a string using the local platform's default charset. This method is potentially dangerous.
      Parameters:
      s - mandatory string
      Returns:
      the encoded byte buffer
    • decodeLocal

      public static String decodeLocal(byte[] bytes, int offset, int length)
      Decode a byte buffer using the local platform's default charset. This method is potentially dangerous.
      Parameters:
      bytes - byte buffer
      offset - start offset
      length - count of bytes to be decoded
      Returns:
      the decoded string
    • decodeLocal

      public static String decodeLocal(byte[] bytes)
      Decode a byte buffer using the local platform's default charset. This method is potentially dangerous.
      Parameters:
      bytes - mandatory byte buffer
      Returns:
      the decoded string
    • encodeBinary

      public static byte[] encodeBinary(String s)
      Generate a byte array consisting of the low-bytes of the input string characters.
      Parameters:
      s -
      Returns:
    • getComparator

      public static Comparator<String> getComparator()
      Get a case-sensitive string comparator that treats hexadecimal sequences as numbers, and orders them accordingly, instead as simple strings.

      Refer to NumberComparator and AlphanumCharComparator for details.

      Returns:
      the comparator
    • getComparator

      public static Comparator<String> getComparator(boolean caseSensitive, boolean scanHexadecimal)
      Get a string comparator that can treat hexadecimal sequences as numbers (and order them accordingly) instead as simple strings.

      Refer to NumberComparator and AlphanumCharComparator for details.

      Parameters:
      caseSensitive -
      scanHexadecimal -
      Returns:
      the comparator
    • makeNewLine

      public static void makeNewLine(StringBuilder sb)
      Append a new-line character to the provided buffer unless the buffer is empty or the last character in the buffer is a new-line.
      Parameters:
      sb - a string builder
    • randomUniqueId

      public static String randomUniqueId()
      Generate a 32-character long random unique identifier. The UID returned consists of the digits 0 to 9 and letters a to f (lower-case).
      Returns:
    • pad

      public static CharSequence pad(char c, int count)
      Repeat character c, iter times and build a CharSequence from it. For example pad('0', 4) will return "0000".
      Parameters:
      c - inner character
      count - times to repeat character.
      Returns:
      CharSequence
    • capitalizeFirst

      public static String capitalizeFirst(String s)
      Capitalize the first character of a string.
      Parameters:
      s -
      Returns:
    • camelCaseToString

      public static String camelCaseToString(String s, boolean breakOnDigits, boolean keepUppercaseAcronyms) throws ParseException
      Convert a camel-case string to a sentence. Example:
       ThisIsACamelCaseString    -> This is a camel case string
       ThisIsACamel44CaseString  -> This is a camel44 case string
       CountryUSA                -> Country u s a
       
       with breakOnDigits=true:
       ThisIsACamel44CaseString  -> This is a camel 44 case string
       
       with keepUppercaseAcronyms=true:
       CountryUSA                -> Country USA
       
      A legal camel-case string always starts with an upper-case letter, and does not contain whitespace characters.
      Parameters:
      s - the input camel-case string
      breakOnDigits - if true, base-10 numbers will also be used as breaks
      keepUppercaseAcronyms - keep 2+ upper-case letter acronyms intact, eg: CountryUSA would be converted to Country USA instead of Country u s a
      Returns:
      the result sentence
      Throws:
      ParseException - if the input string was not camel-case formatted
    • camelCaseToString

      public static String camelCaseToString(String s) throws ParseException
      Convert a camel-case string to a sentence. Example:
       ThisIsACamelCaseString -> This is a camel case string
       
      A legal camel-case string always starts with an upper-case letter, and does not contain whitespace characters.
      Parameters:
      s - the input camel-case string
      Returns:
      the result sentence
      Throws:
      ParseException - if the input string was not camel-case formatted
    • hasRtl

      public static boolean hasRtl(CharSequence s)
      Determine if a string contains right-to-left (RTL) characters, eg Arabic or Hebrew characters.
      Parameters:
      s -
      Returns:
    • parseCommandline

      public static String[] parseCommandline(String s)
      Parse a string as a command line. Source: ant.jar.
      Parameters:
      s - the command line to process.
      Returns:
      the command line broken into strings
    • isWellFormedUTF8

      public static boolean isWellFormedUTF8(byte[] bytes)
      Parameters:
      bytes -
      Returns:
    • isWellFormedUTF8

      public static boolean isWellFormedUTF8(byte[] bytes, int off, int len)
      Parameters:
      bytes -
      off -
      len -
      Returns:
    • isPrintableUTF8Header

      public static boolean isPrintableUTF8Header(byte[] headerBytes)
      Validate if some starting bytes may be considered as an UTF-8 printable character header.
      Parameters:
      headerBytes - starting bytes. May be cropped without incidence (will be more accurate with more bytes, though).
      Returns:
      true if bytes appears to represent UTF-8.
    • isPrintableCharsetHeader

      public static boolean isPrintableCharsetHeader(byte[] headerBytes, Charset charset)
      Validate if some starting bytes may be encoded with a particular charset.
      Parameters:
      headerBytes - starting bytes. May be cropped without incidence (will be more accurate with more bytes, though).
      charset - Charset to detect. USe isPrintableUTF8Header(byte[]) for UTF-8.
      Returns:
      true if bytes appears to represent the provided charset.
    • decodeUTF8Ex

      public static String decodeUTF8Ex(byte[] bytes, boolean useStandardDecoderFirst)
      Parameters:
      bytes -
      useStandardDecoderFirst -
      Returns:
    • decodeUTF8Ex

      public static String decodeUTF8Ex(byte[] bytes, int off, int len, boolean useStandardDecoderFirst)
      Parameters:
      bytes -
      off -
      len -
      useStandardDecoderFirst -
      Returns:
    • getBOMSize

      public static int getBOMSize(byte[] input)
      Retrieve the size taken by the BOM or equivalent encoding mark. Detect UTF-8, UTF-16 and UTF-32.
      Parameters:
      input - byte array. Be sure to have at least 4 bytes to analyze all.
      Returns:
      the size taken by BOM or 0 if no BOM was detected
    • readBOM

      public static String readBOM(byte[] input)
      Retrieve the charset from start bytes. Detect UTF-8, UTF-16LE/BE and UTF-32LE-BE.
      Parameters:
      input - first bytes of a string
      Returns:
      the detected charset or null if no BOM was detected.
    • getInitialBlankSize

      public static int getInitialBlankSize(InputStream stream, boolean includeBOM, char... extraWhitespaceCharacters) throws IOException
      Retrieve the initial blank bytes at the beginning of a stream (non data)
      Parameters:
      stream - input Stream to analyze
      includeBOM - true will consider BOM at start of the stream as an initial blank bytes
      Returns:
      the number of bytes considered as blank
      Throws:
      IOException
    • count

      public static int count(String str, char ch)
      Count the number of occurrences of a character within a string.
      Parameters:
      str - haystack
      ch - needle
      Returns:
      the number of occurrences
    • count

      public static int count(String str, String sub, boolean countOverlaps)
      Count the number of occurrences of a sub-string within a string.

      Note: a search for 'aaa' inside 'aaaaaa' would return 4, not 2!

      Parameters:
      str - haystack
      sub - needle
      countOverlaps - if true, a search for 'aaa' inside 'aaaaaa' will return 4 instead of 2
      Returns:
      the number of occurrences
    • like

      public static boolean like(String str, String pat)
      Check whether an input string matches a provided regex pattern. This method is case-sensitive.
      Parameters:
      str - a string
      pat - a regular expression
      Returns:
    • likei

      public static boolean likei(String str, String pat)
      Check whether an input string matches a provided regex pattern. This method is case-insensitive.
      Parameters:
      str - a string
      pat - a regular expression
      Returns:
    • starMatches

      public static boolean starMatches(String str, String pat)
      Check whether an input string matches a provided pattern using a StarMatcher.
      Parameters:
      str - a string
      pat - a wildcard pattern
      Returns:
    • findWordBoundaries

      public static int[] findWordBoundaries(String str, int offset)
      Find a word in the string
      Parameters:
      str - a string
      offset - offset in the string, for which the underlying word should be found
      Returns:
      a tuple (start, end) in the string, specifying the word boundaries; if nothing is found, the tuple returned will be (provided_offset, provided_offset)
    • findWordBoundaries

      public static int[] findWordBoundaries(String str, int offset, Predicate<Character> boundaryTester)
      Find a word in the string
      Parameters:
      str - a string
      offset - offset in the string, for which the underlying word should be found
      optional - custom boundaryTester; leave null to use the default boundary tester (in that case, characters considered as boundaries are: white-space characters, punctuation characters except dash and underscore)
      Returns:
      a tuple (start, end) in the string, specifying the word boundaries; if nothing is found, the tuple returned will be (provided_offset, provided_offset)