Functions
Last updated
Last updated
There are at least* two types of functions - regular functions (they are just called “functions”) and aggregate functions. These are completely different concepts. Regular functions work as if they are applied to each row separately (for each row, the result of the function does not depend on the other rows). Aggregate functions accumulate a set of values from various rows (i.e. they depend on the entire set of rows).
In this section we discuss regular functions. For aggregate functions, see the section “Aggregate functions”.
* - There is a third type of function that the ‘arrayJoin’ function belongs to; table functions can also be mentioned separately.*
For all arithmetic functions, the result type is calculated as the smallest number type that the result fits in, if there is such a type. The minimum is taken simultaneously based on the number of bits, whether it is signed, and whether it floats. If there are not enough bits, the highest bit type is taken.
Example
Arithmetic functions work for any pair of types from UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Float32, or Float64.
Overflow is produced the same way as in C++.
Calculates the sum of the numbers. You can also add integer numbers with a date or date and time. In the case of a date, adding an integer means adding the corresponding number of days. For a date with time, it means adding the corresponding number of seconds.
Example
Calculates the difference. The result is always signed.
You can also calculate integer numbers from a date or date with time. The idea is the same – see above for ‘plus’.
Example
Calculates the product of the numbers.
Example
Calculates the quotient of the numbers. The result type is always a floating-point type. It is not integer division. For integer division, use the ‘intDiv’ function. When dividing by zero you get ‘inf’, ‘-inf’, or ‘nan’.
Example
Calculates the quotient of the numbers. Divides into integers, rounding down (by the absolute value). An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.
Example
Differs from ‘intDiv’ in that it returns zero when dividing by zero or when dividing a minimal negative number by minus one.
Example
Calculates the remainder when dividing a
by b
. The result type is an integer if both inputs are integers. If one of the inputs is a floating-point number, the result is a floating-point number. The remainder is computed like in C++. Truncated division is used for negative numbers. An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.
Example
Example
Calculates a number with the reverse sign. The result is always signed.
Example
Calculates the absolute value of the number (a). That is, if a \< 0, it returns -a. For unsigned types it does not do anything. For signed integer types, it returns an unsigned number.
Example
Returns the greatest common divisor of the numbers. An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.
Example
Returns the least common multiple of the numbers. An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.
Example
Checks whether the input array is empty.
Syntax
An array is considered empty if it does not contain any elements.
NOTE
Arguments
Returned value
Returns 1
for an empty array or 0
for a non-empty array.
Example
Query:
Result:
Checks whether the input array is non-empty.
Syntax
An array is considered non-empty if it contains at least one element.
NOTE
Arguments
Returned value
Returns 1
for a non-empty array or 0
for an empty array.
Example
Query:
Result:
Returns the number of items in the array. The result type is UInt64. The function also works for strings.
Syntax
Example
Accepts zero arguments and returns an empty array of the appropriate type.
Accepts an empty array and returns a one-element array that is equal to the default value.
Returns an array of UInt
numbers from start
to end - 1
by step
.
Syntax
Arguments
Returned value
Array of UInt
numbers from start
to end - 1
by step
.
Implementation details
All arguments must be positive values: start
, end
, step
are UInt
data types, as well as elements of the returned array.
Examples
Query:
Result:
Creates an array from the function arguments. The arguments must be constants and have types that have the smallest common type. At least one argument must be passed, because otherwise it isn’t clear which type of array to create. That is, you can’t use this function to create an empty array (to do that, use the ‘emptyArray*’ function described above). Returns an ‘Array(T)’ type result, where ‘T’ is the smallest common type out of the passed arguments.
Example
Combines arrays passed as arguments.
Arguments
Checks whether the ‘arr’ array has the ‘elem’ element. Returns 0 if the element is not in the array, or 1 if it is.
NULL
is processed as a value.
Checks whether one array is a subset of another.
Arguments
set
– Array of any type with a set of elements.
subset
– Array of any type with elements that should be tested to be a subset of set
.
Return values
1
, if set
contains all of the elements from subset
.
0
, otherwise.
Peculiar properties
An empty array is a subset of any array.
Null
processed as a value.
Order of values in both of arrays does not matter.
Examples
SELECT hasAll([], [])
returns 1.
SELECT hasAll([1, Null], [Null])
returns 1.
SELECT hasAll([1.0, 2, 3, 4], [1, 3])
returns 1.
SELECT hasAll(['a', 'b'], ['a'])
returns 1.
SELECT hasAll([1], ['a'])
returns 0.
SELECT hasAll([[1, 2], [3, 4]], [[1, 2], [3, 5]])
returns 0.
Checks whether two arrays have intersection by some elements.
Arguments
array1
– Array of any type with a set of elements.
array2
– Array of any type with a set of elements.
Return values
1
, if array1
and array2
have one similar element at least.
0
, otherwise.
Peculiar properties
Null
processed as a value.
Order of values in both of arrays does not matter.
Examples
SELECT hasAny([1], [])
returns 0
.
SELECT hasAny([Null], [Null, 1])
returns 1
.
SELECT hasAny([-128, 1., 512], [1])
returns 1
.
SELECT hasAny([[1, 2], [3, 4]], ['a', 'c'])
returns 0
.
SELECT hasAll([[1, 2], [3, 4]], [[1, 2], [1, 2]])
returns 1
.
Checks whether all the elements of array2 appear in array1 in the same exact order. Therefore, the function will return 1, if and only if array1 = prefix + array2 + suffix
.
In other words, the functions will check whether all the elements of array2
are contained in array1
like the hasAll
function. In addition, it will check that the elements are observed in the same order in both array1
and array2
.
For Example:
hasSubstr([1,2,3,4], [2,3])
returns 1. However, hasSubstr([1,2,3,4], [3,2])
will return 0
.
hasSubstr([1,2,3,4], [1,2,3])
returns 1. However, hasSubstr([1,2,3,4], [1,2,4])
will return 0
.
Arguments
array1
– Array of any type with a set of elements.
array2
– Array of any type with a set of elements.
Return values
1
, if array1
contains array2
.
0
, otherwise.
Peculiar properties
The function will return 1
if array2
is empty.
Null
processed as a value. In other words hasSubstr([1, 2, NULL, 3, 4], [2,3])
will return 0
. However, hasSubstr([1, 2, NULL, 3, 4], [2,NULL,3])
will return 1
Order of values in both of arrays does matter.
Examples
SELECT hasSubstr([], [])
returns 1.
SELECT hasSubstr([1, Null], [Null])
returns 1.
SELECT hasSubstr([1.0, 2, 3, 4], [1, 3])
returns 0.
SELECT hasSubstr(['a', 'b'], ['a'])
returns 1.
SELECT hasSubstr(['a', 'b' , 'c'], ['a', 'b'])
returns 1.
SELECT hasSubstr(['a', 'b' , 'c'], ['a', 'c'])
returns 0.
SELECT hasSubstr([[1, 2], [3, 4], [5, 6]], [[1, 2], [3, 4]])
returns 1.
Returns the index of the first ‘x’ element (starting from 1) if it is in the array, or 0 if it is not.
Example:
Elements set to NULL
are handled as normal values.
Returns the number of elements for which func(arr1[i], …, arrN[i])
returns something other than 0. If func
is not specified, it returns the number of non-zero elements in the array.
Example
Returns the number of elements in the array equal to x. Equivalent to arrayCount (elem -> elem = x, arr).
NULL
elements are handled as separate values.
Example
Returns the array [1, 2, 3, …, length (arr) ]
This function is normally used with ARRAY JOIN. It allows counting something just once for each array after applying ARRAY JOIN. Example:
In this example, Reaches is the number of conversions (the strings received after applying ARRAY JOIN), and Hits is the number of pageviews (strings before ARRAY JOIN). In this particular case, you can get the same result in an easier way:
This function can also be used in higher-order functions. For example, you can use it to get array indexes for elements that match a condition.
Returns an array the same size as the source array, indicating for each element what its position is among elements with the same value. For example: arrayEnumerateUniq([10, 20, 10, 30]) = [1, 1, 2, 1].
This function is useful when using ARRAY JOIN and aggregation of array elements. Example:
In this example, each goal ID has a calculation of the number of conversions (each element in the Goals nested data structure is a goal that was reached, which we refer to as a conversion) and the number of sessions. Without ARRAY JOIN, we would have counted the number of sessions as sum(Sign). But in this particular case, the rows were multiplied by the nested Goals structure, so in order to count each session one time after this, we apply a condition to the value of the arrayEnumerateUniq(Goals.ID) function.
The arrayEnumerateUniq function can take multiple arrays of the same size as arguments. In this case, uniqueness is considered for tuples of elements in the same positions in all the arrays.
This is necessary when using ARRAY JOIN with a nested data structure and further aggregation across multiple elements in this structure.
Removes the last item from the array.
Arguments
array
– Array.
Example
Removes the first item from the array.
Arguments
array
– Array.
Example
Adds one item to the end of the array.
Arguments
array
– Array.
Example
Adds one element to the beginning of the array.
Arguments
array
– Array.
Example
Changes the length of the array.
Arguments:
array
— Array.
size
— Required length of the array.
If size
is less than the original size of the array, the array is truncated from the right.
If size
is larger than the initial size of the array, the array is extended to the right with extender
values or default values for the data type of the array items.
extender
— Value for extending an array. Can be NULL
.
Returned value:
An array of length size
.
Examples of calls
Returns a slice of the array.
Arguments
array
– Array of data.
offset
– Indent from the edge of the array. A positive value indicates an offset on the left, and a negative value is an indent on the right. Numbering of the array items begins with 1.
length
– The length of the required slice. If you specify a negative value, the function returns an open slice [offset, array_length - length]
. If you omit the value, the function returns the slice [offset, the_end_of_array]
.
Example
Array elements set to NULL
are handled as normal values.
Sorts the elements of the arr
array in ascending order. If the func
function is specified, sorting order is determined by the result of the func
function applied to the elements of the array. If func
accepts multiple arguments, the arraySort
function is passed several arrays that the arguments of func
will correspond to. Detailed examples are shown at the end of arraySort
description.
Example of integer values sorting:
Example of string values sorting:
Consider the following sorting order for the NULL
, NaN
and Inf
values:
-Inf
values are first in the array.
NULL
values are last in the array.
NaN
values are right before NULL
.
Inf
values are right before NaN
.
Let’s consider the following example:
The lambda function can accept multiple arguments. In this case, you need to pass the arraySort
function several arrays of identical length that the arguments of lambda function will correspond to. The resulting array will consist of elements from the first input array; elements from the next input array(s) specify the sorting keys. For example:
Here, the elements that are passed in the second array ([2, 1]) define a sorting key for the corresponding element from the source array ([‘hello’, ‘world’]), that is, [‘hello’ –> 2, ‘world’ –> 1]. Since the lambda function does not use x
, actual values of the source array do not affect the order in the result. So, ‘hello’ will be the second element in the result, and ‘world’ will be the first.
Other examples are shown below.
NOTE
Sorts the elements of the arr
array in descending order. If the func
function is specified, arr
is sorted according to the result of the func
function applied to the elements of the array, and then the sorted array is reversed. If func
accepts multiple arguments, the arrayReverseSort
function is passed several arrays that the arguments of func
will correspond to. Detailed examples are shown at the end of arrayReverseSort
description.
Example of integer values sorting:
Example of string values sorting:
Consider the following sorting order for the NULL
, NaN
and Inf
values:
Inf
values are first in the array.
NULL
values are last in the array.
NaN
values are right before NULL
.
-Inf
values are right before NaN
.
The array is sorted in the following way:
At first, the source array ([1, 2, 3]) is sorted according to the result of the lambda function applied to the elements of the array. The result is an array [3, 2, 1].
Array that is obtained on the previous step, is reversed. So, the final result is [1, 2, 3].
The lambda function can accept multiple arguments. In this case, you need to pass the arrayReverseSort
function several arrays of identical length that the arguments of lambda function will correspond to. The resulting array will consist of elements from the first input array; elements from the next input array(s) specify the sorting keys. For example:
In this example, the array is sorted in the following way:
At first, the source array ([‘hello’, ‘world’]) is sorted according to the result of the lambda function applied to the elements of the arrays. The elements that are passed in the second array ([2, 1]), define the sorting keys for corresponding elements from the source array. The result is an array [‘world’, ‘hello’].
Array that was sorted on the previous step, is reversed. So, the final result is [‘hello’, ‘world’].
Other examples are shown below.
If one argument is passed, it counts the number of different elements in the array. If multiple arguments are passed, it counts the number of different tuples of elements at corresponding positions in multiple arrays.
If you want to get a list of unique items in an array, you can use arrayReduce(‘groupUniqArray’, arr).
Example
Calculates the difference between adjacent array elements. Returns an array where the first element will be 0, the second is the difference between a[1] - a[0]
, etc. The type of elements in the resulting array is determined by the type inference rules for subtraction (e.g. UInt8
- UInt8
= Int16
).
Syntax
Arguments
Returned values
Returns an array of differences between adjacent elements.
Example
Query:
Result:
Example of the overflow due to result type Int64:
Query:
Result:
Takes an array, returns an array containing the distinct elements only.
Syntax
Arguments
Returned values
Returns an array containing the distinct elements.
Example
Query:
Result:
Returns an array of the same size as the source array, indicating where each element first appears in the source array.
Example
Takes multiple arrays, returns an array with elements that are present in all source arrays.
Example
Applies an aggregate function to array elements and returns its result. The name of the aggregation function is passed as a string in single quotes 'max'
, 'sum'
. When using parametric aggregate functions, the parameter is indicated after the function name in parentheses 'uniqUpTo(6)'
.
Syntax
Arguments
Returned value
Example
Query:
Result:
If an aggregate function takes multiple arguments, then this function must be applied to multiple arrays of the same size.
Query:
Result:
Example with a parametric aggregate function:
Query:
Result:
Applies an aggregate function to array elements in given ranges and returns an array containing the result corresponding to each range. The function will return the same result as multiple arrayReduce(agg_func, arraySlice(arr1, index, length), ...)
.
Syntax
Arguments
Returned value
Array containing results of the aggregate function over specified ranges.
Example
Query:
Result:
Returns an array of the same size as the original array containing the elements in reverse order.
Example:
Converts an array of arrays to a flat array.
Function:
Applies to any depth of nested arrays.
Does not change arrays that are already flat.
The flattened array contains all the elements from all source arrays.
Syntax
Alias: flatten
.
Arguments
Examples
Removes consecutive duplicate elements from an array. The order of result values is determined by the order in the source array.
Syntax
Arguments
Returned value
The array without duplicate.
Type: Array
.
Example
Query:
Result:
Combines multiple arrays into a single array. The resulting array contains the corresponding elements of the source arrays grouped into tuples in the listed order of arguments.
Syntax
Arguments
The function can take any number of arrays of different types. All the input arrays must be of equal size.
Returned value
Example
Query:
Result:
Syntax
Arguments
arr_scores
— scores prediction model gives.
arr_labels
— labels of samples, usually 1 for positive sample and 0 for negtive sample.
Returned value
Returns AUC value with type Float64.
Example
Query:
Result:
Returns an array obtained from the original arrays by application of func(arr1[i], …, arrN[i])
for each element. Arrays arr1
… arrN
must have the same number of elements.
Examples
The following example shows how to create a tuple of elements from different arrays:
Returns an array containing only the elements in arr1
for which func(arr1[i], …, arrN[i])
returns something other than 0.
Examples
Scan through arr1
from the first element to the last element and replace arr1[i]
by arr1[i - 1]
if func(arr1[i], …, arrN[i])
returns 0. The first element of arr1
will not be replaced.
Examples
Scan through arr1
from the last element to the first element and replace arr1[i]
by arr1[i + 1]
if func(arr1[i], …, arrN[i])
returns 0. The last element of arr1
will not be replaced.
Examples:
Split arr1
into multiple arrays. When func(arr1[i], …, arrN[i])
returns something other than 0, the array will be split on the left hand side of the element. The array will not be split before the first element.
Examples:
Split arr1
into multiple arrays. When func(arr1[i], …, arrN[i])
returns something other than 0, the array will be split on the right hand side of the element. The array will not be split after the last element.
Examples:
Returns 1 if there is at least one element in arr
for which func(arr1[i], …, arrN[i])
returns something other than 0. Otherwise, it returns 0.
Example
Returns 1 if func(arr1[i], …, arrN[i])
returns something other than 0 for all the elements in arrays. Otherwise, it returns 0.
Example
Returns the first element in the arr1
array for which func(arr1[i], …, arrN[i])
returns something other than 0.
Example
Returns the index of the first element in the arr1
array for which func(arr1[i], …, arrN[i])
returns something other than 0.
Example
Returns the minimum of elements in the source array.
If the func
function is specified, returns the mininum of elements converted by this function.
Syntax
Arguments
Returned value
The minimum of function values (or the array minimum).
Type: if func
is specified, matches func
return value type, else matches the array elements type.
Examples
Query:
Result:
Query:
Result:
Returns the maximum of elements in the source array.
If the func
function is specified, returns the maximum of elements converted by this function.
Syntax
Arguments
Returned value
The maximum of function values (or the array maximum).
Type: if func
is specified, matches func
return value type, else matches the array elements type.
Examples
Query:
Result:
Query:
Result:
Returns the sum of elements in the source array.
If the func
function is specified, returns the sum of elements converted by this function.
Syntax
Arguments
Returned value
The sum of the function values (or the array sum).
Examples
Query:
Result:
Query:
Result:
Returns the average of elements in the source array.
If the func
function is specified, returns the average of elements converted by this function.
Syntax
Arguments
Returned value
The average of function values (or the array average).
Examples
Query:
Result:
Query:
Result:
Returns an array of partial sums of elements in the source array (a running sum). If the func
function is specified, then the values of the array elements are converted by func(arr1[i], …, arrN[i])
before summing.
Example:
Same as arrayCumSum
, returns an array of partial sums of elements in the source array (a running sum). Different arrayCumSum
, when then returned value contains a value less than zero, the value is replace with zero and the subsequent calculation is performed with zero parameters. For example:
Syntax
Arguments
Returned value
A product of array's elements.
Examples
Query:
Result:
Query:
Bit functions work for any pair of types from UInt8
, UInt16
, UInt32
, UInt64
, Int8
, Int16
, Int32
, Int64
, Float32
, or Float64
. Some functions support String
and FixedString
types.
The result type is an integer with bits equal to the maximum bits of its arguments. If at least one of the arguments is signed, the result is a signed number. If an argument is a floating-point number, it is cast to Int64.
Shifts the binary representation of a value to the left by a specified number of bit positions.
A FixedString
or a String
is treated as a single multibyte value.
Bits of a FixedString
value are lost as they are shifted out. On the contrary, a String
value is extended with additional bytes, so no bits are lost.
Syntax
Arguments
Returned value
Shifted value.
The type of the returned value is the same as the type of the input value.
Example
Result:
Shifts the binary representation of a value to the right by a specified number of bit positions.
A FixedString
or a String
is treated as a single multibyte value. Note that the length of a String
value is reduced as bits are shifted out.
Syntax
Arguments
Returned value
Shifted value.
The type of the returned value is the same as the type of the input value.
Example
Query:
Result:
Syntax
Arguments
number
– Integer number.
index
– Position of bit.
Returned values
Returns a value of bit at specified position.
Type: UInt8
.
Example
For example, the number 43 in base-2 (binary) numeral system is 101011.
Query:
Result:
Another example:
Query:
Result:
The conjuction for bitwise operations:
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1
Syntax
Arguments
number
– Integer number.
index1
, index2
, index3
, index4
– Positions of bit. For example, for set of positions (index1
, index2
, index3
, index4
) is true if and only if all of its positions are true (index1
⋀ index2
, ⋀ index3
⋀ index4
).
Returned values
Returns result of logical conjuction.
Type: UInt8
.
Example
For example, the number 43 in base-2 (binary) numeral system is 101011.
Query:
Result:
Another example:
Query:
Result:
The disjunction for bitwise operations:
0 OR 0 = 0
0 OR 1 = 1
1 OR 0 = 1
1 OR 1 = 1
Syntax
Arguments
number
– Integer number.
index1
, index2
, index3
, index4
– Positions of bit.
Returned values
Returns result of logical disjuction.
Type: UInt8
.
Example
For example, the number 43 in base-2 (binary) numeral system is 101011.
Query:
Result:
Another example:
Query:
Result:
Calculates the number of bits set to one in the binary representation of a number.
Syntax
Arguments
Returned value
Number of bits set to one in the input number.
Type: UInt8
.
Example
Take for example the number 333. Its binary representation: 0000000101001101.
Query:
Result:
Syntax
Arguments
Returned value
The Hamming distance.
Examples
Query:
Result:
Result:
Bitmap functions work for two bitmaps Object value calculation, it is to return new bitmap or cardinality while using formula calculation, such as and, or, xor, and not, etc.
There are 2 kinds of construction methods for Bitmap Object. One is to be constructed by aggregation function groupBitmap with -State, the other is to be constructed by Array Object. It is also to convert Bitmap Object to Array Object.
RoaringBitmap is wrapped into a data structure while actual storage of Bitmap objects. When the cardinality is less than or equal to 32, it uses Set objet. When the cardinality is greater than 32, it uses RoaringBitmap object. That is why storage of low cardinality set is faster.
Build a bitmap from unsigned integer array.
Arguments
array
– Unsigned integer array.
Example
Convert bitmap to integer array.
Arguments
bitmap
– Bitmap object.
Example
Return subset in specified range (not include the range_end).
Arguments
Example
Creates a subset of bitmap with n elements taken between range_start
and cardinality_limit
.
Syntax
Arguments
Returned value
The subset.
Example
Query:
Result:
Syntax
Arguments
Returned value
The subset.
Example
Query:
Result:
Checks whether the bitmap contains an element.
Arguments
Returned values
0 — If haystack
does not contain needle
.
1 — If haystack
contains needle
.
Type: UInt8
.
Example
Checks whether two bitmaps have intersection by some elements.
Arguments
bitmap*
– Bitmap object.
Return values
1
, if bitmap1
and bitmap2
have one similar element at least.
0
, otherwise.
Example
Analogous to hasAll(array, array)
returns 1 if the first bitmap contains all the elements of the second one, 0 otherwise. If the second argument is an empty bitmap then returns 1.
Arguments
bitmap
– Bitmap object.
Example
Retrun bitmap cardinality of type UInt64.
Arguments
bitmap
– Bitmap object.
Example
Retrun the smallest value of type UInt64 in the set, UINT32_MAX if the set is empty.
Arguments
bitmap
– Bitmap object.
Example
Retrun the greatest value of type UInt64 in the set, 0 if the set is empty.
Arguments
bitmap
– Bitmap object.
Example
Transform an array of values in a bitmap to another array of values, the result is a new bitmap.
Arguments
bitmap
– Bitmap object.
from_array
– UInt32 array. For idx in range [0, from_array.size()), if bitmap contains from_array[idx], then replace it with to_array[idx]. Note that the result depends on array ordering if there are common elements between from_array and to_array.
to_array
– UInt32 array, its size shall be the same to from_array.
Example
Two bitmap and calculation, the result is a new bitmap.
Arguments
bitmap
– Bitmap object.
Example
Two bitmap or calculation, the result is a new bitmap.
Arguments
bitmap
– Bitmap object.
Example
Two bitmap xor calculation, the result is a new bitmap.
Arguments
bitmap
– Bitmap object.
Example
Two bitmap andnot calculation, the result is a new bitmap.
Arguments
bitmap
– Bitmap object.
Example
Two bitmap and calculation, return cardinality of type UInt64.
Arguments
bitmap
– Bitmap object.
Example
Two bitmap or calculation, return cardinality of type UInt64.
Arguments
bitmap
– Bitmap object.
Example
Two bitmap xor calculation, return cardinality of type UInt64.
Arguments
bitmap
– Bitmap object.
Example
Two bitmap andnot calculation, return cardinality of type UInt64.
Arguments
bitmap
– Bitmap object.
Example
Controls conditional branching. Unlike most systems, ClickHouse always evaluates both expressions then
and else
.
Syntax
If the condition cond
evaluates to a non-zero value, returns the result of the expression then
, and the result of the expression else
, if present, is skipped. If the cond
is zero or NULL
, then the result of the then
expression is skipped and the result of the else
expression, if present, is returned.
Arguments
cond
– The condition for evaluation that can be zero or not. The type is UInt8, Nullable(UInt8) or NULL.
then
– The expression to return if condition is met.
else
– The expression to return if condition is not met.
Returned values
The function executes then
and else
expressions and returns its result, depending on whether the condition cond
ended up being zero or not.
Example
Query:
Result:
Query:
Result:
then
and else
must have the lowest common type.
Example:
Take this LEFT_RIGHT
table:
The following query compares left
and right
values:
It works same as if
function.
Syntax: cond ? then : else
Returns then
if the cond
evaluates to be true (greater than zero), otherwise returns else
.
cond
must be of type of UInt8
, and then
and else
must have the lowest common type.
then
and else
can be NULL
See also
Syntax
Arguments
cond_N
— The condition for the function to return then_N
.
then_N
— The result of the function when executed.
else
— The result of the function if none of the conditions is met.
The function accepts 2N+1
parameters.
Returned values
The function returns one of the values then_N
or else
, depending on the conditions cond_N
.
Example
Again using LEFT_RIGHT
table.
Support for time zones.
All functions for working with the date and time that have a logical use for the time zone can accept a second optional time zone argument. Example: Asia/Yekaterinburg. In this case, they use the specified time zone instead of the local (default) one.
Returns the timezone of the server. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
Syntax
Alias: timezone
.
Returned value
Timezone.
Converts time or date and time to the specified time zone. The time zone is an attribute of the Date
and DateTime
data types. The internal value (number of seconds) of the table field or of the resultset's column does not change, the column's type changes and its string representation changes accordingly.
Syntax
Alias: toTimezone
.
Arguments
Returned value
Date and time.
Example
Query:
Result:
toTimeZone(time_utc, 'Asia/Yekaterinburg')
changes the DateTime('UTC')
type to DateTime('Asia/Yekaterinburg')
. The value (Unixtimestamp) 1546300800 stays the same, but the string representation (the result of the toString() function) changes from time_utc: 2019-01-01 00:00:00
to time_yekat: 2019-01-01 05:00:00
.
Syntax
Alias: timezoneOf
.
Arguments
Returned value
Timezone name.
Example
Query:
Result:
Syntax
Alias: timezoneOffset
.
Arguments
Returned value
Offset from UTC in seconds.
Example
Query:
Result:
Converts a date or date with time to a UInt16 number containing the year number (AD).
Alias: YEAR
.
Converts a date or date with time to a UInt8 number containing the quarter number.
Alias: QUARTER
.
Converts a date or date with time to a UInt8 number containing the month number (1-12).
Alias: MONTH
.
Converts a date or date with time to a UInt16 number containing the number of the day of the year (1-366).
Alias: DAYOFYEAR
.
Converts a date or date with time to a UInt8 number containing the number of the day of the month (1-31).
Aliases: DAYOFMONTH
, DAY
.
Converts a date or date with time to a UInt8 number containing the number of the day of the week (Monday is 1, and Sunday is 7).
Alias: DAYOFWEEK
.
Converts a date with time to a UInt8 number containing the number of the hour in 24-hour time (0-23). This function assumes that if clocks are moved ahead, it is by one hour and occurs at 2 a.m., and if clocks are moved back, it is by one hour and occurs at 3 a.m. (which is not always true – even in Moscow the clocks were twice changed at a different time).
Alias: HOUR
.
Converts a date with time to a UInt8 number containing the number of the minute of the hour (0-59).
Alias: MINUTE
.
Converts a date with time to a UInt8 number containing the number of the second in the minute (0-59). Leap seconds are not accounted for.
Alias: SECOND
.
Syntax
Returned value
Returns the unix timestamp.
Type: UInt32
.
Example
Query:
Result:
NOTE
Behavior for
enable_extended_results_for_datetime_functions = 0
: Functions toStartOfYear
, toStartOfISOYear
, toStartOfQuarter
, toStartOfMonth
, toStartOfWeek
, toLastDayOfMonth
, toMonday
return Date
or DateTime
. Functions toStartOfDay
, toStartOfHour
, toStartOfFifteenMinutes
, toStartOfTenMinutes
, toStartOfFiveMinutes
, toStartOfMinute
, timeSlot
return DateTime
. Though these functions can take values of the extended types Date32
and DateTime64
as an argument, passing them a time outside the normal range (year 1970 to 2149 for Date
/ 2106 for DateTime
) will produce wrong results.
enable_extended_results_for_datetime_functions = 1
:
Functions toStartOfYear
, toStartOfISOYear
, toStartOfQuarter
, toStartOfMonth
, toStartOfWeek
, toLastDayOfMonth
, toMonday
return Date
or DateTime
if their argument is a Date
or DateTime
, and they return Date32
or DateTime64
if their argument is a Date32
or DateTime64
.
Functions toStartOfDay
, toStartOfHour
, toStartOfFifteenMinutes
, toStartOfTenMinutes
, toStartOfFiveMinutes
, toStartOfMinute
, timeSlot
return DateTime
if their argument is a Date
or DateTime
, and they return DateTime64
if their argument is a Date32
or DateTime64
.
Rounds down a date or date with time to the first day of the year. Returns the date.
Rounds down a date or date with time to the first day of ISO year. Returns the date.
Rounds down a date or date with time to the first day of the quarter. The first day of the quarter is either 1 January, 1 April, 1 July, or 1 October. Returns the date.
Rounds down a date or date with time to the first day of the month. Returns the date.
NOTE
The behavior of parsing incorrect dates is implementation specific. ClickHouse may return zero date, throw an exception or do “natural” overflow.
If toLastDayOfMonth
is called with an argument of type Date
greater then 2149-05-31, the result will be calculated from the argument 2149-05-31 instead.
Rounds down a date or date with time to the nearest Monday. Returns the date.
Rounds down a date or date with time to the nearest Sunday or Monday by mode. Returns the date. The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.
Rounds down a date with time to the start of the day.
Rounds down a date with time to the start of the hour.
Rounds down a date with time to the start of the minute.
Truncates sub-seconds.
Syntax
Arguments
Returned value
Input value without sub-seconds.
Examples
Query without timezone:
Result:
Query with timezone:
Result:
See also
Rounds down a date with time to the start of the five-minute interval.
Rounds down a date with time to the start of the ten-minute interval.
Rounds down the date with time to the start of the fifteen-minute interval.
This is a generalization of other functions named toStartOf*
. For example, toStartOfInterval(t, INTERVAL 1 year)
returns the same as toStartOfYear(t)
, toStartOfInterval(t, INTERVAL 1 month)
returns the same as toStartOfMonth(t)
, toStartOfInterval(t, INTERVAL 1 day)
returns the same as toStartOfDay(t)
, toStartOfInterval(t, INTERVAL 15 minute)
returns the same as toStartOfFifteenMinutes(t)
etc.
Converts a date with time to a certain fixed date, while preserving the time.
Converts a date with time or date to the number of the year, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the quarter, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the month, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the week, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the day, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the hour, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the minute, starting from a certain fixed point in the past.
Converts a date with time or date to the number of the second, starting from a certain fixed point in the past.
Converts a date or date with time to a UInt16 number containing the ISO Year number.
Converts a date or date with time to a UInt8 number containing the ISO Week number.
This function returns the week number for date or datetime. The two-argument form of toWeek() enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53. If the mode argument is omitted, the default mode is 0. toISOWeek()
is a compatibility function that is equivalent to toWeek(date,3)
. The following table describes how the mode argument works.
0
Sunday
0-53
with a Sunday in this year
1
Monday
0-53
with 4 or more days this year
2
Sunday
1-53
with a Sunday in this year
3
Monday
1-53
with 4 or more days this year
4
Sunday
0-53
with 4 or more days this year
5
Monday
0-53
with a Monday in this year
6
Sunday
1-53
with 4 or more days this year
7
Monday
1-53
with a Monday in this year
8
Sunday
1-53
contains January 1
9
Monday
1-53
contains January 1
For mode values with a meaning of “with 4 or more days this year,” weeks are numbered according to ISO 8601:1988:
If the week containing January 1 has 4 or more days in the new year, it is week 1.
Otherwise, it is the last week of the previous year, and the next week is week 1.
For mode values with a meaning of “contains January 1”, the week contains January 1 is week 1. It does not matter how many days in the new year the week contained, even if it contained only one day.
Arguments
date
– Date or DateTime.
mode
– Optional parameter, Range of values is [0,9], default is 0.
Timezone
– Optional parameter, it behaves like any other conversion function.
Example
Returns year and week for a date. The year in the result may be different from the year in the date argument for the first and the last week of the year.
The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.
toISOYear()
is a compatibility function that is equivalent to intDiv(toYearWeek(date,3),100)
.
Example
Truncates date and time data to the specified part of date.
Syntax
Alias: dateTrunc
.
Arguments
second
minute
hour
day
week
month
quarter
year
Returned value
Value, truncated to the specified part of date.
Example
Query without timezone:
Result:
Query with the specified timezone:
Result:
See Also
Adds the time interval or date interval to the provided date or date with time.
Syntax
Aliases: dateAdd
, DATE_ADD
.
Arguments
second
minute
hour
day
week
month
quarter
year
Returned value
Date or date with time obtained by adding value
, expressed in unit
, to date
.
Example
Query:
Result:
Syntax
Aliases: dateDiff
, DATE_DIFF
.
Arguments
second
minute
hour
day
week
month
quarter
year
Returned value
Difference between enddate
and startdate
expressed in unit
.
Example
Query:
Result:
Query:
Result:
Subtracts the time interval or date interval from the provided date or date with time.
Syntax
Aliases: dateSub
, DATE_SUB
.
Arguments
second
minute
hour
day
week
month
quarter
year
Returned value
Date or date with time obtained by subtracting value
, expressed in unit
, from date
.
Example
Query:
Result:
Adds the specified time value with the provided date or date time value.
Syntax
Aliases: timeStampAdd
, TIMESTAMP_ADD
.
Arguments
second
minute
hour
day
week
month
quarter
year
Returned value
Date or date with time with the specified value
expressed in unit
added to date
.
Example
Query:
Result:
Subtracts the time interval from the provided date or date with time.
Syntax
Aliases: timeStampSub
, TIMESTAMP_SUB
.
Arguments
second
minute
hour
day
week
month
quarter
year
Returned value
Date or date with time obtained by subtracting value
, expressed in unit
, from date
.
Example
Query:
Result:
Returns the current date and time at the moment of query analysis. The function is a constant expression.
Syntax
Arguments
Returned value
Current date and time.
Example
Query without timezone:
Result:
Query with the specified timezone:
Result:
Returns the current date and time with sub-second precision at the moment of query analysis. The function is a constant expression.
Syntax
Arguments
scale
- Tick size (precision): 10-precision seconds. Valid range: [ 0 : 9 ]. Typically are used - 3 (default) (milliseconds), 6 (microseconds), 9 (nanoseconds).
Returned value
Current date and time with sub-second precision.
Example
Result:
Accepts zero arguments and returns the current date at one of the moments of query analysis. The same as ‘toDate(now())’.
Accepts zero arguments and returns yesterday’s date at one of the moments of query analysis. The same as ‘today() - 1’.
Rounds the time to the half hour.
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY * 100 + MM).
Converts a date or date with time to a UInt32 number containing the year and month number (YYYY * 10000 + MM * 100 + DD).
Converts a date or date with time to a UInt64 number containing the year and month number (YYYY * 10000000000 + MM * 100000000 + DD * 1000000 + hh * 10000 + mm * 100 + ss).
Function adds a Date/DateTime interval to a Date/DateTime and then return the Date/DateTime. For example:
Function subtract a Date/DateTime interval to a Date/DateTime and then return the Date/DateTime. For example:
Formats a Time according to the given Format string. Format is a constant expression, so you cannot have multiple formats for a single result column.
Syntax
Returned value(s)
Returns time and date values according to the determined format.
Replacement fields Using replacement fields, you can define a pattern for the resulting string. “Example” column shows formatting result for 2018-01-02 22:33:44
.
%C
year divided by 100 and truncated to integer (00-99)
20
%d
day of the month, zero-padded (01-31)
02
%D
Short MM/DD/YY date, equivalent to %m/%d/%y
01/02/18
%e
day of the month, space-padded ( 1-31)
2
%f
fractional second from the fractional part of DateTime64
1234560
%F
short YYYY-MM-DD date, equivalent to %Y-%m-%d
2018-01-02
%G
2018
%g
two-digit year format, aligned to ISO 8601, abbreviated from four-digit notation
18
%H
hour in 24h format (00-23)
22
%I
hour in 12h format (01-12)
10
%j
day of the year (001-366)
002
%m
month as a decimal number (01-12)
01
%M
minute (00-59)
33
%n
new-line character (‘’)
%p
AM or PM designation
PM
%Q
Quarter (1-4)
1
%R
24-hour HH:MM time, equivalent to %H:%M
22:33
%S
second (00-59)
44
%t
horizontal-tab character (’)
%T
ISO 8601 time format (HH:MM:SS), equivalent to %H:%M:%S
22:33:44
%u
ISO 8601 weekday as number with Monday as 1 (1-7)
2
%V
ISO 8601 week number (01-53)
01
%w
weekday as a decimal number with Sunday as 0 (0-6)
2
%y
Year, last two digits (00-99)
18
%Y
Year
2018
%z
Time offset from UTC as +HHMM or -HHMM
-0500
%%
a % sign
%
Example
Query:
Result:
Query:
Result:
See Also
Replacement fields
Using replacement fields, you can define a pattern for the resulting string.
G
era
text
AD
C
century of era (>=0)
number
20
Y
year of era (>=0)
year
1996
x
weekyear(not supported yet)
year
1996
w
week of weekyear(not supported yet)
number
27
e
day of week
number
2
E
day of week
text
Tuesday; Tue
y
year
year
1996
D
day of year
number
189
M
month of year
month
July; Jul; 07
d
day of month
number
10
a
halfday of day
text
PM
K
hour of halfday (0~11)
number
0
h
clockhour of halfday (1~12)
number
12
H
hour of day (0~23)
number
0
k
clockhour of day (1~24)
number
24
m
minute of hour
number
30
s
second of minute
number
55
S
fraction of second(not supported yet)
number
978
z
time zone(short name not supported yet)
text
Pacific Standard Time; PST
Z
time zone offset/id(not supported yet)
zone
-0800; -08:00; America/Los_Angeles
'
escape for text
delimiter
''
single quote
literal
'
Example
Query:
Result:
Returns specified part of date.
Syntax
Arguments
Returned value
The specified part of date.
Example
Query:
Result:
Alias: fromUnixTimestamp
.
Example:
Query:
Result:
For example:
See Also
Example: Query:
Result:
Syntax
Arguments
Returned value
Modified Julian Day number.
Example
Query:
Result:
Syntax
Arguments
Returned value
Modified Julian Day number.
Example
Query:
Result:
Syntax
Arguments
Returned value
Date in text form.
Example
Query:
Result:
Syntax
Arguments
Returned value
Date in text form.
Example
Query:
Result:
Retrieves values from a dictionary.
Arguments
Returned value
If there is no the key, corresponding to id_expr
, in the dictionary, then:
ClickHouse throws an exception if it cannot parse the value of the attribute or the value does not match the attribute data type.
Example for simple key dictionary
Create a text file ext-dict-test.csv
containing the following:
The first column is id
, the second column is c1
.
Configure the dictionary:
Perform the query:
Example for complex key dictionary
Create a text file ext-dict-mult.csv
containing the following:
The first column is id
, the second is c1
, the third is c2
.
Configure the dictionary:
Perform the query:
Example for range key dictionary
Input table:
Create the dictionary:
Perform the query:
Result:
See Also
Checks whether a key is present in a dictionary.
Arguments
Returned value
0, if there is no key.
1, if there is a key.
Type: UInt8
.
Syntax
Arguments
Returned value
Parents for the key.
Checks the ancestor of a key through the whole hierarchical chain in the dictionary.
Arguments
Returned value
0, if child_id_expr
is not a child of ancestor_id_expr
.
1, if child_id_expr
is a child of ancestor_id_expr
or if child_id_expr
is an ancestor_id_expr
.
Type: UInt8
.
ClickHouse supports specialized functions that convert dictionary attribute values to a specific data type regardless of the dictionary configuration.
Functions:
dictGetInt8
, dictGetInt16
, dictGetInt32
, dictGetInt64
dictGetUInt8
, dictGetUInt16
, dictGetUInt32
, dictGetUInt64
dictGetFloat32
, dictGetFloat64
dictGetDate
dictGetDateTime
dictGetUUID
dictGetString
All these functions have the OrDefault
modification. For example, dictGetDateOrDefault
.
Syntax:
Arguments
Returned value
If there is no requested id_expr
in the dictionary then:
ClickHouse throws an exception if it cannot parse the value of the attribute or the value does not match the attribute data type.
Returns the string with the length as the number of passed arguments and each byte has the value of corresponding argument. Accepts multiple arguments of numeric types. If the value of argument is out of range of UInt8 data type, it is converted to UInt8 with possible rounding and overflow.
Syntax
Arguments
Returned value
a string of given bytes.
Type: String
.
Example
Query:
Result:
You can construct a string of arbitrary encoding by passing the corresponding bytes. Here is example for UTF-8:
Query:
Result:
Query:
Result:
Returns a string containing the argument’s hexadecimal representation.
Alias: HEX
.
Syntax
The function is using uppercase letters A-F
and not using any prefixes (like 0x
) or suffixes (like h
).
For integer arguments, it prints hex digits (“nibbles”) from the most significant to least significant (big-endian or “human-readable” order). It starts with the most significant non-zero byte (leading zero bytes are omitted) but always prints both digits of every byte even if the leading digit is zero.
Arguments
Returned value
A string with the hexadecimal representation of the argument.
Examples
Query:
Result:
Query:
Result:
Query:
Result:
Query:
Result:
NOTE
If unhex
is invoked from within the clickhouse-client
, binary strings display using UTF-8.
Alias: UNHEX
.
Syntax
Arguments
Supports both uppercase and lowercase letters A-F
. The number of hexadecimal digits does not have to be even. If it is odd, the last digit is interpreted as the least significant half of the 00-0F
byte. If the argument string contains anything other than hexadecimal digits, some implementation-defined result is returned (an exception isn’t thrown). For a numeric argument the inverse of hex(N) is not performed by unhex().
Returned value
A binary string (BLOB).
Example
Query:
Result:
Query:
Result:
Accepts an integer. Returns a string containing the list of powers of two that total the source number when summed. They are comma-separated without spaces in text format, in ascending order.
Accepts an integer. Returns an array of UInt64 numbers containing the list of powers of two that total the source number when summed. Numbers in the array are in ascending order.
These functions implement encryption and decryption of data with AES (Advanced Encryption Standard) algorithm.
Key length depends on encryption mode. It is 16, 24, and 32 bytes long for -128-
, -196-
, and -256-
modes respectively.
Initialization vector length is always 16 bytes (bytes in excess of 16 are ignored).
Note that these functions work slowly until ClickHouse 21.1.
This function encrypts data using these modes:
aes-128-ecb, aes-192-ecb, aes-256-ecb
aes-128-cbc, aes-192-cbc, aes-256-cbc
aes-128-ofb, aes-192-ofb, aes-256-ofb
aes-128-gcm, aes-192-gcm, aes-256-gcm
aes-128-ctr, aes-192-ctr, aes-256-ctr
Syntax
Arguments
Returned value
Examples
Create this table:
Query:
Insert some data (please avoid storing the keys/ivs in the database as this undermines the whole concept of encryption), also storing 'hints' is unsafe too and used only for illustrative purposes:
Query:
Query:
Result:
Example with -gcm
:
Query:
Result:
Will produce the same ciphertext as encrypt
on equal inputs. But when key
or iv
are longer than they should normally be, aes_encrypt_mysql
will stick to what MySQL's aes_encrypt
does: 'fold' key
and ignore excess bits of iv
.
Supported encryption modes:
aes-128-ecb, aes-192-ecb, aes-256-ecb
aes-128-cbc, aes-192-cbc, aes-256-cbc
aes-128-ofb, aes-192-ofb, aes-256-ofb
Syntax
Arguments
Returned value
Examples
Given equal input encrypt
and aes_encrypt_mysql
produce the same ciphertext:
Query:
Result:
But encrypt
fails when key
or iv
is longer than expected:
Query:
Result:
While aes_encrypt_mysql
produces MySQL-compatitalbe output:
Query:
Result:
Notice how supplying even longer IV
produces the same result
Query:
Result:
Which is binary equal to what MySQL produces on same inputs:
This function decrypts ciphertext into a plaintext using these modes:
aes-128-ecb, aes-192-ecb, aes-256-ecb
aes-128-cbc, aes-192-cbc, aes-256-cbc
aes-128-ofb, aes-192-ofb, aes-256-ofb
aes-128-gcm, aes-192-gcm, aes-256-gcm
aes-128-ctr, aes-192-ctr, aes-256-ctr
Syntax
Arguments
Returned value
Examples
Query:
Result:
Now let's try to decrypt all that data.
Query:
Result:
Notice how only a portion of the data was properly decrypted, and the rest is gibberish since either mode
, key
, or iv
were different upon encryption.
Will produce same plaintext as decrypt
on equal inputs. But when key
or iv
are longer than they should normally be, aes_decrypt_mysql
will stick to what MySQL's aes_decrypt
does: 'fold' key
and ignore excess bits of IV
.
Supported decryption modes:
aes-128-ecb, aes-192-ecb, aes-256-ecb
aes-128-cbc, aes-192-cbc, aes-256-cbc
aes-128-cfb128
aes-128-ofb, aes-192-ofb, aes-256-ofb
Syntax
Arguments
Returned value
Examples
Let's decrypt data we've previously encrypted with MySQL:
Query:
Result:
Reads file as a String. The file content is not parsed, so any information is read as one string and placed into the specified column.
Syntax
Arguments
Example
Inserting data from files a.txt and b.txt into a table as strings:
Query:
See Also
Input parameters
lon1Deg
— Longitude of the first point in degrees. Range: [-180°, 180°]
.
lat1Deg
— Latitude of the first point in degrees. Range: [-90°, 90°]
.
lon2Deg
— Longitude of the second point in degrees. Range: [-180°, 180°]
.
lat2Deg
— Latitude of the second point in degrees. Range: [-90°, 90°]
.
Positive values correspond to North latitude and East longitude, and negative values correspond to South latitude and West longitude.
Returned value
The distance between two points on the Earth’s surface, in meters.
Generates an exception when the input parameter values fall outside of the range.
Example
Similar to greatCircleDistance
but calculates the distance on WGS-84 ellipsoid instead of sphere. This is more precise approximation of the Earth Geoid. The performance is the same as for greatCircleDistance
(no performance drawback). It is recommended to use geoDistance
to calculate the distances on Earth.
Technical note: for close enough points we calculate the distance using planar approximation with the metric on the tangent plane at the midpoint of the coordinates.
Input parameters
lon1Deg
— Longitude of the first point in degrees.
lat1Deg
— Latitude of the first point in degrees.
lon2Deg
— Longitude of the second point in degrees.
lat2Deg
— Latitude of the second point in degrees.
Returned value
The central angle between two points in degrees.
Example
Checks whether the point belongs to at least one of the ellipses. Coordinates are geometric in the Cartesian coordinate system.
Input parameters
x, y
— Coordinates of a point on the plane.
xᵢ, yᵢ
— Coordinates of the center of the i
-th ellipsis.
aᵢ, bᵢ
— Axes of the i
-th ellipsis in units of x, y coordinates.
The input parameters must be 2+4⋅n
, where n
is the number of ellipses.
Returned values
1
if the point is inside at least one of the ellipses; 0
if it is not.
Example
Checks whether the point belongs to the polygon on the plane.
Input values
The function also supports polygons with holes (cut out sections). In this case, add polygons that define the cut out sections using additional arguments of the function. The function does not support non-simply-connected polygons.
Returned values
1
if the point is inside the polygon, 0
if it is not. If the point is on the polygon boundary, the function may return either 0 or 1.
Example
Input values
longitude - longitude part of the coordinate you want to encode. Floating in range[-180°, 180°]
latitude - latitude part of the coordinate you want to encode. Floating in range [-90°, 90°]
precision - Optional, length of the resulting encoded string, defaults to 12
. Integer in range [1, 12]
. Any value less than 1
or greater than 12
is silently converted to 12
.
Returned values
alphanumeric String
of encoded coordinate (modified version of the base32-encoding alphabet is used).
Example
Input values
encoded string - geohash-encoded string.
Returned values
(longitude, latitude) - 2-tuple of Float64
values of longitude and latitude.
Example
Syntax
Arguments
NOTE
All coordinate parameters must be of the same type: either Float32
or Float64
.
Returned values
Array of precision-long strings of geohash-boxes covering provided area, you should not rely on order of items.
[]
- Empty array if minimum latitude and longitude values aren’t less than corresponding maximum values.
NOTE
Function throws an exception if resulting array is over 10’000’000 items long.
Example
Query:
Result:
The level of the hierarchy is called resolution
and can receive a value from 0
till 15
, where 0
is the base
level with the largest and coarsest cells.
A latitude and longitude pair can be transformed to a 64-bit H3 index, identifying a grid cell.
The H3 index is used primarily for bucketing locations and other geospatial manipulations.
Syntax
Parameter
Returned values
1 — The number is a valid H3 index.
0 — The number is not a valid H3 index.
Example
Query:
Result:
Syntax
Parameter
Returned values
Index resolution. Range: [0, 15]
.
Example
Query:
Result:
Syntax
Parameter
Returned values
Example
Query:
Result:
Syntax
Parameter
Returned values
Example
Query:
Result:
Syntax
Arguments
Returned values
Hexagon index number.
0 in case of error.
Example
Query:
Result:
Syntax
Arguments
Returned values
Array of H3 indexes.
Example
Query:
Result:
Syntax
Parameter
Returned value
Hexagon base cell number.
Example
Query:
Result:
Returns average hexagon area in square meters at the given resolution.
Syntax
Parameter
Returned value
Area in square meters.
Example
Query:
Result:
Syntax
Arguments
Returned value
1
— Indexes are neighbours.
0
— Indexes are not neighbours.
Example
Query:
Result:
Syntax
Arguments
Returned values
Array of the child H3-indexes.
Example
Query:
Result:
Syntax
Arguments
Returned value
Parent H3 index.
Example
Query:
Result:
Converts the H3Index
representation of the index to the string representation.
Parameter
Returned value
String representation of the H3 index.
Example
Query:
Result:
Converts the string representation to the H3Index
(UInt64) representation.
Syntax
Parameter
Returned value
Example
Query:
Result:
Hash functions can be used for the deterministic pseudo-random shuffling of elements.
Simhash is a hash function, which returns close hash values for close (similar) arguments.
Arguments
Returned Value
Example
Calculates the MD5 from a string and returns the resulting set of bytes as FixedString(16). If you do not need MD5 in particular, but you need a decent cryptographic 128-bit hash, use the ‘sipHash128’ function instead. If you want to get the same result as output by the md5sum utility, use lower(hex(MD5(s))).
After hashing all the input parameters, the function gets the array of hashes.
Function takes the first and the second elements and calculates a hash for the array of them.
Then the function takes the hash value, calculated at the previous step, and the third element of the initial hash array, and calculates a hash for the array of them.
The previous step is repeated for all the remaining elements of the initial hash array.
Arguments
Returned Value
Example
Syntax
Arguments
Returned value
A 128-bit SipHash
hash value.
Example
Query:
Result:
This is a fast non-cryptographic hash function. It uses the CityHash algorithm for string parameters and implementation-specific fast non-cryptographic hash function for parameters with other data types. The function uses the CityHash combinator to get the final results.
Arguments
Returned Value
Examples
Call example:
The following example shows how to compute the checksum of the entire table with accuracy up to the row order:
Calculates a 32-bit hash code from any type of integer. This is a relatively fast non-cryptographic hash function of average quality for numbers.
Calculates a 64-bit hash code from any type of integer. It works faster than intHash32. Average quality.
Syntax
The function works fairly slowly (SHA-1 processes about 5 million short strings per second per processor core, while SHA-224 and SHA-256 process about 2.2 million). We recommend using this function only in cases when you need a specific hash function and you can’t select it. Even in these cases, we recommend applying the function offline and pre-calculating values when inserting them into the table, instead of applying it in SELECT
queries.
Arguments
Returned value
SHA hash as a hex-unencoded FixedString. SHA-1 returns as FixedString(20), SHA-224 as FixedString(28), SHA-256 — FixedString(32), SHA-512 — FixedString(64).
Example
Query:
Result:
A fast, decent-quality non-cryptographic hash function for a string obtained from a URL using some type of normalization. URLHash(s)
– Calculates a hash from a string without one of the trailing symbols /
,?
or #
at the end, if present. URLHash(s, N)
– Calculates a hash from a string up to the N level in the URL hierarchy, without one of the trailing symbols /
,?
or #
at the end, if present. Levels are the same as in URLHierarchy.
Arguments
Returned Value
Example
Note that Java only support calculating signed integers hash, so if you want to calculate unsigned integers hash you must cast it to proper signed ClickHouse types.
Syntax
Returned value
A Int32
data type hash value.
Example
Query:
Result:
Query:
Result:
Syntax
Arguments
stringUtf16le
— a string in UTF-16LE encoding.
Returned value
A Int32
data type hash value.
Example
Correct query with UTF-16LE encoded string.
Query:
Result:
Calculates HiveHash
from a string.
Returned value
A Int32
data type hash value.
Type: hiveHash
.
Example
Query:
Result:
Arguments
Returned Value
Example
Arguments
Returned Value
Example
Syntax
Arguments
Returned value
Calculated hash value.
Example
Query:
Result:
Arguments
Returned Value
Example
Syntax
Arguments
Returned value
A 128-bit MurmurHash3
hash value.
Example
Query:
Result:
Calculates xxHash
from a string. It is proposed in two flavors, 32 and 64 bits.
Returned value
A UInt32
or UInt64
data type hash value.
Type: UInt32
for xxHash32
and UInt64
for xxHash64
.
Example
Query:
Result:
See Also
Splits a ASCII string into n-grams of ngramsize
symbols and returns the n-gram simhash
. Is case sensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a ASCII string into n-grams of ngramsize
symbols and returns the n-gram simhash
. Is case insensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a UTF-8 string into n-grams of ngramsize
symbols and returns the n-gram simhash
. Is case sensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a UTF-8 string into n-grams of ngramsize
symbols and returns the n-gram simhash
. Is case insensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a ASCII string into parts (shingles) of shinglesize
words and returns the word shingle simhash
. Is case sensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a ASCII string into parts (shingles) of shinglesize
words and returns the word shingle simhash
. Is case insensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a UTF-8 string into parts (shingles) of shinglesize
words and returns the word shingle simhash
. Is case sensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a UTF-8 string into parts (shingles) of shinglesize
words and returns the word shingle simhash
. Is case insensitive.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Splits a ASCII string into n-grams of ngramsize
symbols and calculates hash values for each n-gram. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Splits a ASCII string into n-grams of ngramsize
symbols and calculates hash values for each n-gram. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Splits a UTF-8 string into n-grams of ngramsize
symbols and calculates hash values for each n-gram. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Splits a UTF-8 string into n-grams of ngramsize
symbols and calculates hash values for each n-gram. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
n-grams each.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
n-grams each.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
n-grams each.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
n-grams each.
Example
Query:
Result:
Splits a ASCII string into parts (shingles) of shinglesize
words and calculates hash values for each word shingle. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Splits a ASCII string into parts (shingles) of shinglesize
words and calculates hash values for each word shingle. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Splits a UTF-8 string into parts (shingles) of shinglesize
words and calculates hash values for each word shingle. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Splits a UTF-8 string into parts (shingles) of shinglesize
words and calculates hash values for each word shingle. Uses hashnum
minimum hashes to calculate the minimum hash and hashnum
maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.
Syntax
Arguments
Returned value
Tuple with two hashes — the minimum and the maximum.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
word shingles each.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
word shingles each.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
word shingles each.
Example
Query:
Result:
Syntax
Arguments
Returned value
Tuple with two tuples with hashnum
word shingles each.
Example
Query:
Result:
WARNING
These functions are slow and may impose security considerations.
For proper operation of introspection functions:
Install the clickhouse-common-static-dbg
package.
Converts virtual memory address inside ClickHouse server process to the filename and the line number in ClickHouse source code.
If you use official ClickHouse packages, you need to install the clickhouse-common-static-dbg
package.
Syntax
Arguments
Returned value
Source code filename and the line number in this file delimited by colon.
Name of a binary, if the function couldn’t find the debug information.
Empty string, if the address is not valid.
Example
Enabling introspection functions:
Selecting the first string from the trace_log
system table:
The trace
field contains the stack trace at the moment of sampling.
Getting the source code filename and the line number for a single address:
Applying the function to the whole stack trace:
Similar to addressToLine
, but it will return an Array with all inline functions, and will be much slower as a price.
If you use official ClickHouse packages, you need to install the clickhouse-common-static-dbg
package.
Syntax
Arguments
Returned value
Array which first element is source code filename and the line number in this file delimited by colon. And from second element, inline functions' source code filename and line number and function name are listed.
Array with single element which is name of a binary, if the function couldn’t find the debug information.
Empty array, if the address is not valid.
Example
Enabling introspection functions:
Applying the function to address.
Applying the function to the whole stack trace:
Converts virtual memory address inside ClickHouse server process to the symbol from ClickHouse object files.
Syntax
Arguments
Returned value
Symbol from ClickHouse object files.
Empty string, if the address is not valid.
Example
Enabling introspection functions:
Selecting the first string from the trace_log
system table:
The trace
field contains the stack trace at the moment of sampling.
Getting a symbol for a single address:
Applying the function to the whole stack trace:
Syntax
Arguments
Returned value
Name of the C++ function.
Empty string if a symbol is not valid.
Example
Enabling introspection functions:
Selecting the first string from the trace_log
system table:
The trace
field contains the stack trace at the moment of sampling.
Getting a function name for a single address:
Applying the function to the whole stack trace:
Syntax
Returned value
Example
Query:
Result:
Syntax
Arguments
Returned value
Always returns 0.
Example
Query:
Result:
Takes a UInt32 number. Interprets it as an IPv4 address in big endian. Returns a string containing the corresponding IPv4 address in the format A.B.C.d (dot-separated numbers in decimal form).
Alias: INET_NTOA
.
The reverse function of IPv4NumToString. If the IPv4 address has an invalid format, it throws exception.
Alias: INET_ATON
.
Similar to IPv4NumToString, but using xxx instead of the last octet.
Example:
Since using ‘xxx’ is highly unusual, this may be changed in the future. We recommend that you do not rely on the exact format of this fragment.
Accepts a FixedString(16) value containing the IPv6 address in binary format. Returns a string containing this address in text format. IPv6-mapped IPv4 addresses are output in the format ::ffff:111.222.33.44.
Alias: INET6_NTOA
.
Examples:
If the input string contains a valid IPv4 address, returns its IPv6 equivalent. HEX can be uppercase or lowercase.
Alias: INET6_ATON
.
Syntax
Argument
Returned value
IPv6 address in binary format.
Example
Query:
Result:
See Also
Accepts a FixedString(16) value containing the IPv6 address in binary format. Returns a string containing the address of the specified number of bytes removed in text format. For example:
Accepts an IPv6 and an UInt8 value containing the CIDR. Return a tuple with two IPv6 containing the lower range and the higher range of the subnet.
If the input string contains a valid IPv4 address, then the IPv6 equivalent of the IPv4 address is returned.
Syntax
Argument
Returned value
IP address.
Examples
Query:
Result:
Query:
Result:
Determines whether the input string is an IPv4 address or not. If string
is IPv6 address returns 0
.
Syntax
Arguments
Returned value
1
if string
is IPv4 address, 0
otherwise.
Examples
Query:
Result:
Determines whether the input string is an IPv6 address or not. If string
is IPv4 address returns 0
.
Syntax
Arguments
Returned value
1
if string
is IPv6 address, 0
otherwise.
Examples
Query:
Result:
Syntax
This function accepts both IPv4 and IPv6 addresses (and networks) represented as strings. It returns 0
if the IP version of the address and the CIDR don't match.
Arguments
Returned value
1
or 0
.
Example
Query:
Result:
Query:
Result:
Query:
Result:
ClickHouse has special functions for working with this JSON. All the JSON functions are based on strong assumptions about what the JSON can be, but they try to do as little as possible to get the job done.
The following assumptions are made:
The field name (function argument) must be a constant.
The field name is somehow canonically encoded in JSON. For example: visitParamHas('{"abc":"def"}', 'abc') = 1
, but visitParamHas('{"\\u0061\\u0062\\u0063":"def"}', 'abc') = 0
Fields are searched for on any nesting level, indiscriminately. If there are multiple matching fields, the first occurrence is used.
The JSON does not have space characters outside of string literals.
Checks whether there is a field with the name
name.
Alias: simpleJSONHas
.
Parses UInt64 from the value of the field named name
. If this is a string field, it tries to parse a number from the beginning of the string. If the field does not exist, or it exists but does not contain a number, it returns 0.
Alias: simpleJSONExtractUInt
.
The same as for Int64.
Alias: simpleJSONExtractInt
.
The same as for Float64.
Alias: simpleJSONExtractFloat
.
Parses a true/false value. The result is UInt8.
Alias: simpleJSONExtractBool
.
Returns the value of a field, including separators.
Alias: simpleJSONExtractRaw
.
Examples:
Parses the string in double quotes. The value is unescaped. If unescaping failed, it returns an empty string.
Alias: simpleJSONExtractString
.
Examples:
There is currently no support for code points in the format \uXXXX\uYYYY
that are not from the basic multilingual plane (they are converted to CESU-8 instead of UTF-8).
Checks that passed string is a valid json.
Examples:
If the value exists in the JSON document, 1
will be returned.
If the value does not exist, 0
will be returned.
Examples:
indices_or_keys
is a list of zero or more arguments each of them can be either string or integer.
String = access object member by key.
Positive integer = access the n-th member/key from the beginning.
Negative integer = access the n-th member/key from the end.
Minimum index of the element is 1. Thus the element 0 does not exist.
You may use integers to access both JSON arrays and JSON objects.
So, for example:
Return the length of a JSON array or a JSON object.
If the value does not exist or has a wrong type, 0
will be returned.
Examples:
Return the type of a JSON value.
If the value does not exist, Null
will be returned.
Examples:
Parses a JSON and extract a value. These functions are similar to visitParam
functions.
If the value does not exist or has a wrong type, 0
will be returned.
Examples:
Parses a JSON and extract a string. This function is similar to visitParamExtractString
functions.
If the value does not exist or has a wrong type, an empty string will be returned.
The value is unescaped. If unescaping failed, it returns an empty string.
Examples:
Parses a JSON and extract a value of the given ClickHouse data type.
This is a generalization of the previous JSONExtract<type>
functions. This means JSONExtract(..., 'String')
returns exactly the same as JSONExtractString()
, JSONExtract(..., 'Float64')
returns exactly the same as JSONExtractFloat()
.
Examples:
Parses key-value pairs from a JSON where the values are of the given ClickHouse data type.
Example:
Parses a JSON string and extracts the keys.
Syntax
Arguments
Returned value
Array with the keys of the JSON.
Example
Query:
Result:
Returns a part of JSON as unparsed string.
If the part does not exist or has a wrong type, an empty string will be returned.
Example:
Returns an array with elements of JSON array, each represented as unparsed string.
If the part does not exist or isn’t array, an empty array will be returned.
Example:
Extracts raw data from a JSON object.
Syntax
Arguments
Returned values
Array with ('key', 'value')
tuples. Both tuple members are strings.
Empty array if the requested object does not exist, or input JSON is invalid.
Examples
Query:
Result:
Query:
Result:
Query:
Result:
If the value exists in the JSON document, 1
will be returned.
If the value does not exist, 0
will be returned.
Examples:
NOTE
Before version 21.11 the order of arguments was wrong, i.e. JSON_EXISTS(path, json)
Parses a JSON and extract a value as JSON array or JSON object.
If the value does not exist, an empty string will be returned.
Example:
Result:
NOTE
Before version 21.11 the order of arguments was wrong, i.e. JSON_QUERY(path, json)
Parses a JSON and extract a value as JSON scalar.
If the value does not exist, an empty string will be returned.
Example:
Result:
NOTE
Before version 21.11 the order of arguments was wrong, i.e. JSON_VALUE(path, json)
Syntax
Arguments
value
— Value to serialize. Value may be of any data type.
Returned value
JSON representation of the value.
Example
Query:
Result:
Prediction using fitted regression models uses evalMLMethod
function. See link in linearRegression
.
Syntax
Arguments
Returned value
Data structure as key:value
pairs.
Examples
Query:
Result:
Query:
Result:
See Also
Collect all the keys and sum corresponding values.
Syntax
Arguments
Returned value
Example
Query with a tuple:
Result:
Query with Map
type:
Result:
Collect all the keys and subtract corresponding values.
Syntax
Arguments
Returned value
Example
Query with a tuple map:
Result:
Query with Map
type:
Result:
Fills missing keys in the maps (key and value array pair), where keys are integers. Also, it supports specifying the max key, which is used to extend the keys array.
Syntax
Generates a map (a tuple with two arrays or a value of Map
type, depending on the arguments), where keys are a series of numbers, from minimum to maximum keys (or max
argument if it specified) taken from the map with a step size of one, and corresponding values. If the value is not specified for the key, then it uses the default value in the resulting map. For repeated keys, only the first value (in order of appearing) gets associated with the key.
For array arguments the number of elements in keys
and values
must be the same for each row.
Arguments
Mapped arrays:
or
Returned value
Example
Query with mapped arrays:
Result:
Query with Map
type:
Result:
Determines whether the map
contains the key
parameter.
Syntax
Parameters
key
— Key. Type matches the type of keys of map
parameter.
Returned value
1
if map
contains key
, 0
if not.
Example
Query:
Result:
Returns all keys from the map
parameter.
Syntax
Parameters
Returned value
Array containing all keys from the map
.
Example
Query:
Result:
Returns all values from the map
parameter.
Syntax
Parameters
Returned value
Array containing all the values from map
.
Example
Query:
Result:
All the functions return a Float64 number. The accuracy of the result is close to the maximum precision possible, but the result might not coincide with the machine representable number nearest to the corresponding real number.
Returns a Float64 number that is close to the number e.
Accepts a numeric argument and returns a Float64 number close to the exponent of the argument.
Accepts a numeric argument and returns a Float64 number close to the natural logarithm of the argument.
Accepts a numeric argument and returns a Float64 number close to 2 to the power of x.
Accepts a numeric argument and returns a Float64 number close to the binary logarithm of the argument.
Accepts a numeric argument and returns a Float64 number close to 10 to the power of x.
Accepts a numeric argument and returns a Float64 number close to the decimal logarithm of the argument.
Accepts a numeric argument and returns a Float64 number close to the square root of the argument.
Accepts a numeric argument and returns a Float64 number close to the cubic root of the argument.
If ‘x’ is non-negative, then erf(x / σ√2)
is the probability that a random variable having a normal distribution with standard deviation ‘σ’ takes the value that is separated from the expected value by more than ‘x’.
Example (three sigma rule):
Accepts a numeric argument and returns a Float64 number close to 1 - erf(x), but without loss of precision for large ‘x’ values.
The logarithm of the gamma function.
Gamma function.
The sine.
The cosine.
The tangent.
The arc sine.
The arc cosine.
The arc tangent.
Takes two numeric arguments x and y. Returns a Float64 number close to x to the power of y.
Accepts a numeric argument and returns a UInt64 number close to 2 to the power of x.
Accepts a numeric argument and returns a UInt64 number close to 10 to the power of x.
Syntax
Arguments
Returned value
Values from the interval: 1 <= cosh(x) < +∞
.
Example
Query:
Result:
Syntax
Arguments
Returned value
The angle, in radians. Values from the interval: 0 <= acosh(x) < +∞
.
Example
Query:
Result:
See Also
Syntax
Arguments
Returned value
Values from the interval: -∞ < sinh(x) < +∞
.
Example
Query:
Result:
Syntax
Arguments
Returned value
The angle, in radians. Values from the interval: -∞ < asinh(x) < +∞
.
Example
Query:
Result:
See Also
Syntax
Arguments
Returned value
The angle, in radians. Values from the interval: -∞ < atanh(x) < +∞
.
Example
Query:
Result:
Syntax
Arguments
Returned value
The angle θ
such that −π < θ ≤ π
, in radians.
Example
Query:
Result:
Syntax
Arguments
Returned value
The length of the hypotenuse of a right-angle triangle.
Example
Query:
Result:
Syntax
Arguments
Returned value
Values from the interval: -∞ < log1p(x) < +∞
.
Example
Query:
Result:
See Also
Returns the sign of a real number.
Syntax
Arguments
x
— Values from -∞
to +∞
. Support all numeric types in ClickHouse.
Returned value
-1 for x < 0
0 for x = 0
1 for x > 0
Examples
Sign for the zero value:
Result:
Sign for the positive value:
Result:
Sign for the negative value:
Result:
Alias: ISNULL
.
Arguments
x
— A value with a non-compound data type.
Returned value
1
if x
is NULL
.
0
if x
is not NULL
.
Example
Input table
Query
Arguments:
x
— A value with a non-compound data type.
Returned value
0
if x
is NULL
.
1
if x
is not NULL
.
Example
Input table
Query
Checks from left to right whether NULL
arguments were passed and returns the first non-NULL
argument.
Arguments:
Any number of parameters of a non-compound type. All parameters must be compatible by data type.
Returned values
The first non-NULL
argument.
NULL
, if all arguments are NULL
.
Example
Consider a list of contacts that may specify multiple ways to contact a customer.
The mail
and phone
fields are of type String, but the icq
field is UInt32
, so it needs to be converted to String
.
Get the first available contact method for the customer from the contact list:
Returns an alternative value if the main argument is NULL
.
Arguments:
x
— The value to check for NULL
.
alt
— The value that the function returns if x
is NULL
.
Returned values
The value x
, if x
is not NULL
.
The value alt
, if x
is NULL
.
Example
Returns NULL
if the arguments are equal.
Arguments:
x
, y
— Values for comparison. They must be compatible types, or ClickHouse will generate an exception.
Returned values
NULL
, if the arguments are equal.
The x
value, if the arguments are not equal.
Example
Arguments:
x
— The original value.
Returned values
The original value from the non-Nullable
type, if it is not NULL
.
Implementation specific result if the original value was NULL
.
Example
Consider the t_null
table.
Apply the assumeNotNull
function to the y
column.
Converts the argument type to Nullable
.
Arguments:
x
— The value of any non-compound type.
Returned value
The input value with a Nullable
type.
Example
Returns a string with the name of the host that this function was performed on. For distributed processing, this is the name of the remote server host, if the function is performed on a remote server. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
Syntax
Arguments
Returned value
Value of the specified macro.
Example
The example macros
section in the server configuration file:
Query:
Result:
An alternative way to get the same value:
Returns the fully qualified domain name.
Syntax
This function is case-insensitive.
Returned value
String with the fully qualified domain name.
Type: String
.
Example
Query:
Result:
Extracts the trailing part of a string after the last slash or backslash. This function if often used to extract the filename from a path.
Arguments
Returned Value
A string that contains:
The trailing part of a string after the last slash or backslash.
The original string if there are no slashes or backslashes.
Example
Calculates the approximate width when outputting values to the console in text format (tab-separated). This function is used by the system for implementing Pretty formats.
NULL
is represented as a string corresponding to NULL
in Pretty
formats.
Returns a string containing the type name of the passed argument.
If NULL
is passed to the function as input, then it returns the Nullable(Nothing)
type, which corresponds to an internal NULL
representation in ClickHouse.
Gets the size of the block. In ClickHouse, queries are always run on blocks (sets of column parts). This function allows getting the size of the block that you called it for.
Returns estimation of uncompressed byte size of its arguments in memory.
Syntax
Arguments
argument
— Value.
Returned value
Estimation of byte size of the arguments in memory.
Examples
Query:
Result:
Query:
Result:
If the function takes multiple arguments, it returns their combined byte size.
Query:
Result:
Turns a constant into a full column containing just one value. In ClickHouse, full columns and constants are represented differently in memory. Functions work differently for constant arguments and normal arguments (different code is executed), although the result is almost always the same. This function is for debugging this behavior.
Accepts any arguments, including NULL
. Always returns 0. However, the argument is still evaluated. This can be used for benchmarks.
Sleeps ‘seconds’ seconds on each data block. You can specify an integer or a floating-point number.
Sleeps ‘seconds’ seconds on each row. You can specify an integer or a floating-point number.
Returns the name of the current database. You can use this function in table engine parameters in a CREATE TABLE query where you need to specify the database.
Returns the login of current user. Login of user, that initiated query, will be returned in case distibuted query.
Alias: user()
, USER()
.
Returned values
Login of current user.
Login of user that initiated query in case of disributed query.
Type: String
.
Example
Query:
Result:
Checks whether the argument is a constant expression.
The function is intended for development, debugging and demonstration.
Syntax
Arguments
x
— Expression to check.
Returned values
1
— x
is constant.
0
— x
is non-constant.
Examples
Query:
Result:
Query:
Result:
Query:
Result:
Accepts Float32 and Float64 and returns UInt8 equal to 1 if the argument is not infinite and not a NaN, otherwise 0.
Accepts Float32 and Float64 and returns UInt8 equal to 1 if the argument is infinite, otherwise 0. Note that 0 is returned for a NaN.
Checks whether floating point value is finite.
Syntax
Arguments
Returned value
x
if x
is finite.
y
if x
is not finite.
Example
Query:
Result:
Accepts Float32 and Float64 and returns UInt8 equal to 1 if the argument is a NaN, otherwise 0.
Accepts constant strings: database name, table name, and column name. Returns a UInt8 constant expression equal to 1 if there is a column, otherwise 0. If the hostname parameter is set, the test will run on a remote server. The function throws an exception if the table does not exist. For elements in a nested data structure, the function checks for the existence of a column. For the nested data structure itself, the function returns 0.
Allows building a unicode-art diagram.
bar(x, min, max, width)
draws a band with a width proportional to (x - min)
and equal to width
characters when x = max
.
Arguments
x
— Size to display.
min, max
— Integer constants. The value must fit in Int64
.
width
— Constant, positive integer, can be fractional.
The band is drawn with accuracy to one eighth of a symbol.
Example:
Transforms a value according to the explicitly defined mapping of some elements to other ones. There are two variations of this function:
x
– What to transform.
array_from
– Constant array of values for converting.
array_to
– Constant array of values to convert the values in ‘from’ to.
default
– Which value to use if ‘x’ is not equal to any of the values in ‘from’.
array_from
and array_to
– Arrays of the same size.
Types:
transform(T, Array(T), Array(U), U) -> U
T
and U
can be numeric, string, or Date or DateTime types. Where the same letter is indicated (T or U), for numeric types these might not be matching types, but types that have a common type. For example, the first argument can have the Int64 type, while the second has the Array(UInt16) type.
If the ‘x’ value is equal to one of the elements in the ‘array_from’ array, it returns the existing element (that is numbered the same) from the ‘array_to’ array. Otherwise, it returns ‘default’. If there are multiple matching elements in ‘array_from’, it returns one of the matches.
Example:
Differs from the first variation in that the ‘default’ argument is omitted. If the ‘x’ value is equal to one of the elements in the ‘array_from’ array, it returns the matching element (that is numbered the same) from the ‘array_to’ array. Otherwise, it returns ‘x’.
Types:
transform(T, Array(T), Array(T)) -> T
Example:
Accepts the size (number of bytes). Returns a rounded size with a suffix (KB, MB, etc.) as a string.
Example:
Accepts the size (number of bytes). Returns a rounded size with a suffix (KiB, MiB, etc.) as a string.
Example:
Accepts the number. Returns a rounded number with a suffix (thousand, million, billion, etc.) as a string.
It is useful for reading big numbers by human.
Example:
Accepts the time delta in seconds. Returns a time delta with (year, month, day, hour, minute, second) as a string.
Syntax
Arguments
column
— A column with numeric time delta.
maximum_unit
— Optional. Maximum unit to show. Acceptable values seconds, minutes, hours, days, months, years.
Example:
Returns the smallest value from a and b.
Returns the largest value of a and b.
Returns the server’s uptime in seconds. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
Returns the version of the server as a string. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.
Returns the sequence number of the data block where the row is located.
Returns the ordinal number of the row in the data block. Different data blocks are always recalculated.
Returns the ordinal number of the row in the data block. This function only considers the affected data blocks.
The window function that provides access to a row at a specified offset which comes before or after the current row of a given column.
Syntax
The result of the function depends on the affected data blocks and the order of data in the block.
WARNING
It can reach the neighbor rows only inside the currently processed data block.
Arguments
column
— A column name or scalar expression.
default_value
— Optional. The value to be returned if offset goes beyond the scope of the block. Type of data blocks affected.
Returned values
Value for column
in offset
distance from current row if offset
value is not outside block bounds.
Default value for column
if offset
value is outside block bounds. If default_value
is given, then it will be used.
Type: type of data blocks affected or default value type.
Example
Query:
Result:
Query:
Result:
This function can be used to compute year-over-year metric value:
Query:
Result:
Calculates the difference between successive row values in the data block. Returns 0 for the first row and the difference from the previous row for each subsequent row.
WARNING
It can reach the previous row only inside the currently processed data block.
The result of the function depends on the affected data blocks and the order of data in the block.
Example:
Please note - block size affects the result. With each new block, the runningDifference
state is reset.
Calculates the number of concurrent events. Each event has a start time and an end time. The start time is included in the event, while the end time is excluded. Columns with a start time and an end time must be of the same data type. The function calculates the total number of active (concurrent) events for each event start time.
WARNING
Events must be ordered by the start time in ascending order. If this requirement is violated the function raises an exception. Every data block is processed separately. If events from different data blocks overlap then they can not be processed correctly.
Syntax
Arguments
Returned values
The number of concurrent events at each event start time.
Example
Consider the table:
Query:
Result:
Accepts a UInt64 number. Interprets it as a MAC address in big endian. Returns a string containing the corresponding MAC address in the format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form).
The inverse function of MACNumToString. If the MAC address has an invalid format, it returns 0.
Accepts a MAC address in the format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form). Returns the first three octets as a UInt64 number. If the MAC address has an invalid format, it returns 0.
Arguments:
value
— Value of type Enum
.
Returned values
The number of fields with Enum
input values.
An exception is thrown if the type is not Enum
.
Example
Returns size on disk (without taking into account compression).
Arguments
value
— Any value.
Returned values
The number of bytes that will be written to disk for block of values (without compression).
Example
Query:
Result:
Returns the name of the class that represents the data type of the column in RAM.
Arguments:
value
— Any type of value.
Returned values
A string with the name of the class that is used for representing the value
data type in RAM.
Example of the difference betweentoTypeName ' and ' toColumnTypeName
The example shows that the DateTime
data type is stored in memory as Const(UInt32)
.
Outputs a detailed description of data structures in RAM
Arguments:
value
— Any type of value.
Returned values
A string describing the structure that is used for representing the value
data type in RAM.
Example
Outputs the default value for the data type.
Does not include default values for custom columns set by the user.
Arguments:
expression
— Arbitrary type of value or an expression that results in a value of an arbitrary type.
Returned values
0
for numbers.
Empty string for strings.
Example
Outputs the default value for given type name.
Does not include default values for custom columns set by the user.
Arguments:
type
— A string representing a type name.
Returned values
0
for numbers.
Empty string for strings.
Example
The function is intended for debugging and introspection purposes. The function ignores it's argument and always returns 1. Arguments are not even evaluated.
But for the purpose of index analysis, the argument of this function is analyzed as if it was present directly without being wrapped inside indexHint
function. This allows to select data in index ranges by the corresponding condition but without further filtering by this condition. The index in ClickHouse is sparse and using indexHint
will yield more data than specifying the same condition directly.
Syntax
Returned value
Example
Input table:
The table has indexes on the fields (FlightDate, (Year, FlightDate))
.
Create a query, where the index is not used.
Query:
ClickHouse processed the entire table (Processed 4.28 million rows
).
Result:
To apply the index, select a specific date.
Query:
By using the index, ClickHouse processed a significantly smaller number of rows (Processed 32.74 thousand rows
).
Result:
Now wrap the expression k = '2017-09-15'
into indexHint
function.
Query:
ClickHouse used the index in the same way as the previous time (Processed 32.74 thousand rows
). The expression k = '2017-09-15'
was not used when generating the result. In examle the indexHint
function allows to see adjacent dates.
Result:
Creates an array with a single value.
Arguments:
arr
— Original array. ClickHouse creates a new array of the same length as the original and fills it with the value x
.
x
— The value that the resulting array will be filled with.
Returned value
An array filled with the value x
.
Type: Array
.
Example
Query:
Result:
Syntax
Returned value
The amount of remaining space available in bytes.
Example
Query:
Result:
Returns total amount of the free space on the filesystem where the files of the databases located. See also filesystemAvailable
Syntax
Returned value
Amount of free space in bytes.
Example
Query:
Result:
Syntax
Returned value
Capacity information of the filesystem in bytes.
Example
Query:
Result:
Syntax
Arguments
arg
— Arguments of aggregate function.
Returned value(s)
Result of aggregation for every row passed to the function.
The return type is the same as the return type of function, that initializeAgregation
takes as first argument.
Example
Query:
Result:
Query:
Result:
Example with AggregatingMergeTree
table engine and AggregateFunction
column:
See Also
Syntax
Arguments
Returned value(s)
Value/values that was aggregated.
Type: Value of any types that was aggregated.
Examples
Query:
Result:
Query:
Result:
Note that NULL
values are ignored.
Query:
Result:
Combined example:
Query:
Result:
See Also
Accumulates states of an aggregate function for each row of a data block.
WARNING
The state is reset for each new data block.
Syntax
Arguments
Returned value
Each resulting row contains a result of the aggregate function, accumulated for all the input rows from 0 to the current position. runningAccumulate
resets states for each new data block or when the grouping
value changes.
Type depends on the aggregate function used.
Examples
Consider how you can use runningAccumulate
to find the cumulative sum of numbers without and with grouping.
Query:
Result:
The whole query does the following:
For the first row, runningAccumulate
takes sumState(0)
and returns 0
.
For the second row, the function merges sumState(0)
and sumState(1)
resulting in sumState(0 + 1)
, and returns 1
as a result.
For the third row, the function merges sumState(0 + 1)
and sumState(2)
resulting in sumState(0 + 1 + 2)
, and returns 3
as a result.
The actions are repeated until the block ends.
The following example shows the groupping
parameter usage:
Query:
Result:
As you can see, runningAccumulate
merges states for each group of rows separately.
Only supports tables created with the ENGINE = Join(ANY, LEFT, <join_keys>)
statement.
Syntax
Arguments
value_column
— name of the column of the table that contains required data.
join_keys
— list of keys.
Returned value
Returns list of values corresponded to list of keys.
Example
Input table:
Query:
Result:
Prerequisites
Build the catboost evaluation library
Next, specify the path to libcatboostmodel.<so|dylib>
in the clickhouse configuration:
For security and isolation reasons, the model evaluation does not run in the server process but in the clickhouse-library-bridge process. At the first execution of catboostEvaluate()
, the server starts the library bridge process if it is not running already. Both processes communicate using a HTTP interface. By default, port 9012
is used. A different port can be specified as follows - this is useful if port 9012
is already assigned to a different service.
Train a catboost model using libcatboost
Throw an exception if the argument is non zero. message
- is an optional parameter: a constant string providing a custom error message error_code
- is an optional parameter: a constant integer providing a custom error code
To use the error_code
argument, configuration parameter allow_custom_error_code_in_throwif
must be enabled.
Returns the same value that was used as its argument. Used for debugging and testing, allows to cancel using index, and get the query performance of a full scan. When query is analyzed for possible use of index, the analyzer does not look inside identity
functions. Also constant folding is not applied too.
Syntax
Example
Query:
Result:
Syntax
Parameter
Returned value
The setting current value.
Example
Result
See Also
Syntax
Arguments
Returned values
1
— Decimal value has more digits then it's precision allow,
0
— Decimal value satisfies the specified precision.
Example
Query:
Result:
Returns number of decimal digits you need to represent the value.
Syntax
Arguments
Returned value
Number of digits.
NOTE
Example
Query:
Result:
Returned value
Variable name for the error code.
Syntax
Result:
Syntax
Arguments
None.
Returned value
The TCP port number.
Example
Query:
Result:
See Also
All the functions accept zero arguments or one argument. If an argument is passed, it can be any type, and its value is not used for anything. The only purpose of this argument is to prevent common subexpression elimination, so that two different instances of the same function return different columns with different random numbers.
NOTE
Non-cryptographic generators of pseudo-random numbers are used.
Returns a pseudo-random UInt32 number, evenly distributed among all UInt32-type numbers.
Uses a linear congruential generator.
Returns a pseudo-random UInt64 number, evenly distributed among all UInt64-type numbers.
Uses a linear congruential generator.
The function generates pseudo random results with independent and identically distributed uniformly distributed values in [0, 1).
Non-deterministic. Return type is Float64.
Produces a constant column with a random value.
Syntax
Arguments
Returned value
Pseudo-random number.
Example
Query:
Result:
Syntax
Inverts bits of s
, each with probability prob
.
Arguments
s
- String
or FixedString
prob
- constant Float32/64
Returned value Fuzzed string with same as s type.
Example
Result:
NOTE
Replaces the first occurrence of the substring ‘pattern’ (if it exists) in ‘haystack’ by the ‘replacement’ string. ‘pattern’ and ‘replacement’ must be constants.
Replaces all occurrences of the substring ‘pattern’ in ‘haystack’ by the ‘replacement’ string.
Example 1. Converting ISO dates to American format:
Example 2. Copying a string ten times:
Like ‘replaceRegexpOne‘, but replaces all occurrences of the pattern. Example:
As an exception, if a regular expression worked on an empty substring, the replacement is not made more than once. Example:
Returns the largest round number that is less than or equal to x
. A round number is a multiple of 1/10N, or the nearest number of the appropriate data type if 1 / 10N isn’t exact. ‘N’ is an integer constant, optional parameter. By default it is zero, which means to round to an integer. ‘N’ may be negative.
Examples: floor(123.45, 1) = 123.4, floor(123.45, -1) = 120.
x
is any numeric type. The result is a number of the same type. For integer arguments, it makes sense to round with a negative N
value (for non-negative N
, the function does not do anything). If rounding causes overflow (for example, floor(-128, -1)), an implementation-specific result is returned.
Returns the smallest round number that is greater than or equal to x
. In every other way, it is the same as the floor
function (see above).
Returns the round number with largest absolute value that has an absolute value less than or equal to x
‘s. In every other way, it is the same as the ’floor’ function (see above).
Rounds a value to a specified number of decimal places.
The function returns the nearest number of the specified order. In case when given number has equal distance to surrounding numbers, the function uses banker’s rounding for float number types and rounds away from zero for the other number types (Decimal).
Arguments
decimal-places
— An integer value.
If decimal-places > 0
then the function rounds the value to the right of the decimal point.
If decimal-places < 0
then the function rounds the value to the left of the decimal point.
If decimal-places = 0
then the function rounds the value to integer. In this case the argument can be omitted.
Returned value:
The rounded number of the same type as the input number.
Example of use with Float
Example of use with Decimal
Examples of rounding
Rounding to the nearest number.
Banker’s rounding.
See Also
Rounds a number to a specified decimal position.
If the rounding number is halfway between two numbers, the function uses banker’s rounding.
In other cases, the function rounds numbers to the nearest integer.
Using banker’s rounding, you can reduce the effect that rounding numbers has on the results of summing or subtracting these numbers.
For example, sum numbers 1.5, 2.5, 3.5, 4.5 with different rounding:
No rounding: 1.5 + 2.5 + 3.5 + 4.5 = 12.
Banker’s rounding: 2 + 2 + 4 + 4 = 12.
Rounding to the nearest integer: 2 + 3 + 4 + 5 = 14.
Syntax
Arguments
decimal-places
— Decimal places. An integer number.
decimal-places > 0
— The function rounds the number to the given position right of the decimal point. Example: roundBankers(3.55, 1) = 3.6
.
decimal-places < 0
— The function rounds the number to the given position left of the decimal point. Example: roundBankers(24.55, -1) = 20
.
decimal-places = 0
— The function rounds the number to an integer. In this case the argument can be omitted. Example: roundBankers(2.5) = 2
.
Returned value
A value rounded by the banker’s rounding method.
Example of use
Query:
Result:
Examples of Banker’s rounding
See Also
Accepts a number. If the number is less than one, it returns 0. Otherwise, it rounds the number down to the nearest (whole non-negative) degree of two.
Accepts a number. If the number is less than one, it returns 0. Otherwise, it rounds the number down to numbers from the set: 1, 10, 30, 60, 120, 180, 240, 300, 600, 1200, 1800, 3600, 7200, 18000, 36000.
Accepts a number. If the number is less than 18, it returns 0. Otherwise, it rounds the number down to a number from the set: 18, 25, 35, 45, 55.
Accepts a number and rounds it down to an element in the specified array. If the value is less than the lowest bound, the lowest bound is returned.
NOTE
Searches for the substring needle
in the string haystack
.
Returns the position (in bytes) of the found substring in the string, starting from 1.
Syntax
Alias: locate(haystack, needle[, start_pos])
.
NOTE
Syntax of position(needle IN haystack)
provides SQL-compatibility, the function works the same way as to position(haystack, needle)
.
Arguments
Returned values
Starting position in bytes (counting from 1), if substring was found.
0, if the substring was not found.
Type: Integer
.
Examples
The phrase “Hello, world!” contains a set of bytes representing a single-byte encoded text. The function returns some expected result:
Query:
Result:
Query:
Result:
Examples for POSITION(needle IN haystack) syntax
Query:
Result:
Query:
Result:
Works under the assumption that the string contains a set of bytes representing a single-byte encoded text. If this assumption is not met and a character can’t be represented using a single byte, the function does not throw an exception and returns some unexpected result. If character can be represented using two bytes, it will use two bytes and so on.
Syntax
Arguments
Returned values
Starting position in bytes (counting from 1), if substring was found.
0, if the substring was not found.
Type: Integer
.
Example
Query:
Result:
Returns the position (in Unicode points) of the found substring in the string, starting from 1.
Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, the function does not throw an exception and returns some unexpected result. If character can be represented using two Unicode points, it will use two and so on.
Syntax
Arguments
Returned values
Starting position in Unicode points (counting from 1), if substring was found.
0, if the substring was not found.
Type: Integer
.
Examples
The phrase “Hello, world!” in Russian contains a set of Unicode points representing a single-point encoded text. The function returns some expected result:
Query:
Result:
The phrase “Salut, étudiante!”, where character é
can be represented using a one point (U+00E9
) or two points (U+0065U+0301
) the function can be returned some unexpected result:
Query for the letter é
, which is represented one Unicode point U+00E9
:
Result:
Query for the letter é
, which is represented two Unicode points U+0065U+0301
:
Result:
Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, the function does not throw an exception and returns some unexpected result. If character can be represented using two Unicode points, it will use two and so on.
Syntax
Arguments
Returned value
Starting position in Unicode points (counting from 1), if substring was found.
0, if the substring was not found.
Type: Integer
.
Example
Query:
Result:
The search is performed on sequences of bytes without respect to string encoding and collation.
For case-insensitive ASCII search, use the function multiSearchAllPositionsCaseInsensitive
.
For case-insensitive UTF-8 search, use the function multiSearchAllPositionsCaseInsensitiveUTF8.
Syntax
Arguments
Returned values
Array of starting positions in bytes (counting from 1), if the corresponding substring was found and 0 if not found.
Example
Query:
Result:
See multiSearchAllPositions
.
The same as position
but returns the leftmost offset of the string haystack
that is matched to some of the needles.
For a case-insensitive search or/and in UTF-8 format use functions multiSearchFirstPositionCaseInsensitive, multiSearchFirstPositionUTF8, multiSearchFirstPositionCaseInsensitiveUTF8
.
Returns the index i
(starting from 1) of the leftmost found needlei in the string haystack
and 0 otherwise.
For a case-insensitive search or/and in UTF-8 format use functions multiSearchFirstIndexCaseInsensitive, multiSearchFirstIndexUTF8, multiSearchFirstIndexCaseInsensitiveUTF8
.
Returns 1, if at least one string needlei matches the string haystack
and 0 otherwise.
For a case-insensitive search or/and in UTF-8 format use functions multiSearchAnyCaseInsensitive, multiSearchAnyUTF8, multiSearchAnyCaseInsensitiveUTF8
.
NOTE
In all multiSearch*
functions the number of needles should be less than 28 because of implementation specification.
Returns 0 if it does not match, or 1 if it matches.
For patterns to search for substrings in a string, it is better to use LIKE or ‘position’, since they work much faster.
NOTE
The length of any of the haystack
string must be less than 232 bytes otherwise the exception is thrown. This restriction takes place because of hyperscan API.
The same as multiMatchAny
, but returns any index that matches the haystack.
The same as multiMatchAny
, but returns the array of all indices that match the haystack in any order.
The same as multiFuzzyMatchAny
, but returns any index that matches the haystack within a constant edit distance.
The same as multiFuzzyMatchAny
, but returns the array of all indices in any order that match the haystack within a constant edit distance.
NOTE
multiFuzzyMatch*
functions do not support UTF-8 regular expressions, and such expressions are treated as bytes because of hyperscan restriction.
NOTE
To turn off all functions that use hyperscan, use setting SET allow_hyperscan = 0;
.
Extracts a fragment of a string using a regular expression. If ‘haystack’ does not match the ‘pattern’ regex, an empty string is returned. If the regex does not contain subpatterns, it takes the fragment that matches the entire regex. Otherwise, it takes the fragment that matches the first subpattern.
Extracts all the fragments of a string using a regular expression. If ‘haystack’ does not match the ‘pattern’ regex, an empty string is returned. Returns an array of strings consisting of all matches to the regex. In general, the behavior is the same as the ‘extract’ function (it takes the first subpattern, or the entire expression if there isn’t a subpattern).
Matches all groups of the haystack
string using the pattern
regular expression. Returns an array of arrays, where the first array includes all fragments matching the first group, the second array - matching the second group, etc.
NOTE
Syntax
Arguments
Returned value
If haystack
does not match the pattern
regex, an array of empty arrays is returned.
Example
Query:
Result:
See Also
Matches all groups of the haystack
string using the pattern
regular expression. Returns an array of arrays, where each array includes matching fragments from every group. Fragments are grouped in order of appearance in the haystack
.
Syntax
Arguments
Returned value
If haystack
does not match the pattern
regex, an empty array is returned.
Example
Query:
Result:
See Also
Checks whether a string matches a simple regular expression. The regular expression can contain the metasymbols %
and _
.
%
indicates any quantity of any bytes (including zero characters).
_
indicates any one byte.
Use the backslash (\
) for escaping metasymbols. See the note on escaping in the description of the ‘match’ function.
For regular expressions like %needle%
, the code is more optimal and works as fast as the position
function. For other regular expressions, the code is the same as for the ‘match’ function.
The same thing as ‘like’, but negative.
The function ignores the language, e.g. for Turkish (i/İ), the result might be incorrect.
Syntax
Arguments
pattern
— If pattern
does not contain percent signs or underscores, then the pattern
only represents the string itself. An underscore (_
) in pattern
stands for (matches) any single character. A percent sign (%
) matches any sequence of zero or more characters.
Some pattern
examples:
Returned values
True, if the string matches pattern
.
False, if the string does not match pattern
.
Example
Input table:
Query:
Result:
See Also
Calculates the 4-gram distance between haystack
and needle
: counts the symmetric difference between two multisets of 4-grams and normalizes it by the sum of their cardinalities. Returns float number from 0 to 1 – the closer to zero, the more strings are similar to each other. If the constant needle
or haystack
is more than 32Kb, throws an exception. If some of the non-constant haystack
or needle
strings are more than 32Kb, the distance is always one.
For case-insensitive search or/and in UTF-8 format use functions ngramDistanceCaseInsensitive, ngramDistanceUTF8, ngramDistanceCaseInsensitiveUTF8
.
Same as ngramDistance
but calculates the non-symmetric difference between needle
and haystack
– the number of n-grams from needle minus the common number of n-grams normalized by the number of needle
n-grams. The closer to one, the more likely needle
is in the haystack
. Can be useful for fuzzy string search.
For case-insensitive search or/and in UTF-8 format use functions ngramSearchCaseInsensitive, ngramSearchUTF8, ngramSearchCaseInsensitiveUTF8
.
NOTE
For UTF-8 case we use 3-gram distance. All these are not perfectly fair n-gram distances. We use 2-byte hashes to hash n-grams and then calculate the (non-)symmetric difference between these hash tables – collisions may occur. With UTF-8 case-insensitive format we do not use fair tolower
function – we zero the 5-th bit (starting from zero) of each codepoint byte and first bit of zeroth byte if bytes more than one – this works for Latin and mostly for all Cyrillic letters.
Returns the number of substring occurrences.
Syntax
Arguments
Returned values
Number of occurrences.
Examples
Query:
Result:
Query:
Result:
Query:
Result:
Returns the number of substring occurrences case-insensitive.
Syntax
Arguments
Returned values
Number of occurrences.
Examples
Query:
Result:
Query:
Result:
Query:
Result:
Returns the number of substring occurrences in UTF-8
case-insensitive.
Syntax
Arguments
Returned values
Number of occurrences.
Examples
Query:
Result:
Query:
Result:
Returns the number of regular expression matches for a pattern
in a haystack
.
Syntax
Arguments
Returned value
The number of matches.
Examples
Query:
Result:
Query:
Result:
Splits a string into substrings separated by a specified character. It uses a constant string separator
which consists of exactly one character. Returns an array of selected substrings. Empty substrings may be selected if the separator occurs at the beginning or end of the string, or if there are multiple consecutive separators.
Syntax
Arguments
max_substrings
— An optional Int64
defaulting to 0. When max_substrings
> 0, the returned substrings will be no more than max_substrings
, otherwise the function will return as many substrings as possible.
Returned value(s)
Returns an array of selected substrings. Empty substrings may be selected when:
A separator occurs at the beginning or end of the string;
There are multiple consecutive separators;
The original string s
is empty.
Example
Splits a string into substrings separated by a string. It uses a constant string separator
of multiple characters as the separator. If the string separator
is empty, it will split the string s
into an array of single characters.
Syntax
Arguments
max_substrings
— An optional Int64
defaulting to 0. When max_substrings
> 0, the returned substrings will be no more than max_substrings
, otherwise the function will return as many substrings as possible.
Returned value(s)
Returns an array of selected substrings. Empty substrings may be selected when:
A non-empty separator occurs at the beginning or end of the string;
There are multiple consecutive non-empty separators;
The original string s
is empty while the separator is not empty.
Example
Concatenates string representations of values listed in the array with the separator. separator
is an optional parameter: a constant string, set to an empty string by default. Returns the string.
Selects substrings of consecutive bytes from the ranges a-z and A-Z.Returns an array of substrings.
Syntax
Arguments
max_substrings
— An optional Int64
defaulting to 0. When max_substrings
> 0, the returned substrings will be no more than max_substrings
, otherwise the function will return as many substrings as possible.
Returned value(s)
Returns an array of selected substrings.
Example
Extracts all groups from non-overlapping substrings matched by a regular expression.
Syntax
Arguments
Returned values
If the function finds at least one matching group, it returns Array(Array(String))
column, clustered by group_id (1 to N, where N is number of capturing groups in regexp
).
If there is no matching group, returns an empty array.
Example
Query:
Result:
NOTE
Checks whether the input string is empty.
Syntax
A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.
Arguments
Returned value
Returns 1
for an empty string or 0
for a non-empty string.
Example
Query:
Result:
Checks whether the input string is non-empty.
Syntax
A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.
Arguments
Returned value
Returns 1
for a non-empty string or 0
for an empty string string.
Example
Query:
Result:
Returns the length of a string in bytes (not in characters, and not in code points). The result type is UInt64. The function also works for arrays.
Returns the length of a string in Unicode code points (not in characters), assuming that the string contains a set of bytes that make up UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception). The result type is UInt64.
Returns the length of a string in Unicode code points (not in characters), assuming that the string contains a set of bytes that make up UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception). The result type is UInt64.
Returns the length of a string in Unicode code points (not in characters), assuming that the string contains a set of bytes that make up UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception). The result type is UInt64.
Pads the current string from the left with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL LPAD
function.
Syntax
Arguments
Returned value
The resulting string of the given length.
Example
Query:
Result:
Syntax
Arguments
Returned value
The resulting string of the given length.
Example
Query:
Result:
Pads the current string from the right with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL RPAD
function.
Syntax
Arguments
Returned value
The resulting string of the given length.
Example
Query:
Result:
Syntax
Arguments
Returned value
The resulting string of the given length.
Example
Query:
Result:
Converts ASCII Latin symbols in a string to lowercase.
Converts ASCII Latin symbols in a string to uppercase.
Converts a string to lowercase, assuming the string contains a set of bytes that make up a UTF-8 encoded text. It does not detect the language. E.g. for Turkish the result might not be exactly correct (i/İ vs. i/I). If the length of the UTF-8 byte sequence is different for upper and lower case of a code point, the result may be incorrect for this code point. If the string contains a sequence of bytes that are not valid UTF-8, then the behavior is undefined.
Converts a string to uppercase, assuming the string contains a set of bytes that make up a UTF-8 encoded text. It does not detect the language. E.g. for Turkish the result might not be exactly correct (i/İ vs. i/I). If the length of the UTF-8 byte sequence is different for upper and lower case of a code point, the result may be incorrect for this code point. If the string contains a sequence of bytes that are not valid UTF-8, then the behavior is undefined.
Returns 1, if the set of bytes is valid UTF-8 encoded, otherwise 0.
Replaces invalid UTF-8 characters by the �
(U+FFFD) character. All running in a row invalid characters are collapsed into the one replacement character.
Arguments
Returned value: Valid UTF-8 string.
Example
Repeats a string as many times as specified and concatenates the replicated values as a single string.
Alias: REPEAT
.
Syntax
Arguments
Returned value
The single string, which contains the string s
repeated n
times. If n
\< 1, the function returns empty string.
Type: String
.
Example
Query:
Result:
Reverses the string (as a sequence of bytes).
Reverses a sequence of Unicode code points, assuming that the string contains a set of bytes representing a UTF-8 text. Otherwise, it does something else (it does not throw an exception).
Formatting constant pattern with the string listed in the arguments. pattern
is a simplified Python format pattern. Format string contains “replacement fields” surrounded by curly braces {}
. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output. If you need to include a brace character in the literal text, it can be escaped by doubling: {{ '{{' }}
and {{ '}}' }}
. Field names can be numbers (starting from zero) or empty (then they are treated as consequence numbers).
Concatenates the strings listed in the arguments, without a separator.
Syntax
Arguments
Values of type String or FixedString.
Returned values
Returns the String that results from concatenating the arguments.
If any of argument values is NULL
, concat
returns NULL
.
Example
Query:
Result:
The function is named “injective” if it always returns different result for different values of arguments. In other words: different arguments never yield identical result.
Syntax
Arguments
Values of type String or FixedString.
Returned values
Returns the String that results from concatenating the arguments.
If any of argument values is NULL
, concatAssumeInjective
returns NULL
.
Example
Input table:
Query:
Result:
Returns a substring starting with the byte from the ‘offset’ index that is ‘length’ bytes long. Character indexing starts from one (as in standard SQL).
The same as ‘substring’, but for Unicode code points. Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception).
If the ‘s’ string is non-empty and does not contain the ‘c’ character at the end, it appends the ‘c’ character to the end.
Returns the string ‘s’ that was converted from the encoding in ‘from’ to the encoding in ‘to’.
Syntax
Arguments
Returned value
A string containing encoded value of 1st argument.
Example
Query:
Result:
Encodes ‘s’ FixedString or String into base64.
Alias: TO_BASE64
.
Decode base64-encoded FixedString or String ‘s’ into original string. In case of failure raises an exception.
Alias: FROM_BASE64
.
Similar to base64Decode, but returns an empty string in case of error.
Returns whether to end with the specified suffix. Returns 1 if the string ends with the specified suffix, otherwise it returns 0.
Returns 1 whether string starts with the specified prefix, otherwise it returns 0.
Returned values
1, if the string starts with the specified prefix.
0, if the string does not start with the specified prefix.
Example
Query:
Result:
Removes all specified characters from the start or end of a string. By default removes all consecutive occurrences of common whitespace (ASCII character 32) from both ends of a string.
Syntax
Arguments
Returned value
A string without leading and (or) trailing specified characters.
Type: String
.
Example
Query:
Result:
Removes all consecutive occurrences of common whitespace (ASCII character 32) from the beginning of a string. It does not remove other kinds of whitespace characters (tab, no-break space, etc.).
Syntax
Alias: ltrim(input_string)
.
Arguments
Returned value
A string without leading common whitespaces.
Type: String
.
Example
Query:
Result:
Removes all consecutive occurrences of common whitespace (ASCII character 32) from the end of a string. It does not remove other kinds of whitespace characters (tab, no-break space, etc.).
Syntax
Alias: rtrim(input_string)
.
Arguments
Returned value
A string without trailing common whitespaces.
Type: String
.
Example
Query:
Result:
Removes all consecutive occurrences of common whitespace (ASCII character 32) from both ends of a string. It does not remove other kinds of whitespace characters (tab, no-break space, etc.).
Syntax
Alias: trim(input_string)
.
Arguments
Returned value
A string without leading and trailing common whitespaces.
Type: String
.
Example
Query:
Result:
Returns the CRC32 checksum of a string, using CRC-32-IEEE 802.3 polynomial and initial value 0xffffffff
(zlib implementation).
The result type is UInt32.
Returns the CRC32 checksum of a string, using CRC-32-IEEE 802.3 polynomial.
The result type is UInt32.
Returns the CRC64 checksum of a string, using CRC-64-ECMA polynomial.
The result type is UInt64.
Replaces literals, sequences of literals and complex aliases with placeholders.
Syntax
Arguments
Returned value
Sequence of characters with placeholders.
Example
Query:
Result:
Returns identical 64bit hash values without the values of literals for similar queries. It helps to analyze query log.
Syntax
Arguments
Returned value
Hash value.
Example
Query:
Result:
Syntax
Arguments
Returned value
String transformed to NFC normalization form.
Example
Query:
Result:
Syntax
Arguments
Returned value
String transformed to NFD normalization form.
Example
Query:
Result:
Syntax
Arguments
Returned value
String transformed to NFKC normalization form.
Example
Query:
Result:
Syntax
Arguments
Returned value
String transformed to NFKD normalization form.
Example
Query:
Result:
Escapes characters to place string into XML text node or attribute.
The following five XML predefined entities will be replaced: <
, &
, >
, "
, '
.
Syntax
Arguments
Returned value
The sequence of characters with escape characters.
Example
Query:
Result:
Replaces XML predefined entities with characters. Predefined entities are "
&
'
>
<
This function also replaces numeric character references with Unicode characters. Both decimal (like ✓
) and hexadecimal (✓
) forms are supported.
Syntax
Arguments
Returned value
The sequence of characters after replacement.
Example
Query:
Result:
See Also
A function to extract text from HTML or XHTML. It does not necessarily 100% conform to any of the HTML, XML or XHTML standards, but the implementation is reasonably accurate and it is fast. The rules are the following:
Comments are skipped. Example: <!-- test -->
. Comment must end with -->
. Nested comments are not possible. Note: constructions like <!-->
and <!--->
are not valid comments in HTML but they are skipped by other rules.
CDATA is pasted verbatim. Note: CDATA is XML/XHTML specific. But it is processed for "best-effort" approach.
script
and style
elements are removed with all their content. Note: it is assumed that closing tag cannot appear inside content. For example, in JS string literal has to be escaped like "<\/script>"
. Note: comments and CDATA are possible inside script
or style
- then closing tags are not searched inside CDATA. Example: <script><![CDATA[</script>]]></script>
. But they are still searched inside comments. Sometimes it becomes complicated: <script>var x = "<!--"; </script> var y = "-->"; alert(x + y);</script>
Note: script
and style
can be the names of XML namespaces - then they are not treated like usual script
or style
elements. Example: <script:a>Hello</script:a>
. Note: whitespaces are possible after closing tag name: </script >
but not before: < / script>
.
Other tags or tag-like elements are skipped without inner content. Example: <a>.</a>
Note: it is expected that this HTML is illegal: <a test=">"></a>
Note: it also skips something like tags: <>
, <!>
, etc. Note: tag without end is skipped to the end of input: <hello
HTML and XML entities are not decoded. They must be processed by separate function.
Whitespaces in the text are collapsed or inserted by specific rules.
Whitespaces at the beginning and at the end are removed.
Consecutive whitespaces are collapsed.
But if the text is separated by other elements and there is no whitespace, it is inserted.
It may cause unnatural examples: Hello<b>world</b>
, Hello<!-- -->world
- there is no whitespace in HTML, but the function inserts it. Also consider: Hello<p>world</p>
, Hello<br>world
. This behavior is reasonable for data analysis, e.g. to convert HTML to a bag of words.
Also note that correct handling of whitespaces requires the support of <pre></pre>
and CSS display
and white-space
properties.
Syntax
Arguments
Returned value
Extracted text.
Example
Query:
Result:
A function that allows grouping multiple columns. For columns with the types T1, T2, …, it returns a Tuple(T1, T2, …) type tuple containing these columns. There is no cost to execute the function. Tuples are normally used as intermediate values for an argument of IN operators, or for creating a list of formal parameters of lambda functions. Tuples can’t be written to a table.
The function implements the operator (x, y, …)
.
Syntax
A function that allows getting a column from a tuple. ‘N’ is the column index, starting from 1. ‘N’ must be a constant. ‘N’ must be a strict postive integer no greater than the size of the tuple. There is no cost to execute the function.
The function implements the operator x.N
.
Syntax
Syntax
You can use the EXCEPT
expression to skip columns as a result of the query.
Arguments
Returned value
None.
Examples
Input table:
Example of using a Tuple
-type column as the untuple
function parameter:
Query:
Result:
Note: the names are implementation specific and are subject to change. You should not assume specific names of the columns after application of the untuple
.
Example of using an EXCEPT
expression:
Query:
Result:
See Also
Syntax
Arguments
Tuples should have the same type of the elements.
Returned value
The Hamming distance.
Examples
Query:
Result:
Result:
When you convert a value from one to another data type, you should remember that in common case, it is an unsafe operation that can lead to a data loss. A data loss can occur if you try to fit value from a larger data type to a smaller data type, or if you convert values between different data types.
toInt8(expr)
— Results in the Int8
data type.
toInt16(expr)
— Results in the Int16
data type.
toInt32(expr)
— Results in the Int32
data type.
toInt64(expr)
— Results in the Int64
data type.
toInt128(expr)
— Results in the Int128
data type.
toInt256(expr)
— Results in the Int256
data type.
Arguments
Returned value
Integer value in the Int8
, Int16
, Int32
, Int64
, Int128
or Int256
data type.
Example
Query:
Result:
It takes an argument of type String and tries to parse it into Int (8 | 16 | 32 | 64 | 128 | 256). If failed, returns 0.
Example
Query:
Result:
It takes an argument of type String and tries to parse it into Int (8 | 16 | 32 | 64 | 128 | 256). If failed, returns NULL.
Example
Query:
Result:
It takes an argument of type String and tries to parse it into Int (8 | 16 | 32 | 64 | 128 | 256). If failed, returns the default type value.
Example
Query:
Result:
toUInt8(expr)
— Results in the UInt8
data type.
toUInt16(expr)
— Results in the UInt16
data type.
toUInt32(expr)
— Results in the UInt32
data type.
toUInt64(expr)
— Results in the UInt64
data type.
toUInt256(expr)
— Results in the UInt256
data type.
Arguments
Returned value
Integer value in the UInt8
, UInt16
, UInt32
, UInt64
or UInt256
data type.
Example
Query:
Result:
Converts the argument to Date
data type.
If the argument is DateTime
or DateTime64
, it truncates it, leaving the date component of the DateTime:
If the argument is a string, it is parsed as Date or DateTime. If it was parsed as DateTime, the date component is being used:
If the argument is a number and it looks like a UNIX timestamp (is greater than 65535), it is interpreted as a DateTime, then truncated to Date in the current timezone. The timezone argument can be specified as a second argument of the function. The truncation to Date depends on the timezone:
The example above demonstrates how the same UNIX timestamp can be interpreted as different dates in different time zones.
If the argument is a number and it is smaller than 65536, it is interpreted as the number of days since 1970-01-01 (a UNIX day) and converted to Date. It corresponds to the internal numeric representation of the Date
data type. Example:
This conversion does not depend on timezones.
If the argument does not fit in the range of the Date type, it results in an implementation-defined behavior, that can saturate to the maximum supported date or overflow:
The function toDate
can be also written in alternative forms:
Have a nice day working with dates and times.
Syntax
Arguments
Returned value
A calendar date.
Example
The value is within the range:
The value is outside the range:
With Date
-type argument:
Example
Query:
Result:
Example
Query:
Result:
Example
Query:
Result:
Syntax
Arguments
scale
- Tick size (precision): 10-precision seconds. Valid range: [ 0 : 9 ].
timezone
- Time zone of the specified datetime64 object.
Returned value
A calendar date and time of day, with sub-second precision.
Example
The value is within the range:
As decimal with precision:
Without the decimal point the value is still treated as Unix Timestamp in seconds:
With timezone
:
toDecimal32(value, S)
toDecimal64(value, S)
toDecimal128(value, S)
toDecimal256(value, S)
toDecimal32OrNull(expr, S)
— Results in Nullable(Decimal32(S))
data type.
toDecimal64OrNull(expr, S)
— Results in Nullable(Decimal64(S))
data type.
toDecimal128OrNull(expr, S)
— Results in Nullable(Decimal128(S))
data type.
toDecimal256OrNull(expr, S)
— Results in Nullable(Decimal256(S))
data type.
These functions should be used instead of toDecimal*()
functions, if you prefer to get a NULL
value instead of an exception in the event of an input value parsing error.
Arguments
S
— Scale, the number of decimal places in the resulting value.
Returned value
A value in the Nullable(Decimal(P,S))
data type. The value contains:
Number with S
decimal places, if ClickHouse interprets the input string as a number.
NULL
, if ClickHouse can’t interpret the input string as a number or if the input number contains more than S
decimal places.
Examples
Query:
Result:
Query:
Result:
toDecimal32OrDefault(expr, S)
— Results in Decimal32(S)
data type.
toDecimal64OrDefault(expr, S)
— Results in Decimal64(S)
data type.
toDecimal128OrDefault(expr, S)
— Results in Decimal128(S)
data type.
toDecimal256OrDefault(expr, S)
— Results in Decimal256(S)
data type.
These functions should be used instead of toDecimal*()
functions, if you prefer to get a default value instead of an exception in the event of an input value parsing error.
Arguments
S
— Scale, the number of decimal places in the resulting value.
Returned value
A value in the Decimal(P,S)
data type. The value contains:
Number with S
decimal places, if ClickHouse interprets the input string as a number.
Default Decimal(P,S)
data type value, if ClickHouse can’t interpret the input string as a number or if the input number contains more than S
decimal places.
Examples
Query:
Result:
Query:
Result:
toDecimal32OrZero( expr, S)
— Results in Decimal32(S)
data type.
toDecimal64OrZero( expr, S)
— Results in Decimal64(S)
data type.
toDecimal128OrZero( expr, S)
— Results in Decimal128(S)
data type.
toDecimal256OrZero( expr, S)
— Results in Decimal256(S)
data type.
These functions should be used instead of toDecimal*()
functions, if you prefer to get a 0
value instead of an exception in the event of an input value parsing error.
Arguments
S
— Scale, the number of decimal places in the resulting value.
Returned value
A value in the Nullable(Decimal(P,S))
data type. The value contains:
Number with S
decimal places, if ClickHouse interprets the input string as a number.
0 with S
decimal places, if ClickHouse can’t interpret the input string as a number or if the input number contains more than S
decimal places.
Example
Query:
Result:
Query:
Result:
Functions for converting between numbers, strings (but not fixed strings), dates, and dates with times. All these functions accept one argument.
When converting to or from a string, the value is formatted or parsed using the same rules as for the TabSeparated format (and almost all other text formats). If the string can’t be parsed, an exception is thrown and the request is canceled.
When converting dates to numbers or vice versa, the date corresponds to the number of days since the beginning of the Unix epoch. When converting dates with times to numbers or vice versa, the date with time corresponds to the number of seconds since the beginning of the Unix epoch.
The date and date-with-time formats for the toDate/toDateTime functions are defined as follows:
As an exception, if converting from UInt32, Int32, UInt64, or Int64 numeric types to Date, and if the number is greater than or equal to 65536, the number is interpreted as a Unix timestamp (and not as the number of days) and is rounded to the date. This allows support for the common occurrence of writing ‘toDate(unix_timestamp)’, which otherwise would be an error and would require writing the more cumbersome ‘toDate(toDateTime(unix_timestamp))’.
Conversion between a date and a date with time is performed the natural way: by adding a null time or dropping the time.
Conversion between numeric types uses the same rules as assignments between different numeric types in C++.
Additionally, the toString function of the DateTime argument can take a second String argument containing the name of the time zone. Example: Asia/Yekaterinburg
In this case, the time is formatted according to the specified time zone.
Example
Query:
Result:
Also see the toUnixTimestamp
function.
Converts a String type argument to a FixedString(N) type (a string with fixed length N). N must be a constant. If the string has fewer bytes than N, it is padded with null bytes to the right. If the string has more bytes than N, an exception is thrown.
Accepts a String or FixedString argument. Returns the String with the content truncated at the first zero byte found.
Example
Query:
Result:
Query:
Result:
These functions accept a string and interpret the bytes placed at the beginning of the string as a number in host order (little endian). If the string isn’t long enough, the functions work as if the string is padded with the necessary number of null bytes. If the string is longer than needed, the extra bytes are ignored. A date is interpreted as the number of days since the beginning of the Unix Epoch, and a date with time is interpreted as the number of seconds since the beginning of the Unix Epoch.
This function accepts a number or date or date with time and returns a string containing bytes representing the corresponding value in host order (little endian). Null bytes are dropped from the end. For example, a UInt32 type value of 255 is a string that is one byte long.
This function accepts a number or date or date with time and returns a FixedString containing bytes representing the corresponding value in host order (little endian). Null bytes are dropped from the end. For example, a UInt32 type value of 255 is a FixedString that is one byte long.
Accepts 16 bytes string and returns UUID containing bytes representing the corresponding value in network byte order (big-endian). If the string isn't long enough, the function works as if the string is padded with the necessary number of null bytes to the end. If the string is longer than 16 bytes, the extra bytes at the end are ignored.
Syntax
Arguments
Returned value
Examples
String to UUID.
Query:
Result:
Going back and forth from String to UUID.
Query:
Result:
Uses the same source in-memory bytes sequence for x
value and reinterprets it to destination type.
Syntax
Arguments
x
— Any type.
Returned value
Destination type value.
Examples
Query:
Result:
Syntax
Arguments
x
— A value to convert. May be of any type.
t
— The target data type.
Returned value
Converted value.
NOTE
If the input value does not fit the bounds of the target type, the result overflows. For example, CAST(-1, 'UInt8')
returns 255
.
Examples
Query:
Result:
Query:
Result:
Example
Query:
Result:
Query:
Result:
See also
Converts x
to the T
data type.
Example
Query:
Result:
Query:
Result:
Syntax
Parameters
x
— Input value.
T
— The name of the returned data type.
Returned value
The value, converted to the specified data type T
.
Example
Query:
Result:
Query:
Result:
Converts input value x
to the specified data type T
. Returns default type value or default_value
if specified if the casted value is not representable in the target type.
Syntax
Parameters
x
— Input value.
T
— The name of the returned data type.
default_value
— Default value of returned data type.
Returned value
The value converted to the specified data type T
.
Example
Query:
Result:
Query:
Result:
Syntax
Arguments
number
— Duration of interval. Positive integer number.
Returned values
The value in Interval
data type.
Example
Query:
Result:
Syntax
Arguments
Supported non-standard formats
A string with a date and a time component: YYYYMMDDhhmmss
, DD/MM/YYYY hh:mm:ss
, DD-MM-YY hh:mm
, YYYY-MM-DD hh:mm:ss
, etc.
A string with a date, but no time component: YYYY
, YYYYMM
, YYYY*MM
, DD/MM/YYYY
, DD-MM-YY
etc.
A string with a day and time: DD
, DD hh
, DD hh:mm
. In this case YYYY-MM
are substituted as 2000-01
.
A string that includes the date and time along with time zone offset information: YYYY-MM-DD hh:mm:ss ±h:mm
, etc. For example, 2020-12-12 17:36:00 -5:00
.
For all of the formats with separator the function parses months names expressed by their full name or by the first three letters of a month name. Examples: 24/DEC/18
, 24-Dec-18
, 01-September-2018
.
Returned value
time_string
converted to the DateTime
data type.
Examples
Query:
Result:
Query:
Result:
Query:
Result:
Query:
Result:
Query:
Result:
See Also
Syntax
Parameters
Returned value
Examples
Query:
Result:
Syntax
Arguments
Returned values
Result of expr
.
Type: LowCardinality(expr_result_type)
Example
Query:
Result:
Converts a DateTime64
to a Int64
value with fixed sub-second precision. Input value is scaled up or down appropriately depending on it precision.
NOTE
The output value is a timestamp in UTC, not in the timezone of DateTime64
.
Syntax
Arguments
value
— DateTime64 value with any precision.
Returned value
value
converted to the Int64
data type.
Examples
Query:
Result:
Query:
Result:
Converts an Int64
to a DateTime64
value with fixed sub-second precision and optional timezone. Input value is scaled up or down appropriately depending on it’s precision. Please note that input value is treated as UTC timestamp, not timestamp at given (or implicit) timezone.
Syntax
Arguments
value
— Int64
value with any precision.
timezone
— String
(optional) timezone name of the result.
Returned value
value
converted to the DateTime64
data type.
Example
Query:
Result:
Converts arbitrary expressions into a string via given format.
Syntax
Arguments
x
,y
, ... — Expressions.
Returned value
A formatted string. (for text formats it's usually terminated with the new line character).
Example
Query:
Result:
Note: If format contains suffix/prefix, it will be written in each row.
Example
Query:
Result:
Note: Only row-based formats are supported in this function.
Converts arbitrary expressions into a string via given format. Differs from formatRow in that this function trims the last if any.
Syntax
Arguments
x
,y
, ... — Expressions.
Returned value
A formatted string.
Example
Query:
Result:
All these functions do not follow the RFC. They are maximally simplified for improved performance.
Extracts the protocol from a URL.
Examples of typical returned values: http, https, ftp, mailto, tel, magnet…
Extracts the hostname from a URL.
Arguments
The URL can be specified with or without a scheme. Examples:
For these examples, the domain
function returns the following results:
Returned values
Host name. If ClickHouse can parse the input string as a URL.
Empty string. If ClickHouse can’t parse the input string as a URL.
Type: String
.
Example
Returns the domain and removes no more than one ‘www.’ from the beginning of it, if present.
Extracts the the top-level domain from a URL.
Arguments
The URL can be specified with or without a scheme. Examples:
Returned values
Domain name. If ClickHouse can parse the input string as a URL.
Empty string. If ClickHouse cannot parse the input string as a URL.
Type: String
.
Example
Returns the “first significant subdomain”. The first significant subdomain is a second-level domain if it is ‘com’, ‘net’, ‘org’, or ‘co’. Otherwise, it is a third-level domain. For example, firstSignificantSubdomain (‘https://news.clickhouse.com/’) = ‘clickhouse’, firstSignificantSubdomain (‘https://news.clickhouse.com.tr/’) = ‘clickhouse’
. The list of “insignificant” second-level domains and other implementation details may change in the future.
Returns the part of the domain that includes top-level subdomains up to the “first significant subdomain” (see the explanation above).
For example:
cutToFirstSignificantSubdomain('https://news.clickhouse.com.tr/') = 'clickhouse.com.tr'
.
cutToFirstSignificantSubdomain('www.tr') = 'tr'
.
cutToFirstSignificantSubdomain('tr') = ''
.
Returns the part of the domain that includes top-level subdomains up to the “first significant subdomain”, without stripping "www".
For example:
cutToFirstSignificantSubdomain('https://news.clickhouse.com.tr/') = 'clickhouse.com.tr'
.
cutToFirstSignificantSubdomain('www.tr') = 'www.tr'
.
cutToFirstSignificantSubdomain('tr') = ''
.
Can be useful if you need fresh TLD list or you have custom.
Configuration example:
Syntax
Parameters
Returned value
Part of the domain that includes top-level subdomains up to the first significant subdomain.
Example
Query:
Result:
See Also
Returns the part of the domain that includes top-level subdomains up to the first significant subdomain without stripping www
. Accepts custom TLD list name.
Can be useful if you need fresh TLD list or you have custom.
Configuration example:
Syntax
Parameters
Returned value
Part of the domain that includes top-level subdomains up to the first significant subdomain without stripping www
.
Example
Query:
Result:
See Also
Returns the first significant subdomain. Accepts customs TLD list name.
Can be useful if you need fresh TLD list or you have custom.
Configuration example:
Syntax
Parameters
Returned value
First significant subdomain.
Example
Query:
Result:
See Also
Returns the port or default_port
if there is no port in the URL (or in case of validation error).
Returns the path. Example: /top/news.html
The path does not include the query string.
The same as above, but including query string and fragment. Example: /top/news.html?page=2#comments
Returns the query string. Example: page=1&lr=213. query-string does not include the initial question mark, as well as # and everything after #.
Returns the fragment identifier. fragment does not include the initial hash symbol.
Returns the query string and fragment identifier. Example: page=1#29390.
Returns the value of the ‘name’ parameter in the URL, if present. Otherwise, an empty string. If there are many parameters with this name, it returns the first occurrence. This function works under the assumption that the parameter name is encoded in the URL exactly the same way as in the passed argument.
Returns an array of name=value strings corresponding to the URL parameters. The values are not decoded in any way.
Returns an array of name strings corresponding to the names of URL parameters. The values are not decoded in any way.
Returns an array containing the URL, truncated at the end by the symbols /,? in the path and query-string. Consecutive separator characters are counted as one. The cut is made in the position after all the consecutive separator characters.
The same as above, but without the protocol and host in the result. The / element (root) is not included.
Returns the decoded URL. Example:
Extracts network locality (username:password@host:port
) from a URL.
Syntax
Arguments
Returned value
username:password@host:port
.
Type: String
.
Example
Query:
Result:
Removes no more than one ‘www.’ from the beginning of the URL’s domain, if present.
Removes query string. The question mark is also removed.
Removes the fragment identifier. The number sign is also removed.
Removes the query string and fragment identifier. The question mark and number sign are also removed.
Removes the name
parameter from URL, if present. This function does not encode or decode characters in parameter names, e.g. Client ID
and Client%20ID
are treated as different parameter names.
Syntax
Arguments
Returned value
URL with name
URL parameter removed.
Type: String
.
Example
Query:
Result:
The functions for working with UUID are listed below.
Syntax
Arguments
Returned value
The UUID type value.
Usage example
This example demonstrates creating a table with the UUID type column and inserting a value into the table.
Usage example if it is needed to generate multiple values in one row
Converts String type value to UUID type.
Returned value
The UUID type value.
Usage example
It takes an argument of type String and tries to parse it into UUID. If failed, returns NULL.
Returned value
The Nullable(UUID) type value.
Usage example
It takes an argument of type String and tries to parse it into UUID. If failed, returns zero UUID.
Returned value
The UUID type value.
Usage example
Syntax
Arguments
Returned value
FixedString(16)
Usage examples
Accepts binary
containing a binary representation of a UUID, with its format optionally specified by variant
(Big-endian
by default), and returns a string containing 36 characters in text format.
Syntax
Arguments
Returned value
String.
Usage example
Differs from in that it returns zero when the divisor is zero.
Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1
the function reads only subcolumn instead of reading and processing the whole array column. The query SELECT empty(arr) FROM TABLE;
transforms to SELECT arr.size0 = 0 FROM TABLE;
.
The function also works for or .
[x]
— Input array. .
Type: .
Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1
the function reads only subcolumn instead of reading and processing the whole array column. The query SELECT notEmpty(arr) FROM table
transforms to SELECT arr.size0 != 0 FROM TABLE
.
The function also works for or .
[x]
— Input array. .
Type: .
Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1
the function reads only subcolumn instead of reading and processing the whole array column. The query SELECT length(arr) FROM table
transforms to SELECT arr.size0 FROM TABLE
.
start
— The first element of the array. Optional, required if step
is used. Default value: 0.
end
— The number before which the array is constructed. Required.
step
— Determines the incremental step between each element in the array. Optional. Default value: 1.
An exception is thrown if query results in arrays with a total length of more than number of elements specified by the setting.
arrays
– Arbitrary number of arguments of type. Example
Note that the arrayCount
is a . You can pass a lambda function to it as the first argument.
single_value
– A single value. Only numbers can be added to an array with numbers, and only strings can be added to an array of strings. When adding numbers, ClickHouse automatically sets the single_value
type for the data type of the array. For more information about the types of data in ClickHouse, see “”. Can be NULL
. The function adds a NULL
element to an array, and the type of array elements converts to Nullable
.
single_value
– A single value. Only numbers can be added to an array with numbers, and only strings can be added to an array of strings. When adding numbers, ClickHouse automatically sets the single_value
type for the data type of the array. For more information about the types of data in ClickHouse, see “”. Can be NULL
. The function adds a NULL
element to an array, and the type of array elements converts to Nullable
.
Note that arraySort
is a . You can pass a lambda function to it as the first argument. In this case, sorting order is determined by the result of the lambda function applied to the elements of the array.
For each element of the source array, the lambda function returns the sorting key, that is, [1 –> -1, 2 –> -2, 3 –> -3]. Since the arraySort
function sorts the keys in ascending order, the result is [3, 2, 1]. Thus, the (x) –> -x
lambda function sets the in a sorting.
To improve sorting efficiency, the is used.
Note that the arrayReverseSort
is a . You can pass a lambda function to it as the first argument. Example is shown below.
A special function. See the section .
array
– .
Type: , , .
array
– .
agg_func
— The name of an aggregate function which should be a constant .
arr
— Any number of type columns as the parameters of the aggregation function.
agg_func
— The name of an aggregate function which should be a constant .
ranges
— The ranges to aggretate which should be an of which containing the index and the length of each range.
arr
— Any number of type columns as the parameters of the aggregation function.
Type: .
Synonym for
array_of_arrays
— of arrays. For example, [[1,2,3], [4,5]]
.
arr
— The to inspect.
arrN
— .
Array with elements from the source arrays grouped into . Data types in the tuple are the same as types of the input arrays and in the same order as arrays are passed.
Type: .
Calculate AUC (Area Under the Curve, which is a concept in machine learning, see more details: ).
Note that the arrayMap
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arrayFilter
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arrayFill
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arrayReverseFill
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arraySplit
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arrayReverseSplit
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arrayExists
is a . You can pass a lambda function to it as the first argument.
Note that the arrayAll
is a . You can pass a lambda function to it as the first argument.
Note that the arrayFirst
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arrayFirstIndex
is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.
Note that the arrayMin
is a . You can pass a lambda function to it as the first argument.
func
— Function. .
arr
— Array. .
Note that the arrayMax
is a . You can pass a lambda function to it as the first argument.
func
— Function. .
arr
— Array. .
Note that the arraySum
is a . You can pass a lambda function to it as the first argument.
func
— Function. .
arr
— Array. .
Type: for decimal numbers in source array (or for converted values, if func
is specified) — , for floating point numbers — , for numeric unsigned — , and for numeric signed — .
Note that the arrayAvg
is a . You can pass a lambda function to it as the first argument.
func
— Function. .
arr
— Array. .
Type: .
Note that the arrayCumSum
is a . You can pass a lambda function to it as the first argument.
Note that the arraySumNonNegative
is a . You can pass a lambda function to it as the first argument.
Multiplies elements of an .
arr
— of numeric values.
Type: .
Return value type is always . Result:
a
— A value to shift. , or .
b
— The number of shift positions. , 64 bit types or less are allowed.
In the following queries and functions are used to show bits of shifted values.
a
— A value to shift. , or .
b
— The number of shift positions. , 64 bit types or less are allowed.
Takes any integer and converts it into , returns the value of a bit at specified position. The countdown starts from 0 from the right to the left.
Returns result of (AND operator) of all bits at given positions. The countdown starts from 0 from the right to the left.
Returns result of (OR operator) of all bits at given positions. The countdown starts from 0 from the right to the left.
x
— or number. The function uses the value representation in memory. It allows supporting floating-point numbers.
The function does not convert input value to a larger type (). So, for example, bitCount(toUInt8(-1)) = 8
.
Returns the between the bit representations of two integer values. Can be used with functions for detection of semi-duplicate strings. The smaller is the distance, the more likely those strings are the same.
int1
— First integer value. .
int2
— Second integer value. .
Type: .
With :
For more information on RoaringBitmap, see: .
bitmap
– .
range_start
– Range start point. Type: .
range_end
– Range end point (excluded). Type: .
bitmap
– .
range_start
– The subset starting point. Type: .
cardinality_limit
– The subset cardinality upper limit. Type: .
Type: .
Returns the bitmap elements, starting from the offset
position. The number of returned elements is limited by the cardinality_limit
parameter. Analog of the ) string function, but for bitmap.
bitmap
– The bitmap. Type: .
offset
– The position of the first element of the subset. Type: .
cardinality_limit
– The maximum number of elements in the subset. Type: .
Type: .
haystack
– , where the function searches.
needle
– Value that the function searches. Type: .
If you are sure that bitmap2
contains strictly one element, consider using the function. It works more efficiently.
You can use the setting to calculate the if
function according to a short scheme. If this setting is enabled, then
expression is evaluated only on rows where cond
is true, else
expression – where cond
is false. For example, an exception about division by zero is not thrown when executing the query SELECT if(number = 0, 0, intDiv(42, number)) FROM numbers(10)
, because intDiv(42, number)
will be evaluated only for numbers that doesn't satisfy condition number = 0
.
Note: NULL
values are not used in this example, check section.
.
Allows you to write the operator more compactly in the query.
You can use the setting to calculate the multiIf
function according to a short scheme. If this setting is enabled, then_i
expression is evaluated only on rows where ((NOT cond_1) AND (NOT cond_2) AND ... AND (NOT cond_{i-1}) AND cond_i)
is true, cond_i
will be evaluated only on rows where ((NOT cond_1) AND (NOT cond_2) AND ... AND (NOT cond_{i-1}))
is true. For example, an exception about division by zero is not thrown when executing the query SELECT multiIf(number = 2, intDiv(1, number), number = 5) FROM numbers(10)
.
Type: .
value
— Time or date and time. .
timezone
— Timezone for the returned value. . This argument is a constant, because toTimezone
changes the timezone of a column (timezone is an attribute of DateTime*
types).
Type: .
Returns the timezone name of or data types.
value
— Date and time. or .
Type: .
Returns a timezone offset in seconds from . The function takes into account and historical timezone changes at the specified date and time. is used to calculate the offset.
value
— Date and time. or .
Type: .
For DateTime argument: converts value to the number with type UInt32 -- Unix Timestamp (). For String argument: converts the input string to the datetime according to the timezone (optional second argument, server timezone is used by default) and returns the corresponding unix timestamp.
The return type of toStartOf*
, toLastDayOfMonth
, toMonday
, timeSlot
functions described below is determined by the configuration parameter which is 0
by default.
value
— Date and time. .
timezone
— for the returned value (optional). If not specified, the function uses the timezone of the value
parameter. .
Type: .
server configuration parameter.
unit
— The type of interval to truncate the result. . Possible values:
value
— Date and time. or .
timezone
— for the returned value (optional). If not specified, the function uses the timezone of the value
parameter. .
Type: .
unit
— The type of interval to add. . Possible values:
value
— Value of interval to add. .
date
— The date or date with time to which value
is added. or .
Type: or .
Returns the difference between two dates or dates with time values. The difference is calculated using relative units, e.g. the difference between 2022-01-01
and 2021-12-29
is 3 days for day unit (see ), 1 month for month unit (see ), 1 year for year unit (see ).
unit
— The type of interval for result. . Possible values:
startdate
— The first time value to subtract (the subtrahend). , , or .
enddate
— The second time value to subtract from (the minuend). , , or .
timezone
— (optional). If specified, it is applied to both startdate
and enddate
. If not specified, timezones of startdate
and enddate
are used. If they are not the same, the result is unspecified. .
Type: .
unit
— The type of interval to subtract. . Possible values:
value
— Value of interval to subtract. .
date
— The date or date with time from which value
is subtracted. or .
Type: or .
date
— Date or date with time. or .
value
— Value of interval to add. .
unit
— The type of interval to add. . Possible values:
Type: or .
unit
— The type of interval to subtract. . Possible values:
value
— Value of interval to subtract. .
date
— Date or date with time. or .
Type: or .
timezone
— for the returned value (optional). .
Type: .
timezone
— for the returned value (optional). .
Type: .
formatDateTime uses MySQL datetime format style, refer to .
four-digit year format for ISO week number, calculated from the week-based year standard, normally useful only with %V
Similar to formatDateTime, except that it formats datetime in Joda style instead of MySQL style. Refer to .
date_part
— Date part. Possible values: 'year', 'quarter', 'month', 'week', 'dayofyear', 'day', 'weekday', 'hour', 'minute', 'second'. .
date
— Date. , , or .
timezone
— Timezone. Optional. .
Type:
Function converts Unix timestamp to a calendar date and a time of a day. When there is only a single argument of type, it acts in the same way as and return type.
FROM_UNIXTIME uses MySQL datetime format style, refer to .
When there are two or three arguments, the first an , , , or , the second a constant format string and the third an optional constant time zone string — it acts in the same way as and return type.
Similar to FROM_UNIXTIME, except that it formats time in Joda style instead of MySQL style. Refer to .
Converts a date in text form YYYY-MM-DD
to a number in Int32. This function supports date from 0000-01-01
to 9999-12-31
. It raises an exception if the argument cannot be parsed as a date, or the date is invalid.
date
— Date in text form. or .
Type: .
Similar to , but instead of raising exceptions it returns NULL
.
date
— Date in text form. or .
Type: .
Converts a number to a date in text form YYYY-MM-DD
. This function supports day number from -678941
to 2973119
(which represent 0000-01-01 and 9999-12-31 respectively). It raises an exception if the day number is outside of the supported range.
day
— Modified Julian Day number. .
Type:
Similar to , but instead of raising exceptions it returns NULL
.
day
— Modified Julian Day number. .
Type:
For information on connecting and configuring dictionaries, see .
dict_name
— Name of the dictionary. .
attr_names
— Name of the column of the dictionary, , or tuple of column names, ().
id_expr
— Key value. returning dictionary key-type value or -type value depending on the dictionary configuration.
default_value_expr
— Values returned if the dictionary does not contain a row with the id_expr
key. or (), returning the value (or values) in the data types configured for the attr_names
attribute.
If ClickHouse parses the attribute successfully in the , functions return the value of the dictionary attribute that corresponds to id_expr
.
dict_name
— Name of the dictionary. .
id_expr
— Key value. returning dictionary key-type value or -type value depending on the dictionary configuration.
Creates an array, containing all the parents of a key in the .
dict_name
— Name of the dictionary. .
key
— Key value. returning a -type value.
Type: .
dict_name
— Name of the dictionary. .
child_id_expr
— Key to be checked. returning a -type value.
ancestor_id_expr
— Alleged ancestor of the child_id_expr
key. returning a -type value.
dict_name
— Name of the dictionary. .
attr_name
— Name of the column of the dictionary. .
id_expr
— Key value. returning a or -type value depending on the dictionary configuration.
default_value_expr
— Value returned if the dictionary does not contain a row with the id_expr
key. returning the value in the data type configured for the attr_name
attribute.
If ClickHouse parses the attribute successfully in the , functions return the value of the dictionary attribute that corresponds to id_expr
.
number_1, number_2, ..., number_n
— Numerical arguments interpreted as integers. Types: , .
Values of type and are formatted as corresponding integers (the number of days since Epoch for Date and the value of Unix Timestamp for DateTime).
For and , all bytes are simply encoded as two hexadecimal numbers. Zero bytes are not omitted.
Values of and types are encoded as their representation in memory. As we support little-endian architecture, they are encoded in little-endian. Zero leading/trailing bytes are not omitted.
Values of type are encoded as big-endian order string.
arg
— A value to convert to hexadecimal. Types: , , , , or .
Type: .
Performs the opposite operation of . It interprets each pair of hexadecimal digits (in the argument) as a number and converts it to the byte represented by the number. The return value is a binary string (BLOB).
If you want to convert the result to a number, you can use the and functions.
arg
— A string containing any number of hexadecimal digits. Type: , .
Type: .
mode
— Encryption mode. .
plaintext
— Text thats need to be encrypted. .
key
— Encryption key. .
iv
— Initialization vector. Required for -gcm
modes, optinal for others. .
aad
— Additional authenticated data. It isn't encrypted, but it affects decryption. Works only in -gcm
modes, for others would throw an exception. .
Ciphertext binary string. .
Compatible with mysql encryption and resulting ciphertext can be decrypted with function.
mode
— Encryption mode. .
plaintext
— Text that needs to be encrypted. .
key
— Encryption key. If key is longer than required by mode, MySQL-specific key folding is performed. .
iv
— Initialization vector. Optional, only first 16 bytes are taken into account .
Ciphertext binary string. .
mode
— Decryption mode. .
ciphertext
— Encrypted text that needs to be decrypted. .
key
— Decryption key. .
iv
— Initialization vector. Required for -gcm
modes, optinal for others. .
aad
— Additional authenticated data. Won't decrypt if this value is incorrect. Works only in -gcm
modes, for others would throw an exception. .
Decrypted String. .
Re-using table from .
Compatible with mysql encryption and decrypts data encrypted with function.
mode
— Decryption mode. .
ciphertext
— Encrypted text that needs to be decrypted. .
key
— Decryption key. .
iv
— Initialization vector. Optinal. .
Decrypted String. .
path
— The relative path to the file from . Path to file support following wildcards: *
, ?
, {abc,def}
and {N..M}
where N
, M
— numbers, 'abc', 'def'
— strings.
default
— The value that will be returned in the case when a file does not exist or cannot be accessed. Data types supported: and .
Calculates the distance between two points on the Earth’s surface using .
Calculates the central angle between two points on the Earth’s surface using .
(x, y)
— Coordinates of a point on the plane. Data type — — A tuple of two numbers.
[(a, b), (c, d) ...]
— Polygon vertices. Data type — . Each vertex is represented by a pair of coordinates (a, b)
. Vertices should be specified in a clockwise or counterclockwise order. The minimum number of vertices is 3. The polygon must be constant.
is the geocode system, which subdivides Earth’s surface into buckets of grid shape and encodes each cell into a short string of letters and digits. It is a hierarchical data structure, so the longer is the geohash string, the more precise is the geographic location.
If you need to manually convert geographic coordinates to geohash strings, you can use .
Encodes latitude and longitude as a -string.
Decodes any -encoded string into longitude and latitude.
Returns an array of -encoded strings of given precision that fall inside and intersect boundaries of given box, basically a 2D grid flattened into array.
longitude_min
— Minimum longitude. Range: [-180°, 180°]
. Type: .
latitude_min
— Minimum latitude. Range: [-90°, 90°]
. Type: .
longitude_max
— Maximum longitude. Range: [-180°, 180°]
. Type: .
latitude_max
— Maximum latitude. Range: [-90°, 90°]
. Type: .
precision
— Geohash precision. Range: [1, 12]
. Type: .
Type: ().
is a geographical indexing system where Earth’s surface divided into a grid of even hexagonal cells. This system is hierarchical, i. e. each hexagon on the top level ("parent") can be split into seven even but smaller ones ("children"), and so on.
The full description of the H3 system is available at .
Verifies whether the number is a valid index.
h3index
— Hexagon index number. Type: .
Type: .
Defines the resolution of the given index.
h3index
— Hexagon index number. Type: .
If the index is not valid, the function returns a random value. Use to verify the index.
Type: .
Calculates the average length of the hexagon edge in grades.
resolution
— Index resolution. Type: . Range: [0, 15]
.
The average length of the hexagon edge in grades. Type: .
Calculates the average length of the hexagon edge in meters.
resolution
— Index resolution. Type: . Range: [0, 15]
.
The average length of the hexagon edge in meters. Type: .
Returns point index (lon, lat)
with specified resolution.
lon
— Longitude. Type: .
lat
— Latitude. Type: .
resolution
— Index resolution. Range: [0, 15]
. Type: .
Type: .
Lists all the hexagons in the raduis of k
from the given hexagon in random order.
h3index
— Hexagon index number. Type: .
k
— Radius. Type:
Type: ().
Returns the base cell number of the index.
index
— Hexagon index number. Type: .
Type: .
resolution
— Index resolution. Range: [0, 15]
. Type: .
Type: .
Returns whether or not the provided indexes are neighbors.
index1
— Hexagon index number. Type: .
index2
— Hexagon index number. Type: .
Type: .
Returns an array of child indexes for the given index.
index
— Hexagon index number. Type: .
resolution
— Index resolution. Range: [0, 15]
. Type: .
Type: ().
Returns the parent (coarser) index containing the given index.
index
— Hexagon index number. Type: .
resolution
— Index resolution. Range: [0, 15]
. Type: .
Type: .
index
— Hexagon index number. Type: .
Type: .
index_str
— String representation of the H3 index. Type: .
Hexagon index number. Returns 0 on error. Type: .
all the input parameters as strings and calculates the hash value for each of them. Then combines hashes, takes the first 8 bytes of the hash of the resulting string, and interprets them as UInt64
in big-endian byte order.
The function is relatively slow (5 million short strings per second per processor core). Consider using the function instead.
The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data).
A data type hash value.
Produces a 64-bit hash value.
This is a cryptographic hash function. It works at least three times faster than the function.
Function all the input parameters as strings and calculates the hash value for each of them. Then combines hashes by the following algorithm:
The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data).
A data type hash value.
Produces a 128-bit hash value. Differs from in that the final xor-folding state is done up to 128 bits.
The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data).
Type: .
Produces a 64-bit hash value.
The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data).
A data type hash value.
Calculates SHA-1, SHA-224, SHA-256, SHA-512 hash from a string and returns the resulting set of bytes as .
s
— Input string for SHA hash calculation. .
Type: .
Use the function to represent the result as a hex-encoded string.
Produces a 64-bit or Fingerprint value. farmFingerprint64
is preferred for a stable and portable value.
These functions use the Fingerprint64
and Hash64
methods respectively from all .
The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data)..
A data type hash value.
Calculates JavaHash from a , , , , . This hash function is neither fast nor having a good quality. The only reason to use it is when this algorithm is already used in another system and you have to calculate exactly the same result.
Calculates from a string, assuming it contains bytes representing a string in UTF-16LE encoding.
This is just with zeroed out sign bit. This function is used in for versions before 3.0. This hash function is neither fast nor having a good quality. The only reason to use it is when this algorithm is already used in another system and you have to calculate exactly the same result.
Produces a 64-bit hash value.
The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data).
A data type hash value.
Calculates JumpConsistentHash form a UInt64. Accepts two arguments: a UInt64-type key and the number of buckets. Returns Int32. For more information, see the link:
Produces a hash value.
Both functions take a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data).
The murmurHash2_32
function returns hash value having the data type.
The murmurHash2_64
function returns hash value having the data type.
Calculates a 64-bit hash value using the same hash seed as . It is portable between CLang and GCC builds.
par1, ...
— A variable number of parameters that can be any of the .
Type: .
Produces a hash value.
Both functions take a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple
with the same data, Map
and the corresponding Array(Tuple(key, value))
type with the same data).
The murmurHash3_32
function returns a data type hash value.
The murmurHash3_64
function returns a data type hash value.
Produces a 128-bit hash value.
expr
— A list of . .
Type: .
.
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optinal. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes
of two strings, the more likely these strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
Type: .
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Splits a ASCII string into n-grams of ngramsize
symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case sensitive.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
Splits a ASCII string into n-grams of ngramsize
symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case insensitive.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
Splits a UTF-8 string into n-grams of ngramsize
symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case sensitive.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
Splits a UTF-8 string into n-grams of ngramsize
symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case insensitive.
string
— String. .
ngramsize
— The size of an n-gram. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: (, ).
Splits a ASCII string into parts (shingles) of shinglesize
words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case sensitive.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
Splits a ASCII string into parts (shingles) of shinglesize
words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case insensitive.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
Splits a UTF-8 string into parts (shingles) of shinglesize
words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case sensitive.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
Splits a UTF-8 string into parts (shingles) of shinglesize
words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case insensitive.
string
— String. .
shinglesize
— The size of a word shingle. Optional. Possible values: any number from 1
to 25
. Default value: 3
. .
hashnum
— The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1
to 25
. Default value: 6
. .
Type: ((), ()).
You can use functions described in this chapter to introspect and for query profiling.
Set the setting to 1.
ClickHouse saves profiler reports to the system table. Make sure the table and profiler are configured properly.
address_of_binary_instruction
() — Address of instruction in a running process.
Type: .
The function allows to process each individual element of the trace
array by the addressToLine
function. The result of this processing you see in the trace_source_code_lines
column of output.
address_of_binary_instruction
() — Address of instruction in a running process.
Type: .
The functions will split array to rows.
address_of_binary_instruction
() — Address of instruction in a running process.
Type: .
The function allows to process each individual element of the trace
array by the addressToSymbols
function. The result of this processing you see in the trace_symbols
column of output.
Converts a symbol that you can get using the function to the C++ function name.
symbol
() — Symbol from an object file.
Type: .
The function allows to process each individual element of the trace
array by the demangle
function. The result of this processing you see in the trace_functions
column of output.
Returns id of the thread, in which current is processed.
Current thread id. .
Emits trace log message to server log for each .
message
— Message that is emitted to server log. .
The reverse function of . If the IPv6 address has an invalid format, it throws exception.
string
— IP address. .
Type: .
.
Takes a UInt32
number. Interprets it as an IPv4 address in . Returns a FixedString(16)
value containing the IPv6 address in binary format. Examples:
Accepts an IPv4 and an UInt8 value containing the . Return a tuple with two IPv4 containing the lower range and the higher range of the subnet.
An alias to IPv4StringToNum()
that takes a string form of IPv4 address and returns value of type, which is binary equal to value returned by IPv4StringToNum()
.
Converts a string form of IPv6 address to type. If the IPv6 address has an invalid format, returns an empty value. Similar to function, which converts IPv6 address to binary format.
string
— IP address.
Type: .
string
— IP address. .
Type: .
string
— IP address. .
Type: .
Determines if an IP address is contained in a network represented in the notation. Returns 1
if true, or 0
otherwise.
address
— An IPv4 or IPv6 address. .
prefix
— An IPv4 or IPv6 network prefix in CIDR. .
Type: .
The following functions are based on designed for more complex JSON parsing requirements. The assumption 2 mentioned above still applies.
json
— with valid JSON.
a, b, c...
— Comma-separated indices or keys that specify the path to the inner field in a nested JSON object. Each argument can be either a to get the field by the key or an to get the N-th field (indexed from 1, negative integers count from the end). If not set, the whole JSON is parsed as the top-level object. Optional parameter.
Type: ().
json
— with valid JSON.
p, a, t, h
— Comma-separated indices or keys that specify the path to the inner field in a nested JSON object. Each argument can be either a to get the field by the key or an to get the N-th field (indexed from 1, negative integers count from the end). If not set, the whole JSON is parsed as the top-level object. Optional parameter.
Type: ((, ).
Serializes a value to its JSON representation. Various data types and nested structures are supported. 64-bit or bigger (like UInt64
or Int128
) are enclosed in quotes by default. controls this behavior. Special values NaN
and inf
are replaced with null
. Enable setting to show them. When serializing an value, the function outputs its name.
Type: .
The first example shows serialization of a . The second example shows some special values wrapped into a .
The aggregate function implements stochastic gradient descent method using linear model and MSE loss function. Uses evalMLMethod
to predict on new data.
The aggregate function implements stochastic gradient descent method for binary classification problem. Uses evalMLMethod
to predict on new data.
Arranges key:value
pairs into data type.
key
— The key part of the pair. , , , , , , , , .
value
— The value part of the pair. Arbitrary type, including and .
Type: .
data type
Arguments are or of two , where items in the first array represent keys, and the second array contains values for the each key. All key arrays should have same type, and all value arrays should contain items which are promoted to the one type (, or ). The common promoted type is used as a type for the result array.
Depending on the arguments returns one or , where the first array contains the sorted keys and the second array contains values.
Arguments are or of two , where items in the first array represent keys, and the second array contains values for the each key. All key arrays should have same type, and all value arrays should contain items which are promote to the one type (, or ). The common promoted type is used as a type for the result array.
Depending on the arguments returns one or , where the first array contains the sorted keys and the second array contains values.
Arguments are or two , where the first array represent keys, and the second array contains values for the each key.
keys
— Array of keys. ().
values
— Array of values. ().
max
— Maximum key value. Optional. .
map
— Map with integer keys. .
Depending on the arguments returns a or a of two : keys in sorted order, and values the corresponding keys.
map
— Map. .
Type: .
Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1
the function reads only subcolumn instead of reading and processing the whole column data. The query SELECT mapKeys(m) FROM table
transforms to SELECT m.keys FROM table
.
map
— Map. .
Type: .
Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1
the function reads only subcolumn instead of reading and processing the whole column data. The query SELECT mapValues(m) FROM table
transforms to SELECT m.values FROM table
.
map
— Map. .
Type: .
.
x
— The angle, in radians. Values from the interval: -∞ < x < +∞
. .
Type: .
.
x
— Hyperbolic cosine of angle. Values from the interval: 1 <= x < +∞
. .
Type: .
.
x
— The angle, in radians. Values from the interval: -∞ < x < +∞
. .
Type: .
.
x
— Hyperbolic sine of angle. Values from the interval: -∞ < x < +∞
. .
Type: .
.
x
— Hyperbolic tangent of angle. Values from the interval: –1 < x < 1
. .
Type: .
The calculates the angle in the Euclidean plane, given in radians, between the positive x axis and the ray to the point (x, y) ≠ (0, 0)
.
y
— y-coordinate of the point through which the ray passes. .
x
— x-coordinate of the point through which the ray passes. .
Type: .
Calculates the length of the hypotenuse of a right-angle triangle. The avoids problems that occur when squaring very large or very small numbers.
x
— The first cathetus of a right-angle triangle. .
y
— The second cathetus of a right-angle triangle. .
Type: .
Calculates log(1+x)
. The log1p(x)
is more accurate than log(1+x)
for small values of x.
x
— Values from the interval: -1 < x < +∞
. .
Type: .
Checks whether the argument is .
Checks whether the argument is .
Results in an equivalent non-Nullable
value for a type. In case the original value is NULL
the result is undetermined. See also ifNull
and coalesce
functions.
Gets a named value from the section of the server configuration.
name
— Name to retrieve from the macros
section. .
Type: .
expr
— Expression resulting in a type value. All the backslashes must be escaped in the resulting value.
Type: .
For arguments the funtion returns the string length + 9 (terminating zero + length).
A constant expression means an expression whose resulting value is known at the query analysis (i.e. before execution). For example, expressions over are constant expressions.
Type: .
x
— Value to be checked for infinity. Type: .
y
— Fallback value. Type: .
You can get similar result by using : isFinite(x) ? x : y
.
The rows order used during the calculation of neighbor
can differ from the order of rows returned to the user. To prevent that you can make a subquery with and call the function from outside the subquery.
offset
— The number of rows forwards or backwards from the current row of column
. .
The rows order used during the calculation of runningDifference
can differ from the order of rows returned to the user. To prevent that you can make a subquery with and call the function from outside the subquery.
Same as for , the difference is the value of the first row, returned the value of the first row, and each subsequent row returns the difference from the previous row.
start
— A column with the start time of events. , , or .
end
— A column with the end time of events. , , or .
Type:
Returns the number of fields in .
ᴺᵁᴸᴸ
for .
ᴺᵁᴸᴸ
for .
Type: .
Here is the example of test data from the table .
Used for internal implementation of .
Returns amount of remaining space on the filesystem where the files of the databases located. It is always smaller than total free space () because some space is reserved for OS.
Type: .
Type: .
Returns the capacity of the filesystem in bytes. For evaluation, the to the data directory must be configured.
Type: .
Calculates result of aggregate function based on single value. It is intended to use this function to initialize aggregate functions with combinator . You can create states of aggregate functions and insert them to columns of type or use initialized aggregates as default values.
aggregate_function
— Name of the aggregation function to initialize. .
Takes state of aggregate function. Returns result of aggregation (or finalized state when using combinator).
state
— State of aggregation. .
agg_state
— State of the aggregate function. .
grouping
— Grouping key. Optional. The state of the function is reset if the grouping
value is changed. It can be any of the for which the equality operator is defined.
The subquery generates sumState
for every number from 0
to 9
. sumState
returns the state of the function that contains the sum of a single number.
The function lets you extract data from the table the same way as from a .
Gets data from tables using the specified join key.
join_storage_table_name
— an indicates where search is performed. The identifier is searched in the default database (see parameter default_database
in the config file). To override the default database, use the USE db_name
or specify the database and the table through the separator db_name.db_table
, see the example.
If certain does not exist in source table then 0
or null
will be returned based on setting.
More info about join_use_nulls
in .
Evaluate external catboost model. is an open-source gradient boosting library developed by Yandex for machine learing. Accepts a path to a catboost model and model arguments (features). Returns Float64.
Before evaluating catboost models, the libcatboostmodel.<so|dylib>
library must be made available. See how to compile it.
See for how to train catboost models from a training data set.
Returns the current value of a .
custom_setting
— The setting name. .
Checks whether the value is out of its (or specified) precision.
d
— value. .
p
— precision. Optional. If omitted, the initial precision of the first argument is used. Using of this paratemer could be helpful for data extraction to another DBMS or file. .
x
— or value.
Type: .
For Decimal
values takes into account their scales: calculates result over underlying integer type which is (value * scale)
. For example: countDigits(42) = 2
, countDigits(42.000) = 5
, countDigits(0.04200) = 4
. I.e. you may check decimal overflow for Decimal64
with countDecimal(x) > 18
. It's a slow variant of .
Type: .
Returns TCP port number listened by this server. If it is executed in the context of a distributed table, then it generates a normal column, otherwise it produces a constant value.
Type: .
x
— resulting in any of the . The resulting value is discarded, but the expression itself if used for bypassing if the function is called multiple times in one query. Optional parameter.
Type: .
Functions for and are described separately.
Replaces the first occurrence of the substring matching the regular expression ‘pattern’ in ‘haystack‘ by the ‘replacement‘ string. ‘pattern‘ must be a constant . ‘replacement’ must be a plain constant string or a constant string containing substitutions \0-\9
. Substitutions \1-\9
correspond to the 1st to 9th capturing group (submatch), substitution \0
corresponds to the entire match. To use a verbatim \
character in the ‘pattern‘ or ‘replacement‘ string, escape it using \
. Also keep in mind that string literals require an extra escaping.
The function adds a backslash before some predefined characters in the string. Predefined characters: \0
, \\
, |
, (
, )
, ^
, $
, .
, [
, ]
, ?
, *
, +
, {
, :
, -
. This implementation slightly differs from re2::RE2::QuoteMeta. It escapes zero byte as \0
instead of \x00
and it escapes only required characters. For more information, see the link:
expression
— A number to be rounded. Can be any returning the numeric .
expression
— A number to be rounded. Can be any returning the numeric .
Functions for and are described separately.
For a case-insensitive search, use the function .
haystack
— String, in which substring will to be searched. .
needle
— Substring to be searched. .
start_pos
– Position of the first character in the string to start search. . Optional.
The same phrase in Russian contains characters which can’t be represented using a single byte. The function returns some unexpected result (use function for multi-byte encoded text):
The same as returns the position (in bytes) of the found substring in the string, starting from 1. Use the function for a case-insensitive search.
haystack
— String, in which substring will to be searched. .
needle
— Substring to be searched. .
start_pos
— Optional parameter, position of the first character in the string to start search. .
For a case-insensitive search, use the function .
haystack
— String, in which substring will to be searched. .
needle
— Substring to be searched. .
start_pos
— Optional parameter, position of the first character in the string to start search.
The same as , but is case-insensitive. Returns the position (in Unicode points) of the found substring in the string, starting from 1.
haystack
— String, in which substring will to be searched. .
needle
— Substring to be searched. .
start_pos
— Optional parameter, position of the first character in the string to start search.
The same as but returns Array
of positions (in bytes) of the found corresponding substrings in the string. Positions are indexed starting from 1.
For search in UTF-8, use the function .
haystack
— String, in which substring will to be searched. .
needle
— Substring to be searched. .
Checks whether the string matches the regular expression pattern
in re2
syntax. Re2
has a more limited than Perl regular expressions.
Matching is based on UTF-8, e.g. .
matches the Unicode code point ¥
which is represented in UTF-8 using two bytes. The regular expression must not contain null bytes. If the haystack or pattern contain a sequence of bytes that are not valid UTF-8, then the behavior is undefined. No automatic Unicode normalization is performed, if you need it you can use the functions for that.
The same as match
, but returns 0 if none of the regular expressions are matched and 1 if any of the patterns matches. It uses library. For patterns to search substrings in a string, it is better to use multiSearchAny
since it works much faster.
The same as multiMatchAny
, but returns 1 if any pattern matches the haystack within a constant . This function relies on the experimental feature of library, and can be slow for some corner cases. The performance depends on the edit distance value and patterns used, but it's always more expensive compared to a non-fuzzy variants.
extractAllGroupsHorizontal
function is slower than .
haystack
— Input string. Type: .
pattern
— Regular expression with . Must contain groups, each group enclosed in parentheses. If pattern
contains no groups, an exception is thrown. Type: .
Type: .
haystack
— Input string. Type: .
pattern
— Regular expression with . Must contain groups, each group enclosed in parentheses. If pattern
contains no groups, an exception is thrown. Type: .
Type: .
Matching is based on UTF-8, e.g. _
matches the Unicode code point ¥
which is represented in UTF-8 using two bytes. If the haystack or pattern contain a sequence of bytes that are not valid UTF-8, then the behavior is undefined. No automatic Unicode normalization is performed, if you need it you can use the functions for that.
Case insensitive variant of function. You can use ILIKE
operator instead of the ilike
function.
haystack
— Input string. .
For a case-insensitive search, use or functions.
haystack
— The string to search in. .
needle
— The substring to search for. .
start_pos
– Position of the first character in the string to start search. Optional. .
Type: .
haystack
— The string to search in. .
needle
— The substring to search for. .
start_pos
— Position of the first character in the string to start search. Optional. .
Type: .
haystack
— The string to search in. .
needle
— The substring to search for. .
start_pos
— Position of the first character in the string to start search. Optional. .
Type: .
haystack
— The string to search in. .
pattern
— The regular expression with . .
Type: .
separator
— The separator which should contain exactly one character. .
s
— The string to split. .
Type: ().
separator
— The separator. .
s
— The string to split. .
Type: ().
s
— The string to split. .
Type: ().
text
— or .
regexp
— Regular expression. Constant. or .
Type: .
Functions for and in strings are described separately.
The function also works for or .
x
— Input value. .
Type: .
The function also works for or .
x
— Input value. .
Type: .
string
— Input string that needs to be padded. .
length
— The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.
pad_string
— The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.
Type: .
Pads the current string from the left with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL LPAD
function. While in the function the length is measured in bytes, here in the leftPadUTF8
function it is measured in code points.
string
— Input string that needs to be padded. .
length
— The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.
pad_string
— The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.
Type: .
string
— Input string that needs to be padded. .
length
— The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.
pad_string
— The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.
Type: .
Pads the current string from the right with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL RPAD
function. While in the function the length is measured in bytes, here in the rightPadUTF8
function it is measured in code points.
string
— Input string that needs to be padded. .
length
— The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.
pad_string
— The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.
Type: .
input_string
— Any set of bytes represented as the data type object.
s
— The string to repeat. .
n
— The number of times to repeat the string. .
Same as , the difference is that you need to ensure that concat(s1, s2, ...) → sn
is injective, it will be used for optimization of GROUP BY.
Accepts a String and encodes it using encoding scheme using "Bitcoin" alphabet.
plaintext
— column or constant.
Type: .
trim_character
— Specified characters for trim. .
input_string
— String for trim. .
input_string
— string to trim. .
input_string
— string to trim. .
input_string
— string to trim. .
x
— Sequence of characters. .
Type: .
x
— Sequence of characters. .
Type: .
Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.
words
— Input string that contains UTF-8 encoded text. .
Type: .
Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.
words
— Input string that contains UTF-8 encoded text. .
Type: .
Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.
words
— Input string that contains UTF-8 encoded text. .
Type: .
Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.
words
— Input string that contains UTF-8 encoded text. .
Type: .
x
— The sequence of characters. .
Type: .
x
— A sequence of characters. .
Type: .
x
— input text. .
Type: .
The first example contains several tags and a comment and also shows whitespace processing. The second example shows CDATA
and script
tag processing. In the third example text is extracted from the full HTML response received by the function.
Performs syntactic substitution of elements in the call location.
x
— A tuple
function, column, or tuple of elements. .
Returns the between two tuples of the same size.
tuple1
— First tuple. .
tuple2
— Second tuple. .
Type: The result type is calculed the same way it is for , based on the number of elements in the input tuples.
Can be used with functions for detection of semi-duplicate strings:
ClickHouse has the .
Converts an input value to the data type. This function family includes:
expr
— returning a number or a string with the decimal representation of a number. Binary, octal, and hexadecimal representations of numbers are not supported. Leading zeroes are stripped.
Functions use , meaning they truncate fractional digits of numbers.
The behavior of functions for the arguments is undefined. Remember about , when using the functions.
Converts an input value to the data type. This function family includes:
expr
— returning a number or a string with the decimal representation of a number. Binary, octal, and hexadecimal representations of numbers are not supported. Leading zeroes are stripped.
Functions use , meaning they truncate fractional digits of numbers.
The behavior of functions for negative arguments and for the arguments is undefined. If you pass a string with a negative number, for example '-32'
, ClickHouse raises an exception. Remember about , when using the functions.
Converts the argument to the data type. If the value is outside the range, toDate32
returns the border values supported by Date32
. If the argument has type, borders of Date
are taken into account.
expr
— The value. , or .
Type: .
The same as but returns the min value of if an invalid argument is received.
The same as but returns NULL
if an invalid argument is received.
Converts the argument to the data type. If the value is outside the range, toDate32OrDefault
returns the lower border value supported by Date32
. If the argument has type, borders of Date
are taken into account. Returns default value if an invalid argument is received.
Converts the argument to the data type.
expr
— The value. , , or .
Type: .
Converts value
to the data type with precision of S
. The value
can be a number or a string. The S
(scale) parameter specifies the number of decimal places.
Converts an input string to a data type value. This family of functions includes:
expr
— , returns a value in the data type. ClickHouse expects the textual representation of the decimal number. For example, '1.111'
.
Converts an input string to a data type value. This family of functions includes:
expr
— , returns a value in the data type. ClickHouse expects the textual representation of the decimal number. For example, '1.111'
.
Converts an input value to the data type. This family of functions includes:
expr
— , returns a value in the data type. ClickHouse expects the textual representation of the decimal number. For example, '1.111'
.
fixed_string
— Big-endian byte string. .
The UUID type value. .
type
— Destination type. .
Converts an input value to the specified data type. Unlike the function, CAST
tries to present the same value using the new data type. If the conversion can not be done then an exception is raised. Several syntax variants are supported.
T
— The name of the target data type. .
Conversion to FixedString(N) only works for arguments of type or .
Type conversion to and back is supported.
setting
The difference from is that accurateCast
does not allow overflow of numeric types during cast if type value x
does not fit the bounds of type T
. For example, accurateCast(-1, 'UInt8')
throws an exception.
Converts input value x
to the specified data type T
. Always returns type and returns if the casted value is not representable in the target type.
Converts a Number type argument to an data type.
Converts a date and time in the representation to data type.
The function parses , , ClickHouse’s and some other date and time formats.
time_string
— String containing a date and time to convert. .
time_zone
— Time zone. The function parses time_string
according to the time zone. .
A string containing 9..10 digit .
This function behaves like for ISO date formats, e.g. YYYY-MM-DD hh:mm:ss
, and other date formats where the month and date components can be unambiguously extracted, e.g. YYYYMMDDhhmmss
, YYYY-MM
, DD hh
, or YYYY-MM-DD hh:mm:ss ±h:mm
. If the month and the date components cannot be unambiguously extracted, e.g. MM/DD/YYYY
, MM-DD-YYYY
, or MM-DD-YY
, it prefers the US date format instead of DD/MM/YYYY
, DD-MM-YYYY
, or DD-MM-YY
. As an exception from the latter, if the month is bigger than 12 and smaller or equal than 31, this function falls back to the behavior of , e.g. 15/08/2020
is parsed as 2020-08-15
.
Same as for except that it returns NULL
when it encounters a date format that cannot be processed.
Same as for except that it returns zero date or zero date time when it encounters a date format that cannot be processed.
Same as function except that it returns NULL
when it encounters a date format that cannot be processed.
Same as function except that it returns zero date (1970-01-01
) or zero date with time (1970-01-01 00:00:00
) when it encounters a date format that cannot be processed.
Same as function but also parse milliseconds and microseconds and returns data type.
time_string
— String containing a date or date with time to convert. .
precision
— Required precision. 3
— for milliseconds, 6
— for microseconds. Default — 3
. Optional. .
time_zone
— . The function parses time_string
according to the timezone. Optional. .
time_string
converted to the data type.
Same as for , except that this function prefers US date format (MM/DD/YYYY
etc.) in case of ambiguity.
Same as for except that it returns NULL
when it encounters a date format that cannot be processed.
Same as for except that it returns zero date or zero date time when it encounters a date format that cannot be processed.
Same as for , except that this function prefers US date format (MM/DD/YYYY
etc.) in case of ambiguity and returns NULL
when it encounters a date format that cannot be processed.
Same as for , except that this function prefers US date format (MM/DD/YYYY
etc.) in case of ambiguity and returns zero date or zero date time when it encounters a date format that cannot be processed.
Converts input parameter to the version of same data type.
To convert data from the LowCardinality
data type use the function. For example, CAST(x as String)
.
expr
— resulting in one of the .
format
— Text format. For example, , .
format
— Text format. For example, , .
url
— URL. Type: .
url
— URL. Type: .
Returns the part of the domain that includes top-level subdomains up to the first significant subdomain. Accepts custom name.
URL
— URL. .
TLD
— Custom TLD list name. .
Type: .
.
URL
— URL. .
TLD
— Custom TLD list name. .
Type: .
.
URL
— URL. .
TLD
— Custom TLD list name. .
Type: .
.
url
— URL. .
url
— URL. .
name
— name of URL parameter. or of Strings.
Generates the of .
x
— resulting in any of the . The resulting value is discarded, but the expression itself if used for bypassing if the function is called multiple times in one query. Optional parameter.
Accepts string
containing 36 characters in the format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
, and returns a as its binary representation, with its format optionally specified by variant
(Big-endian
by default).
string
— String of 36 characters or FixedString(36). .
variant
— Integer, representing a variant as specified by . 1 = Big-endian
(default), 2 = Microsoft
.
binary
— as a binary representation of a UUID.
variant
— Integer, representing a variant as specified by . 1 = Big-endian
(default), 2 = Microsoft
.