Documentation
  • The Fundamental
  • ACTIVE SYNC
    • Data Ingestion
      • Data Tracking
        • API Key Management
        • Generate Tracking ID
        • Install tracking with Tag Manager
        • Install Tracking from the Console
        • Tracking Method on Website
      • Datasource
        • MySQL
        • PostgreSQL
        • MongoDB
        • Microsoft SQL Server
        • Shopify
        • CSV
        • Google Sheets
    • Data Ingestion API
      • Data Lake
        • File upload
        • Tracking API
      • Data Warehouse
        • Batch upload
        • CSV upload
        • Tracking API
      • Data Schema Warehouse API
    • Data Integrations
      • Manage your API Key
      • Get Data using API
  • ROCKET.BI
    • Introduction
    • Data Warehouse
      • Data Management
      • Ad-hoc Query
        • Measure Schema
        • Calculated Field
      • Query Analysis
      • Relationship
    • Row-level Security
    • Dashboard
      • Dashboard Filter
      • Chart Control
        • Tab Control
        • Single Choice
        • Multiple Choice
        • Dropdown Control
        • Slicer Control
        • Date Control
        • Input Control
      • Manage Dashboard
        • Relationship
        • View and Share
        • Select Main Date Filter
        • Boost
        • Settings
        • Add Chart
        • Add Tab
        • Add Text
    • Chart Builder
      • Chart Types
        • Pie Chart
        • Column Chart
        • Bar Chart
        • Line Chart
        • Line Stock Chart
        • Pareto Chart
        • Bubble Chart
        • Scatter Chart
        • Map Chart
        • Area Chart
        • KPI Chart
        • Lollipop Chart
        • Parliament Chart
        • Funnel Chart
        • Pyramid Chart
        • Gauge Chart
        • Bullet Graph Chart
        • Heat Map Chart
        • Word Cloud Chart
        • Tree Map Chart
        • Stacked Column Chart
        • Stacked Bar Chart
        • Sankey Chart
        • Spider Web Chart
        • Wind Rose Chart
        • Histogram Chart
        • Bell Curve Chart
        • Table Chart
        • Pivot Table Chart
      • Chart Settings
        • Zoom
        • Inner chart filter
      • Chart Filters
        • Tab Filter
        • Single Choice
        • Multiple Choice
        • Dropdown Filter
        • Slicer Filter
        • Date Filter
        • Input Filter
      • Right-click Settings
        • Change date function
        • Drill down
        • Drill through
        • Use as a filter
    • SQL Query
      • Syntax
      • Functions
      • Aggregate Functions
      • Data Types
  • UNLOCK.CI
    • Unlock.CI
Powered by GitBook
On this page
  • ARITHMETIC
  • plus(a, b), a + b operator​
  • minus(a, b), a - b operator​
  • multiply(a, b), a * b operator​
  • divide(a, b), a / b operator​
  • intDiv(a, b)​
  • intDivOrZero(a, b)​
  • modulo(a, b), a % b operator​
  • moduloOrZero(a, b)​
  • negate(a), -a operator​
  • abs(a)​
  • gcd(a, b)​
  • lcm(a, b)​
  • ARRAY
  • empty​
  • notEmpty​
  • length​
  • emptyArrayUInt8, emptyArrayUInt16, emptyArrayUInt32, emptyArrayUInt64
  • emptyArrayInt8, emptyArrayInt16, emptyArrayInt32, emptyArrayInt64​
  • emptyArrayFloat32, emptyArrayFloat64​
  • emptyArrayDate, emptyArrayDateTime​
  • emptyArrayString​
  • emptyArrayToSingle​
  • range(end), range([start, ] end [, step])​
  • array(x1, …), operator [x1, …]​
  • arrayConcat​
  • has(arr, elem)​
  • hasAll​
  • hasAny​
  • hasSubstr​
  • indexOf(arr, x)​
  • arrayCount([func,] arr1, …)​
  • countEqual(arr, x)​
  • arrayEnumerate(arr)​
  • arrayEnumerateUniq(arr, …)​
  • arrayPopBack​
  • arrayPopFront​
  • arrayPushBack​
  • arrayPushFront​
  • arrayResize​
  • arraySlice​
  • arraySort([func,] arr, …)​
  • arrayReverseSort([func,] arr, …)​
  • arrayUniq(arr, …)​
  • arrayJoin(arr)​
  • arrayDifference​
  • arrayDistinct​
  • arrayEnumerateDense(arr)​
  • arrayIntersect(arr)​
  • arrayReduce​
  • arrayReduceInRanges​
  • arrayReverse(arr)​
  • reverse(arr)​
  • arrayFlatten​
  • arrayCompact​
  • arrayZip​
  • arrayAUC​
  • arrayMap(func, arr1, …)​
  • arrayFilter(func, arr1, …)​
  • arrayFill(func, arr1, …)​
  • arrayReverseFill(func, arr1, …)​
  • arraySplit(func, arr1, …)​
  • arrayReverseSplit(func, arr1, …)​
  • arrayExists([func,] arr1, …)​
  • arrayAll([func,] arr1, …)​
  • arrayFirst(func, arr1, …)​
  • arrayFirstIndex(func, arr1, …)​
  • arrayMin​
  • arrayMax​
  • arraySum​
  • arrayAvg​
  • arrayCumSum([func,] arr1, …)​
  • arrayCumSumNonNegative(arr)​
  • arrayProduct​
  • BIT
  • bitAnd(a, b)
  • ​
  • bitOr(a, b)​
  • bitXor(a, b)​
  • bitNot(a)​
  • bitShiftLeft(a, b)​
  • bitShiftRight(a, b)​
  • bitRotateLeft(a, b)​
  • bitRotateRight(a, b)​
  • bitTest​
  • bitTestAll​
  • bitTestAny​
  • bitCount​
  • bitHammingDistance​
  • BITMAP
  • bitmapBuild​
  • bitmapToArray​
  • bitmapSubsetInRange​
  • bitmapSubsetLimit​
  • subBitmap​
  • bitmapContains​
  • bitmapHasAny​
  • bitmapHasAll​
  • bitmapCardinality​
  • bitmapMin​
  • bitmapMax​
  • bitmapTransform​
  • bitmapAnd​
  • bitmapOr​
  • bitmapXor​
  • bitmapAndnot​
  • bitmapAndCardinality​
  • bitmapOrCardinality​
  • bitmapXorCardinality​
  • bitmapAndnotCardinality​
  • CONDITIONAL
  • if​
  • Ternary Operator​
  • multiIf​
  • Case​
  • DATES AND TIMES
  • timeZone​
  • toTimeZone​
  • timeZoneOf​
  • timeZoneOffset​
  • toYear​
  • toQuarter​
  • toMonth​
  • toDayOfYear​
  • toDayOfMonth​
  • toDayOfWeek​
  • toHour​
  • toMinute​
  • toSecond​
  • toUnixTimestamp​
  • toStartOfYear​
  • toStartOfISOYear​
  • toStartOfQuarter​
  • toStartOfMonth​
  • toMonday​
  • toStartOfWeek(t[,mode])​
  • toStartOfDay​
  • toStartOfHour​
  • toStartOfMinute​
  • toStartOfSecond​
  • toStartOfFiveMinutes​
  • toStartOfTenMinutes​
  • toStartOfFifteenMinutes​
  • toStartOfInterval(time_or_data, INTERVAL x unit [, time_zone])​
  • toTime​
  • toRelativeYearNum​
  • toRelativeQuarterNum​
  • toRelativeMonthNum​
  • toRelativeWeekNum​
  • toRelativeDayNum​
  • toRelativeHourNum​
  • toRelativeMinuteNum​
  • toRelativeSecondNum​
  • toISOYear​
  • toISOWeek​
  • toWeek(date[,mode])​
  • toYearWeek(date[,mode])​
  • date_trunc​
  • date_add​
  • date_diff​
  • date_sub​
  • timestamp_add​
  • timestamp_sub​
  • now​
  • today​
  • yesterday​
  • timeSlot​
  • toYYYYMM​
  • toYYYYMMDD​
  • toYYYYMMDDhhmmss​
  • addYears, addMonths, addWeeks, addDays, addHours, addMinutes, addSeconds, addQuarters​
  • subtractYears, subtractMonths, subtractWeeks, subtractDays, subtractHours, subtractMinutes, subtractSeconds, subtractQuarters​
  • formatDateTime​
  • formatDateTimeInJodaSyntax​
  • dateName​
  • FROM_UNIXTIME​
  • fromUnixTimestampInJodaSyntax​
  • toModifiedJulianDay​
  • toModifiedJulianDayOrNull​
  • fromModifiedJulianDay​
  • fromModifiedJulianDayOrNull​
  • DICTIONARIES
  • dictGet, dictGetOrDefault, dictGetOrNull​
  • dictHas​
  • dictGetHierarchy​
  • dictIsIn​
  • Other Functions​
  • ENCODING
  • char​
  • hex​
  • unhex​
  • bitmaskToList(num)​
  • bitmaskToArray(num)​
  • ENCRYPTION
  • encrypt​
  • aes_encrypt_mysql​
  • decrypt​
  • aes_decrypt_mysql​
  • FILE
  • file​
  • GEOGRAPHICAL COORDINATES
  • greatCircleDistance​
  • geoDistance​
  • greatCircleAngle​
  • pointInEllipses​
  • pointInPolygon​
  • GEOHASH
  • geohashEncode​
  • geohashDecode​
  • geohashesInBox​
  • H3 INDEXES
  • h3IsValid​
  • h3GetResolution​
  • h3EdgeAngle​
  • h3EdgeLengthM​
  • geoToH3​
  • h3kRing​
  • h3GetBaseCell​
  • h3HexAreaM2​
  • h3IndexesAreNeighbors​
  • h3ToChildren​
  • h3ToParent​
  • h3ToString​
  • stringToH3​
  • HASH
  • halfMD5​
  • MD5​
  • sipHash64​
  • sipHash128​
  • cityHash64​
  • intHash32​
  • intHash64​
  • SHA1, SHA224, SHA256, SHA512​
  • URLHash(url[, N])​
  • farmFingerprint64​
  • farmHash64​
  • javaHash​
  • javaHashUTF16LE​
  • hiveHash​
  • metroHash64​
  • jumpConsistentHash​
  • murmurHash2_32, murmurHash2_64​
  • gccMurmurHash​
  • murmurHash3_32, murmurHash3_64​
  • murmurHash3_128​
  • xxHash32, xxHash64​
  • ngramSimHash​
  • ngramSimHashCaseInsensitive​
  • ngramSimHashUTF8​
  • ngramSimHashCaseInsensitiveUTF8​
  • wordShingleSimHash​
  • wordShingleSimHashCaseInsensitive​
  • wordShingleSimHashUTF8​
  • wordShingleSimHashCaseInsensitiveUTF8​
  • ngramMinHash​
  • ngramMinHashCaseInsensitive​
  • ngramMinHashUTF8​
  • ngramMinHashCaseInsensitiveUTF8​
  • ngramMinHashArg​
  • ngramMinHashArgCaseInsensitive​
  • ngramMinHashArgUTF8​
  • ngramMinHashArgCaseInsensitiveUTF8​
  • wordShingleMinHash​
  • wordShingleMinHashCaseInsensitive​
  • wordShingleMinHashUTF8​
  • wordShingleMinHashCaseInsensitiveUTF8​
  • wordShingleMinHashArg​
  • wordShingleMinHashArgCaseInsensitive​
  • wordShingleMinHashArgUTF8​
  • wordShingleMinHashArgCaseInsensitiveUTF8​
  • INTROSPECTION FUNCTIONS
  • addressToLine​
  • addressToLineWithInlines​
  • addressToSymbol​
  • demangle​
  • tid​
  • logTrace​
  • IP ADDRESSES
  • IPv4NumToString(num)​
  • IPv4StringToNum(s)​
  • IPv4NumToStringClassC(num)​
  • IPv6StringToNum​
  • IPv4ToIPv6(x)​
  • cutIPv6(x, bytesToCutForIPv6, bytesToCutForIPv4)​
  • IPv4CIDRToRange(ipv4, Cidr),​
  • IPv6CIDRToRange(ipv6, Cidr),​
  • toIPv4(string)​
  • toIPv6​
  • isIPv4String​
  • isIPv6String​
  • isIPAddressInRange​
  • JSON
  • visitParamHas(params, name)​
  • visitParamExtractUInt(params, name)​
  • visitParamExtractInt(params, name)​
  • visitParamExtractFloat(params, name)​
  • visitParamExtractBool(params, name)​
  • visitParamExtractRaw(params, name)​
  • visitParamExtractString(params, name)​
  • isValidJSON(json)​
  • JSONHas(json[, indices_or_keys]…)​
  • JSONLength(json[, indices_or_keys]…)​
  • JSONType(json[, indices_or_keys]…)​
  • JSONExtractUInt(json[, indices_or_keys]…)​
  • JSONExtractInt(json[, indices_or_keys]…)​
  • JSONExtractFloat(json[, indices_or_keys]…)​
  • JSONExtractBool(json[, indices_or_keys]…)​
  • JSONExtractString(json[, indices_or_keys]…)​
  • JSONExtract(json[, indices_or_keys…], Return_type)​
  • JSONExtractKeysAndValues(json[, indices_or_keys…], Value_type)​
  • JSONExtractKeys​
  • JSONExtractRaw(json[, indices_or_keys]…)​
  • JSONExtractArrayRaw(json[, indices_or_keys…])​
  • JSONExtractKeysAndValuesRaw​
  • JSON_EXISTS(json, path)​
  • JSON_QUERY(json, path)​
  • JSON_VALUE(json, path)​
  • toJSONString​
  • MACHINE LEARNING FUNCTIONS
  • evalMLMethod​
  • stochasticLinearRegression​
  • stochasticLogisticRegression​
  • MAPS
  • map​
  • mapAdd​
  • mapSubtract​
  • mapPopulateSeries​
  • mapContains​
  • mapKeys​
  • mapValues​
  • MATHEMATICAL
  • e()​
  • exp(x)​
  • log(x), ln(x)​
  • exp2(x)​
  • log2(x)​
  • exp10(x)​
  • log10(x)​
  • sqrt(x)​
  • cbrt(x)​
  • erf(x)​
  • erfc(x)​
  • lgamma(x)​
  • tgamma(x)​
  • sin(x)​
  • cos(x)​
  • tan(x)​
  • asin(x)​
  • acos(x)​
  • atan(x)​
  • pow(x, y), power(x, y)​
  • intExp2​
  • intExp10​
  • cosh(x)​
  • acosh(x)​
  • sinh(x)​
  • asinh(x)​
  • atanh(x)​
  • atan2(y, x)​
  • hypot(x, y)​
  • log1p(x)​
  • sign(x)​
  • NULLABLE
  • isNull​
  • isNotNull​
  • coalesce​
  • ifNull​
  • nullIf​
  • assumeNotNull​
  • toNullable​
  • OTHERS
  • hostName()​
  • getMacro​
  • FQDN​
  • basename​
  • visibleWidth(x)​
  • toTypeName(x)​
  • blockSize()​
  • byteSize​
  • materialize(x)​
  • ignore(…)​
  • sleep(seconds)​
  • sleepEachRow(seconds)​
  • currentDatabase()​
  • currentUser()​
  • isConstant​
  • isFinite(x)​
  • isInfinite(x)​
  • ifNotFinite​
  • isNaN(x)​
  • hasColumnInTable([‘hostname’[, ‘username’[, ‘password’]],] ‘database’, ‘table’, ‘column’)​
  • bar​
  • transform​
  • formatReadableDecimalSize(x)​
  • formatReadableSize(x)​
  • formatReadableQuantity(x)​
  • formatReadableTimeDelta​
  • least(a, b)​
  • greatest(a, b)​
  • uptime()​
  • version()​
  • blockNumber​
  • rowNumberInBlock​
  • rowNumberInAllBlocks()​
  • neighbor​
  • runningDifference(x)​
  • runningDifferenceStartingWithFirstValue​
  • runningConcurrency​
  • MACNumToString(num)​
  • MACStringToNum(s)​
  • MACStringToOUI(s)​
  • getSizeOfEnumType​
  • blockSerializedSize​
  • toColumnTypeName​
  • dumpColumnStructure​
  • defaultValueOfArgumentType​
  • defaultValueOfTypeName​
  • indexHint​
  • replicate​
  • filesystemAvailable​
  • filesystemFree​
  • filesystemCapacity​
  • initializeAggregation​
  • finalizeAggregation​
  • runningAccumulate​
  • joinGet​
  • catboostEvaluate(path_to_model, feature_1, feature_2, …, feature_n)​
  • throwIf(x[, message[, error_code]])​
  • identity​
  • getSetting​
  • isDecimalOverflow​
  • countDigits​
  • errorCodeToName​
  • tcpPort​
  • RANDOM NUMBER AND STRING
  • rand, rand32​
  • rand64​
  • randCanonical​
  • randConstant​
  • randomString​
  • randomFixedString​
  • randomPrintableASCII​
  • randomStringUTF8​
  • fuzzBits​
  • REPLACING IN STRINGS
  • replaceOne(haystack, pattern, replacement)​
  • replaceAll(haystack, pattern, replacement), replace(haystack, pattern, replacement)​
  • replaceRegexpOne(haystack, pattern, replacement)​
  • replaceRegexpAll(haystack, pattern, replacement)​
  • regexpQuoteMeta(s)​
  • ROUNDING
  • floor(x[, N])​
  • ceil(x[, N]), ceiling(x[, N])​
  • trunc(x[, N]), truncate(x[, N])​
  • round(x[, N])​
  • roundBankers​
  • roundToExp2(num)​
  • roundDuration(num)​
  • roundAge(num)​
  • roundDown(num, arr)​
  • SEARCHING IN STRINGS
  • position(haystack, needle), locate(haystack, needle)​
  • positionCaseInsensitive​
  • positionUTF8​
  • positionCaseInsensitiveUTF8​
  • multiSearchAllPositions​
  • multiSearchAllPositionsUTF8​
  • multiSearchFirstPosition(haystack, [needle1, needle2, …, needlen])​
  • multiSearchFirstIndex(haystack, [needle1, needle2, …, needlen])​
  • multiSearchAny(haystack, [needle1, needle2, …, needlen])​
  • match(haystack, pattern)​
  • multiMatchAny(haystack, [pattern1, pattern2, …, patternn])​
  • multiMatchAnyIndex(haystack, [pattern1, pattern2, …, patternn])​
  • multiMatchAllIndices(haystack, [pattern1, pattern2, …, patternn])​
  • multiFuzzyMatchAny(haystack, distance, [pattern1, pattern2, …, patternn])​
  • multiFuzzyMatchAnyIndex(haystack, distance, [pattern1, pattern2, …, patternn])​
  • multiFuzzyMatchAllIndices(haystack, distance, [pattern1, pattern2, …, patternn])​
  • extract(haystack, pattern)​
  • extractAll(haystack, pattern)​
  • extractAllGroupsHorizontal​
  • extractAllGroupsVertical​
  • like(haystack, pattern), haystack LIKE pattern operator​
  • notLike(haystack, pattern), haystack NOT LIKE pattern operator​
  • ilike​
  • ngramDistance(haystack, needle)​
  • ngramSearch(haystack, needle)​
  • countSubstrings​
  • countSubstringsCaseInsensitive​
  • countSubstringsCaseInsensitiveUTF8​
  • countMatches(haystack, pattern)​
  • SPLITTING AND MERGING
  • splitByChar(separator, s[, max_substrings])​
  • splitByString(separator, s[, max_substrings])​
  • arrayStringConcat(arr[, separator])​
  • alphaTokens(s[, max_substrings]), splitByAlpha(s[, max_substrings])​
  • extractAllGroups(text, regexp)​
  • STRINGS
  • empty​
  • notEmpty​
  • length​
  • lengthUTF8​
  • char_length, CHAR_LENGTH​
  • character_length, CHARACTER_LENGTH​
  • leftPad​
  • leftPadUTF8​
  • rightPad​
  • rightPadUTF8​
  • lower, lcase​
  • upper, ucase​
  • lowerUTF8​
  • upperUTF8​
  • isValidUTF8​
  • toValidUTF8​
  • repeat​
  • reverse​
  • reverseUTF8​
  • format(pattern, s0, s1, …)​
  • concat​
  • concatAssumeInjective​
  • substring(s, offset, length), mid(s, offset, length), substr(s, offset, length)​
  • substringUTF8(s, offset, length)​
  • appendTrailingCharIfAbsent(s, c)​
  • convertCharset(s, from, to)​
  • base58Encode(plaintext)​
  • base64Encode(s)​
  • base64Decode(s)​
  • tryBase64Decode(s)​
  • endsWith(s, suffix)​
  • startsWith(str, prefix)​
  • trim​
  • trimLeft​
  • trimRight​
  • trimBoth​
  • CRC32(s)​
  • CRC32IEEE(s)​
  • CRC64(s)​
  • normalizeQuery​
  • normalizedQueryHash​
  • normalizeUTF8NFC​
  • normalizeUTF8NFD​
  • normalizeUTF8NFKC​
  • normalizeUTF8NFKD​
  • encodeXMLComponent​
  • decodeXMLComponent​
  • extractTextFromHTML​
  • TUPLES
  • tuple​
  • tupleElement​
  • untuple​
  • tupleHammingDistance​
  • TYPE CONVERSION
  • Common Issues of Numeric Conversions​
  • toInt(8|16|32|64|128|256)​
  • toInt(8|16|32|64|128|256)OrZero​
  • toInt(8|16|32|64|128|256)OrNull​
  • toInt(8|16|32|64|128|256)OrDefault​
  • toUInt(8|16|32|64|256)​
  • toUInt(8|16|32|64|256)OrZero​
  • toUInt(8|16|32|64|256)OrNull​
  • toUInt(8|16|32|64|256)OrDefault​
  • toFloat(32|64)​
  • toFloat(32|64)OrZero​
  • toFloat(32|64)OrNull​
  • toFloat(32|64)OrDefault​
  • toDate​
  • toDateOrZero​
  • toDateOrNull​
  • toDateOrDefault​
  • toDateTime​
  • toDateTimeOrZero​
  • toDateTimeOrNull​
  • toDateTimeOrDefault​
  • toDate32​
  • toDate32OrZero​
  • toDate32OrNull​
  • toDate32OrDefault​
  • toDateTime64​
  • toDecimal(32|64|128|256)​
  • toDecimal(32|64|128|256)OrNull​
  • toDecimal(32|64|128|256)OrDefault​
  • toDecimal(32|64|128|256)OrZero​
  • toString​
  • toFixedString(s, N)​
  • toStringCutToZero(s)​
  • reinterpretAsUInt(8|16|32|64)​
  • reinterpretAsInt(8|16|32|64)​
  • reinterpretAsFloat(32|64)​
  • reinterpretAsDate​
  • reinterpretAsDateTime​
  • reinterpretAsString​
  • reinterpretAsFixedString​
  • reinterpretAsUUID​
  • reinterpret(x, T)​
  • CAST(x, T)​
  • accurateCast(x, T)​
  • accurateCastOrNull(x, T)​
  • accurateCastOrDefault(x, T[, default_value])​
  • toInterval(Year|Quarter|Month|Week|Day|Hour|Minute|Second)​
  • parseDateTimeBestEffort​
  • parseDateTime32BestEffort​
  • parseDateTimeBestEffortUS​
  • parseDateTimeBestEffortOrNull​
  • parseDateTime32BestEffortOrNull​
  • parseDateTimeBestEffortOrZero​
  • parseDateTime32BestEffortOrZero​
  • parseDateTimeBestEffortUSOrNull​
  • parseDateTimeBestEffortUSOrZero​
  • parseDateTime64BestEffort​
  • parseDateTime64BestEffortUS​
  • parseDateTime64BestEffortOrNull​
  • parseDateTime64BestEffortOrZero​
  • parseDateTime64BestEffortUSOrNull​
  • parseDateTime64BestEffortUSOrZero​
  • toLowCardinality​
  • toUnixTimestamp64Milli​
  • toUnixTimestamp64Micro​
  • toUnixTimestamp64Nano​
  • fromUnixTimestamp64Milli​
  • fromUnixTimestamp64Micro​
  • fromUnixTimestamp64Nano​
  • formatRow​
  • formatRowNoNewline​
  • URLs
  • protocol​
  • domain​
  • domainWithoutWWW​
  • topLevelDomain​
  • firstSignificantSubdomain​
  • cutToFirstSignificantSubdomain​
  • cutToFirstSignificantSubdomainWithWWW​
  • cutToFirstSignificantSubdomainCustom​
  • cutToFirstSignificantSubdomainCustomWithWWW​
  • firstSignificantSubdomainCustom​
  • port(URL[, default_port = 0])​
  • path​
  • pathFull​
  • queryString​
  • fragment​
  • queryStringAndFragment​
  • extractURLParameter(URL, name)​
  • extractURLParameters(URL)​
  • extractURLParameterNames(URL)​
  • URLHierarchy(URL)​
  • URLPathHierarchy(URL)​
  • decodeURLComponent(URL)​
  • netloc​
  • cutWWW​
  • cutQueryString​
  • cutFragment​
  • cutQueryStringAndFragment​
  • cutURLParameter(URL, name)​
  • UUID
  • generateUUIDv4​
  • toUUID (x)​
  • toUUIDOrNull (x)​
  • toUUIDOrZero (x)​
  • UUIDStringToNum​
  • UUIDNumToString​
  1. ROCKET.BI
  2. SQL Query

Functions

PreviousSyntaxNextAggregate Functions

Last updated 2 years ago

There are at least* two types of functions - regular functions (they are just called “functions”) and aggregate functions. These are completely different concepts. Regular functions work as if they are applied to each row separately (for each row, the result of the function does not depend on the other rows). Aggregate functions accumulate a set of values from various rows (i.e. they depend on the entire set of rows).

In this section we discuss regular functions. For aggregate functions, see the section “Aggregate functions”.

* - There is a third type of function that the ‘arrayJoin’ function belongs to; table functions can also be mentioned separately.*

ARITHMETIC

For all arithmetic functions, the result type is calculated as the smallest number type that the result fits in, if there is such a type. The minimum is taken simultaneously based on the number of bits, whether it is signed, and whether it floats. If there are not enough bits, the highest bit type is taken.

Example

SELECT toTypeName(0), toTypeName(0 + 0), toTypeName(0 + 0 + 0), toTypeName(0 + 0 + 0 + 0)
┌─toTypeName(0)─┬─toTypeName(plus(0, 0))─┬─toTypeName(plus(plus(0, 0), 0))─┬─toTypeName(plus(plus(plus(0, 0), 0), 0))─┐
│ UInt8         │ UInt16                 │ UInt32                          │ UInt64                                   │
└───────────────┴────────────────────────┴─────────────────────────────────┴──────────────────────────────────────────┘

Arithmetic functions work for any pair of types from UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Float32, or Float64.

Overflow is produced the same way as in C++.

plus(a, b), a + b operator

Calculates the sum of the numbers. You can also add integer numbers with a date or date and time. In the case of a date, adding an integer means adding the corresponding number of days. For a date with time, it means adding the corresponding number of seconds.

Example

"plus(1,2) = 3"

Calculates the difference. The result is always signed.

You can also calculate integer numbers from a date or date with time. The idea is the same – see above for ‘plus’.

Example

"minus(5,2) = 3"

Calculates the product of the numbers.

Example

"divide(50,2) = 2.5e+01"

Calculates the quotient of the numbers. The result type is always a floating-point type. It is not integer division. For integer division, use the ‘intDiv’ function. When dividing by zero you get ‘inf’, ‘-inf’, or ‘nan’.

Example

"divide(50,2) = 2.5e+01"

Calculates the quotient of the numbers. Divides into integers, rounding down (by the absolute value). An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.

Example

"intDiv(10, -2) = -5"

Differs from ‘intDiv’ in that it returns zero when dividing by zero or when dividing a minimal negative number by minus one.

Example

"intDivOrZero(10, -2) = -5"

Calculates the remainder when dividing a by b. The result type is an integer if both inputs are integers. If one of the inputs is a floating-point number, the result is a floating-point number. The remainder is computed like in C++. Truncated division is used for negative numbers. An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.

Example

"modulo(10, 3) = 1"

Example

"moduloOrZero(10, 5) = 0"

Calculates a number with the reverse sign. The result is always signed.

Example

"negate(20) = -20"

Calculates the absolute value of the number (a). That is, if a \< 0, it returns -a. For unsigned types it does not do anything. For signed integer types, it returns an unsigned number.

Example

"abs(-2) = 2"

Returns the greatest common divisor of the numbers. An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.

Example

"gcd(27,18) = 9"

Returns the least common multiple of the numbers. An exception is thrown when dividing by zero or when dividing a minimal negative number by minus one.

Example

"lcm(27,18) = 54"

ARRAY

Checks whether the input array is empty.

Syntax

empty([x])

An array is considered empty if it does not contain any elements.

NOTE

Arguments

Returned value

  • Returns 1 for an empty array or 0 for a non-empty array.

Example

Query:

SELECT empty([]);

Result:

┌─empty(array())─┐
│              1 │
└────────────────┘

Checks whether the input array is non-empty.

Syntax

notEmpty([x])

An array is considered non-empty if it contains at least one element.

NOTE

Arguments

Returned value

  • Returns 1 for a non-empty array or 0 for an empty array.

Example

Query:

SELECT notEmpty([1,2]);

Result:

┌─notEmpty([1, 2])─┐
│                1 │
└──────────────────┘

Returns the number of items in the array. The result type is UInt64. The function also works for strings.

Syntax

length(string)

Example

length(\"ABC Corporation\")

emptyArrayUInt8, emptyArrayUInt16, emptyArrayUInt32, emptyArrayUInt64

Accepts zero arguments and returns an empty array of the appropriate type.

Accepts an empty array and returns a one-element array that is equal to the default value.

Returns an array of UInt numbers from start to end - 1 by step.

Syntax

range([start, ] end [, step])

Arguments

Returned value

  • Array of UInt numbers from start to end - 1 by step.

Implementation details

  • All arguments must be positive values: start, end, step are UInt data types, as well as elements of the returned array.

Examples

Query:

SELECT range(5), range(1, 5), range(1, 5, 2);

Result:

┌─range(5)────┬─range(1, 5)─┬─range(1, 5, 2)─┐
│ [0,1,2,3,4] │ [1,2,3,4]   │ [1,3]          │
└─────────────┴─────────────┴────────────────┘

Creates an array from the function arguments. The arguments must be constants and have types that have the smallest common type. At least one argument must be passed, because otherwise it isn’t clear which type of array to create. That is, you can’t use this function to create an empty array (to do that, use the ‘emptyArray*’ function described above). Returns an ‘Array(T)’ type result, where ‘T’ is the smallest common type out of the passed arguments.

Example

SELECT array(1,2,3);

Combines arrays passed as arguments.

arrayConcat(arrays)

Arguments

SELECT arrayConcat([1, 2], [3, 4], [5, 6]) AS res
┌─res───────────┐
│ [1,2,3,4,5,6] │
└───────────────┘

Checks whether the ‘arr’ array has the ‘elem’ element. Returns 0 if the element is not in the array, or 1 if it is.

NULL is processed as a value.

SELECT has([1, 2, NULL], NULL)
┌─has([1, 2, NULL], NULL)─┐
│                       1 │
└─────────────────────────┘

Checks whether one array is a subset of another.

hasAll(set, subset)

Arguments

  • set – Array of any type with a set of elements.

  • subset – Array of any type with elements that should be tested to be a subset of set.

Return values

  • 1, if set contains all of the elements from subset.

  • 0, otherwise.

Peculiar properties

  • An empty array is a subset of any array.

  • Null processed as a value.

  • Order of values in both of arrays does not matter.

Examples

SELECT hasAll([], []) returns 1.

SELECT hasAll([1, Null], [Null]) returns 1.

SELECT hasAll([1.0, 2, 3, 4], [1, 3]) returns 1.

SELECT hasAll(['a', 'b'], ['a']) returns 1.

SELECT hasAll([1], ['a']) returns 0.

SELECT hasAll([[1, 2], [3, 4]], [[1, 2], [3, 5]]) returns 0.

Checks whether two arrays have intersection by some elements.

hasAny(array1, array2)

Arguments

  • array1 – Array of any type with a set of elements.

  • array2 – Array of any type with a set of elements.

Return values

  • 1, if array1 and array2 have one similar element at least.

  • 0, otherwise.

Peculiar properties

  • Null processed as a value.

  • Order of values in both of arrays does not matter.

Examples

SELECT hasAny([1], []) returns 0.

SELECT hasAny([Null], [Null, 1]) returns 1.

SELECT hasAny([-128, 1., 512], [1]) returns 1.

SELECT hasAny([[1, 2], [3, 4]], ['a', 'c']) returns 0.

SELECT hasAll([[1, 2], [3, 4]], [[1, 2], [1, 2]]) returns 1.

Checks whether all the elements of array2 appear in array1 in the same exact order. Therefore, the function will return 1, if and only if array1 = prefix + array2 + suffix.

hasSubstr(array1, array2)

In other words, the functions will check whether all the elements of array2 are contained in array1 like the hasAll function. In addition, it will check that the elements are observed in the same order in both array1 and array2.

For Example:

  • hasSubstr([1,2,3,4], [2,3]) returns 1. However, hasSubstr([1,2,3,4], [3,2]) will return 0.

  • hasSubstr([1,2,3,4], [1,2,3]) returns 1. However, hasSubstr([1,2,3,4], [1,2,4]) will return 0.

Arguments

  • array1 – Array of any type with a set of elements.

  • array2 – Array of any type with a set of elements.

Return values

  • 1, if array1 contains array2.

  • 0, otherwise.

Peculiar properties

  • The function will return 1 if array2 is empty.

  • Null processed as a value. In other words hasSubstr([1, 2, NULL, 3, 4], [2,3]) will return 0. However, hasSubstr([1, 2, NULL, 3, 4], [2,NULL,3]) will return 1

  • Order of values in both of arrays does matter.

Examples

SELECT hasSubstr([], []) returns 1.

SELECT hasSubstr([1, Null], [Null]) returns 1.

SELECT hasSubstr([1.0, 2, 3, 4], [1, 3]) returns 0.

SELECT hasSubstr(['a', 'b'], ['a']) returns 1.

SELECT hasSubstr(['a', 'b' , 'c'], ['a', 'b']) returns 1.

SELECT hasSubstr(['a', 'b' , 'c'], ['a', 'c']) returns 0.

SELECT hasSubstr([[1, 2], [3, 4], [5, 6]], [[1, 2], [3, 4]]) returns 1.

Returns the index of the first ‘x’ element (starting from 1) if it is in the array, or 0 if it is not.

Example:

SELECT indexOf([1, 3, NULL, NULL], NULL)
┌─indexOf([1, 3, NULL, NULL], NULL)─┐
│                                 3 │
└───────────────────────────────────┘

Elements set to NULL are handled as normal values.

Returns the number of elements for which func(arr1[i], …, arrN[i]) returns something other than 0. If func is not specified, it returns the number of non-zero elements in the array.

Example

arrayCount(lambda(tuple(x, y), equals(x, y)), [1, 2, 3], [1, 5, 3]) = 2

Returns the number of elements in the array equal to x. Equivalent to arrayCount (elem -> elem = x, arr).

NULL elements are handled as separate values.

Example

SELECT countEqual([1, 2, NULL, NULL], NULL)
┌─countEqual([1, 2, NULL, NULL], NULL)─┐
│                                    2 │
└──────────────────────────────────────┘

Returns the array [1, 2, 3, …, length (arr) ]

This function is normally used with ARRAY JOIN. It allows counting something just once for each array after applying ARRAY JOIN. Example:

SELECT
    count() AS Reaches,
    countIf(num = 1) AS Hits
FROM test.hits
ARRAY JOIN
    GoalsReached,
    arrayEnumerate(GoalsReached) AS num
WHERE CounterID = 160656
LIMIT 10
┌─Reaches─┬──Hits─┐
│   95606 │ 31406 │
└─────────┴───────┘

In this example, Reaches is the number of conversions (the strings received after applying ARRAY JOIN), and Hits is the number of pageviews (strings before ARRAY JOIN). In this particular case, you can get the same result in an easier way:

SELECT
    sum(length(GoalsReached)) AS Reaches,
    count() AS Hits
FROM test.hits
WHERE (CounterID = 160656) AND notEmpty(GoalsReached)
┌─Reaches─┬──Hits─┐
│   95606 │ 31406 │
└─────────┴───────┘

This function can also be used in higher-order functions. For example, you can use it to get array indexes for elements that match a condition.

Returns an array the same size as the source array, indicating for each element what its position is among elements with the same value. For example: arrayEnumerateUniq([10, 20, 10, 30]) = [1, 1, 2, 1].

This function is useful when using ARRAY JOIN and aggregation of array elements. Example:

SELECT
    Goals.ID AS GoalID,
    sum(Sign) AS Reaches,
    sumIf(Sign, num = 1) AS Visits
FROM test.visits
ARRAY JOIN
    Goals,
    arrayEnumerateUniq(Goals.ID) AS num
WHERE CounterID = 160656
GROUP BY GoalID
ORDER BY Reaches DESC
LIMIT 10
┌──GoalID─┬─Reaches─┬─Visits─┐
│   53225 │    3214 │   1097 │
│ 2825062 │    3188 │   1097 │
│   56600 │    2803 │    488 │
│ 1989037 │    2401 │    365 │
│ 2830064 │    2396 │    910 │
│ 1113562 │    2372 │    373 │
│ 3270895 │    2262 │    812 │
│ 1084657 │    2262 │    345 │
│   56599 │    2260 │    799 │
│ 3271094 │    2256 │    812 │
└─────────┴─────────┴────────┘

In this example, each goal ID has a calculation of the number of conversions (each element in the Goals nested data structure is a goal that was reached, which we refer to as a conversion) and the number of sessions. Without ARRAY JOIN, we would have counted the number of sessions as sum(Sign). But in this particular case, the rows were multiplied by the nested Goals structure, so in order to count each session one time after this, we apply a condition to the value of the arrayEnumerateUniq(Goals.ID) function.

The arrayEnumerateUniq function can take multiple arrays of the same size as arguments. In this case, uniqueness is considered for tuples of elements in the same positions in all the arrays.

SELECT arrayEnumerateUniq([1, 1, 1, 2, 2, 2], [1, 1, 2, 1, 1, 2]) AS res
┌─res───────────┐
│ [1,2,1,1,2,1] │
└───────────────┘

This is necessary when using ARRAY JOIN with a nested data structure and further aggregation across multiple elements in this structure.

Removes the last item from the array.

arrayPopBack(array)

Arguments

  • array – Array.

Example

SELECT arrayPopBack([1, 2, 3]) AS res;
┌─res───┐
│ [1,2] │
└───────┘

Removes the first item from the array.

arrayPopFront(array)

Arguments

  • array – Array.

Example

SELECT arrayPopFront([1, 2, 3]) AS res;
┌─res───┐
│ [2,3] │
└───────┘

Adds one item to the end of the array.

arrayPushBack(array, single_value)

Arguments

  • array – Array.

Example

SELECT arrayPushBack(['a'], 'b') AS res;
┌─res───────┐
│ ['a','b'] │
└───────────┘

Adds one element to the beginning of the array.

arrayPushFront(array, single_value)

Arguments

  • array – Array.

Example

SELECT arrayPushFront(['b'], 'a') AS res;
┌─res───────┐
│ ['a','b'] │
└───────────┘

Changes the length of the array.

arrayResize(array, size[, extender])

Arguments:

  • array — Array.

  • size — Required length of the array.

    • If size is less than the original size of the array, the array is truncated from the right.

  • If size is larger than the initial size of the array, the array is extended to the right with extender values or default values for the data type of the array items.

  • extender — Value for extending an array. Can be NULL.

Returned value:

An array of length size.

Examples of calls

SELECT arrayResize([1], 3);
┌─arrayResize([1], 3)─┐
│ [1,0,0]             │
└─────────────────────┘
SELECT arrayResize([1], 3, NULL);
┌─arrayResize([1], 3, NULL)─┐
│ [1,NULL,NULL]             │
└───────────────────────────┘

Returns a slice of the array.

arraySlice(array, offset[, length])

Arguments

  • array – Array of data.

  • offset – Indent from the edge of the array. A positive value indicates an offset on the left, and a negative value is an indent on the right. Numbering of the array items begins with 1.

  • length – The length of the required slice. If you specify a negative value, the function returns an open slice [offset, array_length - length]. If you omit the value, the function returns the slice [offset, the_end_of_array].

Example

SELECT arraySlice([1, 2, NULL, 4, 5], 2, 3) AS res;
┌─res────────┐
│ [2,NULL,4] │
└────────────┘

Array elements set to NULL are handled as normal values.

Sorts the elements of the arr array in ascending order. If the func function is specified, sorting order is determined by the result of the func function applied to the elements of the array. If func accepts multiple arguments, the arraySort function is passed several arrays that the arguments of func will correspond to. Detailed examples are shown at the end of arraySort description.

Example of integer values sorting:

SELECT arraySort([1, 3, 3, 0]);
┌─arraySort([1, 3, 3, 0])─┐
│ [0,1,3,3]               │
└─────────────────────────┘

Example of string values sorting:

SELECT arraySort(['hello', 'world', '!']);
┌─arraySort(['hello', 'world', '!'])─┐
│ ['!','hello','world']              │
└────────────────────────────────────┘

Consider the following sorting order for the NULL, NaN and Inf values:

SELECT arraySort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]);
┌─arraySort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf])─┐
│ [-inf,-4,1,2,3,inf,nan,nan,NULL,NULL]                     │
└───────────────────────────────────────────────────────────┘
  • -Inf values are first in the array.

  • NULL values are last in the array.

  • NaN values are right before NULL.

  • Inf values are right before NaN.

Let’s consider the following example:

SELECT arraySort((x) -> -x, [1, 2, 3]) as res;
┌─res─────┐
│ [3,2,1] │
└─────────┘

The lambda function can accept multiple arguments. In this case, you need to pass the arraySort function several arrays of identical length that the arguments of lambda function will correspond to. The resulting array will consist of elements from the first input array; elements from the next input array(s) specify the sorting keys. For example:

SELECT arraySort((x, y) -> y, ['hello', 'world'], [2, 1]) as res;
┌─res────────────────┐
│ ['world', 'hello'] │
└────────────────────┘

Here, the elements that are passed in the second array ([2, 1]) define a sorting key for the corresponding element from the source array ([‘hello’, ‘world’]), that is, [‘hello’ –> 2, ‘world’ –> 1]. Since the lambda function does not use x, actual values of the source array do not affect the order in the result. So, ‘hello’ will be the second element in the result, and ‘world’ will be the first.

Other examples are shown below.

SELECT arraySort((x, y) -> y, [0, 1, 2], ['c', 'b', 'a']) as res;
┌─res─────┐
│ [2,1,0] │
└─────────┘
SELECT arraySort((x, y) -> -y, [0, 1, 2], [1, 2, 3]) as res;
┌─res─────┐
│ [2,1,0] │
└─────────┘

NOTE

Sorts the elements of the arr array in descending order. If the func function is specified, arr is sorted according to the result of the func function applied to the elements of the array, and then the sorted array is reversed. If func accepts multiple arguments, the arrayReverseSort function is passed several arrays that the arguments of func will correspond to. Detailed examples are shown at the end of arrayReverseSort description.

Example of integer values sorting:

SELECT arrayReverseSort([1, 3, 3, 0]);
┌─arrayReverseSort([1, 3, 3, 0])─┐
│ [3,3,1,0]                      │
└────────────────────────────────┘

Example of string values sorting:

SELECT arrayReverseSort(['hello', 'world', '!']);
┌─arrayReverseSort(['hello', 'world', '!'])─┐
│ ['world','hello','!']                     │
└───────────────────────────────────────────┘

Consider the following sorting order for the NULL, NaN and Inf values:

SELECT arrayReverseSort([1, nan, 2, NULL, 3, nan, -4, NULL, inf, -inf]) as res;
┌─res───────────────────────────────────┐
│ [inf,3,2,1,-4,-inf,nan,nan,NULL,NULL] │
└───────────────────────────────────────┘
  • Inf values are first in the array.

  • NULL values are last in the array.

  • NaN values are right before NULL.

  • -Inf values are right before NaN.

SELECT arrayReverseSort((x) -> -x, [1, 2, 3]) as res;
┌─res─────┐
│ [1,2,3] │
└─────────┘

The array is sorted in the following way:

  1. At first, the source array ([1, 2, 3]) is sorted according to the result of the lambda function applied to the elements of the array. The result is an array [3, 2, 1].

  2. Array that is obtained on the previous step, is reversed. So, the final result is [1, 2, 3].

The lambda function can accept multiple arguments. In this case, you need to pass the arrayReverseSort function several arrays of identical length that the arguments of lambda function will correspond to. The resulting array will consist of elements from the first input array; elements from the next input array(s) specify the sorting keys. For example:

SELECT arrayReverseSort((x, y) -> y, ['hello', 'world'], [2, 1]) as res;
┌─res───────────────┐
│ ['hello','world'] │
└───────────────────┘

In this example, the array is sorted in the following way:

  1. At first, the source array ([‘hello’, ‘world’]) is sorted according to the result of the lambda function applied to the elements of the arrays. The elements that are passed in the second array ([2, 1]), define the sorting keys for corresponding elements from the source array. The result is an array [‘world’, ‘hello’].

  2. Array that was sorted on the previous step, is reversed. So, the final result is [‘hello’, ‘world’].

Other examples are shown below.

SELECT arrayReverseSort((x, y) -> y, [4, 3, 5], ['a', 'b', 'c']) AS res;
┌─res─────┐
│ [5,3,4] │
└─────────┘
SELECT arrayReverseSort((x, y) -> -y, [4, 3, 5], [1, 2, 3]) AS res;
┌─res─────┐
│ [4,3,5] │
└─────────┘

If one argument is passed, it counts the number of different elements in the array. If multiple arguments are passed, it counts the number of different tuples of elements at corresponding positions in multiple arrays.

If you want to get a list of unique items in an array, you can use arrayReduce(‘groupUniqArray’, arr).

Example

SELECT arrayUniq([2, 3]) AS res;

Calculates the difference between adjacent array elements. Returns an array where the first element will be 0, the second is the difference between a[1] - a[0], etc. The type of elements in the resulting array is determined by the type inference rules for subtraction (e.g. UInt8 - UInt8 = Int16).

Syntax

arrayDifference(array)

Arguments

Returned values

Returns an array of differences between adjacent elements.

Example

Query:

SELECT arrayDifference([1, 2, 3, 4]);

Result:

┌─arrayDifference([1, 2, 3, 4])─┐
│ [0,1,1,1]                     │
└───────────────────────────────┘

Example of the overflow due to result type Int64:

Query:

SELECT arrayDifference([0, 10000000000000000000]);

Result:

┌─arrayDifference([0, 10000000000000000000])─┐
│ [0,-8446744073709551616]                   │
└────────────────────────────────────────────┘

Takes an array, returns an array containing the distinct elements only.

Syntax

arrayDistinct(array)

Arguments

Returned values

Returns an array containing the distinct elements.

Example

Query:

SELECT arrayDistinct([1, 2, 2, 3, 1]);

Result:

┌─arrayDistinct([1, 2, 2, 3, 1])─┐
│ [1,2,3]                        │
└────────────────────────────────┘

Returns an array of the same size as the source array, indicating where each element first appears in the source array.

Example

SELECT arrayEnumerateDense([10, 20, 10, 30])
┌─arrayEnumerateDense([10, 20, 10, 30])─┐
│ [1,2,1,3]                             │
└───────────────────────────────────────┘

Takes multiple arrays, returns an array with elements that are present in all source arrays.

Example

SELECT
    arrayIntersect([1, 2], [1, 3], [2, 3]) AS no_intersect,
    arrayIntersect([1, 2], [1, 3], [1, 4]) AS intersect
┌─no_intersect─┬─intersect─┐
│ []           │ [1]       │
└──────────────┴───────────┘

Applies an aggregate function to array elements and returns its result. The name of the aggregation function is passed as a string in single quotes 'max', 'sum'. When using parametric aggregate functions, the parameter is indicated after the function name in parentheses 'uniqUpTo(6)'.

Syntax

arrayReduce(agg_func, arr1, arr2, ..., arrN)

Arguments

Returned value

Example

Query:

SELECT arrayReduce('max', [1, 2, 3]);

Result:

┌─arrayReduce('max', [1, 2, 3])─┐
│                             3 │
└───────────────────────────────┘

If an aggregate function takes multiple arguments, then this function must be applied to multiple arrays of the same size.

Query:

SELECT arrayReduce('maxIf', [3, 5], [1, 0]);

Result:

┌─arrayReduce('maxIf', [3, 5], [1, 0])─┐
│                                    3 │
└──────────────────────────────────────┘

Example with a parametric aggregate function:

Query:

SELECT arrayReduce('uniqUpTo(3)', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);

Result:

┌─arrayReduce('uniqUpTo(3)', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])─┐
│                                                           4 │
└─────────────────────────────────────────────────────────────┘

Applies an aggregate function to array elements in given ranges and returns an array containing the result corresponding to each range. The function will return the same result as multiple arrayReduce(agg_func, arraySlice(arr1, index, length), ...).

Syntax

arrayReduceInRanges(agg_func, ranges, arr1, arr2, ..., arrN)

Arguments

Returned value

  • Array containing results of the aggregate function over specified ranges.

Example

Query:

SELECT arrayReduceInRanges(
    'sum',
    [(1, 5), (2, 3), (3, 4), (4, 4)],
    [1000000, 200000, 30000, 4000, 500, 60, 7]
) AS res

Result:

┌─res─────────────────────────┐
│ [1234500,234000,34560,4567] │
└─────────────────────────────┘

Returns an array of the same size as the original array containing the elements in reverse order.

Example:

SELECT arrayReverse([1, 2, 3])
┌─arrayReverse([1, 2, 3])─┐
│ [3,2,1]                 │
└─────────────────────────┘

Converts an array of arrays to a flat array.

Function:

  • Applies to any depth of nested arrays.

  • Does not change arrays that are already flat.

The flattened array contains all the elements from all source arrays.

Syntax

flatten(array_of_arrays)

Alias: flatten.

Arguments

Examples

SELECT flatten([[[1]], [[2], [3]]]);
┌─flatten(array(array([1]), array([2], [3])))─┐
│ [1,2,3]                                     │
└─────────────────────────────────────────────┘

Removes consecutive duplicate elements from an array. The order of result values is determined by the order in the source array.

Syntax

arrayCompact(arr)

Arguments

Returned value

The array without duplicate.

Type: Array.

Example

Query:

SELECT arrayCompact([1, 1, nan, nan, 2, 3, 3, 3]);

Result:

┌─arrayCompact([1, 1, nan, nan, 2, 3, 3, 3])─┐
│ [1,nan,nan,2,3]                            │
└────────────────────────────────────────────┘

Combines multiple arrays into a single array. The resulting array contains the corresponding elements of the source arrays grouped into tuples in the listed order of arguments.

Syntax

arrayZip(arr1, arr2, ..., arrN)

Arguments

The function can take any number of arrays of different types. All the input arrays must be of equal size.

Returned value

Example

Query:

SELECT arrayZip(['a', 'b', 'c'], [5, 2, 1]);

Result:

┌─arrayZip(['a', 'b', 'c'], [5, 2, 1])─┐
│ [('a',5),('b',2),('c',1)]            │
└──────────────────────────────────────┘

Syntax

arrayAUC(arr_scores, arr_labels)

Arguments

  • arr_scores — scores prediction model gives.

  • arr_labels — labels of samples, usually 1 for positive sample and 0 for negtive sample.

Returned value

Returns AUC value with type Float64.

Example

Query:

select arrayAUC([0.1, 0.4, 0.35, 0.8], [0, 0, 1, 1]);

Result:

┌─arrayAUC([0.1, 0.4, 0.35, 0.8], [0, 0, 1, 1])─┐
│                                          0.75 │
└───────────────────────────────────────────────┘

Returns an array obtained from the original arrays by application of func(arr1[i], …, arrN[i]) for each element. Arrays arr1 … arrN must have the same number of elements.

Examples

SELECT arrayMap(x -> (x + 2), [1, 2, 3]) as res;
┌─res─────┐
│ [3,4,5] │
└─────────┘

The following example shows how to create a tuple of elements from different arrays:

SELECT arrayMap((x, y) -> (x, y), [1, 2, 3], [4, 5, 6]) AS res
┌─res─────────────────┐
│ [(1,4),(2,5),(3,6)] │
└─────────────────────┘

Returns an array containing only the elements in arr1 for which func(arr1[i], …, arrN[i]) returns something other than 0.

Examples

SELECT arrayFilter(x -> x LIKE '%World%', ['Hello', 'abc World']) AS res
┌─res───────────┐
│ ['abc World'] │
└───────────────┘
SELECT
    arrayFilter(
        (i, x) -> x LIKE '%World%',
        arrayEnumerate(arr),
        ['Hello', 'abc World'] AS arr)
    AS res
┌─res─┐
│ [2] │
└─────┘

Scan through arr1 from the first element to the last element and replace arr1[i] by arr1[i - 1] if func(arr1[i], …, arrN[i]) returns 0. The first element of arr1 will not be replaced.

Examples

SELECT arrayFill(x -> not isNull(x), [1, null, 3, 11, 12, null, null, 5, 6, 14, null, null]) AS res
┌─res──────────────────────────────┐
│ [1,1,3,11,12,12,12,5,6,14,14,14] │
└──────────────────────────────────┘

Scan through arr1 from the last element to the first element and replace arr1[i] by arr1[i + 1] if func(arr1[i], …, arrN[i]) returns 0. The last element of arr1 will not be replaced.

Examples:

SELECT arrayReverseFill(x -> not isNull(x), [1, null, 3, 11, 12, null, null, 5, 6, 14, null, null]) AS res
┌─res────────────────────────────────┐
│ [1,3,3,11,12,5,5,5,6,14,NULL,NULL] │
└────────────────────────────────────┘

Split arr1 into multiple arrays. When func(arr1[i], …, arrN[i]) returns something other than 0, the array will be split on the left hand side of the element. The array will not be split before the first element.

Examples:

SELECT arraySplit((x, y) -> y, [1, 2, 3, 4, 5], [1, 0, 0, 1, 0]) AS res
┌─res─────────────┐
│ [[1,2,3],[4,5]] │
└─────────────────┘

Split arr1 into multiple arrays. When func(arr1[i], …, arrN[i]) returns something other than 0, the array will be split on the right hand side of the element. The array will not be split after the last element.

Examples:

SELECT arrayReverseSplit((x, y) -> y, [1, 2, 3, 4, 5], [1, 0, 0, 1, 0]) AS res
┌─res───────────────┐
│ [[1],[2,3,4],[5]] │
└───────────────────┘

Returns 1 if there is at least one element in arr for which func(arr1[i], …, arrN[i]) returns something other than 0. Otherwise, it returns 0.

Example

SELECT arrayAll((x,y)->x==y,[1,2,3],[4,5,6]);

Returns 1 if func(arr1[i], …, arrN[i]) returns something other than 0 for all the elements in arrays. Otherwise, it returns 0.

Example

SELECT arrayAll((x,y)->x==y,[1,2,3],[4,5,6]);

Returns the first element in the arr1 array for which func(arr1[i], …, arrN[i]) returns something other than 0.

Example

SELECT arrayFirst(x -> x LIKE '%World%', ['Hello World', 'abc World']) AS res

Returns the index of the first element in the arr1 array for which func(arr1[i], …, arrN[i]) returns something other than 0.

Example

SELECT arrayFirstIndex(x -> x LIKE '%World%', ['Hello World', 'abc World']) AS res
```json
Returns a bitwise 'AND' of two numbers.
```

Returns the minimum of elements in the source array.

If the func function is specified, returns the mininum of elements converted by this function.

Syntax

arrayMin([func,] arr)

Arguments

Returned value

  • The minimum of function values (or the array minimum).

Type: if func is specified, matches func return value type, else matches the array elements type.

Examples

Query:

SELECT arrayMin([1, 2, 4]) AS res;

Result:

┌─res─┐
│   1 │
└─────┘

Query:

SELECT arrayMin(x -> (-x), [1, 2, 4]) AS res;

Result:

┌─res─┐
│  -4 │
└─────┘

Returns the maximum of elements in the source array.

If the func function is specified, returns the maximum of elements converted by this function.

Syntax

arrayMax([func,] arr)

Arguments

Returned value

  • The maximum of function values (or the array maximum).

Type: if func is specified, matches func return value type, else matches the array elements type.

Examples

Query:

SELECT arrayMax([1, 2, 4]) AS res;

Result:

┌─res─┐
│   4 │
└─────┘

Query:

SELECT arrayMax(x -> (-x), [1, 2, 4]) AS res;

Result:

┌─res─┐
│  -1 │
└─────┘

Returns the sum of elements in the source array.

If the func function is specified, returns the sum of elements converted by this function.

Syntax

arraySum([func,] arr)

Arguments

Returned value

  • The sum of the function values (or the array sum).

Examples

Query:

SELECT arraySum([2, 3]) AS res;

Result:

┌─res─┐
│   5 │
└─────┘

Query:

SELECT arraySum(x -> x*x, [2, 3]) AS res;

Result:

┌─res─┐
│  13 │
└─────┘

Returns the average of elements in the source array.

If the func function is specified, returns the average of elements converted by this function.

Syntax

arrayAvg([func,] arr)

Arguments

Returned value

  • The average of function values (or the array average).

Examples

Query:

SELECT arrayAvg([1, 2, 4]) AS res;

Result:

┌────────────────res─┐
│ 2.3333333333333335 │
└────────────────────┘

Query:

SELECT arrayAvg(x -> (x * x), [2, 4]) AS res;

Result:

┌─res─┐
│  10 │
└─────┘

Returns an array of partial sums of elements in the source array (a running sum). If the func function is specified, then the values of the array elements are converted by func(arr1[i], …, arrN[i]) before summing.

Example:

SELECT arrayCumSum([1, 1, 1, 1]) AS res
┌─res──────────┐
│ [1, 2, 3, 4] │
└──────────────┘

Same as arrayCumSum, returns an array of partial sums of elements in the source array (a running sum). Different arrayCumSum, when then returned value contains a value less than zero, the value is replace with zero and the subsequent calculation is performed with zero parameters. For example:

SELECT arrayCumSumNonNegative([1, 1, -4, 1]) AS res
┌─res───────┐
│ [1,2,0,1] │
└───────────┘

Syntax

arrayProduct(arr)

Arguments

Returned value

  • A product of array's elements.

Examples

Query:

SELECT arrayProduct([1,2,3,4,5,6]) as res;

Result:

┌─res───┐
│ 720   │
└───────┘

Query:

SELECT arrayProduct([toDecimal64(1,8), toDecimal64(2,8), toDecimal64(3,8)]) as res, toTypeName(res);
┌─res─┬─toTypeName(arrayProduct(array(toDecimal64(1, 8), toDecimal64(2, 8), toDecimal64(3, 8))))─┐
│ 6   │ Float64                                                                                  │
└─────┴──────────────────────────────────────────────────────────────────────────────────────────┘

BIT

Bit functions work for any pair of types from UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Float32, or Float64. Some functions support String and FixedString types.

The result type is an integer with bits equal to the maximum bits of its arguments. If at least one of the arguments is signed, the result is a signed number. If an argument is a floating-point number, it is cast to Int64.

bitAnd(a, b)

Returns a bitwise 'AND' of two numbers.

Shifts the binary representation of a value to the left by a specified number of bit positions.

A FixedString or a String is treated as a single multibyte value.

Bits of a FixedString value are lost as they are shifted out. On the contrary, a String value is extended with additional bytes, so no bits are lost.

Syntax

bitShiftLeft(a, b)

Arguments

Returned value

  • Shifted value.

The type of the returned value is the same as the type of the input value.

Example

SELECT 99 AS a, bin(a), bitShiftLeft(a, 2) AS a_shifted, bin(a_shifted);
SELECT 'abc' AS a, hex(a), bitShiftLeft(a, 4) AS a_shifted, hex(a_shifted);
SELECT toFixedString('abc', 3) AS a, hex(a), bitShiftLeft(a, 4) AS a_shifted, hex(a_shifted);

Result:

┌──a─┬─bin(99)──┬─a_shifted─┬─bin(bitShiftLeft(99, 2))─┐
│ 99 │ 01100011 │       140 │ 10001100                 │
└────┴──────────┴───────────┴──────────────────────────┘
┌─a───┬─hex('abc')─┬─a_shifted─┬─hex(bitShiftLeft('abc', 4))─┐
│ abc │ 616263     │ &0        │ 06162630                    │
└─────┴────────────┴───────────┴─────────────────────────────┘
┌─a───┬─hex(toFixedString('abc', 3))─┬─a_shifted─┬─hex(bitShiftLeft(toFixedString('abc', 3), 4))─┐
│ abc │ 616263                       │ &0        │ 162630                                        │
└─────┴──────────────────────────────┴───────────┴───────────────────────────────────────────────┘

Shifts the binary representation of a value to the right by a specified number of bit positions.

A FixedString or a String is treated as a single multibyte value. Note that the length of a String value is reduced as bits are shifted out.

Syntax

bitShiftRight(a, b)

Arguments

Returned value

  • Shifted value.

The type of the returned value is the same as the type of the input value.

Example

Query:

SELECT 101 AS a, bin(a), bitShiftRight(a, 2) AS a_shifted, bin(a_shifted);
SELECT 'abc' AS a, hex(a), bitShiftRight(a, 12) AS a_shifted, hex(a_shifted);
SELECT toFixedString('abc', 3) AS a, hex(a), bitShiftRight(a, 12) AS a_shifted, hex(a_shifted);

Result:

┌───a─┬─bin(101)─┬─a_shifted─┬─bin(bitShiftRight(101, 2))─┐
│ 101 │ 01100101 │        25 │ 00011001                   │
└─────┴──────────┴───────────┴────────────────────────────┘
┌─a───┬─hex('abc')─┬─a_shifted─┬─hex(bitShiftRight('abc', 12))─┐
│ abc │ 616263     │           │ 0616                          │
└─────┴────────────┴───────────┴───────────────────────────────┘
┌─a───┬─hex(toFixedString('abc', 3))─┬─a_shifted─┬─hex(bitShiftRight(toFixedString('abc', 3), 12))─┐
│ abc │ 616263                       │           │ 000616                                          │
└─────┴──────────────────────────────┴───────────┴─────────────────────────────────────────────────┘

Syntax

SELECT bitTest(number, index)

Arguments

  • number – Integer number.

  • index – Position of bit.

Returned values

Returns a value of bit at specified position.

Type: UInt8.

Example

For example, the number 43 in base-2 (binary) numeral system is 101011.

Query:

SELECT bitTest(43, 1);

Result:

┌─bitTest(43, 1)─┐
│              1 │
└────────────────┘

Another example:

Query:

SELECT bitTest(43, 2);

Result:

┌─bitTest(43, 2)─┐
│              0 │
└────────────────┘

The conjuction for bitwise operations:

0 AND 0 = 0

0 AND 1 = 0

1 AND 0 = 0

1 AND 1 = 1

Syntax

SELECT bitTestAll(number, index1, index2, index3, index4, ...)

Arguments

  • number – Integer number.

  • index1, index2, index3, index4 – Positions of bit. For example, for set of positions (index1, index2, index3, index4) is true if and only if all of its positions are true (index1 ⋀ index2, ⋀ index3 ⋀ index4).

Returned values

Returns result of logical conjuction.

Type: UInt8.

Example

For example, the number 43 in base-2 (binary) numeral system is 101011.

Query:

SELECT bitTestAll(43, 0, 1, 3, 5);

Result:

┌─bitTestAll(43, 0, 1, 3, 5)─┐
│                          1 │
└────────────────────────────┘

Another example:

Query:

SELECT bitTestAll(43, 0, 1, 3, 5, 2);

Result:

┌─bitTestAll(43, 0, 1, 3, 5, 2)─┐
│                             0 │
└───────────────────────────────┘

The disjunction for bitwise operations:

0 OR 0 = 0

0 OR 1 = 1

1 OR 0 = 1

1 OR 1 = 1

Syntax

SELECT bitTestAny(number, index1, index2, index3, index4, ...)

Arguments

  • number – Integer number.

  • index1, index2, index3, index4 – Positions of bit.

Returned values

Returns result of logical disjuction.

Type: UInt8.

Example

For example, the number 43 in base-2 (binary) numeral system is 101011.

Query:

SELECT bitTestAny(43, 0, 2);

Result:

┌─bitTestAny(43, 0, 2)─┐
│                    1 │
└──────────────────────┘

Another example:

Query:

SELECT bitTestAny(43, 4, 2);

Result:

┌─bitTestAny(43, 4, 2)─┐
│                    0 │
└──────────────────────┘

Calculates the number of bits set to one in the binary representation of a number.

Syntax

bitCount(x)

Arguments

Returned value

  • Number of bits set to one in the input number.

Type: UInt8.

Example

Take for example the number 333. Its binary representation: 0000000101001101.

Query:

SELECT bitCount(333);

Result:

┌─bitCount(333)─┐
│             5 │
└───────────────┘

Syntax

bitHammingDistance(int1, int2)

Arguments

Returned value

  • The Hamming distance.

Examples

Query:

SELECT bitHammingDistance(111, 121);

Result:

┌─bitHammingDistance(111, 121)─┐
│                            3 │
└──────────────────────────────┘
SELECT bitHammingDistance(ngramSimHash('cat ate rat'), ngramSimHash('rat ate cat'));

Result:

┌─bitHammingDistance(ngramSimHash('cat ate rat'), ngramSimHash('rat ate cat'))─┐
│                                                                            5 │
└──────────────────────────────────────────────────────────────────────────────┘

BITMAP

Bitmap functions work for two bitmaps Object value calculation, it is to return new bitmap or cardinality while using formula calculation, such as and, or, xor, and not, etc.

There are 2 kinds of construction methods for Bitmap Object. One is to be constructed by aggregation function groupBitmap with -State, the other is to be constructed by Array Object. It is also to convert Bitmap Object to Array Object.

RoaringBitmap is wrapped into a data structure while actual storage of Bitmap objects. When the cardinality is less than or equal to 32, it uses Set objet. When the cardinality is greater than 32, it uses RoaringBitmap object. That is why storage of low cardinality set is faster.

Build a bitmap from unsigned integer array.

bitmapBuild(array)

Arguments

  • array – Unsigned integer array.

Example

SELECT bitmapBuild([1, 2, 3, 4, 5]) AS res, toTypeName(res);
┌─res─┬─toTypeName(bitmapBuild([1, 2, 3, 4, 5]))─────┐
│     │ AggregateFunction(groupBitmap, UInt8)        │
└─────┴──────────────────────────────────────────────┘

Convert bitmap to integer array.

bitmapToArray(bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapToArray(bitmapBuild([1, 2, 3, 4, 5])) AS res;
┌─res─────────┐
│ [1,2,3,4,5] │
└─────────────┘

Return subset in specified range (not include the range_end).

bitmapSubsetInRange(bitmap, range_start, range_end)

Arguments

Example

SELECT bitmapToArray(bitmapSubsetInRange(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(30), toUInt32(200))) AS res;
┌─res───────────────┐
│ [30,31,32,33,100] │
└───────────────────┘

Creates a subset of bitmap with n elements taken between range_start and cardinality_limit.

Syntax

bitmapSubsetLimit(bitmap, range_start, cardinality_limit)

Arguments

Returned value

The subset.

Example

Query:

SELECT bitmapToArray(bitmapSubsetLimit(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(30), toUInt32(200))) AS res;

Result:

┌─res───────────────────────┐
│ [30,31,32,33,100,200,500] │
└───────────────────────────┘

Syntax

subBitmap(bitmap, offset, cardinality_limit)

Arguments

Returned value

The subset.

Example

Query:

SELECT bitmapToArray(subBitmap(bitmapBuild([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,100,200,500]), toUInt32(10), toUInt32(10))) AS res;

Result:

┌─res─────────────────────────────┐
│ [10,11,12,13,14,15,16,17,18,19] │
└─────────────────────────────────┘

Checks whether the bitmap contains an element.

bitmapContains(haystack, needle)

Arguments

Returned values

  • 0 — If haystack does not contain needle.

  • 1 — If haystack contains needle.

Type: UInt8.

Example

SELECT bitmapContains(bitmapBuild([1,5,7,9]), toUInt32(9)) AS res;
┌─res─┐
│  1  │
└─────┘

Checks whether two bitmaps have intersection by some elements.

bitmapHasAny(bitmap1, bitmap2)

Arguments

  • bitmap* – Bitmap object.

Return values

  • 1, if bitmap1 and bitmap2 have one similar element at least.

  • 0, otherwise.

Example

SELECT bitmapHasAny(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
┌─res─┐
│  1  │
└─────┘

Analogous to hasAll(array, array) returns 1 if the first bitmap contains all the elements of the second one, 0 otherwise. If the second argument is an empty bitmap then returns 1.

bitmapHasAll(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapHasAll(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
┌─res─┐
│  0  │
└─────┘

Retrun bitmap cardinality of type UInt64.

bitmapCardinality(bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapCardinality(bitmapBuild([1, 2, 3, 4, 5])) AS res;
┌─res─┐
│   5 │
└─────┘

Retrun the smallest value of type UInt64 in the set, UINT32_MAX if the set is empty.

bitmapMin(bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapMin(bitmapBuild([1, 2, 3, 4, 5])) AS res;
 ┌─res─┐
 │   1 │
 └─────┘

Retrun the greatest value of type UInt64 in the set, 0 if the set is empty.

bitmapMax(bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapMax(bitmapBuild([1, 2, 3, 4, 5])) AS res;
 ┌─res─┐
 │   5 │
 └─────┘

Transform an array of values in a bitmap to another array of values, the result is a new bitmap.

bitmapTransform(bitmap, from_array, to_array)

Arguments

  • bitmap – Bitmap object.

  • from_array – UInt32 array. For idx in range [0, from_array.size()), if bitmap contains from_array[idx], then replace it with to_array[idx]. Note that the result depends on array ordering if there are common elements between from_array and to_array.

  • to_array – UInt32 array, its size shall be the same to from_array.

Example

SELECT bitmapToArray(bitmapTransform(bitmapBuild([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]), cast([5,999,2] as Array(UInt32)), cast([2,888,20] as Array(UInt32)))) AS res;
 ┌─res───────────────────┐
 │ [1,3,4,6,7,8,9,10,20] │
 └───────────────────────┘

Two bitmap and calculation, the result is a new bitmap.

bitmapAnd(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapToArray(bitmapAnd(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res;
┌─res─┐
│ [3] │
└─────┘

Two bitmap or calculation, the result is a new bitmap.

bitmapOr(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapToArray(bitmapOr(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res;
┌─res─────────┐
│ [1,2,3,4,5] │
└─────────────┘

Two bitmap xor calculation, the result is a new bitmap.

bitmapXor(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapToArray(bitmapXor(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res;
┌─res───────┐
│ [1,2,4,5] │
└───────────┘

Two bitmap andnot calculation, the result is a new bitmap.

bitmapAndnot(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapToArray(bitmapAndnot(bitmapBuild([1,2,3]),bitmapBuild([3,4,5]))) AS res;
┌─res───┐
│ [1,2] │
└───────┘

Two bitmap and calculation, return cardinality of type UInt64.

bitmapAndCardinality(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapAndCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
┌─res─┐
│   1 │
└─────┘

Two bitmap or calculation, return cardinality of type UInt64.

bitmapOrCardinality(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapOrCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
┌─res─┐
│   5 │
└─────┘

Two bitmap xor calculation, return cardinality of type UInt64.

bitmapXorCardinality(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapXorCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
┌─res─┐
│   4 │
└─────┘

Two bitmap andnot calculation, return cardinality of type UInt64.

bitmapAndnotCardinality(bitmap,bitmap)

Arguments

  • bitmap – Bitmap object.

Example

SELECT bitmapAndnotCardinality(bitmapBuild([1,2,3]),bitmapBuild([3,4,5])) AS res;
┌─res─┐
│   2 │
└─────┘

CONDITIONAL

Controls conditional branching. Unlike most systems, ClickHouse always evaluates both expressions then and else.

Syntax

if(cond, then, else)

If the condition cond evaluates to a non-zero value, returns the result of the expression then, and the result of the expression else, if present, is skipped. If the cond is zero or NULL, then the result of the then expression is skipped and the result of the else expression, if present, is returned.

Arguments

  • cond – The condition for evaluation that can be zero or not. The type is UInt8, Nullable(UInt8) or NULL.

  • then – The expression to return if condition is met.

  • else – The expression to return if condition is not met.

Returned values

The function executes then and else expressions and returns its result, depending on whether the condition cond ended up being zero or not.

Example

Query:

SELECT if(1, plus(2, 2), plus(2, 6));

Result:

┌─plus(2, 2)─┐
│          4 │
└────────────┘

Query:

SELECT if(0, plus(2, 2), plus(2, 6));

Result:

┌─plus(2, 6)─┐
│          8 │
└────────────┘
  • then and else must have the lowest common type.

Example:

Take this LEFT_RIGHT table:

SELECT *
FROM LEFT_RIGHT

┌─left─┬─right─┐
│ ᴺᵁᴸᴸ │     4 │
│    1 │     3 │
│    2 │     2 │
│    3 │     1 │
│    4 │  ᴺᵁᴸᴸ │
└──────┴───────┘

The following query compares left and right values:

SELECT
    left,
    right,
    if(left < right, 'left is smaller than right', 'right is greater or equal than left') AS is_smaller
FROM LEFT_RIGHT
WHERE isNotNull(left) AND isNotNull(right)

┌─left─┬─right─┬─is_smaller──────────────────────────┐
│    1 │     3 │ left is smaller than right          │
│    2 │     2 │ right is greater or equal than left │
│    3 │     1 │ right is greater or equal than left │
└──────┴───────┴─────────────────────────────────────┘

It works same as if function.

Syntax: cond ? then : else

Returns then if the cond evaluates to be true (greater than zero), otherwise returns else.

  • cond must be of type of UInt8, and then and else must have the lowest common type.

  • then and else can be NULL

See also

Syntax

multiIf(cond_1, then_1, cond_2, then_2, ..., else)

Arguments

  • cond_N — The condition for the function to return then_N.

  • then_N — The result of the function when executed.

  • else — The result of the function if none of the conditions is met.

The function accepts 2N+1 parameters.

Returned values

The function returns one of the values then_N or else, depending on the conditions cond_N.

Example

Again using LEFT_RIGHT table.

SELECT
    left,
    right,
    multiIf(left < right, 'left is smaller', left > right, 'left is greater', left = right, 'Both equal', 'Null value') AS result
FROM LEFT_RIGHT

┌─left─┬─right─┬─result──────────┐
│ ᴺᵁᴸᴸ │     4 │ Null value      │
│    1 │     3 │ left is smaller │
│    2 │     2 │ Both equal      │
│    3 │     1 │ left is greater │
│    4 │  ᴺᵁᴸᴸ │ Null value      │
└──────┴───────┴─────────────────┘

DATES AND TIMES

Support for time zones.

All functions for working with the date and time that have a logical use for the time zone can accept a second optional time zone argument. Example: Asia/Yekaterinburg. In this case, they use the specified time zone instead of the local (default) one.

SELECT
    toDateTime('2016-06-15 23:00:00') AS time,
    toDate(time) AS date_local,
    toDate(time, 'Asia/Yekaterinburg') AS date_yekat,
    toString(time, 'US/Samoa') AS time_samoa
┌────────────────time─┬─date_local─┬─date_yekat─┬─time_samoa──────────┐
│ 2016-06-15 23:00:00 │ 2016-06-15 │ 2016-06-16 │ 2016-06-15 09:00:00 │
└─────────────────────┴────────────┴────────────┴─────────────────────┘

Returns the timezone of the server. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Syntax

timeZone()

Alias: timezone.

Returned value

  • Timezone.

Converts time or date and time to the specified time zone. The time zone is an attribute of the Date and DateTime data types. The internal value (number of seconds) of the table field or of the resultset's column does not change, the column's type changes and its string representation changes accordingly.

Syntax

toTimezone(value, timezone)

Alias: toTimezone.

Arguments

Returned value

  • Date and time.

Example

Query:

SELECT toDateTime('2019-01-01 00:00:00', 'UTC') AS time_utc,
    toTypeName(time_utc) AS type_utc,
    toInt32(time_utc) AS int32utc,
    toTimeZone(time_utc, 'Asia/Yekaterinburg') AS time_yekat,
    toTypeName(time_yekat) AS type_yekat,
    toInt32(time_yekat) AS int32yekat,
    toTimeZone(time_utc, 'US/Samoa') AS time_samoa,
    toTypeName(time_samoa) AS type_samoa,
    toInt32(time_samoa) AS int32samoa
FORMAT Vertical;

Result:

Row 1:
──────
time_utc:   2019-01-01 00:00:00
type_utc:   DateTime('UTC')
int32utc:   1546300800
time_yekat: 2019-01-01 05:00:00
type_yekat: DateTime('Asia/Yekaterinburg')
int32yekat: 1546300800
time_samoa: 2018-12-31 13:00:00
type_samoa: DateTime('US/Samoa')
int32samoa: 1546300800

toTimeZone(time_utc, 'Asia/Yekaterinburg') changes the DateTime('UTC') type to DateTime('Asia/Yekaterinburg'). The value (Unixtimestamp) 1546300800 stays the same, but the string representation (the result of the toString() function) changes from time_utc: 2019-01-01 00:00:00 to time_yekat: 2019-01-01 05:00:00.

Syntax

timeZoneOf(value)

Alias: timezoneOf.

Arguments

Returned value

  • Timezone name.

Example

Query:

SELECT timezoneOf(now());

Result:

┌─timezoneOf(now())─┐
│ Etc/UTC           │
└───────────────────┘

Syntax

timeZoneOffset(value)

Alias: timezoneOffset.

Arguments

Returned value

  • Offset from UTC in seconds.

Example

Query:

SELECT toDateTime('2021-04-21 10:20:30', 'America/New_York') AS Time, toTypeName(Time) AS Type,
       timeZoneOffset(Time) AS Offset_in_seconds, (Offset_in_seconds / 3600) AS Offset_in_hours;

Result:

┌────────────────Time─┬─Type─────────────────────────┬─Offset_in_seconds─┬─Offset_in_hours─┐
│ 2021-04-21 10:20:30 │ DateTime('America/New_York') │            -14400 │              -4 │
└─────────────────────┴──────────────────────────────┴───────────────────┴─────────────────┘

Converts a date or date with time to a UInt16 number containing the year number (AD).

Alias: YEAR.

Converts a date or date with time to a UInt8 number containing the quarter number.

Alias: QUARTER.

Converts a date or date with time to a UInt8 number containing the month number (1-12).

Alias: MONTH.

Converts a date or date with time to a UInt16 number containing the number of the day of the year (1-366).

Alias: DAYOFYEAR.

Converts a date or date with time to a UInt8 number containing the number of the day of the month (1-31).

Aliases: DAYOFMONTH, DAY.

Converts a date or date with time to a UInt8 number containing the number of the day of the week (Monday is 1, and Sunday is 7).

Alias: DAYOFWEEK.

Converts a date with time to a UInt8 number containing the number of the hour in 24-hour time (0-23). This function assumes that if clocks are moved ahead, it is by one hour and occurs at 2 a.m., and if clocks are moved back, it is by one hour and occurs at 3 a.m. (which is not always true – even in Moscow the clocks were twice changed at a different time).

Alias: HOUR.

Converts a date with time to a UInt8 number containing the number of the minute of the hour (0-59).

Alias: MINUTE.

Converts a date with time to a UInt8 number containing the number of the second in the minute (0-59). Leap seconds are not accounted for.

Alias: SECOND.

Syntax

toUnixTimestamp(datetime)
toUnixTimestamp(str, [timezone])

Returned value

  • Returns the unix timestamp.

Type: UInt32.

Example

Query:

SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp

Result:

┌─unix_timestamp─┐
│     1509836867 │
└────────────────┘

NOTE

Behavior for

  • enable_extended_results_for_datetime_functions = 0: Functions toStartOfYear, toStartOfISOYear, toStartOfQuarter, toStartOfMonth, toStartOfWeek, toLastDayOfMonth, toMonday return Date or DateTime. Functions toStartOfDay, toStartOfHour, toStartOfFifteenMinutes, toStartOfTenMinutes, toStartOfFiveMinutes, toStartOfMinute, timeSlot return DateTime. Though these functions can take values of the extended types Date32 and DateTime64 as an argument, passing them a time outside the normal range (year 1970 to 2149 for Date / 2106 for DateTime) will produce wrong results.

  • enable_extended_results_for_datetime_functions = 1:

    • Functions toStartOfYear, toStartOfISOYear, toStartOfQuarter, toStartOfMonth, toStartOfWeek, toLastDayOfMonth, toMonday return Date or DateTime if their argument is a Date or DateTime, and they return Date32 or DateTime64 if their argument is a Date32 or DateTime64.

    • Functions toStartOfDay, toStartOfHour, toStartOfFifteenMinutes, toStartOfTenMinutes, toStartOfFiveMinutes, toStartOfMinute, timeSlot return DateTime if their argument is a Date or DateTime, and they return DateTime64 if their argument is a Date32 or DateTime64.

Rounds down a date or date with time to the first day of the year. Returns the date.

Rounds down a date or date with time to the first day of ISO year. Returns the date.

Rounds down a date or date with time to the first day of the quarter. The first day of the quarter is either 1 January, 1 April, 1 July, or 1 October. Returns the date.

Rounds down a date or date with time to the first day of the month. Returns the date.

NOTE

The behavior of parsing incorrect dates is implementation specific. ClickHouse may return zero date, throw an exception or do “natural” overflow.

If toLastDayOfMonth is called with an argument of type Date greater then 2149-05-31, the result will be calculated from the argument 2149-05-31 instead.

Rounds down a date or date with time to the nearest Monday. Returns the date.

Rounds down a date or date with time to the nearest Sunday or Monday by mode. Returns the date. The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.

Rounds down a date with time to the start of the day.

Rounds down a date with time to the start of the hour.

Rounds down a date with time to the start of the minute.

Truncates sub-seconds.

Syntax

toStartOfSecond(value, [timezone])

Arguments

Returned value

  • Input value without sub-seconds.

Examples

Query without timezone:

WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64
SELECT toStartOfSecond(dt64);

Result:

┌───toStartOfSecond(dt64)─┐
│ 2020-01-01 10:20:30.000 │
└─────────────────────────┘

Query with timezone:

WITH toDateTime64('2020-01-01 10:20:30.999', 3) AS dt64
SELECT toStartOfSecond(dt64, 'Asia/Istanbul');

Result:

┌─toStartOfSecond(dt64, 'Asia/Istanbul')─┐
│                2020-01-01 13:20:30.000 │
└────────────────────────────────────────┘

See also

Rounds down a date with time to the start of the five-minute interval.

Rounds down a date with time to the start of the ten-minute interval.

Rounds down the date with time to the start of the fifteen-minute interval.

This is a generalization of other functions named toStartOf*. For example, toStartOfInterval(t, INTERVAL 1 year) returns the same as toStartOfYear(t), toStartOfInterval(t, INTERVAL 1 month) returns the same as toStartOfMonth(t), toStartOfInterval(t, INTERVAL 1 day) returns the same as toStartOfDay(t), toStartOfInterval(t, INTERVAL 15 minute) returns the same as toStartOfFifteenMinutes(t) etc.

Converts a date with time to a certain fixed date, while preserving the time.

Converts a date with time or date to the number of the year, starting from a certain fixed point in the past.

Converts a date with time or date to the number of the quarter, starting from a certain fixed point in the past.

Converts a date with time or date to the number of the month, starting from a certain fixed point in the past.

Converts a date with time or date to the number of the week, starting from a certain fixed point in the past.

Converts a date with time or date to the number of the day, starting from a certain fixed point in the past.

Converts a date with time or date to the number of the hour, starting from a certain fixed point in the past.

Converts a date with time or date to the number of the minute, starting from a certain fixed point in the past.

Converts a date with time or date to the number of the second, starting from a certain fixed point in the past.

Converts a date or date with time to a UInt16 number containing the ISO Year number.

Converts a date or date with time to a UInt8 number containing the ISO Week number.

This function returns the week number for date or datetime. The two-argument form of toWeek() enables you to specify whether the week starts on Sunday or Monday and whether the return value should be in the range from 0 to 53 or from 1 to 53. If the mode argument is omitted, the default mode is 0. toISOWeek()is a compatibility function that is equivalent to toWeek(date,3). The following table describes how the mode argument works.

Mode
First day of week
Range
Week 1 is the first week …

0

Sunday

0-53

with a Sunday in this year

1

Monday

0-53

with 4 or more days this year

2

Sunday

1-53

with a Sunday in this year

3

Monday

1-53

with 4 or more days this year

4

Sunday

0-53

with 4 or more days this year

5

Monday

0-53

with a Monday in this year

6

Sunday

1-53

with 4 or more days this year

7

Monday

1-53

with a Monday in this year

8

Sunday

1-53

contains January 1

9

Monday

1-53

contains January 1

For mode values with a meaning of “with 4 or more days this year,” weeks are numbered according to ISO 8601:1988:

  • If the week containing January 1 has 4 or more days in the new year, it is week 1.

  • Otherwise, it is the last week of the previous year, and the next week is week 1.

For mode values with a meaning of “contains January 1”, the week contains January 1 is week 1. It does not matter how many days in the new year the week contained, even if it contained only one day.

toWeek(date, [, mode][, Timezone])

Arguments

  • date – Date or DateTime.

  • mode – Optional parameter, Range of values is [0,9], default is 0.

  • Timezone – Optional parameter, it behaves like any other conversion function.

Example

SELECT toDate('2016-12-27') AS date, toWeek(date) AS week0, toWeek(date,1) AS week1, toWeek(date,9) AS week9;
┌───────date─┬─week0─┬─week1─┬─week9─┐
│ 2016-12-27 │    52 │    52 │     1 │
└────────────┴───────┴───────┴───────┘

Returns year and week for a date. The year in the result may be different from the year in the date argument for the first and the last week of the year.

The mode argument works exactly like the mode argument to toWeek(). For the single-argument syntax, a mode value of 0 is used.

toISOYear()is a compatibility function that is equivalent to intDiv(toYearWeek(date,3),100).

Example

SELECT toDate('2016-12-27') AS date, toYearWeek(date) AS yearWeek0, toYearWeek(date,1) AS yearWeek1, toYearWeek(date,9) AS yearWeek9;
┌───────date─┬─yearWeek0─┬─yearWeek1─┬─yearWeek9─┐
│ 2016-12-27 │    201652 │    201652 │    201701 │
└────────────┴───────────┴───────────┴───────────┘

Truncates date and time data to the specified part of date.

Syntax

date_trunc(unit, value[, timezone])

Alias: dateTrunc.

Arguments

    • second

    • minute

    • hour

    • day

    • week

    • month

    • quarter

    • year

Returned value

  • Value, truncated to the specified part of date.

Example

Query without timezone:

SELECT now(), date_trunc('hour', now());

Result:

┌───────────────now()─┬─date_trunc('hour', now())─┐
│ 2020-09-28 10:40:45 │       2020-09-28 10:00:00 │
└─────────────────────┴───────────────────────────┘

Query with the specified timezone:

SELECT now(), date_trunc('hour', now(), 'Asia/Istanbul');

Result:

┌───────────────now()─┬─date_trunc('hour', now(), 'Asia/Istanbul')─┐
│ 2020-09-28 10:46:26 │                        2020-09-28 13:00:00 │
└─────────────────────┴────────────────────────────────────────────┘

See Also

Adds the time interval or date interval to the provided date or date with time.

Syntax

date_add(unit, value, date)

Aliases: dateAdd, DATE_ADD.

Arguments

    • second

    • minute

    • hour

    • day

    • week

    • month

    • quarter

    • year

Returned value

Date or date with time obtained by adding value, expressed in unit, to date.

Example

Query:

SELECT date_add(YEAR, 3, toDate('2018-01-01'));

Result:

┌─plus(toDate('2018-01-01'), toIntervalYear(3))─┐
│                                    2021-01-01 │
└───────────────────────────────────────────────┘

Syntax

date_diff('unit', startdate, enddate, [timezone])

Aliases: dateDiff, DATE_DIFF.

Arguments

    • second

    • minute

    • hour

    • day

    • week

    • month

    • quarter

    • year

Returned value

Difference between enddate and startdate expressed in unit.

Example

Query:

SELECT dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'));

Result:

┌─dateDiff('hour', toDateTime('2018-01-01 22:00:00'), toDateTime('2018-01-02 23:00:00'))─┐
│                                                                                     25 │
└────────────────────────────────────────────────────────────────────────────────────────┘

Query:

SELECT
    toDate('2022-01-01') AS e,
    toDate('2021-12-29') AS s,
    dateDiff('day', s, e) AS day_diff,
    dateDiff('month', s, e) AS month__diff,
    dateDiff('year', s, e) AS year_diff;

Result:

┌──────────e─┬──────────s─┬─day_diff─┬─month__diff─┬─year_diff─┐
│ 2022-01-01 │ 2021-12-29 │        3 │           1 │         1 │
└────────────┴────────────┴──────────┴─────────────┴───────────┘

Subtracts the time interval or date interval from the provided date or date with time.

Syntax

date_sub(unit, value, date)

Aliases: dateSub, DATE_SUB.

Arguments

    • second

    • minute

    • hour

    • day

    • week

    • month

    • quarter

    • year

Returned value

Date or date with time obtained by subtracting value, expressed in unit, from date.

Example

Query:

SELECT date_sub(YEAR, 3, toDate('2018-01-01'));

Result:

┌─minus(toDate('2018-01-01'), toIntervalYear(3))─┐
│                                     2015-01-01 │
└────────────────────────────────────────────────┘

Adds the specified time value with the provided date or date time value.

Syntax

timestamp_add(date, INTERVAL value unit)

Aliases: timeStampAdd, TIMESTAMP_ADD.

Arguments

    • second

    • minute

    • hour

    • day

    • week

    • month

    • quarter

    • year

Returned value

Date or date with time with the specified value expressed in unit added to date.

Example

Query:

select timestamp_add(toDate('2018-01-01'), INTERVAL 3 MONTH);

Result:

┌─plus(toDate('2018-01-01'), toIntervalMonth(3))─┐
│                                     2018-04-01 │
└────────────────────────────────────────────────┘

Subtracts the time interval from the provided date or date with time.

Syntax

timestamp_sub(unit, value, date)

Aliases: timeStampSub, TIMESTAMP_SUB.

Arguments

    • second

    • minute

    • hour

    • day

    • week

    • month

    • quarter

    • year

Returned value

Date or date with time obtained by subtracting value, expressed in unit, from date.

Example

Query:

select timestamp_sub(MONTH, 5, toDateTime('2018-12-18 01:02:03'));

Result:

┌─minus(toDateTime('2018-12-18 01:02:03'), toIntervalMonth(5))─┐
│                                          2018-07-18 01:02:03 │
└──────────────────────────────────────────────────────────────┘

Returns the current date and time at the moment of query analysis. The function is a constant expression.

Syntax

now([timezone])

Arguments

Returned value

  • Current date and time.

Example

Query without timezone:

SELECT now();

Result:

┌───────────────now()─┐
│ 2020-10-17 07:42:09 │
└─────────────────────┘

Query with the specified timezone:

SELECT now('Asia/Istanbul');

Result:

┌─now('Asia/Istanbul')─┐
│  2020-10-17 10:42:23 │
└──────────────────────┘

Returns the current date and time with sub-second precision at the moment of query analysis. The function is a constant expression.

Syntax

now64([scale], [timezone])

Arguments

  • scale - Tick size (precision): 10-precision seconds. Valid range: [ 0 : 9 ]. Typically are used - 3 (default) (milliseconds), 6 (microseconds), 9 (nanoseconds).

Returned value

  • Current date and time with sub-second precision.

Example

SELECT now64(), now64(9, 'Asia/Istanbul');

Result:

┌─────────────────now64()─┬─────now64(9, 'Asia/Istanbul')─┐
│ 2022-08-21 19:34:26.196 │ 2022-08-21 22:34:26.196542766 │
└─────────────────────────┴───────────────────────────────┘

Accepts zero arguments and returns the current date at one of the moments of query analysis. The same as ‘toDate(now())’.

Accepts zero arguments and returns yesterday’s date at one of the moments of query analysis. The same as ‘today() - 1’.

Rounds the time to the half hour.

Converts a date or date with time to a UInt32 number containing the year and month number (YYYY * 100 + MM).

Converts a date or date with time to a UInt32 number containing the year and month number (YYYY * 10000 + MM * 100 + DD).

Converts a date or date with time to a UInt64 number containing the year and month number (YYYY * 10000000000 + MM * 100000000 + DD * 1000000 + hh * 10000 + mm * 100 + ss).

Function adds a Date/DateTime interval to a Date/DateTime and then return the Date/DateTime. For example:

WITH
    toDate('2018-01-01') AS date,
    toDateTime('2018-01-01 00:00:00') AS date_time
SELECT
    addYears(date, 1) AS add_years_with_date,
    addYears(date_time, 1) AS add_years_with_date_time
┌─add_years_with_date─┬─add_years_with_date_time─┐
│          2019-01-01 │      2019-01-01 00:00:00 │
└─────────────────────┴──────────────────────────┘

Function subtract a Date/DateTime interval to a Date/DateTime and then return the Date/DateTime. For example:

WITH
    toDate('2019-01-01') AS date,
    toDateTime('2019-01-01 00:00:00') AS date_time
SELECT
    subtractYears(date, 1) AS subtract_years_with_date,
    subtractYears(date_time, 1) AS subtract_years_with_date_time
┌─subtract_years_with_date─┬─subtract_years_with_date_time─┐
│               2018-01-01 │           2018-01-01 00:00:00 │
└──────────────────────────┴───────────────────────────────┘

Formats a Time according to the given Format string. Format is a constant expression, so you cannot have multiple formats for a single result column.

Syntax

formatDateTime(Time, Format[, Timezone])

Returned value(s)

Returns time and date values according to the determined format.

Replacement fields Using replacement fields, you can define a pattern for the resulting string. “Example” column shows formatting result for 2018-01-02 22:33:44.

Placeholder
Description
Example

%C

year divided by 100 and truncated to integer (00-99)

20

%d

day of the month, zero-padded (01-31)

02

%D

Short MM/DD/YY date, equivalent to %m/%d/%y

01/02/18

%e

day of the month, space-padded ( 1-31)

2

%f

fractional second from the fractional part of DateTime64

1234560

%F

short YYYY-MM-DD date, equivalent to %Y-%m-%d

2018-01-02

%G

2018

%g

two-digit year format, aligned to ISO 8601, abbreviated from four-digit notation

18

%H

hour in 24h format (00-23)

22

%I

hour in 12h format (01-12)

10

%j

day of the year (001-366)

002

%m

month as a decimal number (01-12)

01

%M

minute (00-59)

33

%n

new-line character (‘’)

%p

AM or PM designation

PM

%Q

Quarter (1-4)

1

%R

24-hour HH:MM time, equivalent to %H:%M

22:33

%S

second (00-59)

44

%t

horizontal-tab character (’)

%T

ISO 8601 time format (HH:MM:SS), equivalent to %H:%M:%S

22:33:44

%u

ISO 8601 weekday as number with Monday as 1 (1-7)

2

%V

ISO 8601 week number (01-53)

01

%w

weekday as a decimal number with Sunday as 0 (0-6)

2

%y

Year, last two digits (00-99)

18

%Y

Year

2018

%z

Time offset from UTC as +HHMM or -HHMM

-0500

%%

a % sign

%

Example

Query:

SELECT formatDateTime(toDate('2010-01-04'), '%g')

Result:

┌─formatDateTime(toDate('2010-01-04'), '%g')─┐
│ 10                                         │
└────────────────────────────────────────────┘

Query:

SELECT formatDateTime(toDateTime64('2010-01-04 12:34:56.123456', 7), '%f')

Result:

┌─formatDateTime(toDateTime64('2010-01-04 12:34:56.123456', 7), '%f')─┐
│ 1234560                                                             │
└─────────────────────────────────────────────────────────────────────┘

See Also

Replacement fields

Using replacement fields, you can define a pattern for the resulting string.

Placeholder
Description
Presentation
Examples

G

era

text

AD

C

century of era (>=0)

number

20

Y

year of era (>=0)

year

1996

x

weekyear(not supported yet)

year

1996

w

week of weekyear(not supported yet)

number

27

e

day of week

number

2

E

day of week

text

Tuesday; Tue

y

year

year

1996

D

day of year

number

189

M

month of year

month

July; Jul; 07

d

day of month

number

10

a

halfday of day

text

PM

K

hour of halfday (0~11)

number

0

h

clockhour of halfday (1~12)

number

12

H

hour of day (0~23)

number

0

k

clockhour of day (1~24)

number

24

m

minute of hour

number

30

s

second of minute

number

55

S

fraction of second(not supported yet)

number

978

z

time zone(short name not supported yet)

text

Pacific Standard Time; PST

Z

time zone offset/id(not supported yet)

zone

-0800; -08:00; America/Los_Angeles

'

escape for text

delimiter

''

single quote

literal

'

Example

Query:

SELECT formatDateTimeInJodaSyntax(toDateTime('2010-01-04 12:34:56'), 'yyyy-MM-dd HH:mm:ss')

Result:

┌─formatDateTimeInJodaSyntax(toDateTime('2010-01-04 12:34:56'), 'yyyy-MM-dd HH:mm:ss')─┐
│ 2010-01-04 12:34:56                                                                     │
└─────────────────────────────────────────────────────────────────────────────────────────┘

Returns specified part of date.

Syntax

dateName(date_part, date)

Arguments

Returned value

  • The specified part of date.

Example

Query:

WITH toDateTime('2021-04-14 11:22:33') AS date_value
SELECT
    dateName('year', date_value),
    dateName('month', date_value),
    dateName('day', date_value);

Result:

┌─dateName('year', date_value)─┬─dateName('month', date_value)─┬─dateName('day', date_value)─┐
│ 2021                         │ April                         │ 14                          │
└──────────────────────────────┴───────────────────────────────┴─────────────────────────────┘

Alias: fromUnixTimestamp.

Example:

Query:

SELECT FROM_UNIXTIME(423543535);

Result:

┌─FROM_UNIXTIME(423543535)─┐
│      1983-06-04 10:58:55 │
└──────────────────────────┘

For example:

SELECT FROM_UNIXTIME(1234334543, '%Y-%m-%d %R:%S') AS DateTime;
┌─DateTime────────────┐
│ 2009-02-11 14:42:23 │
└─────────────────────┘

See Also

Example: Query:

SELECT fromUnixTimestampInJodaSyntax(1669804872, 'yyyy-MM-dd HH:mm:ss', 'UTC');

Result:

┌─fromUnixTimestampInJodaSyntax(1669804872, 'yyyy-MM-dd HH:mm:ss', 'UTC')─┐
│ 2022-11-30 10:41:12                                                        │
└────────────────────────────────────────────────────────────────────────────┘

Syntax

toModifiedJulianDay(date)

Arguments

Returned value

  • Modified Julian Day number.

Example

Query:

SELECT toModifiedJulianDay('2020-01-01');

Result:

┌─toModifiedJulianDay('2020-01-01')─┐
│                             58849 │
└───────────────────────────────────┘

Syntax

toModifiedJulianDayOrNull(date)

Arguments

Returned value

  • Modified Julian Day number.

Example

Query:

SELECT toModifiedJulianDayOrNull('2020-01-01');

Result:

┌─toModifiedJulianDayOrNull('2020-01-01')─┐
│                                   58849 │
└─────────────────────────────────────────┘

Syntax

fromModifiedJulianDay(day)

Arguments

Returned value

  • Date in text form.

Example

Query:

SELECT fromModifiedJulianDay(58849);

Result:

┌─fromModifiedJulianDay(58849)─┐
│ 2020-01-01                   │
└──────────────────────────────┘

Syntax

fromModifiedJulianDayOrNull(day)

Arguments

Returned value

  • Date in text form.

Example

Query:

SELECT fromModifiedJulianDayOrNull(58849);

Result:

┌─fromModifiedJulianDayOrNull(58849)─┐
│ 2020-01-01                         │
└────────────────────────────────────┘

DICTIONARIES

Retrieves values from a dictionary.

dictGet('dict_name', attr_names, id_expr)
dictGetOrDefault('dict_name', attr_names, id_expr, default_value_expr)
dictGetOrNull('dict_name', attr_name, id_expr)

Arguments

Returned value

  • If there is no the key, corresponding to id_expr, in the dictionary, then:

    - `dictGet` returns the content of the `<null_value>` element specified for the attribute in the dictionary configuration.
    - `dictGetOrDefault` returns the value passed as the `default_value_expr` parameter.
    - `dictGetOrNull` returns `NULL` in case key was not found in dictionary.

ClickHouse throws an exception if it cannot parse the value of the attribute or the value does not match the attribute data type.

Example for simple key dictionary

Create a text file ext-dict-test.csv containing the following:

1,1
2,2

The first column is id, the second column is c1.

Configure the dictionary:

<clickhouse>
    <dictionary>
        <name>ext-dict-test</name>
        <source>
            <file>
                <path>/path-to/ext-dict-test.csv</path>
                <format>CSV</format>
            </file>
        </source>
        <layout>
            <flat />
        </layout>
        <structure>
            <id>
                <name>id</name>
            </id>
            <attribute>
                <name>c1</name>
                <type>UInt32</type>
                <null_value></null_value>
            </attribute>
        </structure>
        <lifetime>0</lifetime>
    </dictionary>
</clickhouse>

Perform the query:

SELECT
    dictGetOrDefault('ext-dict-test', 'c1', number + 1, toUInt32(number * 10)) AS val,
    toTypeName(val) AS type
FROM system.numbers
LIMIT 3;
┌─val─┬─type───┐
│   1 │ UInt32 │
│   2 │ UInt32 │
│  20 │ UInt32 │
└─────┴────────┘

Example for complex key dictionary

Create a text file ext-dict-mult.csv containing the following:

1,1,'1'
2,2,'2'
3,3,'3'

The first column is id, the second is c1, the third is c2.

Configure the dictionary:

<clickhouse>
    <dictionary>
        <name>ext-dict-mult</name>
        <source>
            <file>
                <path>/path-to/ext-dict-mult.csv</path>
                <format>CSV</format>
            </file>
        </source>
        <layout>
            <flat />
        </layout>
        <structure>
            <id>
                <name>id</name>
            </id>
            <attribute>
                <name>c1</name>
                <type>UInt32</type>
                <null_value></null_value>
            </attribute>
            <attribute>
                <name>c2</name>
                <type>String</type>
                <null_value></null_value>
            </attribute>
        </structure>
        <lifetime>0</lifetime>
    </dictionary>
</clickhouse>

Perform the query:

SELECT
    dictGet('ext-dict-mult', ('c1','c2'), number + 1) AS val,
    toTypeName(val) AS type
FROM system.numbers
LIMIT 3;
┌─val─────┬─type──────────────────┐
│ (1,'1') │ Tuple(UInt8, String)  │
│ (2,'2') │ Tuple(UInt8, String)  │
│ (3,'3') │ Tuple(UInt8, String)  │
└─────────┴───────────────────────┘

Example for range key dictionary

Input table:

CREATE TABLE range_key_dictionary_source_table
(
    key UInt64,
    start_date Date,
    end_date Date,
    value String,
    value_nullable Nullable(String)
)
ENGINE = TinyLog();

INSERT INTO range_key_dictionary_source_table VALUES(1, toDate('2019-05-20'), toDate('2019-05-20'), 'First', 'First');
INSERT INTO range_key_dictionary_source_table VALUES(2, toDate('2019-05-20'), toDate('2019-05-20'), 'Second', NULL);
INSERT INTO range_key_dictionary_source_table VALUES(3, toDate('2019-05-20'), toDate('2019-05-20'), 'Third', 'Third');

Create the dictionary:

CREATE DICTIONARY range_key_dictionary
(
    key UInt64,
    start_date Date,
    end_date Date,
    value String,
    value_nullable Nullable(String)
)
PRIMARY KEY key
SOURCE(CLICKHOUSE(HOST 'localhost' PORT tcpPort() TABLE 'range_key_dictionary_source_table'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(RANGE_HASHED())
RANGE(MIN start_date MAX end_date);

Perform the query:

SELECT
    (number, toDate('2019-05-20')),
    dictHas('range_key_dictionary', number, toDate('2019-05-20')),
    dictGetOrNull('range_key_dictionary', 'value', number, toDate('2019-05-20')),
    dictGetOrNull('range_key_dictionary', 'value_nullable', number, toDate('2019-05-20')),
    dictGetOrNull('range_key_dictionary', ('value', 'value_nullable'), number, toDate('2019-05-20'))
FROM system.numbers LIMIT 5 FORMAT TabSeparated;

Result:

(0,'2019-05-20')        0       \N      \N      (NULL,NULL)
(1,'2019-05-20')        1       First   First   ('First','First')
(2,'2019-05-20')        1       Second  \N      ('Second',NULL)
(3,'2019-05-20')        1       Third   Third   ('Third','Third')
(4,'2019-05-20')        0       \N      \N      (NULL,NULL)

See Also

Checks whether a key is present in a dictionary.

dictHas('dict_name', id_expr)

Arguments

Returned value

  • 0, if there is no key.

  • 1, if there is a key.

Type: UInt8.

Syntax

dictGetHierarchy('dict_name', key)

Arguments

Returned value

  • Parents for the key.

Checks the ancestor of a key through the whole hierarchical chain in the dictionary.

dictIsIn('dict_name', child_id_expr, ancestor_id_expr)

Arguments

Returned value

  • 0, if child_id_expr is not a child of ancestor_id_expr.

  • 1, if child_id_expr is a child of ancestor_id_expr or if child_id_expr is an ancestor_id_expr.

Type: UInt8.

ClickHouse supports specialized functions that convert dictionary attribute values to a specific data type regardless of the dictionary configuration.

Functions:

  • dictGetInt8, dictGetInt16, dictGetInt32, dictGetInt64

  • dictGetUInt8, dictGetUInt16, dictGetUInt32, dictGetUInt64

  • dictGetFloat32, dictGetFloat64

  • dictGetDate

  • dictGetDateTime

  • dictGetUUID

  • dictGetString

All these functions have the OrDefault modification. For example, dictGetDateOrDefault.

Syntax:

dictGet[Type]('dict_name', 'attr_name', id_expr)
dictGet[Type]OrDefault('dict_name', 'attr_name', id_expr, default_value_expr)

Arguments

Returned value

  • If there is no requested id_expr in the dictionary then:

    - `dictGet[Type]` returns the content of the `<null_value>` element specified for the attribute in the dictionary configuration.
    - `dictGet[Type]OrDefault` returns the value passed as the `default_value_expr` parameter.

ClickHouse throws an exception if it cannot parse the value of the attribute or the value does not match the attribute data type.

ENCODING

Returns the string with the length as the number of passed arguments and each byte has the value of corresponding argument. Accepts multiple arguments of numeric types. If the value of argument is out of range of UInt8 data type, it is converted to UInt8 with possible rounding and overflow.

Syntax

char(number_1, [number_2, ..., number_n]);

Arguments

Returned value

  • a string of given bytes.

Type: String.

Example

Query:

SELECT char(104.1, 101, 108.9, 108.9, 111) AS hello;

Result:

┌─hello─┐
│ hello │
└───────┘

You can construct a string of arbitrary encoding by passing the corresponding bytes. Here is example for UTF-8:

Query:

SELECT char(0xD0, 0xBF, 0xD1, 0x80, 0xD0, 0xB8, 0xD0, 0xB2, 0xD0, 0xB5, 0xD1, 0x82) AS hello;

Result:

┌─hello──┐
│ привет │
└────────┘

Query:

SELECT char(0xE4, 0xBD, 0xA0, 0xE5, 0xA5, 0xBD) AS hello;

Result:

┌─hello─┐
│ 你好  │
└───────┘

Returns a string containing the argument’s hexadecimal representation.

Alias: HEX.

Syntax

hex(arg)

The function is using uppercase letters A-F and not using any prefixes (like 0x) or suffixes (like h).

For integer arguments, it prints hex digits (“nibbles”) from the most significant to least significant (big-endian or “human-readable” order). It starts with the most significant non-zero byte (leading zero bytes are omitted) but always prints both digits of every byte even if the leading digit is zero.

Arguments

Returned value

  • A string with the hexadecimal representation of the argument.

Examples

Query:

SELECT hex(1);

Result:

01

Query:

SELECT hex(toFloat32(number)) AS hex_presentation FROM numbers(15, 2);

Result:

┌─hex_presentation─┐
│ 00007041         │
│ 00008041         │
└──────────────────┘

Query:

SELECT hex(toFloat64(number)) AS hex_presentation FROM numbers(15, 2);

Result:

┌─hex_presentation─┐
│ 0000000000002E40 │
│ 0000000000003040 │
└──────────────────┘

Query:

SELECT lower(hex(toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0'))) as uuid_hex

Result:

┌─uuid_hex─────────────────────────┐
│ 61f0c4045cb311e7907ba6006ad3dba0 │
└──────────────────────────────────┘

NOTE

If unhex is invoked from within the clickhouse-client, binary strings display using UTF-8.

Alias: UNHEX.

Syntax

unhex(arg)

Arguments

Supports both uppercase and lowercase letters A-F. The number of hexadecimal digits does not have to be even. If it is odd, the last digit is interpreted as the least significant half of the 00-0F byte. If the argument string contains anything other than hexadecimal digits, some implementation-defined result is returned (an exception isn’t thrown). For a numeric argument the inverse of hex(N) is not performed by unhex().

Returned value

  • A binary string (BLOB).

Example

Query:

SELECT unhex('303132'), UNHEX('4D7953514C');

Result:

┌─unhex('303132')─┬─unhex('4D7953514C')─┐
│ 012             │ MySQL               │
└─────────────────┴─────────────────────┘

Query:

SELECT reinterpretAsUInt64(reverse(unhex('FFF'))) AS num;

Result:

┌──num─┐
│ 4095 │
└──────┘

Accepts an integer. Returns a string containing the list of powers of two that total the source number when summed. They are comma-separated without spaces in text format, in ascending order.

Accepts an integer. Returns an array of UInt64 numbers containing the list of powers of two that total the source number when summed. Numbers in the array are in ascending order.

ENCRYPTION

These functions implement encryption and decryption of data with AES (Advanced Encryption Standard) algorithm.

Key length depends on encryption mode. It is 16, 24, and 32 bytes long for -128-, -196-, and -256- modes respectively.

Initialization vector length is always 16 bytes (bytes in excess of 16 are ignored).

Note that these functions work slowly until ClickHouse 21.1.

This function encrypts data using these modes:

  • aes-128-ecb, aes-192-ecb, aes-256-ecb

  • aes-128-cbc, aes-192-cbc, aes-256-cbc

  • aes-128-ofb, aes-192-ofb, aes-256-ofb

  • aes-128-gcm, aes-192-gcm, aes-256-gcm

  • aes-128-ctr, aes-192-ctr, aes-256-ctr

Syntax

encrypt('mode', 'plaintext', 'key' [, iv, aad])

Arguments

Returned value

Examples

Create this table:

Query:

CREATE TABLE encryption_test
(
    `comment` String,
    `secret` String
)
ENGINE = Memory;

Insert some data (please avoid storing the keys/ivs in the database as this undermines the whole concept of encryption), also storing 'hints' is unsafe too and used only for illustrative purposes:

Query:

INSERT INTO encryption_test VALUES('aes-256-ofb no IV', encrypt('aes-256-ofb', 'Secret', '12345678910121314151617181920212')),\
('aes-256-ofb no IV, different key', encrypt('aes-256-ofb', 'Secret', 'keykeykeykeykeykeykeykeykeykeyke')),\
('aes-256-ofb with IV', encrypt('aes-256-ofb', 'Secret', '12345678910121314151617181920212', 'iviviviviviviviv')),\
('aes-256-cbc no IV', encrypt('aes-256-cbc', 'Secret', '12345678910121314151617181920212'));

Query:

SELECT comment, hex(secret) FROM encryption_test;

Result:

┌─comment──────────────────────────┬─hex(secret)──────────────────────┐
│ aes-256-ofb no IV                │ B4972BDC4459                     │
│ aes-256-ofb no IV, different key │ 2FF57C092DC9                     │
│ aes-256-ofb with IV              │ 5E6CB398F653                     │
│ aes-256-cbc no IV                │ 1BC0629A92450D9E73A00E7D02CF4142 │
└──────────────────────────────────┴──────────────────────────────────┘

Example with -gcm:

Query:

INSERT INTO encryption_test VALUES('aes-256-gcm', encrypt('aes-256-gcm', 'Secret', '12345678910121314151617181920212', 'iviviviviviviviv')), \
('aes-256-gcm with AAD', encrypt('aes-256-gcm', 'Secret', '12345678910121314151617181920212', 'iviviviviviviviv', 'aad'));

SELECT comment, hex(secret) FROM encryption_test WHERE comment LIKE '%gcm%';

Result:

┌─comment──────────────┬─hex(secret)──────────────────────────────────┐
│ aes-256-gcm          │ A8A3CCBC6426CFEEB60E4EAE03D3E94204C1B09E0254 │
│ aes-256-gcm with AAD │ A8A3CCBC6426D9A1017A0A932322F1852260A4AD6837 │
└──────────────────────┴──────────────────────────────────────────────┘

Will produce the same ciphertext as encrypt on equal inputs. But when key or iv are longer than they should normally be, aes_encrypt_mysql will stick to what MySQL's aes_encrypt does: 'fold' key and ignore excess bits of iv.

Supported encryption modes:

  • aes-128-ecb, aes-192-ecb, aes-256-ecb

  • aes-128-cbc, aes-192-cbc, aes-256-cbc

  • aes-128-ofb, aes-192-ofb, aes-256-ofb

Syntax

aes_encrypt_mysql('mode', 'plaintext', 'key' [, iv])

Arguments

Returned value

Examples

Given equal input encrypt and aes_encrypt_mysql produce the same ciphertext:

Query:

SELECT encrypt('aes-256-ofb', 'Secret', '12345678910121314151617181920212', 'iviviviviviviviv') = aes_encrypt_mysql('aes-256-ofb', 'Secret', '12345678910121314151617181920212', 'iviviviviviviviv') AS ciphertexts_equal;

Result:

┌─ciphertexts_equal─┐
│                 1 │
└───────────────────┘

But encrypt fails when key or iv is longer than expected:

Query:

SELECT encrypt('aes-256-ofb', 'Secret', '123456789101213141516171819202122', 'iviviviviviviviv123');

Result:

Received exception from server (version 22.6.1):
Code: 36. DB::Exception: Received from localhost:9000. DB::Exception: Invalid key size: 33 expected 32: While processing encrypt('aes-256-ofb', 'Secret', '123456789101213141516171819202122', 'iviviviviviviviv123').

While aes_encrypt_mysql produces MySQL-compatitalbe output:

Query:

SELECT hex(aes_encrypt_mysql('aes-256-ofb', 'Secret', '123456789101213141516171819202122', 'iviviviviviviviv123')) AS ciphertext;

Result:

┌─ciphertext───┐
│ 24E9E4966469 │
└──────────────┘

Notice how supplying even longer IV produces the same result

Query:

SELECT hex(aes_encrypt_mysql('aes-256-ofb', 'Secret', '123456789101213141516171819202122', 'iviviviviviviviv123456')) AS ciphertext

Result:

┌─ciphertext───┐
│ 24E9E4966469 │
└──────────────┘

Which is binary equal to what MySQL produces on same inputs:

mysql> SET  block_encryption_mode='aes-256-ofb';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT aes_encrypt('Secret', '123456789101213141516171819202122', 'iviviviviviviviv123456') as ciphertext;
+------------------------+
| ciphertext             |
+------------------------+
| 0x24E9E4966469         |
+------------------------+
1 row in set (0.00 sec)

This function decrypts ciphertext into a plaintext using these modes:

  • aes-128-ecb, aes-192-ecb, aes-256-ecb

  • aes-128-cbc, aes-192-cbc, aes-256-cbc

  • aes-128-ofb, aes-192-ofb, aes-256-ofb

  • aes-128-gcm, aes-192-gcm, aes-256-gcm

  • aes-128-ctr, aes-192-ctr, aes-256-ctr

Syntax

decrypt('mode', 'ciphertext', 'key' [, iv, aad])

Arguments

Returned value

Examples

Query:

SELECT comment, hex(secret) FROM encryption_test;

Result:

┌─comment──────────────┬─hex(secret)──────────────────────────────────┐
│ aes-256-gcm          │ A8A3CCBC6426CFEEB60E4EAE03D3E94204C1B09E0254 │
│ aes-256-gcm with AAD │ A8A3CCBC6426D9A1017A0A932322F1852260A4AD6837 │
└──────────────────────┴──────────────────────────────────────────────┘
┌─comment──────────────────────────┬─hex(secret)──────────────────────┐
│ aes-256-ofb no IV                │ B4972BDC4459                     │
│ aes-256-ofb no IV, different key │ 2FF57C092DC9                     │
│ aes-256-ofb with IV              │ 5E6CB398F653                     │
│ aes-256-cbc no IV                │ 1BC0629A92450D9E73A00E7D02CF4142 │
└──────────────────────────────────┴──────────────────────────────────┘

Now let's try to decrypt all that data.

Query:

SELECT comment, decrypt('aes-256-cfb128', secret, '12345678910121314151617181920212') as plaintext FROM encryption_test

Result:

┌─comment──────────────┬─plaintext──┐
│ aes-256-gcm          │ OQ�E
                             �t�7T�\���\�   │
│ aes-256-gcm with AAD │ OQ�E
                             �\��si����;�o�� │
└──────────────────────┴────────────┘
┌─comment──────────────────────────┬─plaintext─┐
│ aes-256-ofb no IV                │ Secret    │
│ aes-256-ofb no IV, different key │ �4�
                                        �         │
│ aes-256-ofb with IV              │ ���6�~        │
 │aes-256-cbc no IV                │ �2*4�h3c�4w��@
└──────────────────────────────────┴───────────┘

Notice how only a portion of the data was properly decrypted, and the rest is gibberish since either mode, key, or iv were different upon encryption.

Will produce same plaintext as decrypt on equal inputs. But when key or iv are longer than they should normally be, aes_decrypt_mysql will stick to what MySQL's aes_decrypt does: 'fold' key and ignore excess bits of IV.

Supported decryption modes:

  • aes-128-ecb, aes-192-ecb, aes-256-ecb

  • aes-128-cbc, aes-192-cbc, aes-256-cbc

  • aes-128-cfb128

  • aes-128-ofb, aes-192-ofb, aes-256-ofb

Syntax

aes_decrypt_mysql('mode', 'ciphertext', 'key' [, iv])

Arguments

Returned value

Examples

Let's decrypt data we've previously encrypted with MySQL:

mysql> SET  block_encryption_mode='aes-256-ofb';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT aes_encrypt('Secret', '123456789101213141516171819202122', 'iviviviviviviviv123456') as ciphertext;
+------------------------+
| ciphertext             |
+------------------------+
| 0x24E9E4966469         |
+------------------------+
1 row in set (0.00 sec)

Query:

SELECT aes_decrypt_mysql('aes-256-ofb', unhex('24E9E4966469'), '123456789101213141516171819202122', 'iviviviviviviviv123456') AS plaintext

Result:

┌─plaintext─┐
│ Secret    │
└───────────┘

FILE

Reads file as a String. The file content is not parsed, so any information is read as one string and placed into the specified column.

Syntax

file(path[, default])

Arguments

Example

Inserting data from files a.txt and b.txt into a table as strings:

Query:

INSERT INTO table SELECT file('a.txt'), file('b.txt');

See Also

GEOGRAPHICAL COORDINATES

greatCircleDistance(lon1Deg, lat1Deg, lon2Deg, lat2Deg)

Input parameters

  • lon1Deg — Longitude of the first point in degrees. Range: [-180°, 180°].

  • lat1Deg — Latitude of the first point in degrees. Range: [-90°, 90°].

  • lon2Deg — Longitude of the second point in degrees. Range: [-180°, 180°].

  • lat2Deg — Latitude of the second point in degrees. Range: [-90°, 90°].

Positive values correspond to North latitude and East longitude, and negative values correspond to South latitude and West longitude.

Returned value

The distance between two points on the Earth’s surface, in meters.

Generates an exception when the input parameter values fall outside of the range.

Example

SELECT greatCircleDistance(55.755831, 37.617673, -55.755831, -37.617673)
┌─greatCircleDistance(55.755831, 37.617673, -55.755831, -37.617673)─┐
│                                                14132374.194975413 │
└───────────────────────────────────────────────────────────────────┘

Similar to greatCircleDistance but calculates the distance on WGS-84 ellipsoid instead of sphere. This is more precise approximation of the Earth Geoid. The performance is the same as for greatCircleDistance (no performance drawback). It is recommended to use geoDistance to calculate the distances on Earth.

Technical note: for close enough points we calculate the distance using planar approximation with the metric on the tangent plane at the midpoint of the coordinates.

greatCircleAngle(lon1Deg, lat1Deg, lon2Deg, lat2Deg)

Input parameters

  • lon1Deg — Longitude of the first point in degrees.

  • lat1Deg — Latitude of the first point in degrees.

  • lon2Deg — Longitude of the second point in degrees.

  • lat2Deg — Latitude of the second point in degrees.

Returned value

The central angle between two points in degrees.

Example

SELECT greatCircleAngle(0, 0, 45, 0) AS arc
┌─arc─┐
│  45 │
└─────┘

Checks whether the point belongs to at least one of the ellipses. Coordinates are geometric in the Cartesian coordinate system.

pointInEllipses(x, y, x₀, y₀, a₀, b₀,...,xₙ, yₙ, aₙ, bₙ)

Input parameters

  • x, y — Coordinates of a point on the plane.

  • xᵢ, yᵢ — Coordinates of the center of the i-th ellipsis.

  • aᵢ, bᵢ — Axes of the i-th ellipsis in units of x, y coordinates.

The input parameters must be 2+4⋅n, where n is the number of ellipses.

Returned values

1 if the point is inside at least one of the ellipses; 0if it is not.

Example

SELECT pointInEllipses(10., 10., 10., 9.1, 1., 0.9999)
┌─pointInEllipses(10., 10., 10., 9.1, 1., 0.9999)─┐
│                                               1 │
└─────────────────────────────────────────────────┘

Checks whether the point belongs to the polygon on the plane.

pointInPolygon((x, y), [(a, b), (c, d) ...], ...)

Input values

  • The function also supports polygons with holes (cut out sections). In this case, add polygons that define the cut out sections using additional arguments of the function. The function does not support non-simply-connected polygons.

Returned values

1 if the point is inside the polygon, 0 if it is not. If the point is on the polygon boundary, the function may return either 0 or 1.

Example

SELECT pointInPolygon((3., 3.), [(6, 0), (8, 4), (5, 8), (0, 2)]) AS res
┌─res─┐
│   1 │
└─────┘

GEOHASH

geohashEncode(longitude, latitude, [precision])

Input values

  • longitude - longitude part of the coordinate you want to encode. Floating in range[-180°, 180°]

  • latitude - latitude part of the coordinate you want to encode. Floating in range [-90°, 90°]

  • precision - Optional, length of the resulting encoded string, defaults to 12. Integer in range [1, 12]. Any value less than 1 or greater than 12 is silently converted to 12.

Returned values

  • alphanumeric String of encoded coordinate (modified version of the base32-encoding alphabet is used).

Example

SELECT geohashEncode(-5.60302734375, 42.593994140625, 0) AS res;
┌─res──────────┐
│ ezs42d000000 │
└──────────────┘

Input values

  • encoded string - geohash-encoded string.

Returned values

  • (longitude, latitude) - 2-tuple of Float64 values of longitude and latitude.

Example

SELECT geohashDecode('ezs42') AS res;
┌─res─────────────────────────────┐
│ (-5.60302734375,42.60498046875) │
└─────────────────────────────────┘

Syntax

geohashesInBox(longitude_min, latitude_min, longitude_max, latitude_max, precision)

Arguments

NOTE

All coordinate parameters must be of the same type: either Float32 or Float64.

Returned values

  • Array of precision-long strings of geohash-boxes covering provided area, you should not rely on order of items.

  • [] - Empty array if minimum latitude and longitude values aren’t less than corresponding maximum values.

NOTE

Function throws an exception if resulting array is over 10’000’000 items long.

Example

Query:

SELECT geohashesInBox(24.48, 40.56, 24.785, 40.81, 4) AS thasos;

Result:

┌─thasos──────────────────────────────────────┐
│ ['sx1q','sx1r','sx32','sx1w','sx1x','sx38'] │
└─────────────────────────────────────────────┘

H3 INDEXES

The level of the hierarchy is called resolution and can receive a value from 0 till 15, where 0 is the base level with the largest and coarsest cells.

A latitude and longitude pair can be transformed to a 64-bit H3 index, identifying a grid cell.

The H3 index is used primarily for bucketing locations and other geospatial manipulations.

Syntax

h3IsValid(h3index)

Parameter

Returned values

  • 1 — The number is a valid H3 index.

  • 0 — The number is not a valid H3 index.

Example

Query:

SELECT h3IsValid(630814730351855103) AS h3IsValid;

Result:

┌─h3IsValid─┐
│         1 │
└───────────┘

Syntax

h3GetResolution(h3index)

Parameter

Returned values

  • Index resolution. Range: [0, 15].

Example

Query:

SELECT h3GetResolution(639821929606596015) AS resolution;

Result:

┌─resolution─┐
│         14 │
└────────────┘

Syntax

h3EdgeAngle(resolution)

Parameter

Returned values

Example

Query:

SELECT h3EdgeAngle(10) AS edgeAngle;

Result:

┌───────h3EdgeAngle(10)─┐
│ 0.0005927224846720883 │
└───────────────────────┘

Syntax

h3EdgeLengthM(resolution)

Parameter

Returned values

Example

Query:

SELECT h3EdgeLengthM(15) AS edgeLengthM;

Result:

┌─edgeLengthM─┐
│ 0.509713273 │
└─────────────┘

Syntax

geoToH3(lon, lat, resolution)

Arguments

Returned values

  • Hexagon index number.

  • 0 in case of error.

Example

Query:

SELECT geoToH3(37.79506683, 55.71290588, 15) AS h3Index;

Result:

┌────────────h3Index─┐
│ 644325524701193974 │
└────────────────────┘

Syntax

h3kRing(h3index, k)

Arguments

Returned values

  • Array of H3 indexes.

Example

Query:

SELECT arrayJoin(h3kRing(644325529233966508, 1)) AS h3index;

Result:

┌────────────h3index─┐
│ 644325529233966508 │
│ 644325529233966497 │
│ 644325529233966510 │
│ 644325529233966504 │
│ 644325529233966509 │
│ 644325529233966355 │
│ 644325529233966354 │
└────────────────────┘

Syntax

h3GetBaseCell(index)

Parameter

Returned value

  • Hexagon base cell number.

Example

Query:

SELECT h3GetBaseCell(612916788725809151) AS basecell;

Result:

┌─basecell─┐
│       12 │
└──────────┘

Returns average hexagon area in square meters at the given resolution.

Syntax

h3HexAreaM2(resolution)

Parameter

Returned value

  • Area in square meters.

Example

Query:

SELECT h3HexAreaM2(13) AS area;

Result:

┌─area─┐
│ 43.9 │
└──────┘

Syntax

h3IndexesAreNeighbors(index1, index2)

Arguments

Returned value

  • 1 — Indexes are neighbours.

  • 0 — Indexes are not neighbours.

Example

Query:

SELECT h3IndexesAreNeighbors(617420388351344639, 617420388352655359) AS n;

Result:

┌─n─┐
│ 1 │
└───┘

Syntax

h3ToChildren(index, resolution)

Arguments

Returned values

  • Array of the child H3-indexes.

Example

Query:

SELECT h3ToChildren(599405990164561919, 6) AS children;

Result:

┌─children───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ [603909588852408319,603909588986626047,603909589120843775,603909589255061503,603909589389279231,603909589523496959,603909589657714687] │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Syntax

h3ToParent(index, resolution)

Arguments

Returned value

  • Parent H3 index.

Example

Query:

SELECT h3ToParent(599405990164561919, 3) AS parent;

Result:

┌─────────────parent─┐
│ 590398848891879423 │
└────────────────────┘

Converts the H3Index representation of the index to the string representation.

h3ToString(index)

Parameter

Returned value

  • String representation of the H3 index.

Example

Query:

SELECT h3ToString(617420388352917503) AS h3_string;

Result:

┌─h3_string───────┐
│ 89184926cdbffff │
└─────────────────┘

Converts the string representation to the H3Index (UInt64) representation.

Syntax

stringToH3(index_str)

Parameter

Returned value

Example

Query:

SELECT stringToH3('89184926cc3ffff') AS index;

Result:

┌──────────────index─┐
│ 617420388351344639 │
└────────────────────┘

HASH

Hash functions can be used for the deterministic pseudo-random shuffling of elements.

Simhash is a hash function, which returns close hash values for close (similar) arguments.

halfMD5(par1, ...)

Arguments

Returned Value

Example

SELECT halfMD5(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS halfMD5hash, toTypeName(halfMD5hash) AS type;
┌────────halfMD5hash─┬─type───┐
│ 186182704141653334 │ UInt64 │
└────────────────────┴────────┘

Calculates the MD5 from a string and returns the resulting set of bytes as FixedString(16). If you do not need MD5 in particular, but you need a decent cryptographic 128-bit hash, use the ‘sipHash128’ function instead. If you want to get the same result as output by the md5sum utility, use lower(hex(MD5(s))).

sipHash64(par1,...)
  1. After hashing all the input parameters, the function gets the array of hashes.

  2. Function takes the first and the second elements and calculates a hash for the array of them.

  3. Then the function takes the hash value, calculated at the previous step, and the third element of the initial hash array, and calculates a hash for the array of them.

  4. The previous step is repeated for all the remaining elements of the initial hash array.

Arguments

Returned Value

Example

SELECT sipHash64(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS SipHash, toTypeName(SipHash) AS type;
┌──────────────SipHash─┬─type───┐
│ 13726873534472839665 │ UInt64 │
└──────────────────────┴────────┘

Syntax

sipHash128(par1,...)

Arguments

Returned value

A 128-bit SipHash hash value.

Example

Query:

SELECT hex(sipHash128('foo', '\x01', 3));

Result:

┌─hex(sipHash128('foo', '', 3))────┐
│ 9DE516A64A414D4B1B609415E4523F24 │
└──────────────────────────────────┘
cityHash64(par1,...)

This is a fast non-cryptographic hash function. It uses the CityHash algorithm for string parameters and implementation-specific fast non-cryptographic hash function for parameters with other data types. The function uses the CityHash combinator to get the final results.

Arguments

Returned Value

Examples

Call example:

SELECT cityHash64(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS CityHash, toTypeName(CityHash) AS type;
┌─────────────CityHash─┬─type───┐
│ 12072650598913549138 │ UInt64 │
└──────────────────────┴────────┘

The following example shows how to compute the checksum of the entire table with accuracy up to the row order:

SELECT groupBitXor(cityHash64(*)) FROM table

Calculates a 32-bit hash code from any type of integer. This is a relatively fast non-cryptographic hash function of average quality for numbers.

Calculates a 64-bit hash code from any type of integer. It works faster than intHash32. Average quality.

Syntax

SHA1('s')
...
SHA512('s')

The function works fairly slowly (SHA-1 processes about 5 million short strings per second per processor core, while SHA-224 and SHA-256 process about 2.2 million). We recommend using this function only in cases when you need a specific hash function and you can’t select it. Even in these cases, we recommend applying the function offline and pre-calculating values when inserting them into the table, instead of applying it in SELECT queries.

Arguments

Returned value

  • SHA hash as a hex-unencoded FixedString. SHA-1 returns as FixedString(20), SHA-224 as FixedString(28), SHA-256 — FixedString(32), SHA-512 — FixedString(64).

Example

Query:

SELECT hex(SHA1('abc'));

Result:

┌─hex(SHA1('abc'))─────────────────────────┐
│ A9993E364706816ABA3E25717850C26C9CD0D89D │
└──────────────────────────────────────────┘

A fast, decent-quality non-cryptographic hash function for a string obtained from a URL using some type of normalization. URLHash(s) – Calculates a hash from a string without one of the trailing symbols /,? or # at the end, if present. URLHash(s, N) – Calculates a hash from a string up to the N level in the URL hierarchy, without one of the trailing symbols /,? or # at the end, if present. Levels are the same as in URLHierarchy.

farmFingerprint64(par1, ...)
farmHash64(par1, ...)

Arguments

Returned Value

Example

SELECT farmHash64(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS FarmHash, toTypeName(FarmHash) AS type;
┌─────────────FarmHash─┬─type───┐
│ 17790458267262532859 │ UInt64 │
└──────────────────────┴────────┘

Note that Java only support calculating signed integers hash, so if you want to calculate unsigned integers hash you must cast it to proper signed ClickHouse types.

Syntax

SELECT javaHash('')

Returned value

A Int32 data type hash value.

Example

Query:

SELECT javaHash(toInt32(123));

Result:

┌─javaHash(toInt32(123))─┐
│               123      │
└────────────────────────┘

Query:

SELECT javaHash('Hello, world!');

Result:

┌─javaHash('Hello, world!')─┐
│               -1880044555 │
└───────────────────────────┘

Syntax

javaHashUTF16LE(stringUtf16le)

Arguments

  • stringUtf16le — a string in UTF-16LE encoding.

Returned value

A Int32 data type hash value.

Example

Correct query with UTF-16LE encoded string.

Query:

SELECT javaHashUTF16LE(convertCharset('test', 'utf-8', 'utf-16le'));

Result:

┌─javaHashUTF16LE(convertCharset('test', 'utf-8', 'utf-16le'))─┐
│                                                      3556498 │
└──────────────────────────────────────────────────────────────┘

Calculates HiveHash from a string.

SELECT hiveHash('')

Returned value

A Int32 data type hash value.

Type: hiveHash.

Example

Query:

SELECT hiveHash('Hello, world!');

Result:

┌─hiveHash('Hello, world!')─┐
│                 267439093 │
└───────────────────────────┘
metroHash64(par1, ...)

Arguments

Returned Value

Example

SELECT metroHash64(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS MetroHash, toTypeName(MetroHash) AS type;
┌────────────MetroHash─┬─type───┐
│ 14235658766382344533 │ UInt64 │
└──────────────────────┴────────┘
murmurHash2_32(par1, ...)
murmurHash2_64(par1, ...)

Arguments

Returned Value

Example

SELECT murmurHash2_64(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS MurmurHash2, toTypeName(MurmurHash2) AS type;
┌──────────MurmurHash2─┬─type───┐
│ 11832096901709403633 │ UInt64 │
└──────────────────────┴────────┘

Syntax

gccMurmurHash(par1, ...)

Arguments

Returned value

  • Calculated hash value.

Example

Query:

SELECT
    gccMurmurHash(1, 2, 3) AS res1,
    gccMurmurHash(('a', [1, 2, 3], 4, (4, ['foo', 'bar'], 1, (1, 2)))) AS res2

Result:

┌─────────────────res1─┬────────────────res2─┐
│ 12384823029245979431 │ 1188926775431157506 │
└──────────────────────┴─────────────────────┘
murmurHash3_32(par1, ...)
murmurHash3_64(par1, ...)

Arguments

Returned Value

Example

SELECT murmurHash3_32(array('e','x','a'), 'mple', 10, toDateTime('2019-06-15 23:00:00')) AS MurmurHash3, toTypeName(MurmurHash3) AS type;
┌─MurmurHash3─┬─type───┐
│     2152717 │ UInt32 │
└─────────────┴────────┘

Syntax

murmurHash3_128(expr)

Arguments

Returned value

A 128-bit MurmurHash3 hash value.

Example

Query:

SELECT hex(murmurHash3_128('foo', 'foo', 'foo'));

Result:

┌─hex(murmurHash3_128('foo', 'foo', 'foo'))─┐
│ F8F7AD9B6CD4CF117A71E277E2EC2931          │
└───────────────────────────────────────────┘

Calculates xxHash from a string. It is proposed in two flavors, 32 and 64 bits.

SELECT xxHash32('')

OR

SELECT xxHash64('')

Returned value

A UInt32 or UInt64 data type hash value.

Type: UInt32 for xxHash32 and UInt64 for xxHash64.

Example

Query:

SELECT xxHash32('Hello, world!');

Result:

┌─xxHash32('Hello, world!')─┐
│                 834093149 │
└───────────────────────────┘

See Also

Splits a ASCII string into n-grams of ngramsize symbols and returns the n-gram simhash. Is case sensitive.

Syntax

ngramSimHash(string[, ngramsize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT ngramSimHash('ClickHouse') AS Hash;

Result:

┌───────Hash─┐
│ 1627567969 │
└────────────┘

Splits a ASCII string into n-grams of ngramsize symbols and returns the n-gram simhash. Is case insensitive.

Syntax

ngramSimHashCaseInsensitive(string[, ngramsize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT ngramSimHashCaseInsensitive('ClickHouse') AS Hash;

Result:

┌──────Hash─┐
│ 562180645 │
└───────────┘

Splits a UTF-8 string into n-grams of ngramsize symbols and returns the n-gram simhash. Is case sensitive.

Syntax

ngramSimHashUTF8(string[, ngramsize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT ngramSimHashUTF8('ClickHouse') AS Hash;

Result:

┌───────Hash─┐
│ 1628157797 │
└────────────┘

Splits a UTF-8 string into n-grams of ngramsize symbols and returns the n-gram simhash. Is case insensitive.

Syntax

ngramSimHashCaseInsensitiveUTF8(string[, ngramsize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT ngramSimHashCaseInsensitiveUTF8('ClickHouse') AS Hash;

Result:

┌───────Hash─┐
│ 1636742693 │
└────────────┘

Splits a ASCII string into parts (shingles) of shinglesize words and returns the word shingle simhash. Is case sensitive.

Syntax

wordShingleSimHash(string[, shinglesize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT wordShingleSimHash('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;

Result:

┌───────Hash─┐
│ 2328277067 │
└────────────┘

Splits a ASCII string into parts (shingles) of shinglesize words and returns the word shingle simhash. Is case insensitive.

Syntax

wordShingleSimHashCaseInsensitive(string[, shinglesize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT wordShingleSimHashCaseInsensitive('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;

Result:

┌───────Hash─┐
│ 2194812424 │
└────────────┘

Splits a UTF-8 string into parts (shingles) of shinglesize words and returns the word shingle simhash. Is case sensitive.

Syntax

wordShingleSimHashUTF8(string[, shinglesize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT wordShingleSimHashUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;

Result:

┌───────Hash─┐
│ 2328277067 │
└────────────┘

Splits a UTF-8 string into parts (shingles) of shinglesize words and returns the word shingle simhash. Is case insensitive.

Syntax

wordShingleSimHashCaseInsensitiveUTF8(string[, shinglesize])

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT wordShingleSimHashCaseInsensitiveUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Hash;

Result:

┌───────Hash─┐
│ 2194812424 │
└────────────┘

Splits a ASCII string into n-grams of ngramsize symbols and calculates hash values for each n-gram. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.

Syntax

ngramMinHash(string[, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT ngramMinHash('ClickHouse') AS Tuple;

Result:

┌─Tuple──────────────────────────────────────┐
│ (18333312859352735453,9054248444481805918) │
└────────────────────────────────────────────┘

Splits a ASCII string into n-grams of ngramsize symbols and calculates hash values for each n-gram. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.

Syntax

ngramMinHashCaseInsensitive(string[, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT ngramMinHashCaseInsensitive('ClickHouse') AS Tuple;

Result:

┌─Tuple──────────────────────────────────────┐
│ (2106263556442004574,13203602793651726206) │
└────────────────────────────────────────────┘

Splits a UTF-8 string into n-grams of ngramsize symbols and calculates hash values for each n-gram. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.

Syntax

ngramMinHashUTF8(string[, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT ngramMinHashUTF8('ClickHouse') AS Tuple;

Result:

┌─Tuple──────────────────────────────────────┐
│ (18333312859352735453,6742163577938632877) │
└────────────────────────────────────────────┘

Splits a UTF-8 string into n-grams of ngramsize symbols and calculates hash values for each n-gram. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.

Syntax

ngramMinHashCaseInsensitiveUTF8(string [, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT ngramMinHashCaseInsensitiveUTF8('ClickHouse') AS Tuple;

Result:

┌─Tuple───────────────────────────────────────┐
│ (12493625717655877135,13203602793651726206) │
└─────────────────────────────────────────────┘

Syntax

ngramMinHashArg(string[, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum n-grams each.

Example

Query:

SELECT ngramMinHashArg('ClickHouse') AS Tuple;

Result:

┌─Tuple─────────────────────────────────────────────────────────────────────────┐
│ (('ous','ick','lic','Hou','kHo','use'),('Hou','lic','ick','ous','ckH','Cli')) │
└───────────────────────────────────────────────────────────────────────────────┘

Syntax

ngramMinHashArgCaseInsensitive(string[, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum n-grams each.

Example

Query:

SELECT ngramMinHashArgCaseInsensitive('ClickHouse') AS Tuple;

Result:

┌─Tuple─────────────────────────────────────────────────────────────────────────┐
│ (('ous','ick','lic','kHo','use','Cli'),('kHo','lic','ick','ous','ckH','Hou')) │
└───────────────────────────────────────────────────────────────────────────────┘

Syntax

ngramMinHashArgUTF8(string[, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum n-grams each.

Example

Query:

SELECT ngramMinHashArgUTF8('ClickHouse') AS Tuple;

Result:

┌─Tuple─────────────────────────────────────────────────────────────────────────┐
│ (('ous','ick','lic','Hou','kHo','use'),('kHo','Hou','lic','ick','ous','ckH')) │
└───────────────────────────────────────────────────────────────────────────────┘

Syntax

ngramMinHashArgCaseInsensitiveUTF8(string[, ngramsize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum n-grams each.

Example

Query:

SELECT ngramMinHashArgCaseInsensitiveUTF8('ClickHouse') AS Tuple;

Result:

┌─Tuple─────────────────────────────────────────────────────────────────────────┐
│ (('ckH','ous','ick','lic','kHo','use'),('kHo','lic','ick','ous','ckH','Hou')) │
└───────────────────────────────────────────────────────────────────────────────┘

Splits a ASCII string into parts (shingles) of shinglesize words and calculates hash values for each word shingle. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.

Syntax

wordShingleMinHash(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT wordShingleMinHash('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;

Result:

┌─Tuple──────────────────────────────────────┐
│ (16452112859864147620,5844417301642981317) │
└────────────────────────────────────────────┘

Splits a ASCII string into parts (shingles) of shinglesize words and calculates hash values for each word shingle. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.

Syntax

wordShingleMinHashCaseInsensitive(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT wordShingleMinHashCaseInsensitive('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;

Result:

┌─Tuple─────────────────────────────────────┐
│ (3065874883688416519,1634050779997673240) │
└───────────────────────────────────────────┘

Splits a UTF-8 string into parts (shingles) of shinglesize words and calculates hash values for each word shingle. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case sensitive.

Syntax

wordShingleMinHashUTF8(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT wordShingleMinHashUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;

Result:

┌─Tuple──────────────────────────────────────┐
│ (16452112859864147620,5844417301642981317) │
└────────────────────────────────────────────┘

Splits a UTF-8 string into parts (shingles) of shinglesize words and calculates hash values for each word shingle. Uses hashnum minimum hashes to calculate the minimum hash and hashnum maximum hashes to calculate the maximum hash. Returns a tuple with these hashes. Is case insensitive.

Syntax

wordShingleMinHashCaseInsensitiveUTF8(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two hashes — the minimum and the maximum.

Example

Query:

SELECT wordShingleMinHashCaseInsensitiveUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).') AS Tuple;

Result:

┌─Tuple─────────────────────────────────────┐
│ (3065874883688416519,1634050779997673240) │
└───────────────────────────────────────────┘

Syntax

wordShingleMinHashArg(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum word shingles each.

Example

Query:

SELECT wordShingleMinHashArg('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;

Result:

┌─Tuple─────────────────────────────────────────────────────────────────┐
│ (('OLAP','database','analytical'),('online','oriented','processing')) │
└───────────────────────────────────────────────────────────────────────┘

Syntax

wordShingleMinHashArgCaseInsensitive(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum word shingles each.

Example

Query:

SELECT wordShingleMinHashArgCaseInsensitive('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;

Result:

┌─Tuple──────────────────────────────────────────────────────────────────┐
│ (('queries','database','analytical'),('oriented','processing','DBMS')) │
└────────────────────────────────────────────────────────────────────────┘

Syntax

wordShingleMinHashArgUTF8(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum word shingles each.

Example

Query:

SELECT wordShingleMinHashArgUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;

Result:

┌─Tuple─────────────────────────────────────────────────────────────────┐
│ (('OLAP','database','analytical'),('online','oriented','processing')) │
└───────────────────────────────────────────────────────────────────────┘

Syntax

wordShingleMinHashArgCaseInsensitiveUTF8(string[, shinglesize, hashnum])

Arguments

Returned value

  • Tuple with two tuples with hashnum word shingles each.

Example

Query:

SELECT wordShingleMinHashArgCaseInsensitiveUTF8('ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).', 1, 3) AS Tuple;

Result:

┌─Tuple──────────────────────────────────────────────────────────────────┐
│ (('queries','database','analytical'),('oriented','processing','DBMS')) │
└────────────────────────────────────────────────────────────────────────┘

INTROSPECTION FUNCTIONS

WARNING

These functions are slow and may impose security considerations.

For proper operation of introspection functions:

  • Install the clickhouse-common-static-dbg package.

  • For security reasons introspection functions are disabled by default.

Converts virtual memory address inside ClickHouse server process to the filename and the line number in ClickHouse source code.

If you use official ClickHouse packages, you need to install the clickhouse-common-static-dbg package.

Syntax

addressToLine(address_of_binary_instruction)

Arguments

Returned value

  • Source code filename and the line number in this file delimited by colon.

    For example, `/build/obj-x86_64-linux-gnu/../src/Common/ThreadPool.cpp:199`, where `199` is a line number.
  • Name of a binary, if the function couldn’t find the debug information.

  • Empty string, if the address is not valid.

Example

Enabling introspection functions:

SET allow_introspection_functions=1;

Selecting the first string from the trace_log system table:

SELECT * FROM system.trace_log LIMIT 1 \G;
Row 1:
──────
event_date:              2019-11-19
event_time:              2019-11-19 18:57:23
revision:                54429
timer_type:              Real
thread_number:           48
query_id:                421b6855-1858-45a5-8f37-f383409d6d72
trace:                   [140658411141617,94784174532828,94784076370703,94784076372094,94784076361020,94784175007680,140658411116251,140658403895439]

The trace field contains the stack trace at the moment of sampling.

Getting the source code filename and the line number for a single address:

SELECT addressToLine(94784076370703) \G;
Row 1:
──────
addressToLine(94784076370703): /build/obj-x86_64-linux-gnu/../src/Common/ThreadPool.cpp:199

Applying the function to the whole stack trace:

SELECT
    arrayStringConcat(arrayMap(x -> addressToLine(x), trace), '\n') AS trace_source_code_lines
FROM system.trace_log
LIMIT 1
\G
Row 1:
──────
trace_source_code_lines: /lib/x86_64-linux-gnu/libpthread-2.27.so
/usr/lib/debug/usr/bin/clickhouse
/build/obj-x86_64-linux-gnu/../src/Common/ThreadPool.cpp:199
/build/obj-x86_64-linux-gnu/../src/Common/ThreadPool.h:155
/usr/include/c++/9/bits/atomic_base.h:551
/usr/lib/debug/usr/bin/clickhouse
/lib/x86_64-linux-gnu/libpthread-2.27.so
/build/glibc-OTsEL5/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Similar to addressToLine, but it will return an Array with all inline functions, and will be much slower as a price.

If you use official ClickHouse packages, you need to install the clickhouse-common-static-dbg package.

Syntax

addressToLineWithInlines(address_of_binary_instruction)

Arguments

Returned value

  • Array which first element is source code filename and the line number in this file delimited by colon. And from second element, inline functions' source code filename and line number and function name are listed.

  • Array with single element which is name of a binary, if the function couldn’t find the debug information.

  • Empty array, if the address is not valid.

Example

Enabling introspection functions:

SET allow_introspection_functions=1;

Applying the function to address.

SELECT addressToLineWithInlines(531055181::UInt64);
┌─addressToLineWithInlines(CAST('531055181', 'UInt64'))────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ ['./src/Functions/addressToLineWithInlines.cpp:98','./build_normal_debug/./src/Functions/addressToLineWithInlines.cpp:176:DB::(anonymous namespace)::FunctionAddressToLineWithInlines::implCached(unsigned long) const'] │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Applying the function to the whole stack trace:

SELECT
    ta, addressToLineWithInlines(arrayJoin(trace) as ta)
FROM system.trace_log
WHERE
    query_id = '5e173544-2020-45de-b645-5deebe2aae54';
┌────────ta─┬─addressToLineWithInlines(arrayJoin(trace))───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 365497529 │ ['./build_normal_debug/./contrib/libcxx/include/string_view:252']                                                                                                                                                        │
│ 365593602 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:191']                                                                                                                                                                      │
│ 365593866 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365592528 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365591003 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:477']                                                                                                                                                                      │
│ 365590479 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:442']                                                                                                                                                                      │
│ 365590600 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:457']                                                                                                                                                                      │
│ 365598941 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365607098 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365590571 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:451']                                                                                                                                                                      │
│ 365598941 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365607098 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365590571 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:451']                                                                                                                                                                      │
│ 365598941 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365607098 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365590571 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:451']                                                                                                                                                                      │
│ 365598941 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:0']                                                                                                                                                                        │
│ 365597289 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:807']                                                                                                                                                                      │
│ 365599840 │ ['./build_normal_debug/./src/Common/Dwarf.cpp:1118']                                                                                                                                                                     │
│ 531058145 │ ['./build_normal_debug/./src/Functions/addressToLineWithInlines.cpp:152']                                                                                                                                                │
│ 531055181 │ ['./src/Functions/addressToLineWithInlines.cpp:98','./build_normal_debug/./src/Functions/addressToLineWithInlines.cpp:176:DB::(anonymous namespace)::FunctionAddressToLineWithInlines::implCached(unsigned long) const'] │
│ 422333613 │ ['./build_normal_debug/./src/Functions/IFunctionAdaptors.h:21']                                                                                                                                                          │
│ 586866022 │ ['./build_normal_debug/./src/Functions/IFunction.cpp:216']                                                                                                                                                               │
│ 586869053 │ ['./build_normal_debug/./src/Functions/IFunction.cpp:264']                                                                                                                                                               │
│ 586873237 │ ['./build_normal_debug/./src/Functions/IFunction.cpp:334']                                                                                                                                                               │
│ 597901620 │ ['./build_normal_debug/./src/Interpreters/ExpressionActions.cpp:601']                                                                                                                                                    │
│ 597898534 │ ['./build_normal_debug/./src/Interpreters/ExpressionActions.cpp:718']                                                                                                                                                    │
│ 630442912 │ ['./build_normal_debug/./src/Processors/Transforms/ExpressionTransform.cpp:23']                                                                                                                                          │
│ 546354050 │ ['./build_normal_debug/./src/Processors/ISimpleTransform.h:38']                                                                                                                                                          │
│ 626026993 │ ['./build_normal_debug/./src/Processors/ISimpleTransform.cpp:89']                                                                                                                                                        │
│ 626294022 │ ['./build_normal_debug/./src/Processors/Executors/ExecutionThreadContext.cpp:45']                                                                                                                                        │
│ 626293730 │ ['./build_normal_debug/./src/Processors/Executors/ExecutionThreadContext.cpp:63']                                                                                                                                        │
│ 626169525 │ ['./build_normal_debug/./src/Processors/Executors/PipelineExecutor.cpp:213']                                                                                                                                             │
│ 626170308 │ ['./build_normal_debug/./src/Processors/Executors/PipelineExecutor.cpp:178']                                                                                                                                             │
│ 626166348 │ ['./build_normal_debug/./src/Processors/Executors/PipelineExecutor.cpp:329']                                                                                                                                             │
│ 626163461 │ ['./build_normal_debug/./src/Processors/Executors/PipelineExecutor.cpp:84']                                                                                                                                              │
│ 626323536 │ ['./build_normal_debug/./src/Processors/Executors/PullingAsyncPipelineExecutor.cpp:85']                                                                                                                                  │
│ 626323277 │ ['./build_normal_debug/./src/Processors/Executors/PullingAsyncPipelineExecutor.cpp:112']                                                                                                                                 │
│ 626323133 │ ['./build_normal_debug/./contrib/libcxx/include/type_traits:3682']                                                                                                                                                       │
│ 626323041 │ ['./build_normal_debug/./contrib/libcxx/include/tuple:1415']                                                                                                                                                             │
└───────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Converts virtual memory address inside ClickHouse server process to the symbol from ClickHouse object files.

Syntax

addressToSymbol(address_of_binary_instruction)

Arguments

Returned value

  • Symbol from ClickHouse object files.

  • Empty string, if the address is not valid.

Example

Enabling introspection functions:

SET allow_introspection_functions=1;

Selecting the first string from the trace_log system table:

SELECT * FROM system.trace_log LIMIT 1 \G;
Row 1:
──────
event_date:    2019-11-20
event_time:    2019-11-20 16:57:59
revision:      54429
timer_type:    Real
thread_number: 48
query_id:      724028bf-f550-45aa-910d-2af6212b94ac
trace:         [94138803686098,94138815010911,94138815096522,94138815101224,94138815102091,94138814222988,94138806823642,94138814457211,94138806823642,94138814457211,94138806823642,94138806795179,94138806796144,94138753770094,94138753771646,94138753760572,94138852407232,140399185266395,140399178045583]

The trace field contains the stack trace at the moment of sampling.

Getting a symbol for a single address:

SELECT addressToSymbol(94138803686098) \G;
Row 1:
──────
addressToSymbol(94138803686098): _ZNK2DB24IAggregateFunctionHelperINS_20AggregateFunctionSumImmNS_24AggregateFunctionSumDataImEEEEE19addBatchSinglePlaceEmPcPPKNS_7IColumnEPNS_5ArenaE

Applying the function to the whole stack trace:

SELECT
    arrayStringConcat(arrayMap(x -> addressToSymbol(x), trace), '\n') AS trace_symbols
FROM system.trace_log
LIMIT 1
\G
Row 1:
──────
trace_symbols: _ZNK2DB24IAggregateFunctionHelperINS_20AggregateFunctionSumImmNS_24AggregateFunctionSumDataImEEEEE19addBatchSinglePlaceEmPcPPKNS_7IColumnEPNS_5ArenaE
_ZNK2DB10Aggregator21executeWithoutKeyImplERPcmPNS0_28AggregateFunctionInstructionEPNS_5ArenaE
_ZN2DB10Aggregator14executeOnBlockESt6vectorIN3COWINS_7IColumnEE13immutable_ptrIS3_EESaIS6_EEmRNS_22AggregatedDataVariantsERS1_IPKS3_SaISC_EERS1_ISE_SaISE_EERb
_ZN2DB10Aggregator14executeOnBlockERKNS_5BlockERNS_22AggregatedDataVariantsERSt6vectorIPKNS_7IColumnESaIS9_EERS6_ISB_SaISB_EERb
_ZN2DB10Aggregator7executeERKSt10shared_ptrINS_17IBlockInputStreamEERNS_22AggregatedDataVariantsE
_ZN2DB27AggregatingBlockInputStream8readImplEv
_ZN2DB17IBlockInputStream4readEv
_ZN2DB26ExpressionBlockInputStream8readImplEv
_ZN2DB17IBlockInputStream4readEv
_ZN2DB26ExpressionBlockInputStream8readImplEv
_ZN2DB17IBlockInputStream4readEv
_ZN2DB28AsynchronousBlockInputStream9calculateEv
_ZNSt17_Function_handlerIFvvEZN2DB28AsynchronousBlockInputStream4nextEvEUlvE_E9_M_invokeERKSt9_Any_data
_ZN14ThreadPoolImplI20ThreadFromGlobalPoolE6workerESt14_List_iteratorIS0_E
_ZZN20ThreadFromGlobalPoolC4IZN14ThreadPoolImplIS_E12scheduleImplIvEET_St8functionIFvvEEiSt8optionalImEEUlvE1_JEEEOS4_DpOT0_ENKUlvE_clEv
_ZN14ThreadPoolImplISt6threadE6workerESt14_List_iteratorIS0_E
execute_native_thread_routine
start_thread
clone

Syntax

demangle(symbol)

Arguments

Returned value

  • Name of the C++ function.

  • Empty string if a symbol is not valid.

Example

Enabling introspection functions:

SET allow_introspection_functions=1;

Selecting the first string from the trace_log system table:

SELECT * FROM system.trace_log LIMIT 1 \G;
Row 1:
──────
event_date:    2019-11-20
event_time:    2019-11-20 16:57:59
revision:      54429
timer_type:    Real
thread_number: 48
query_id:      724028bf-f550-45aa-910d-2af6212b94ac
trace:         [94138803686098,94138815010911,94138815096522,94138815101224,94138815102091,94138814222988,94138806823642,94138814457211,94138806823642,94138814457211,94138806823642,94138806795179,94138806796144,94138753770094,94138753771646,94138753760572,94138852407232,140399185266395,140399178045583]

The trace field contains the stack trace at the moment of sampling.

Getting a function name for a single address:

SELECT demangle(addressToSymbol(94138803686098)) \G;
Row 1:
──────
demangle(addressToSymbol(94138803686098)): DB::IAggregateFunctionHelper<DB::AggregateFunctionSum<unsigned long, unsigned long, DB::AggregateFunctionSumData<unsigned long> > >::addBatchSinglePlace(unsigned long, char*, DB::IColumn const**, DB::Arena*) const

Applying the function to the whole stack trace:

SELECT
    arrayStringConcat(arrayMap(x -> demangle(addressToSymbol(x)), trace), '\n') AS trace_functions
FROM system.trace_log
LIMIT 1
\G
Row 1:
──────
trace_functions: DB::IAggregateFunctionHelper<DB::AggregateFunctionSum<unsigned long, unsigned long, DB::AggregateFunctionSumData<unsigned long> > >::addBatchSinglePlace(unsigned long, char*, DB::IColumn const**, DB::Arena*) const
DB::Aggregator::executeWithoutKeyImpl(char*&, unsigned long, DB::Aggregator::AggregateFunctionInstruction*, DB::Arena*) const
DB::Aggregator::executeOnBlock(std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn> > >, unsigned long, DB::AggregatedDataVariants&, std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*> >&, std::vector<std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*> >, std::allocator<std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*> > > >&, bool&)
DB::Aggregator::executeOnBlock(DB::Block const&, DB::AggregatedDataVariants&, std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*> >&, std::vector<std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*> >, std::allocator<std::vector<DB::IColumn const*, std::allocator<DB::IColumn const*> > > >&, bool&)
DB::Aggregator::execute(std::shared_ptr<DB::IBlockInputStream> const&, DB::AggregatedDataVariants&)
DB::AggregatingBlockInputStream::readImpl()
DB::IBlockInputStream::read()
DB::ExpressionBlockInputStream::readImpl()
DB::IBlockInputStream::read()
DB::ExpressionBlockInputStream::readImpl()
DB::IBlockInputStream::read()
DB::AsynchronousBlockInputStream::calculate()
std::_Function_handler<void (), DB::AsynchronousBlockInputStream::next()::{lambda()#1}>::_M_invoke(std::_Any_data const&)
ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::_List_iterator<ThreadFromGlobalPool>)
ThreadFromGlobalPool::ThreadFromGlobalPool<ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::function<void ()>, int, std::optional<unsigned long>)::{lambda()#3}>(ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::function<void ()>, int, std::optional<unsigned long>)::{lambda()#3}&&)::{lambda()#1}::operator()() const
ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>)
execute_native_thread_routine
start_thread
clone

Syntax

tid()

Returned value

Example

Query:

SELECT tid();

Result:

┌─tid()─┐
│  3878 │
└───────┘

Syntax

logTrace('message')

Arguments

Returned value

  • Always returns 0.

Example

Query:

SELECT logTrace('logTrace message');

Result:

┌─logTrace('logTrace message')─┐
│                            0 │
└──────────────────────────────┘

IP ADDRESSES

Takes a UInt32 number. Interprets it as an IPv4 address in big endian. Returns a string containing the corresponding IPv4 address in the format A.B.C.d (dot-separated numbers in decimal form).

Alias: INET_NTOA.

The reverse function of IPv4NumToString. If the IPv4 address has an invalid format, it throws exception.

Alias: INET_ATON.

Similar to IPv4NumToString, but using xxx instead of the last octet.

Example:

SELECT
    IPv4NumToStringClassC(ClientIP) AS k,
    count() AS c
FROM test.hits
GROUP BY k
ORDER BY c DESC
LIMIT 10
┌─k──────────────┬─────c─┐
│ 83.149.9.xxx   │ 26238 │
│ 217.118.81.xxx │ 26074 │
│ 213.87.129.xxx │ 25481 │
│ 83.149.8.xxx   │ 24984 │
│ 217.118.83.xxx │ 22797 │
│ 78.25.120.xxx  │ 22354 │
│ 213.87.131.xxx │ 21285 │
│ 78.25.121.xxx  │ 20887 │
│ 188.162.65.xxx │ 19694 │
│ 83.149.48.xxx  │ 17406 │
└────────────────┴───────┘

Since using ‘xxx’ is highly unusual, this may be changed in the future. We recommend that you do not rely on the exact format of this fragment.

Accepts a FixedString(16) value containing the IPv6 address in binary format. Returns a string containing this address in text format. IPv6-mapped IPv4 addresses are output in the format ::ffff:111.222.33.44.

Alias: INET6_NTOA.

Examples:

SELECT IPv6NumToString(toFixedString(unhex('2A0206B8000000000000000000000011'), 16)) AS addr;
┌─addr─────────┐
│ 2a02:6b8::11 │
└──────────────┘
SELECT
    IPv6NumToString(ClientIP6 AS k),
    count() AS c
FROM hits_all
WHERE EventDate = today() AND substring(ClientIP6, 1, 12) != unhex('00000000000000000000FFFF')
GROUP BY k
ORDER BY c DESC
LIMIT 10
┌─IPv6NumToString(ClientIP6)──────────────┬─────c─┐
│ 2a02:2168:aaa:bbbb::2                   │ 24695 │
│ 2a02:2698:abcd:abcd:abcd:abcd:8888:5555 │ 22408 │
│ 2a02:6b8:0:fff::ff                      │ 16389 │
│ 2a01:4f8:111:6666::2                    │ 16016 │
│ 2a02:2168:888:222::1                    │ 15896 │
│ 2a01:7e00::ffff:ffff:ffff:222           │ 14774 │
│ 2a02:8109:eee:ee:eeee:eeee:eeee:eeee    │ 14443 │
│ 2a02:810b:8888:888:8888:8888:8888:8888  │ 14345 │
│ 2a02:6b8:0:444:4444:4444:4444:4444      │ 14279 │
│ 2a01:7e00::ffff:ffff:ffff:ffff          │ 13880 │
└─────────────────────────────────────────┴───────┘
SELECT
    IPv6NumToString(ClientIP6 AS k),
    count() AS c
FROM hits_all
WHERE EventDate = today()
GROUP BY k
ORDER BY c DESC
LIMIT 10
┌─IPv6NumToString(ClientIP6)─┬──────c─┐
│ ::ffff:94.26.111.111       │ 747440 │
│ ::ffff:37.143.222.4        │ 529483 │
│ ::ffff:5.166.111.99        │ 317707 │
│ ::ffff:46.38.11.77         │ 263086 │
│ ::ffff:79.105.111.111      │ 186611 │
│ ::ffff:93.92.111.88        │ 176773 │
│ ::ffff:84.53.111.33        │ 158709 │
│ ::ffff:217.118.11.22       │ 154004 │
│ ::ffff:217.118.11.33       │ 148449 │
│ ::ffff:217.118.11.44       │ 148243 │
└────────────────────────────┴────────┘

If the input string contains a valid IPv4 address, returns its IPv6 equivalent. HEX can be uppercase or lowercase.

Alias: INET6_ATON.

Syntax

IPv6StringToNum(string)

Argument

Returned value

  • IPv6 address in binary format.

Example

Query:

SELECT addr, cutIPv6(IPv6StringToNum(addr), 0, 0) FROM (SELECT ['notaddress', '127.0.0.1', '1111::ffff'] AS addr) ARRAY JOIN addr;

Result:

┌─addr───────┬─cutIPv6(IPv6StringToNum(addr), 0, 0)─┐
│ notaddress │ ::                                   │
│ 127.0.0.1  │ ::ffff:127.0.0.1                     │
│ 1111::ffff │ 1111::ffff                           │
└────────────┴──────────────────────────────────────┘

See Also

SELECT IPv6NumToString(IPv4ToIPv6(IPv4StringToNum('192.168.0.1'))) AS addr;
┌─addr───────────────┐
│ ::ffff:192.168.0.1 │
└────────────────────┘

Accepts a FixedString(16) value containing the IPv6 address in binary format. Returns a string containing the address of the specified number of bytes removed in text format. For example:

WITH
    IPv6StringToNum('2001:0DB8:AC10:FE01:FEED:BABE:CAFE:F00D') AS ipv6,
    IPv4ToIPv6(IPv4StringToNum('192.168.0.1')) AS ipv4
SELECT
    cutIPv6(ipv6, 2, 0),
    cutIPv6(ipv4, 0, 2)
┌─cutIPv6(ipv6, 2, 0)─────────────────┬─cutIPv6(ipv4, 0, 2)─┐
│ 2001:db8:ac10:fe01:feed:babe:cafe:0 │ ::ffff:192.168.0.0  │
└─────────────────────────────────────┴─────────────────────┘
SELECT IPv4CIDRToRange(toIPv4('192.168.5.2'), 16);
┌─IPv4CIDRToRange(toIPv4('192.168.5.2'), 16)─┐
│ ('192.168.0.0','192.168.255.255')          │
└────────────────────────────────────────────┘

Accepts an IPv6 and an UInt8 value containing the CIDR. Return a tuple with two IPv6 containing the lower range and the higher range of the subnet.

SELECT IPv6CIDRToRange(toIPv6('2001:0db8:0000:85a3:0000:0000:ac1f:8001'), 32);
┌─IPv6CIDRToRange(toIPv6('2001:0db8:0000:85a3:0000:0000:ac1f:8001'), 32)─┐
│ ('2001:db8::','2001:db8:ffff:ffff:ffff:ffff:ffff:ffff')                │
└────────────────────────────────────────────────────────────────────────┘
WITH
    '171.225.130.45' as IPv4_string
SELECT
    toTypeName(IPv4StringToNum(IPv4_string)),
    toTypeName(toIPv4(IPv4_string))
┌─toTypeName(IPv4StringToNum(IPv4_string))─┬─toTypeName(toIPv4(IPv4_string))─┐
│ UInt32                                   │ IPv4                            │
└──────────────────────────────────────────┴─────────────────────────────────┘
WITH
    '171.225.130.45' as IPv4_string
SELECT
    hex(IPv4StringToNum(IPv4_string)),
    hex(toIPv4(IPv4_string))
┌─hex(IPv4StringToNum(IPv4_string))─┬─hex(toIPv4(IPv4_string))─┐
│ ABE1822D                          │ ABE1822D                 │
└───────────────────────────────────┴──────────────────────────┘

If the input string contains a valid IPv4 address, then the IPv6 equivalent of the IPv4 address is returned.

Syntax

toIPv6(string)

Argument

Returned value

  • IP address.

Examples

Query:

WITH '2001:438:ffff::407d:1bc1' AS IPv6_string
SELECT
    hex(IPv6StringToNum(IPv6_string)),
    hex(toIPv6(IPv6_string));

Result:

┌─hex(IPv6StringToNum(IPv6_string))─┬─hex(toIPv6(IPv6_string))─────────┐
│ 20010438FFFF000000000000407D1BC1  │ 20010438FFFF000000000000407D1BC1 │
└───────────────────────────────────┴──────────────────────────────────┘

Query:

SELECT toIPv6('127.0.0.1');

Result:

┌─toIPv6('127.0.0.1')─┐
│ ::ffff:127.0.0.1    │
└─────────────────────┘

Determines whether the input string is an IPv4 address or not. If string is IPv6 address returns 0.

Syntax

isIPv4String(string)

Arguments

Returned value

  • 1 if string is IPv4 address, 0 otherwise.

Examples

Query:

SELECT addr, isIPv4String(addr) FROM ( SELECT ['0.0.0.0', '127.0.0.1', '::ffff:127.0.0.1'] AS addr ) ARRAY JOIN addr;

Result:

┌─addr─────────────┬─isIPv4String(addr)─┐
│ 0.0.0.0          │                  1 │
│ 127.0.0.1        │                  1 │
│ ::ffff:127.0.0.1 │                  0 │
└──────────────────┴────────────────────┘

Determines whether the input string is an IPv6 address or not. If string is IPv4 address returns 0.

Syntax

isIPv6String(string)

Arguments

Returned value

  • 1 if string is IPv6 address, 0 otherwise.

Examples

Query:

SELECT addr, isIPv6String(addr) FROM ( SELECT ['::', '1111::ffff', '::ffff:127.0.0.1', '127.0.0.1'] AS addr ) ARRAY JOIN addr;

Result:

┌─addr─────────────┬─isIPv6String(addr)─┐
│ ::               │                  1 │
│ 1111::ffff       │                  1 │
│ ::ffff:127.0.0.1 │                  1 │
│ 127.0.0.1        │                  0 │
└──────────────────┴────────────────────┘

Syntax

isIPAddressInRange(address, prefix)

This function accepts both IPv4 and IPv6 addresses (and networks) represented as strings. It returns 0 if the IP version of the address and the CIDR don't match.

Arguments

Returned value

  • 1 or 0.

Example

Query:

SELECT isIPAddressInRange('127.0.0.1', '127.0.0.0/8');

Result:

┌─isIPAddressInRange('127.0.0.1', '127.0.0.0/8')─┐
│                                              1 │
└────────────────────────────────────────────────┘

Query:

SELECT isIPAddressInRange('127.0.0.1', 'ffff::/16');

Result:

┌─isIPAddressInRange('127.0.0.1', 'ffff::/16')─┐
│                                            0 │
└──────────────────────────────────────────────┘

Query:

SELECT isIPAddressInRange('::ffff:192.168.0.1', '::ffff:192.168.0.4/128');

Result:

┌─isIPAddressInRange('::ffff:192.168.0.1', '::ffff:192.168.0.4/128')─┐
│                                                                  0 │
└────────────────────────────────────────────────────────────────────┘

JSON

ClickHouse has special functions for working with this JSON. All the JSON functions are based on strong assumptions about what the JSON can be, but they try to do as little as possible to get the job done.

The following assumptions are made:

  1. The field name (function argument) must be a constant.

  2. The field name is somehow canonically encoded in JSON. For example: visitParamHas('{"abc":"def"}', 'abc') = 1, but visitParamHas('{"\\u0061\\u0062\\u0063":"def"}', 'abc') = 0

  3. Fields are searched for on any nesting level, indiscriminately. If there are multiple matching fields, the first occurrence is used.

  4. The JSON does not have space characters outside of string literals.

Checks whether there is a field with the name name.

Alias: simpleJSONHas.

Parses UInt64 from the value of the field named name. If this is a string field, it tries to parse a number from the beginning of the string. If the field does not exist, or it exists but does not contain a number, it returns 0.

Alias: simpleJSONExtractUInt.

The same as for Int64.

Alias: simpleJSONExtractInt.

The same as for Float64.

Alias: simpleJSONExtractFloat.

Parses a true/false value. The result is UInt8.

Alias: simpleJSONExtractBool.

Returns the value of a field, including separators.

Alias: simpleJSONExtractRaw.

Examples:

visitParamExtractRaw('{"abc":"\\n\\u0000"}', 'abc') = '"\\n\\u0000"';
visitParamExtractRaw('{"abc":{"def":[1,2,3]}}', 'abc') = '{"def":[1,2,3]}';

Parses the string in double quotes. The value is unescaped. If unescaping failed, it returns an empty string.

Alias: simpleJSONExtractString.

Examples:

visitParamExtractString('{"abc":"\\n\\u0000"}', 'abc') = '\n\0';
visitParamExtractString('{"abc":"\\u263a"}', 'abc') = '☺';
visitParamExtractString('{"abc":"\\u263"}', 'abc') = '';
visitParamExtractString('{"abc":"hello}', 'abc') = '';

There is currently no support for code points in the format \uXXXX\uYYYY that are not from the basic multilingual plane (they are converted to CESU-8 instead of UTF-8).

Checks that passed string is a valid json.

Examples:

SELECT isValidJSON('{"a": "hello", "b": [-100, 200.0, 300]}') = 1
SELECT isValidJSON('not a json') = 0

If the value exists in the JSON document, 1 will be returned.

If the value does not exist, 0 will be returned.

Examples:

SELECT JSONHas('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 1
SELECT JSONHas('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 4) = 0

indices_or_keys is a list of zero or more arguments each of them can be either string or integer.

  • String = access object member by key.

  • Positive integer = access the n-th member/key from the beginning.

  • Negative integer = access the n-th member/key from the end.

Minimum index of the element is 1. Thus the element 0 does not exist.

You may use integers to access both JSON arrays and JSON objects.

So, for example:

SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', 1) = 'a'
SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', 2) = 'b'
SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', -1) = 'b'
SELECT JSONExtractKey('{"a": "hello", "b": [-100, 200.0, 300]}', -2) = 'a'
SELECT JSONExtractString('{"a": "hello", "b": [-100, 200.0, 300]}', 1) = 'hello'

Return the length of a JSON array or a JSON object.

If the value does not exist or has a wrong type, 0 will be returned.

Examples:

SELECT JSONLength('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 3
SELECT JSONLength('{"a": "hello", "b": [-100, 200.0, 300]}') = 2

Return the type of a JSON value.

If the value does not exist, Null will be returned.

Examples:

SELECT JSONType('{"a": "hello", "b": [-100, 200.0, 300]}') = 'Object'
SELECT JSONType('{"a": "hello", "b": [-100, 200.0, 300]}', 'a') = 'String'
SELECT JSONType('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = 'Array'

Parses a JSON and extract a value. These functions are similar to visitParam functions.

If the value does not exist or has a wrong type, 0 will be returned.

Examples:

SELECT JSONExtractInt('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 1) = -100
SELECT JSONExtractFloat('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 2) = 200.0
SELECT JSONExtractUInt('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', -1) = 300

Parses a JSON and extract a string. This function is similar to visitParamExtractString functions.

If the value does not exist or has a wrong type, an empty string will be returned.

The value is unescaped. If unescaping failed, it returns an empty string.

Examples:

SELECT JSONExtractString('{"a": "hello", "b": [-100, 200.0, 300]}', 'a') = 'hello'
SELECT JSONExtractString('{"abc":"\\n\\u0000"}', 'abc') = '\n\0'
SELECT JSONExtractString('{"abc":"\\u263a"}', 'abc') = '☺'
SELECT JSONExtractString('{"abc":"\\u263"}', 'abc') = ''
SELECT JSONExtractString('{"abc":"hello}', 'abc') = ''

Parses a JSON and extract a value of the given ClickHouse data type.

This is a generalization of the previous JSONExtract<type> functions. This means JSONExtract(..., 'String') returns exactly the same as JSONExtractString(), JSONExtract(..., 'Float64') returns exactly the same as JSONExtractFloat().

Examples:

SELECT JSONExtract('{"a": "hello", "b": [-100, 200.0, 300]}', 'Tuple(String, Array(Float64))') = ('hello',[-100,200,300])
SELECT JSONExtract('{"a": "hello", "b": [-100, 200.0, 300]}', 'Tuple(b Array(Float64), a String)') = ([-100,200,300],'hello')
SELECT JSONExtract('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 'Array(Nullable(Int8))') = [-100, NULL, NULL]
SELECT JSONExtract('{"a": "hello", "b": [-100, 200.0, 300]}', 'b', 4, 'Nullable(Int64)') = NULL
SELECT JSONExtract('{"passed": true}', 'passed', 'UInt8') = 1
SELECT JSONExtract('{"day": "Thursday"}', 'day', 'Enum8(\'Sunday\' = 0, \'Monday\' = 1, \'Tuesday\' = 2, \'Wednesday\' = 3, \'Thursday\' = 4, \'Friday\' = 5, \'Saturday\' = 6)') = 'Thursday'
SELECT JSONExtract('{"day": 5}', 'day', 'Enum8(\'Sunday\' = 0, \'Monday\' = 1, \'Tuesday\' = 2, \'Wednesday\' = 3, \'Thursday\' = 4, \'Friday\' = 5, \'Saturday\' = 6)') = 'Friday'

Parses key-value pairs from a JSON where the values are of the given ClickHouse data type.

Example:

SELECT JSONExtractKeysAndValues('{"x": {"a": 5, "b": 7, "c": 11}}', 'x', 'Int8') = [('a',5),('b',7),('c',11)];

Parses a JSON string and extracts the keys.

Syntax

JSONExtractKeys(json[, a, b, c...])

Arguments

Returned value

Array with the keys of the JSON.

Example

Query:

SELECT JSONExtractKeys('{"a": "hello", "b": [-100, 200.0, 300]}');

Result:

text
┌─JSONExtractKeys('{"a": "hello", "b": [-100, 200.0, 300]}')─┐
│ ['a','b']                                                  │
└────────────────────────────────────────────────────────────┘

Returns a part of JSON as unparsed string.

If the part does not exist or has a wrong type, an empty string will be returned.

Example:

SELECT JSONExtractRaw('{"a": "hello", "b": [-100, 200.0, 300]}', 'b') = '[-100, 200.0, 300]';

Returns an array with elements of JSON array, each represented as unparsed string.

If the part does not exist or isn’t array, an empty array will be returned.

Example:

SELECT JSONExtractArrayRaw('{"a": "hello", "b": [-100, 200.0, "hello"]}', 'b') = ['-100', '200.0', '"hello"'];

Extracts raw data from a JSON object.

Syntax

JSONExtractKeysAndValuesRaw(json[, p, a, t, h])

Arguments

Returned values

  • Array with ('key', 'value') tuples. Both tuple members are strings.

  • Empty array if the requested object does not exist, or input JSON is invalid.

Examples

Query:

SELECT JSONExtractKeysAndValuesRaw('{"a": [-100, 200.0], "b":{"c": {"d": "hello", "f": "world"}}}');

Result:

┌─JSONExtractKeysAndValuesRaw('{"a": [-100, 200.0], "b":{"c": {"d": "hello", "f": "world"}}}')─┐
│ [('a','[-100,200]'),('b','{"c":{"d":"hello","f":"world"}}')]                                 │
└──────────────────────────────────────────────────────────────────────────────────────────────┘

Query:

SELECT JSONExtractKeysAndValuesRaw('{"a": [-100, 200.0], "b":{"c": {"d": "hello", "f": "world"}}}', 'b');

Result:

┌─JSONExtractKeysAndValuesRaw('{"a": [-100, 200.0], "b":{"c": {"d": "hello", "f": "world"}}}', 'b')─┐
│ [('c','{"d":"hello","f":"world"}')]                                                               │
└───────────────────────────────────────────────────────────────────────────────────────────────────┘

Query:

SELECT JSONExtractKeysAndValuesRaw('{"a": [-100, 200.0], "b":{"c": {"d": "hello", "f": "world"}}}', -1, 'c');

Result:

┌─JSONExtractKeysAndValuesRaw('{"a": [-100, 200.0], "b":{"c": {"d": "hello", "f": "world"}}}', -1, 'c')─┐
│ [('d','"hello"'),('f','"world"')]                                                                     │
└───────────────────────────────────────────────────────────────────────────────────────────────────────┘

If the value exists in the JSON document, 1 will be returned.

If the value does not exist, 0 will be returned.

Examples:

SELECT JSON_EXISTS('{"hello":1}', '$.hello');
SELECT JSON_EXISTS('{"hello":{"world":1}}', '$.hello.world');
SELECT JSON_EXISTS('{"hello":["world"]}', '$.hello[*]');
SELECT JSON_EXISTS('{"hello":["world"]}', '$.hello[0]');

NOTE

Before version 21.11 the order of arguments was wrong, i.e. JSON_EXISTS(path, json)

Parses a JSON and extract a value as JSON array or JSON object.

If the value does not exist, an empty string will be returned.

Example:

SELECT JSON_QUERY('{"hello":"world"}', '$.hello');
SELECT JSON_QUERY('{"array":[[0, 1, 2, 3, 4, 5], [0, -1, -2, -3, -4, -5]]}', '$.array[*][0 to 2, 4]');
SELECT JSON_QUERY('{"hello":2}', '$.hello');
SELECT toTypeName(JSON_QUERY('{"hello":2}', '$.hello'));

Result:

["world"]
[0, 1, 4, 0, -1, -4]
[2]
String

NOTE

Before version 21.11 the order of arguments was wrong, i.e. JSON_QUERY(path, json)

Parses a JSON and extract a value as JSON scalar.

If the value does not exist, an empty string will be returned.

Example:

SELECT JSON_VALUE('{"hello":"world"}', '$.hello');
SELECT JSON_VALUE('{"array":[[0, 1, 2, 3, 4, 5], [0, -1, -2, -3, -4, -5]]}', '$.array[*][0 to 2, 4]');
SELECT JSON_VALUE('{"hello":2}', '$.hello');
SELECT toTypeName(JSON_VALUE('{"hello":2}', '$.hello'));

Result:

world
0
2
String

NOTE

Before version 21.11 the order of arguments was wrong, i.e. JSON_VALUE(path, json)

Syntax

toJSONString(value)

Arguments

  • value — Value to serialize. Value may be of any data type.

Returned value

  • JSON representation of the value.

Example

Query:

SELECT toJSONString(map('key1', 1, 'key2', 2));
SELECT toJSONString(tuple(1.25, NULL, NaN, +inf, -inf, [])) SETTINGS output_format_json_quote_denormals = 1;

Result:

{"key1":1,"key2":2}
[1.25,null,"nan","inf","-inf",[]]

MACHINE LEARNING FUNCTIONS

Prediction using fitted regression models uses evalMLMethod function. See link in linearRegression.

MAPS

Syntax

map(key1, value1[, key2, value2, ...])

Arguments

Returned value

  • Data structure as key:value pairs.

Examples

Query:

SELECT map('key1', number, 'key2', number * 2) FROM numbers(3);

Result:

┌─map('key1', number, 'key2', multiply(number, 2))─┐
│ {'key1':0,'key2':0}                              │
│ {'key1':1,'key2':2}                              │
│ {'key1':2,'key2':4}                              │
└──────────────────────────────────────────────────┘

Query:

CREATE TABLE table_map (a Map(String, UInt64)) ENGINE = MergeTree() ORDER BY a;
INSERT INTO table_map SELECT map('key1', number, 'key2', number * 2) FROM numbers(3);
SELECT a['key2'] FROM table_map;

Result:

┌─arrayElement(a, 'key2')─┐
│                       0 │
│                       2 │
│                       4 │
└─────────────────────────┘

See Also

Collect all the keys and sum corresponding values.

Syntax

mapAdd(arg1, arg2 [, ...])

Arguments

Returned value

Example

Query with a tuple:

SELECT mapAdd(([toUInt8(1), 2], [1, 1]), ([toUInt8(1), 2], [1, 1])) as res, toTypeName(res) as type;

Result:

┌─res───────────┬─type───────────────────────────────┐
│ ([1,2],[2,2]) │ Tuple(Array(UInt8), Array(UInt64)) │
└───────────────┴────────────────────────────────────┘

Query with Map type:

SELECT mapAdd(map(1,1), map(1,1));

Result:

┌─mapAdd(map(1, 1), map(1, 1))─┐
│ {1:2}                        │
└──────────────────────────────┘

Collect all the keys and subtract corresponding values.

Syntax

mapSubtract(Tuple(Array, Array), Tuple(Array, Array) [, ...])

Arguments

Returned value

Example

Query with a tuple map:

SELECT mapSubtract(([toUInt8(1), 2], [toInt32(1), 1]), ([toUInt8(1), 2], [toInt32(2), 1])) as res, toTypeName(res) as type;

Result:

┌─res────────────┬─type──────────────────────────────┐
│ ([1,2],[-1,0]) │ Tuple(Array(UInt8), Array(Int64)) │
└────────────────┴───────────────────────────────────┘

Query with Map type:

SELECT mapSubtract(map(1,1), map(1,1));

Result:

┌─mapSubtract(map(1, 1), map(1, 1))─┐
│ {1:0}                             │
└───────────────────────────────────┘

Fills missing keys in the maps (key and value array pair), where keys are integers. Also, it supports specifying the max key, which is used to extend the keys array.

Syntax

mapPopulateSeries(keys, values[, max])
mapPopulateSeries(map[, max])

Generates a map (a tuple with two arrays or a value of Map type, depending on the arguments), where keys are a series of numbers, from minimum to maximum keys (or max argument if it specified) taken from the map with a step size of one, and corresponding values. If the value is not specified for the key, then it uses the default value in the resulting map. For repeated keys, only the first value (in order of appearing) gets associated with the key.

For array arguments the number of elements in keys and values must be the same for each row.

Arguments

Mapped arrays:

or

Returned value

Example

Query with mapped arrays:

SELECT mapPopulateSeries([1,2,4], [11,22,44], 5) AS res, toTypeName(res) AS type;

Result:

┌─res──────────────────────────┬─type──────────────────────────────┐
│ ([1,2,3,4,5],[11,22,0,44,0]) │ Tuple(Array(UInt8), Array(UInt8)) │
└──────────────────────────────┴───────────────────────────────────┘

Query with Map type:

SELECT mapPopulateSeries(map(1, 10, 5, 20), 6);

Result:

┌─mapPopulateSeries(map(1, 10, 5, 20), 6)─┐
│ {1:10,2:0,3:0,4:0,5:20,6:0}             │
└─────────────────────────────────────────┘

Determines whether the map contains the key parameter.

Syntax

mapContains(map, key)

Parameters

  • key — Key. Type matches the type of keys of map parameter.

Returned value

  • 1 if map contains key, 0 if not.

Example

Query:

CREATE TABLE test (a Map(String,String)) ENGINE = Memory;

INSERT INTO test VALUES ({'name':'eleven','age':'11'}), ({'number':'twelve','position':'6.0'});

SELECT mapContains(a, 'name') FROM test;

Result:

┌─mapContains(a, 'name')─┐
│                      1 │
│                      0 │
└────────────────────────┘

Returns all keys from the map parameter.

Syntax

mapKeys(map)

Parameters

Returned value

  • Array containing all keys from the map.

Example

Query:

CREATE TABLE test (a Map(String,String)) ENGINE = Memory;

INSERT INTO test VALUES ({'name':'eleven','age':'11'}), ({'number':'twelve','position':'6.0'});

SELECT mapKeys(a) FROM test;

Result:

┌─mapKeys(a)────────────┐
│ ['name','age']        │
│ ['number','position'] │
└───────────────────────┘

Returns all values from the map parameter.

Syntax

mapValues(map)

Parameters

Returned value

  • Array containing all the values from map.

Example

Query:

CREATE TABLE test (a Map(String,String)) ENGINE = Memory;

INSERT INTO test VALUES ({'name':'eleven','age':'11'}), ({'number':'twelve','position':'6.0'});

SELECT mapValues(a) FROM test;

Result:

┌─mapValues(a)─────┐
│ ['eleven','11']  │
│ ['twelve','6.0'] │
└──────────────────┘

MATHEMATICAL

All the functions return a Float64 number. The accuracy of the result is close to the maximum precision possible, but the result might not coincide with the machine representable number nearest to the corresponding real number.

Returns a Float64 number that is close to the number e.

Accepts a numeric argument and returns a Float64 number close to the exponent of the argument.

Accepts a numeric argument and returns a Float64 number close to the natural logarithm of the argument.

Accepts a numeric argument and returns a Float64 number close to 2 to the power of x.

Accepts a numeric argument and returns a Float64 number close to the binary logarithm of the argument.

Accepts a numeric argument and returns a Float64 number close to 10 to the power of x.

Accepts a numeric argument and returns a Float64 number close to the decimal logarithm of the argument.

Accepts a numeric argument and returns a Float64 number close to the square root of the argument.

Accepts a numeric argument and returns a Float64 number close to the cubic root of the argument.

If ‘x’ is non-negative, then erf(x / σ√2) is the probability that a random variable having a normal distribution with standard deviation ‘σ’ takes the value that is separated from the expected value by more than ‘x’.

Example (three sigma rule):

SELECT erf(3 / sqrt(2));
┌─erf(divide(3, sqrt(2)))─┐
│      0.9973002039367398 │
└─────────────────────────┘

Accepts a numeric argument and returns a Float64 number close to 1 - erf(x), but without loss of precision for large ‘x’ values.

The logarithm of the gamma function.

Gamma function.

The sine.

The cosine.

The tangent.

The arc sine.

The arc cosine.

The arc tangent.

Takes two numeric arguments x and y. Returns a Float64 number close to x to the power of y.

Accepts a numeric argument and returns a UInt64 number close to 2 to the power of x.

Accepts a numeric argument and returns a UInt64 number close to 10 to the power of x.

Syntax

cosh(x)

Arguments

Returned value

  • Values from the interval: 1 <= cosh(x) < +∞.

Example

Query:

SELECT cosh(0);

Result:

┌─cosh(0)──┐
│        1 │
└──────────┘

Syntax

acosh(x)

Arguments

Returned value

  • The angle, in radians. Values from the interval: 0 <= acosh(x) < +∞.

Example

Query:

SELECT acosh(1);

Result:

┌─acosh(1)─┐
│        0 │
└──────────┘

See Also

Syntax

sinh(x)

Arguments

Returned value

  • Values from the interval: -∞ < sinh(x) < +∞.

Example

Query:

SELECT sinh(0);

Result:

┌─sinh(0)──┐
│        0 │
└──────────┘

Syntax

asinh(x)

Arguments

Returned value

  • The angle, in radians. Values from the interval: -∞ < asinh(x) < +∞.

Example

Query:

SELECT asinh(0);

Result:

┌─asinh(0)─┐
│        0 │
└──────────┘

See Also

Syntax

atanh(x)

Arguments

Returned value

  • The angle, in radians. Values from the interval: -∞ < atanh(x) < +∞.

Example

Query:

SELECT atanh(0);

Result:

┌─atanh(0)─┐
│        0 │
└──────────┘

Syntax

atan2(y, x)

Arguments

Returned value

  • The angle θ such that −π < θ ≤ π, in radians.

Example

Query:

SELECT atan2(1, 1);

Result:

┌────────atan2(1, 1)─┐
│ 0.7853981633974483 │
└────────────────────┘

Syntax

hypot(x, y)

Arguments

Returned value

  • The length of the hypotenuse of a right-angle triangle.

Example

Query:

SELECT hypot(1, 1);

Result:

┌────────hypot(1, 1)─┐
│ 1.4142135623730951 │
└────────────────────┘

Syntax

log1p(x)

Arguments

Returned value

  • Values from the interval: -∞ < log1p(x) < +∞.

Example

Query:

SELECT log1p(0);

Result:

┌─log1p(0)─┐
│        0 │
└──────────┘

See Also

Returns the sign of a real number.

Syntax

sign(x)

Arguments

  • x — Values from -∞ to +∞. Support all numeric types in ClickHouse.

Returned value

  • -1 for x < 0

  • 0 for x = 0

  • 1 for x > 0

Examples

Sign for the zero value:

SELECT sign(0);

Result:

┌─sign(0)─┐
│       0 │
└─────────┘

Sign for the positive value:

SELECT sign(1);

Result:

┌─sign(1)─┐
│       1 │
└─────────┘

Sign for the negative value:

SELECT sign(-1);

Result:

┌─sign(-1)─┐
│       -1 │
└──────────┘

NULLABLE

isNull(x)

Alias: ISNULL.

Arguments

  • x — A value with a non-compound data type.

Returned value

  • 1 if x is NULL.

  • 0 if x is not NULL.

Example

Input table

┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │    3 │
└───┴──────┘

Query

SELECT x FROM t_null WHERE isNull(y);
┌─x─┐
│ 1 │
└───┘
isNotNull(x)

Arguments:

  • x — A value with a non-compound data type.

Returned value

  • 0 if x is NULL.

  • 1 if x is not NULL.

Example

Input table

┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │    3 │
└───┴──────┘

Query

SELECT x FROM t_null WHERE isNotNull(y);
┌─x─┐
│ 2 │
└───┘

Checks from left to right whether NULL arguments were passed and returns the first non-NULL argument.

coalesce(x,...)

Arguments:

  • Any number of parameters of a non-compound type. All parameters must be compatible by data type.

Returned values

  • The first non-NULL argument.

  • NULL, if all arguments are NULL.

Example

Consider a list of contacts that may specify multiple ways to contact a customer.

┌─name─────┬─mail─┬─phone─────┬──icq─┐
│ client 1 │ ᴺᵁᴸᴸ │ 123-45-67 │  123 │
│ client 2 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ      │ ᴺᵁᴸᴸ │
└──────────┴──────┴───────────┴──────┘

The mail and phone fields are of type String, but the icq field is UInt32, so it needs to be converted to String.

Get the first available contact method for the customer from the contact list:

SELECT name, coalesce(mail, phone, CAST(icq,'Nullable(String)')) FROM aBook;
┌─name─────┬─coalesce(mail, phone, CAST(icq, 'Nullable(String)'))─┐
│ client 1 │ 123-45-67                                            │
│ client 2 │ ᴺᵁᴸᴸ                                                 │
└──────────┴──────────────────────────────────────────────────────┘

Returns an alternative value if the main argument is NULL.

ifNull(x,alt)

Arguments:

  • x — The value to check for NULL.

  • alt — The value that the function returns if x is NULL.

Returned values

  • The value x, if x is not NULL.

  • The value alt, if x is NULL.

Example

SELECT ifNull('a', 'b');
┌─ifNull('a', 'b')─┐
│ a                │
└──────────────────┘
SELECT ifNull(NULL, 'b');
┌─ifNull(NULL, 'b')─┐
│ b                 │
└───────────────────┘

Returns NULL if the arguments are equal.

nullIf(x, y)

Arguments:

x, y — Values for comparison. They must be compatible types, or ClickHouse will generate an exception.

Returned values

  • NULL, if the arguments are equal.

  • The x value, if the arguments are not equal.

Example

SELECT nullIf(1, 1);
┌─nullIf(1, 1)─┐
│         ᴺᵁᴸᴸ │
└──────────────┘
SELECT nullIf(1, 2);
┌─nullIf(1, 2)─┐
│            1 │
└──────────────┘
assumeNotNull(x)

Arguments:

  • x — The original value.

Returned values

  • The original value from the non-Nullable type, if it is not NULL.

  • Implementation specific result if the original value was NULL.

Example

Consider the t_null table.

SHOW CREATE TABLE t_null;
┌─statement─────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.t_null ( x Int8,  y Nullable(Int8)) ENGINE = TinyLog │
└───────────────────────────────────────────────────────────────────────────┘
┌─x─┬────y─┐
│ 1 │ ᴺᵁᴸᴸ │
│ 2 │    3 │
└───┴──────┘

Apply the assumeNotNull function to the y column.

SELECT assumeNotNull(y) FROM t_null;
┌─assumeNotNull(y)─┐
│                0 │
│                3 │
└──────────────────┘
SELECT toTypeName(assumeNotNull(y)) FROM t_null;
┌─toTypeName(assumeNotNull(y))─┐
│ Int8                         │
│ Int8                         │
└──────────────────────────────┘

Converts the argument type to Nullable.

toNullable(x)

Arguments:

  • x — The value of any non-compound type.

Returned value

  • The input value with a Nullable type.

Example

SELECT toTypeName(10);
┌─toTypeName(10)─┐
│ UInt8          │
└────────────────┘
SELECT toTypeName(toNullable(10));
┌─toTypeName(toNullable(10))─┐
│ Nullable(UInt8)            │
└────────────────────────────┘

OTHERS

Returns a string with the name of the host that this function was performed on. For distributed processing, this is the name of the remote server host, if the function is performed on a remote server. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Syntax

getMacro(name);

Arguments

Returned value

  • Value of the specified macro.

Example

The example macros section in the server configuration file:

<macros>
    <test>Value</test>
</macros>

Query:

SELECT getMacro('test');

Result:

┌─getMacro('test')─┐
│ Value            │
└──────────────────┘

An alternative way to get the same value:

SELECT * FROM system.macros
WHERE macro = 'test';
┌─macro─┬─substitution─┐
│ test  │ Value        │
└───────┴──────────────┘

Returns the fully qualified domain name.

Syntax

fqdn();

This function is case-insensitive.

Returned value

  • String with the fully qualified domain name.

Type: String.

Example

Query:

SELECT FQDN();

Result:

┌─FQDN()──────────────────────────┐
│ clickhouse.ru-central1.internal │
└─────────────────────────────────┘

Extracts the trailing part of a string after the last slash or backslash. This function if often used to extract the filename from a path.

basename( expr )

Arguments

Returned Value

A string that contains:

  • The trailing part of a string after the last slash or backslash.

    If the input string contains a path ending with slash or backslash, for example, `/` or `c:\`, the function returns an empty string.
  • The original string if there are no slashes or backslashes.

Example

SELECT 'some/long/path/to/file' AS a, basename(a)
┌─a──────────────────────┬─basename('some\\long\\path\\to\\file')─┐
│ some\long\path\to\file │ file                                   │
└────────────────────────┴────────────────────────────────────────┘
SELECT 'some\\long\\path\\to\\file' AS a, basename(a)
┌─a──────────────────────┬─basename('some\\long\\path\\to\\file')─┐
│ some\long\path\to\file │ file                                   │
└────────────────────────┴────────────────────────────────────────┘
SELECT 'some-file-name' AS a, basename(a)
┌─a──────────────┬─basename('some-file-name')─┐
│ some-file-name │ some-file-name             │
└────────────────┴────────────────────────────┘

Calculates the approximate width when outputting values to the console in text format (tab-separated). This function is used by the system for implementing Pretty formats.

NULL is represented as a string corresponding to NULL in Pretty formats.

SELECT visibleWidth(NULL)
┌─visibleWidth(NULL)─┐
│                  4 │
└────────────────────┘

Returns a string containing the type name of the passed argument.

If NULL is passed to the function as input, then it returns the Nullable(Nothing) type, which corresponds to an internal NULL representation in ClickHouse.

Gets the size of the block. In ClickHouse, queries are always run on blocks (sets of column parts). This function allows getting the size of the block that you called it for.

Returns estimation of uncompressed byte size of its arguments in memory.

Syntax

byteSize(argument [, ...])

Arguments

  • argument — Value.

Returned value

  • Estimation of byte size of the arguments in memory.

Examples

Query:

SELECT byteSize('string');

Result:

┌─byteSize('string')─┐
│                 15 │
└────────────────────┘

Query:

CREATE TABLE test
(
    `key` Int32,
    `u8` UInt8,
    `u16` UInt16,
    `u32` UInt32,
    `u64` UInt64,
    `i8` Int8,
    `i16` Int16,
    `i32` Int32,
    `i64` Int64,
    `f32` Float32,
    `f64` Float64
)
ENGINE = MergeTree
ORDER BY key;

INSERT INTO test VALUES(1, 8, 16, 32, 64,  -8, -16, -32, -64, 32.32, 64.64);

SELECT key, byteSize(u8) AS `byteSize(UInt8)`, byteSize(u16) AS `byteSize(UInt16)`, byteSize(u32) AS `byteSize(UInt32)`, byteSize(u64) AS `byteSize(UInt64)`, byteSize(i8) AS `byteSize(Int8)`, byteSize(i16) AS `byteSize(Int16)`, byteSize(i32) AS `byteSize(Int32)`, byteSize(i64) AS `byteSize(Int64)`, byteSize(f32) AS `byteSize(Float32)`, byteSize(f64) AS `byteSize(Float64)` FROM test ORDER BY key ASC FORMAT Vertical;

Result:

Row 1:
──────
key:               1
byteSize(UInt8):   1
byteSize(UInt16):  2
byteSize(UInt32):  4
byteSize(UInt64):  8
byteSize(Int8):    1
byteSize(Int16):   2
byteSize(Int32):   4
byteSize(Int64):   8
byteSize(Float32): 4
byteSize(Float64): 8

If the function takes multiple arguments, it returns their combined byte size.

Query:

SELECT byteSize(NULL, 1, 0.3, '');

Result:

┌─byteSize(NULL, 1, 0.3, '')─┐
│                         19 │
└────────────────────────────┘

Turns a constant into a full column containing just one value. In ClickHouse, full columns and constants are represented differently in memory. Functions work differently for constant arguments and normal arguments (different code is executed), although the result is almost always the same. This function is for debugging this behavior.

Accepts any arguments, including NULL. Always returns 0. However, the argument is still evaluated. This can be used for benchmarks.

Sleeps ‘seconds’ seconds on each data block. You can specify an integer or a floating-point number.

Sleeps ‘seconds’ seconds on each row. You can specify an integer or a floating-point number.

Returns the name of the current database. You can use this function in table engine parameters in a CREATE TABLE query where you need to specify the database.

Returns the login of current user. Login of user, that initiated query, will be returned in case distibuted query.

SELECT currentUser();

Alias: user(), USER().

Returned values

  • Login of current user.

  • Login of user that initiated query in case of disributed query.

Type: String.

Example

Query:

SELECT currentUser();

Result:

┌─currentUser()─┐
│ default       │
└───────────────┘

Checks whether the argument is a constant expression.

The function is intended for development, debugging and demonstration.

Syntax

isConstant(x)

Arguments

  • x — Expression to check.

Returned values

  • 1 — x is constant.

  • 0 — x is non-constant.

Examples

Query:

SELECT isConstant(x + 1) FROM (SELECT 43 AS x)

Result:

┌─isConstant(plus(x, 1))─┐
│                      1 │
└────────────────────────┘

Query:

WITH 3.14 AS pi SELECT isConstant(cos(pi))

Result:

┌─isConstant(cos(pi))─┐
│                   1 │
└─────────────────────┘

Query:

SELECT isConstant(number) FROM numbers(1)

Result:

┌─isConstant(number)─┐
│                  0 │
└────────────────────┘

Accepts Float32 and Float64 and returns UInt8 equal to 1 if the argument is not infinite and not a NaN, otherwise 0.

Accepts Float32 and Float64 and returns UInt8 equal to 1 if the argument is infinite, otherwise 0. Note that 0 is returned for a NaN.

Checks whether floating point value is finite.

Syntax

ifNotFinite(x,y)

Arguments

Returned value

  • x if x is finite.

  • y if x is not finite.

Example

Query:

SELECT 1/0 as infimum, ifNotFinite(infimum,42)

Result:

┌─infimum─┬─ifNotFinite(divide(1, 0), 42)─┐
│     inf │                            42 │
└─────────┴───────────────────────────────┘

Accepts Float32 and Float64 and returns UInt8 equal to 1 if the argument is a NaN, otherwise 0.

Accepts constant strings: database name, table name, and column name. Returns a UInt8 constant expression equal to 1 if there is a column, otherwise 0. If the hostname parameter is set, the test will run on a remote server. The function throws an exception if the table does not exist. For elements in a nested data structure, the function checks for the existence of a column. For the nested data structure itself, the function returns 0.

Allows building a unicode-art diagram.

bar(x, min, max, width) draws a band with a width proportional to (x - min) and equal to width characters when x = max.

Arguments

  • x — Size to display.

  • min, max — Integer constants. The value must fit in Int64.

  • width — Constant, positive integer, can be fractional.

The band is drawn with accuracy to one eighth of a symbol.

Example:

SELECT
    toHour(EventTime) AS h,
    count() AS c,
    bar(c, 0, 600000, 20) AS bar
FROM test.hits
GROUP BY h
ORDER BY h ASC
┌──h─┬──────c─┬─bar────────────────┐
│  0 │ 292907 │ █████████▋         │
│  1 │ 180563 │ ██████             │
│  2 │ 114861 │ ███▋               │
│  3 │  85069 │ ██▋                │
│  4 │  68543 │ ██▎                │
│  5 │  78116 │ ██▌                │
│  6 │ 113474 │ ███▋               │
│  7 │ 170678 │ █████▋             │
│  8 │ 278380 │ █████████▎         │
│  9 │ 391053 │ █████████████      │
│ 10 │ 457681 │ ███████████████▎   │
│ 11 │ 493667 │ ████████████████▍  │
│ 12 │ 509641 │ ████████████████▊  │
│ 13 │ 522947 │ █████████████████▍ │
│ 14 │ 539954 │ █████████████████▊ │
│ 15 │ 528460 │ █████████████████▌ │
│ 16 │ 539201 │ █████████████████▊ │
│ 17 │ 523539 │ █████████████████▍ │
│ 18 │ 506467 │ ████████████████▊  │
│ 19 │ 520915 │ █████████████████▎ │
│ 20 │ 521665 │ █████████████████▍ │
│ 21 │ 542078 │ ██████████████████ │
│ 22 │ 493642 │ ████████████████▍  │
│ 23 │ 400397 │ █████████████▎     │
└────┴────────┴────────────────────┘

Transforms a value according to the explicitly defined mapping of some elements to other ones. There are two variations of this function:

x – What to transform.

array_from – Constant array of values for converting.

array_to – Constant array of values to convert the values in ‘from’ to.

default – Which value to use if ‘x’ is not equal to any of the values in ‘from’.

array_from and array_to – Arrays of the same size.

Types:

transform(T, Array(T), Array(U), U) -> U

T and U can be numeric, string, or Date or DateTime types. Where the same letter is indicated (T or U), for numeric types these might not be matching types, but types that have a common type. For example, the first argument can have the Int64 type, while the second has the Array(UInt16) type.

If the ‘x’ value is equal to one of the elements in the ‘array_from’ array, it returns the existing element (that is numbered the same) from the ‘array_to’ array. Otherwise, it returns ‘default’. If there are multiple matching elements in ‘array_from’, it returns one of the matches.

Example:

SELECT
    transform(SearchEngineID, [2, 3], ['Yandex', 'Google'], 'Other') AS title,
    count() AS c
FROM test.hits
WHERE SearchEngineID != 0
GROUP BY title
ORDER BY c DESC
┌─title─────┬──────c─┐
│ Yandex    │ 498635 │
│ Google    │ 229872 │
│ Other     │ 104472 │
└───────────┴────────┘

Differs from the first variation in that the ‘default’ argument is omitted. If the ‘x’ value is equal to one of the elements in the ‘array_from’ array, it returns the matching element (that is numbered the same) from the ‘array_to’ array. Otherwise, it returns ‘x’.

Types:

transform(T, Array(T), Array(T)) -> T

Example:

SELECT
    transform(domain(Referer), ['yandex.ru', 'google.ru', 'vkontakte.ru'], ['www.yandex', 'example.com', 'vk.com']) AS s,
    count() AS c
FROM test.hits
GROUP BY domain(Referer)
ORDER BY count() DESC
LIMIT 10
┌─s──────────────┬───────c─┐
│                │ 2906259 │
│ www.yandex     │  867767 │
│ ███████.ru     │  313599 │
│ mail.yandex.ru │  107147 │
│ ██████.ru      │  100355 │
│ █████████.ru   │   65040 │
│ news.yandex.ru │   64515 │
│ ██████.net     │   59141 │
│ example.com    │   57316 │
└────────────────┴─────────┘

Accepts the size (number of bytes). Returns a rounded size with a suffix (KB, MB, etc.) as a string.

Example:

SELECT
    arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
    formatReadableDecimalSize(filesize_bytes) AS filesize
┌─filesize_bytes─┬─filesize───┐
│              1 │ 1.00 B     │
│           1024 │ 1.02 KB   │
│        1048576 │ 1.05 MB   │
│      192851925 │ 192.85 MB │
└────────────────┴────────────┘

Accepts the size (number of bytes). Returns a rounded size with a suffix (KiB, MiB, etc.) as a string.

Example:

SELECT
    arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
    formatReadableSize(filesize_bytes) AS filesize
┌─filesize_bytes─┬─filesize───┐
│              1 │ 1.00 B     │
│           1024 │ 1.00 KiB   │
│        1048576 │ 1.00 MiB   │
│      192851925 │ 183.92 MiB │
└────────────────┴────────────┘

Accepts the number. Returns a rounded number with a suffix (thousand, million, billion, etc.) as a string.

It is useful for reading big numbers by human.

Example:

SELECT
    arrayJoin([1024, 1234 * 1000, (4567 * 1000) * 1000, 98765432101234]) AS number,
    formatReadableQuantity(number) AS number_for_humans
┌─────────number─┬─number_for_humans─┐
│           1024 │ 1.02 thousand     │
│        1234000 │ 1.23 million      │
│     4567000000 │ 4.57 billion      │
│ 98765432101234 │ 98.77 trillion    │
└────────────────┴───────────────────┘

Accepts the time delta in seconds. Returns a time delta with (year, month, day, hour, minute, second) as a string.

Syntax

formatReadableTimeDelta(column[, maximum_unit])

Arguments

  • column — A column with numeric time delta.

  • maximum_unit — Optional. Maximum unit to show. Acceptable values seconds, minutes, hours, days, months, years.

Example:

SELECT
    arrayJoin([100, 12345, 432546534]) AS elapsed,
    formatReadableTimeDelta(elapsed) AS time_delta
┌────elapsed─┬─time_delta ─────────────────────────────────────────────────────┐
│        100 │ 1 minute and 40 seconds                                         │
│      12345 │ 3 hours, 25 minutes and 45 seconds                              │
│  432546534 │ 13 years, 8 months, 17 days, 7 hours, 48 minutes and 54 seconds │
└────────────┴─────────────────────────────────────────────────────────────────┘
SELECT
    arrayJoin([100, 12345, 432546534]) AS elapsed,
    formatReadableTimeDelta(elapsed, 'minutes') AS time_delta
┌────elapsed─┬─time_delta ─────────────────────────────────────────────────────┐
│        100 │ 1 minute and 40 seconds                                         │
│      12345 │ 205 minutes and 45 seconds                                      │
│  432546534 │ 7209108 minutes and 54 seconds                                  │
└────────────┴─────────────────────────────────────────────────────────────────┘

Returns the smallest value from a and b.

Returns the largest value of a and b.

Returns the server’s uptime in seconds. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Returns the version of the server as a string. If it is executed in the context of a distributed table, then it generates a normal column with values relevant to each shard. Otherwise it produces a constant value.

Returns the sequence number of the data block where the row is located.

Returns the ordinal number of the row in the data block. Different data blocks are always recalculated.

Returns the ordinal number of the row in the data block. This function only considers the affected data blocks.

The window function that provides access to a row at a specified offset which comes before or after the current row of a given column.

Syntax

neighbor(column, offset[, default_value])

The result of the function depends on the affected data blocks and the order of data in the block.

WARNING

It can reach the neighbor rows only inside the currently processed data block.

Arguments

  • column — A column name or scalar expression.

  • default_value — Optional. The value to be returned if offset goes beyond the scope of the block. Type of data blocks affected.

Returned values

  • Value for column in offset distance from current row if offset value is not outside block bounds.

  • Default value for column if offset value is outside block bounds. If default_value is given, then it will be used.

Type: type of data blocks affected or default value type.

Example

Query:

SELECT number, neighbor(number, 2) FROM system.numbers LIMIT 10;

Result:

┌─number─┬─neighbor(number, 2)─┐
│      0 │                   2 │
│      1 │                   3 │
│      2 │                   4 │
│      3 │                   5 │
│      4 │                   6 │
│      5 │                   7 │
│      6 │                   8 │
│      7 │                   9 │
│      8 │                   0 │
│      9 │                   0 │
└────────┴─────────────────────┘

Query:

SELECT number, neighbor(number, 2, 999) FROM system.numbers LIMIT 10;

Result:

┌─number─┬─neighbor(number, 2, 999)─┐
│      0 │                        2 │
│      1 │                        3 │
│      2 │                        4 │
│      3 │                        5 │
│      4 │                        6 │
│      5 │                        7 │
│      6 │                        8 │
│      7 │                        9 │
│      8 │                      999 │
│      9 │                      999 │
└────────┴──────────────────────────┘

This function can be used to compute year-over-year metric value:

Query:

WITH toDate('2018-01-01') AS start_date
SELECT
    toStartOfMonth(start_date + (number * 32)) AS month,
    toInt32(month) % 100 AS money,
    neighbor(money, -12) AS prev_year,
    round(prev_year / money, 2) AS year_over_year
FROM numbers(16)

Result:

┌──────month─┬─money─┬─prev_year─┬─year_over_year─┐
│ 2018-01-01 │    32 │         0 │              0 │
│ 2018-02-01 │    63 │         0 │              0 │
│ 2018-03-01 │    91 │         0 │              0 │
│ 2018-04-01 │    22 │         0 │              0 │
│ 2018-05-01 │    52 │         0 │              0 │
│ 2018-06-01 │    83 │         0 │              0 │
│ 2018-07-01 │    13 │         0 │              0 │
│ 2018-08-01 │    44 │         0 │              0 │
│ 2018-09-01 │    75 │         0 │              0 │
│ 2018-10-01 │     5 │         0 │              0 │
│ 2018-11-01 │    36 │         0 │              0 │
│ 2018-12-01 │    66 │         0 │              0 │
│ 2019-01-01 │    97 │        32 │           0.33 │
│ 2019-02-01 │    28 │        63 │           2.25 │
│ 2019-03-01 │    56 │        91 │           1.62 │
│ 2019-04-01 │    87 │        22 │           0.25 │
└────────────┴───────┴───────────┴────────────────┘

Calculates the difference between successive row values ​​in the data block. Returns 0 for the first row and the difference from the previous row for each subsequent row.

WARNING

It can reach the previous row only inside the currently processed data block.

The result of the function depends on the affected data blocks and the order of data in the block.

Example:

SELECT
    EventID,
    EventTime,
    runningDifference(EventTime) AS delta
FROM
(
    SELECT
        EventID,
        EventTime
    FROM events
    WHERE EventDate = '2016-11-24'
    ORDER BY EventTime ASC
    LIMIT 5
)
┌─EventID─┬───────────EventTime─┬─delta─┐
│    1106 │ 2016-11-24 00:00:04 │     0 │
│    1107 │ 2016-11-24 00:00:05 │     1 │
│    1108 │ 2016-11-24 00:00:05 │     0 │
│    1109 │ 2016-11-24 00:00:09 │     4 │
│    1110 │ 2016-11-24 00:00:10 │     1 │
└─────────┴─────────────────────┴───────┘

Please note - block size affects the result. With each new block, the runningDifference state is reset.

SELECT
    number,
    runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1
┌─number─┬─diff─┐
│      0 │    0 │
└────────┴──────┘
┌─number─┬─diff─┐
│  65536 │    0 │
└────────┴──────┘
set max_block_size=100000 -- default value is 65536!

SELECT
    number,
    runningDifference(number + 1) AS diff
FROM numbers(100000)
WHERE diff != 1
┌─number─┬─diff─┐
│      0 │    0 │
└────────┴──────┘

Calculates the number of concurrent events. Each event has a start time and an end time. The start time is included in the event, while the end time is excluded. Columns with a start time and an end time must be of the same data type. The function calculates the total number of active (concurrent) events for each event start time.

WARNING

Events must be ordered by the start time in ascending order. If this requirement is violated the function raises an exception. Every data block is processed separately. If events from different data blocks overlap then they can not be processed correctly.

Syntax

runningConcurrency(start, end)

Arguments

Returned values

  • The number of concurrent events at each event start time.

Example

Consider the table:

┌──────start─┬────────end─┐
│ 2021-03-03 │ 2021-03-11 │
│ 2021-03-06 │ 2021-03-12 │
│ 2021-03-07 │ 2021-03-08 │
│ 2021-03-11 │ 2021-03-12 │
└────────────┴────────────┘

Query:

SELECT start, runningConcurrency(start, end) FROM example_table;

Result:

┌──────start─┬─runningConcurrency(start, end)─┐
│ 2021-03-03 │                              1 │
│ 2021-03-06 │                              2 │
│ 2021-03-07 │                              3 │
│ 2021-03-11 │                              2 │
└────────────┴────────────────────────────────┘

Accepts a UInt64 number. Interprets it as a MAC address in big endian. Returns a string containing the corresponding MAC address in the format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form).

The inverse function of MACNumToString. If the MAC address has an invalid format, it returns 0.

Accepts a MAC address in the format AA:BB:CC:DD:EE:FF (colon-separated numbers in hexadecimal form). Returns the first three octets as a UInt64 number. If the MAC address has an invalid format, it returns 0.

getSizeOfEnumType(value)

Arguments:

  • value — Value of type Enum.

Returned values

  • The number of fields with Enum input values.

  • An exception is thrown if the type is not Enum.

Example

SELECT getSizeOfEnumType( CAST('a' AS Enum8('a' = 1, 'b' = 2) ) ) AS x
┌─x─┐
│ 2 │
└───┘

Returns size on disk (without taking into account compression).

blockSerializedSize(value[, value[, ...]])

Arguments

  • value — Any value.

Returned values

  • The number of bytes that will be written to disk for block of values (without compression).

Example

Query:

SELECT blockSerializedSize(maxState(1)) as x

Result:

┌─x─┐
│ 2 │
└───┘

Returns the name of the class that represents the data type of the column in RAM.

toColumnTypeName(value)

Arguments:

  • value — Any type of value.

Returned values

  • A string with the name of the class that is used for representing the value data type in RAM.

Example of the difference betweentoTypeName ' and ' toColumnTypeName

SELECT toTypeName(CAST('2018-01-01 01:02:03' AS DateTime))
┌─toTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime                                            │
└─────────────────────────────────────────────────────┘
SELECT toColumnTypeName(CAST('2018-01-01 01:02:03' AS DateTime))
┌─toColumnTypeName(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ Const(UInt32)                                             │
└───────────────────────────────────────────────────────────┘

The example shows that the DateTime data type is stored in memory as Const(UInt32).

Outputs a detailed description of data structures in RAM

dumpColumnStructure(value)

Arguments:

  • value — Any type of value.

Returned values

  • A string describing the structure that is used for representing the value data type in RAM.

Example

SELECT dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))
┌─dumpColumnStructure(CAST('2018-01-01 01:02:03', 'DateTime'))─┐
│ DateTime, Const(size = 1, UInt32(size = 1))                  │
└──────────────────────────────────────────────────────────────┘

Outputs the default value for the data type.

Does not include default values for custom columns set by the user.

defaultValueOfArgumentType(expression)

Arguments:

  • expression — Arbitrary type of value or an expression that results in a value of an arbitrary type.

Returned values

  • 0 for numbers.

  • Empty string for strings.

Example

SELECT defaultValueOfArgumentType( CAST(1 AS Int8) )
┌─defaultValueOfArgumentType(CAST(1, 'Int8'))─┐
│                                           0 │
└─────────────────────────────────────────────┘
SELECT defaultValueOfArgumentType( CAST(1 AS Nullable(Int8) ) )
┌─defaultValueOfArgumentType(CAST(1, 'Nullable(Int8)'))─┐
│                                                  ᴺᵁᴸᴸ │
└───────────────────────────────────────────────────────┘

Outputs the default value for given type name.

Does not include default values for custom columns set by the user.

defaultValueOfTypeName(type)

Arguments:

  • type — A string representing a type name.

Returned values

  • 0 for numbers.

  • Empty string for strings.

Example

SELECT defaultValueOfTypeName('Int8')
┌─defaultValueOfTypeName('Int8')─┐
│                              0 │
└────────────────────────────────┘
SELECT defaultValueOfTypeName('Nullable(Int8)')
┌─defaultValueOfTypeName('Nullable(Int8)')─┐
│                                     ᴺᵁᴸᴸ │
└──────────────────────────────────────────┘

The function is intended for debugging and introspection purposes. The function ignores it's argument and always returns 1. Arguments are not even evaluated.

But for the purpose of index analysis, the argument of this function is analyzed as if it was present directly without being wrapped inside indexHint function. This allows to select data in index ranges by the corresponding condition but without further filtering by this condition. The index in ClickHouse is sparse and using indexHint will yield more data than specifying the same condition directly.

Syntax

SELECT * FROM table WHERE indexHint(<expression>)

Returned value

Example

Input table:

SELECT count() FROM ontime
┌─count()─┐
│ 4276457 │
└─────────┘

The table has indexes on the fields (FlightDate, (Year, FlightDate)).

Create a query, where the index is not used.

Query:

SELECT FlightDate AS k, count() FROM ontime GROUP BY k ORDER BY k

ClickHouse processed the entire table (Processed 4.28 million rows).

Result:

┌──────────k─┬─count()─┐
│ 2017-01-01 │   13970 │
│ 2017-01-02 │   15882 │
........................
│ 2017-09-28 │   16411 │
│ 2017-09-29 │   16384 │
│ 2017-09-30 │   12520 │
└────────────┴─────────┘

To apply the index, select a specific date.

Query:

SELECT FlightDate AS k, count() FROM ontime WHERE k = '2017-09-15' GROUP BY k ORDER BY k

By using the index, ClickHouse processed a significantly smaller number of rows (Processed 32.74 thousand rows).

Result:

┌──────────k─┬─count()─┐
│ 2017-09-15 │   16428 │
└────────────┴─────────┘

Now wrap the expression k = '2017-09-15' into indexHint function.

Query:

SELECT
    FlightDate AS k,
    count()
FROM ontime
WHERE indexHint(k = '2017-09-15')
GROUP BY k
ORDER BY k ASC

ClickHouse used the index in the same way as the previous time (Processed 32.74 thousand rows). The expression k = '2017-09-15' was not used when generating the result. In examle the indexHint function allows to see adjacent dates.

Result:

┌──────────k─┬─count()─┐
│ 2017-09-14 │    7071 │
│ 2017-09-15 │   16428 │
│ 2017-09-16 │    1077 │
│ 2017-09-30 │    8167 │
└────────────┴─────────┘

Creates an array with a single value.

SELECT replicate(x, arr);

Arguments:

  • arr — Original array. ClickHouse creates a new array of the same length as the original and fills it with the value x.

  • x — The value that the resulting array will be filled with.

Returned value

An array filled with the value x.

Type: Array.

Example

Query:

SELECT replicate(1, ['a', 'b', 'c'])

Result:

┌─replicate(1, ['a', 'b', 'c'])─┐
│ [1,1,1]                       │
└───────────────────────────────┘

Syntax

filesystemAvailable()

Returned value

  • The amount of remaining space available in bytes.

Example

Query:

SELECT formatReadableSize(filesystemAvailable()) AS "Available space", toTypeName(filesystemAvailable()) AS "Type";

Result:

┌─Available space─┬─Type───┐
│ 30.75 GiB       │ UInt64 │
└─────────────────┴────────┘

Returns total amount of the free space on the filesystem where the files of the databases located. See also filesystemAvailable

Syntax

filesystemFree()

Returned value

  • Amount of free space in bytes.

Example

Query:

SELECT formatReadableSize(filesystemFree()) AS "Free space", toTypeName(filesystemFree()) AS "Type";

Result:

┌─Free space─┬─Type───┐
│ 32.39 GiB  │ UInt64 │
└────────────┴────────┘

Syntax

filesystemCapacity()

Returned value

  • Capacity information of the filesystem in bytes.

Example

Query:

SELECT formatReadableSize(filesystemCapacity()) AS "Capacity", toTypeName(filesystemCapacity()) AS "Type"

Result:

┌─Capacity──┬─Type───┐
│ 39.32 GiB │ UInt64 │
└───────────┴────────┘

Syntax

initializeAggregation (aggregate_function, arg1, arg2, ..., argN)

Arguments

  • arg — Arguments of aggregate function.

Returned value(s)

  • Result of aggregation for every row passed to the function.

The return type is the same as the return type of function, that initializeAgregation takes as first argument.

Example

Query:

SELECT uniqMerge(state) FROM (SELECT initializeAggregation('uniqState', number % 3) AS state FROM numbers(10000));

Result:

┌─uniqMerge(state)─┐
│                3 │
└──────────────────┘

Query:

SELECT finalizeAggregation(state), toTypeName(state) FROM (SELECT initializeAggregation('sumState', number % 3) AS state FROM numbers(5));

Result:

┌─finalizeAggregation(state)─┬─toTypeName(state)─────────────┐
│                          0 │ AggregateFunction(sum, UInt8) │
│                          1 │ AggregateFunction(sum, UInt8) │
│                          2 │ AggregateFunction(sum, UInt8) │
│                          0 │ AggregateFunction(sum, UInt8) │
│                          1 │ AggregateFunction(sum, UInt8) │
└────────────────────────────┴───────────────────────────────┘

Example with AggregatingMergeTree table engine and AggregateFunction column:

CREATE TABLE metrics
(
    key UInt64,
    value AggregateFunction(sum, UInt64) DEFAULT initializeAggregation('sumState', toUInt64(0))
)
ENGINE = AggregatingMergeTree
ORDER BY key
INSERT INTO metrics VALUES (0, initializeAggregation('sumState', toUInt64(42)))

See Also

Syntax

finalizeAggregation(state)

Arguments

Returned value(s)

  • Value/values that was aggregated.

Type: Value of any types that was aggregated.

Examples

Query:

SELECT finalizeAggregation(( SELECT countState(number) FROM numbers(10)));

Result:

┌─finalizeAggregation(_subquery16)─┐
│                               10 │
└──────────────────────────────────┘

Query:

SELECT finalizeAggregation(( SELECT sumState(number) FROM numbers(10)));

Result:

┌─finalizeAggregation(_subquery20)─┐
│                               45 │
└──────────────────────────────────┘

Note that NULL values are ignored.

Query:

SELECT finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]));

Result:

┌─finalizeAggregation(arrayReduce('anyState', [NULL, 2, 3]))─┐
│                                                          2 │
└────────────────────────────────────────────────────────────┘

Combined example:

Query:

WITH initializeAggregation('sumState', number) AS one_row_sum_state
SELECT
    number,
    finalizeAggregation(one_row_sum_state) AS one_row_sum,
    runningAccumulate(one_row_sum_state) AS cumulative_sum
FROM numbers(10);

Result:

┌─number─┬─one_row_sum─┬─cumulative_sum─┐
│      0 │           0 │              0 │
│      1 │           1 │              1 │
│      2 │           2 │              3 │
│      3 │           3 │              6 │
│      4 │           4 │             10 │
│      5 │           5 │             15 │
│      6 │           6 │             21 │
│      7 │           7 │             28 │
│      8 │           8 │             36 │
│      9 │           9 │             45 │
└────────┴─────────────┴────────────────┘

See Also

Accumulates states of an aggregate function for each row of a data block.

WARNING

The state is reset for each new data block.

Syntax

runningAccumulate(agg_state[, grouping]);

Arguments

Returned value

  • Each resulting row contains a result of the aggregate function, accumulated for all the input rows from 0 to the current position. runningAccumulate resets states for each new data block or when the grouping value changes.

Type depends on the aggregate function used.

Examples

Consider how you can use runningAccumulate to find the cumulative sum of numbers without and with grouping.

Query:

SELECT k, runningAccumulate(sum_k) AS res FROM (SELECT number as k, sumState(k) AS sum_k FROM numbers(10) GROUP BY k ORDER BY k);

Result:

┌─k─┬─res─┐
│ 0 │   0 │
│ 1 │   1 │
│ 2 │   3 │
│ 3 │   6 │
│ 4 │  10 │
│ 5 │  15 │
│ 6 │  21 │
│ 7 │  28 │
│ 8 │  36 │
│ 9 │  45 │
└───┴─────┘

The whole query does the following:

  1. For the first row, runningAccumulate takes sumState(0) and returns 0.

  2. For the second row, the function merges sumState(0) and sumState(1) resulting in sumState(0 + 1), and returns 1 as a result.

  3. For the third row, the function merges sumState(0 + 1) and sumState(2) resulting in sumState(0 + 1 + 2), and returns 3 as a result.

  4. The actions are repeated until the block ends.

The following example shows the groupping parameter usage:

Query:

SELECT
    grouping,
    item,
    runningAccumulate(state, grouping) AS res
FROM
(
    SELECT
        toInt8(number / 4) AS grouping,
        number AS item,
        sumState(number) AS state
    FROM numbers(15)
    GROUP BY item
    ORDER BY item ASC
);

Result:

┌─grouping─┬─item─┬─res─┐
│        0 │    0 │   0 │
│        0 │    1 │   1 │
│        0 │    2 │   3 │
│        0 │    3 │   6 │
│        1 │    4 │   4 │
│        1 │    5 │   9 │
│        1 │    6 │  15 │
│        1 │    7 │  22 │
│        2 │    8 │   8 │
│        2 │    9 │  17 │
│        2 │   10 │  27 │
│        2 │   11 │  38 │
│        3 │   12 │  12 │
│        3 │   13 │  25 │
│        3 │   14 │  39 │
└──────────┴──────┴─────┘

As you can see, runningAccumulate merges states for each group of rows separately.

Only supports tables created with the ENGINE = Join(ANY, LEFT, <join_keys>) statement.

Syntax

joinGet(join_storage_table_name, `value_column`, join_keys)

Arguments

  • value_column — name of the column of the table that contains required data.

  • join_keys — list of keys.

Returned value

Returns list of values corresponded to list of keys.

Example

Input table:

CREATE DATABASE db_test
CREATE TABLE db_test.id_val(`id` UInt32, `val` UInt32) ENGINE = Join(ANY, LEFT, id) SETTINGS join_use_nulls = 1
INSERT INTO db_test.id_val VALUES (1,11)(2,12)(4,13)
┌─id─┬─val─┐
│  4 │  13 │
│  2 │  12 │
│  1 │  11 │
└────┴─────┘

Query:

SELECT joinGet(db_test.id_val,'val',toUInt32(number)) from numbers(4) SETTINGS join_use_nulls = 1

Result:

┌─joinGet(db_test.id_val, 'val', toUInt32(number))─┐
│                                                0 │
│                                               11 │
│                                               12 │
│                                                0 │
└──────────────────────────────────────────────────┘
SELECT feat1, ..., feat_n, catboostEvaluate('/path/to/model.bin', feat_1, ..., feat_n) AS prediction
FROM data_table

Prerequisites

  1. Build the catboost evaluation library

Next, specify the path to libcatboostmodel.<so|dylib> in the clickhouse configuration:

<clickhouse>
...
    <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>
...
</clickhouse>

For security and isolation reasons, the model evaluation does not run in the server process but in the clickhouse-library-bridge process. At the first execution of catboostEvaluate(), the server starts the library bridge process if it is not running already. Both processes communicate using a HTTP interface. By default, port 9012 is used. A different port can be specified as follows - this is useful if port 9012 is already assigned to a different service.

<library_bridge>
    <port>9019</port>
</library_bridge>
  1. Train a catboost model using libcatboost

Throw an exception if the argument is non zero. message - is an optional parameter: a constant string providing a custom error message error_code - is an optional parameter: a constant integer providing a custom error code

To use the error_code argument, configuration parameter allow_custom_error_code_in_throwif must be enabled.

SELECT throwIf(number = 3, 'Too many') FROM numbers(10);
↙ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) Received exception from server (version 19.14.1):
Code: 395. DB::Exception: Received from localhost:9000. DB::Exception: Too many.

Returns the same value that was used as its argument. Used for debugging and testing, allows to cancel using index, and get the query performance of a full scan. When query is analyzed for possible use of index, the analyzer does not look inside identity functions. Also constant folding is not applied too.

Syntax

identity(x)

Example

Query:

SELECT identity(42)

Result:

┌─identity(42)─┐
│           42 │
└──────────────┘

Syntax

getSetting('custom_setting');

Parameter

Returned value

  • The setting current value.

Example

SET custom_a = 123;
SELECT getSetting('custom_a');

Result

123

See Also

Syntax

isDecimalOverflow(d, [p])

Arguments

Returned values

  • 1 — Decimal value has more digits then it's precision allow,

  • 0 — Decimal value satisfies the specified precision.

Example

Query:

SELECT isDecimalOverflow(toDecimal32(1000000000, 0), 9),
       isDecimalOverflow(toDecimal32(1000000000, 0)),
       isDecimalOverflow(toDecimal32(-1000000000, 0), 9),
       isDecimalOverflow(toDecimal32(-1000000000, 0));

Result:

1   1   1   1

Returns number of decimal digits you need to represent the value.

Syntax

countDigits(x)

Arguments

Returned value

Number of digits.

NOTE

Example

Query:

SELECT countDigits(toDecimal32(1, 9)), countDigits(toDecimal32(-1, 9)),
       countDigits(toDecimal64(1, 18)), countDigits(toDecimal64(-1, 18)),
       countDigits(toDecimal128(1, 38)), countDigits(toDecimal128(-1, 38));

Result:

10  10  19  19  39  39

Returned value

  • Variable name for the error code.

Syntax

errorCodeToName(1)

Result:

UNSUPPORTED_METHOD

Syntax

tcpPort()

Arguments

  • None.

Returned value

  • The TCP port number.

Example

Query:

SELECT tcpPort();

Result:

┌─tcpPort()─┐
│      9000 │
└───────────┘

See Also

RANDOM NUMBER AND STRING

All the functions accept zero arguments or one argument. If an argument is passed, it can be any type, and its value is not used for anything. The only purpose of this argument is to prevent common subexpression elimination, so that two different instances of the same function return different columns with different random numbers.

NOTE

Non-cryptographic generators of pseudo-random numbers are used.

Returns a pseudo-random UInt32 number, evenly distributed among all UInt32-type numbers.

Uses a linear congruential generator.

Returns a pseudo-random UInt64 number, evenly distributed among all UInt64-type numbers.

Uses a linear congruential generator.

The function generates pseudo random results with independent and identically distributed uniformly distributed values in [0, 1).

Non-deterministic. Return type is Float64.

Produces a constant column with a random value.

Syntax

randConstant([x])

Arguments

Returned value

  • Pseudo-random number.

Example

Query:

SELECT rand(), rand(1), rand(number), randConstant(), randConstant(1), randConstant(number)
FROM numbers(3)

Result:

┌─────rand()─┬────rand(1)─┬─rand(number)─┬─randConstant()─┬─randConstant(1)─┬─randConstant(number)─┐
│ 3047369878 │ 4132449925 │   4044508545 │     2740811946 │      4229401477 │           1924032898 │
│ 2938880146 │ 1267722397 │   4154983056 │     2740811946 │      4229401477 │           1924032898 │
│  956619638 │ 4238287282 │   1104342490 │     2740811946 │      4229401477 │           1924032898 │
└────────────┴────────────┴──────────────┴────────────────┴─────────────────┴──────────────────────┘

Syntax

fuzzBits([s], [prob])

Inverts bits of s, each with probability prob.

Arguments

  • s - String or FixedString

  • prob - constant Float32/64

Returned value Fuzzed string with same as s type.

Example

SELECT fuzzBits(materialize('abacaba'), 0.1)
FROM numbers(3)

Result:

┌─fuzzBits(materialize('abacaba'), 0.1)─┐
│ abaaaja                               │
│ a*cjab+                               │
│ aeca2A                                │
└───────────────────────────────────────┘

REPLACING IN STRINGS

NOTE

Replaces the first occurrence of the substring ‘pattern’ (if it exists) in ‘haystack’ by the ‘replacement’ string. ‘pattern’ and ‘replacement’ must be constants.

Replaces all occurrences of the substring ‘pattern’ in ‘haystack’ by the ‘replacement’ string.

Example 1. Converting ISO dates to American format:

SELECT DISTINCT
    EventDate,
    replaceRegexpOne(toString(EventDate), '(\\d{4})-(\\d{2})-(\\d{2})', '\\2/\\3/\\1') AS res
FROM test.hits
LIMIT 7
FORMAT TabSeparated
2014-03-17      03/17/2014
2014-03-18      03/18/2014
2014-03-19      03/19/2014
2014-03-20      03/20/2014
2014-03-21      03/21/2014
2014-03-22      03/22/2014
2014-03-23      03/23/2014

Example 2. Copying a string ten times:

SELECT replaceRegexpOne('Hello, World!', '.*', '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0') AS res
┌─res────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World!Hello, World! │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Like ‘replaceRegexpOne‘, but replaces all occurrences of the pattern. Example:

SELECT replaceRegexpAll('Hello, World!', '.', '\\0\\0') AS res
┌─res────────────────────────┐
│ HHeelllloo,,  WWoorrlldd!! │
└────────────────────────────┘

As an exception, if a regular expression worked on an empty substring, the replacement is not made more than once. Example:

SELECT replaceRegexpAll('Hello, World!', '^', 'here: ') AS res
┌─res─────────────────┐
│ here: Hello, World! │
└─────────────────────┘

ROUNDING

Returns the largest round number that is less than or equal to x. A round number is a multiple of 1/10N, or the nearest number of the appropriate data type if 1 / 10N isn’t exact. ‘N’ is an integer constant, optional parameter. By default it is zero, which means to round to an integer. ‘N’ may be negative.

Examples: floor(123.45, 1) = 123.4, floor(123.45, -1) = 120.

x is any numeric type. The result is a number of the same type. For integer arguments, it makes sense to round with a negative N value (for non-negative N, the function does not do anything). If rounding causes overflow (for example, floor(-128, -1)), an implementation-specific result is returned.

Returns the smallest round number that is greater than or equal to x. In every other way, it is the same as the floor function (see above).

Returns the round number with largest absolute value that has an absolute value less than or equal to x‘s. In every other way, it is the same as the ’floor’ function (see above).

Rounds a value to a specified number of decimal places.

The function returns the nearest number of the specified order. In case when given number has equal distance to surrounding numbers, the function uses banker’s rounding for float number types and rounds away from zero for the other number types (Decimal).

round(expression [, decimal_places])

Arguments

  • decimal-places — An integer value.

    • If decimal-places > 0 then the function rounds the value to the right of the decimal point.

    • If decimal-places < 0 then the function rounds the value to the left of the decimal point.

    • If decimal-places = 0 then the function rounds the value to integer. In this case the argument can be omitted.

Returned value:

The rounded number of the same type as the input number.

Example of use with Float

SELECT number / 2 AS x, round(x) FROM system.numbers LIMIT 3
┌───x─┬─round(divide(number, 2))─┐
│   0 │                        0 │
│ 0.5 │                        0 │
│   1 │                        1 │
└─────┴──────────────────────────┘

Example of use with Decimal

SELECT cast(number / 2 AS  Decimal(10,4)) AS x, round(x) FROM system.numbers LIMIT 3
┌──────x─┬─round(CAST(divide(number, 2), 'Decimal(10, 4)'))─┐
│ 0.0000 │                                           0.0000 │
│ 0.5000 │                                           1.0000 │
│ 1.0000 │                                           1.0000 │
└────────┴──────────────────────────────────────────────────┘

Examples of rounding

Rounding to the nearest number.

round(3.2, 0) = 3
round(4.1267, 2) = 4.13
round(22,-1) = 20
round(467,-2) = 500
round(-467,-2) = -500

Banker’s rounding.

round(3.5) = 4
round(4.5) = 4
round(3.55, 1) = 3.6
round(3.65, 1) = 3.6

See Also

Rounds a number to a specified decimal position.

  • If the rounding number is halfway between two numbers, the function uses banker’s rounding.

    Banker's rounding is a method of rounding fractional numbers. When the rounding number is halfway between two numbers, it's rounded to the nearest even digit at the specified decimal position. For example: 3.5 rounds up to 4, 2.5 rounds down to 2.
    
    It's the default rounding method for floating point numbers defined in [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754#Roundings_to_nearest). The [round](#rounding_functions-round) function performs the same rounding for floating point numbers. The `roundBankers` function also rounds integers the same way, for example, `roundBankers(45, -1) = 40`.
  • In other cases, the function rounds numbers to the nearest integer.

Using banker’s rounding, you can reduce the effect that rounding numbers has on the results of summing or subtracting these numbers.

For example, sum numbers 1.5, 2.5, 3.5, 4.5 with different rounding:

  • No rounding: 1.5 + 2.5 + 3.5 + 4.5 = 12.

  • Banker’s rounding: 2 + 2 + 4 + 4 = 12.

  • Rounding to the nearest integer: 2 + 3 + 4 + 5 = 14.

Syntax

roundBankers(expression [, decimal_places])

Arguments

  • decimal-places — Decimal places. An integer number.

    • decimal-places > 0 — The function rounds the number to the given position right of the decimal point. Example: roundBankers(3.55, 1) = 3.6.

    • decimal-places < 0 — The function rounds the number to the given position left of the decimal point. Example: roundBankers(24.55, -1) = 20.

    • decimal-places = 0 — The function rounds the number to an integer. In this case the argument can be omitted. Example: roundBankers(2.5) = 2.

Returned value

A value rounded by the banker’s rounding method.

Example of use

Query:

 SELECT number / 2 AS x, roundBankers(x, 0) AS b fROM system.numbers limit 10

Result:

┌───x─┬─b─┐
│   0 │ 0 │
│ 0.5 │ 0 │
│   1 │ 1 │
│ 1.5 │ 2 │
│   2 │ 2 │
│ 2.5 │ 2 │
│   3 │ 3 │
│ 3.5 │ 4 │
│   4 │ 4 │
│ 4.5 │ 4 │
└─────┴───┘

Examples of Banker’s rounding

roundBankers(0.4) = 0
roundBankers(-3.5) = -4
roundBankers(4.5) = 4
roundBankers(3.55, 1) = 3.6
roundBankers(3.65, 1) = 3.6
roundBankers(10.35, 1) = 10.4
roundBankers(10.755, 2) = 10.76

See Also

Accepts a number. If the number is less than one, it returns 0. Otherwise, it rounds the number down to the nearest (whole non-negative) degree of two.

Accepts a number. If the number is less than one, it returns 0. Otherwise, it rounds the number down to numbers from the set: 1, 10, 30, 60, 120, 180, 240, 300, 600, 1200, 1800, 3600, 7200, 18000, 36000.

Accepts a number. If the number is less than 18, it returns 0. Otherwise, it rounds the number down to a number from the set: 18, 25, 35, 45, 55.

Accepts a number and rounds it down to an element in the specified array. If the value is less than the lowest bound, the lowest bound is returned.

SEARCHING IN STRINGS

NOTE

Searches for the substring needle in the string haystack.

Returns the position (in bytes) of the found substring in the string, starting from 1.

Syntax

position(haystack, needle[, start_pos])
position(needle IN haystack)

Alias: locate(haystack, needle[, start_pos]).

NOTE

Syntax of position(needle IN haystack) provides SQL-compatibility, the function works the same way as to position(haystack, needle).

Arguments

Returned values

  • Starting position in bytes (counting from 1), if substring was found.

  • 0, if the substring was not found.

Type: Integer.

Examples

The phrase “Hello, world!” contains a set of bytes representing a single-byte encoded text. The function returns some expected result:

Query:

SELECT position('Hello, world!', '!');

Result:

┌─position('Hello, world!', '!')─┐
│                             13 │
└────────────────────────────────┘
SELECT
    position('Hello, world!', 'o', 1),
    position('Hello, world!', 'o', 7)
┌─position('Hello, world!', 'o', 1)─┬─position('Hello, world!', 'o', 7)─┐
│                                 5 │                                 9 │
└───────────────────────────────────┴───────────────────────────────────┘

Query:

SELECT position('Привет, мир!', '!');

Result:

┌─position('Привет, мир!', '!')─┐
│                            21 │
└───────────────────────────────┘

Examples for POSITION(needle IN haystack) syntax

Query:

SELECT 3 = position('c' IN 'abc');

Result:

┌─equals(3, position('abc', 'c'))─┐
│                               1 │
└─────────────────────────────────┘

Query:

SELECT 6 = position('/' IN s) FROM (SELECT 'Hello/World' AS s);

Result:

┌─equals(6, position(s, '/'))─┐
│                           1 │
└─────────────────────────────┘

Works under the assumption that the string contains a set of bytes representing a single-byte encoded text. If this assumption is not met and a character can’t be represented using a single byte, the function does not throw an exception and returns some unexpected result. If character can be represented using two bytes, it will use two bytes and so on.

Syntax

positionCaseInsensitive(haystack, needle[, start_pos])

Arguments

Returned values

  • Starting position in bytes (counting from 1), if substring was found.

  • 0, if the substring was not found.

Type: Integer.

Example

Query:

SELECT positionCaseInsensitive('Hello, world!', 'hello');

Result:

┌─positionCaseInsensitive('Hello, world!', 'hello')─┐
│                                                 1 │
└───────────────────────────────────────────────────┘

Returns the position (in Unicode points) of the found substring in the string, starting from 1.

Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, the function does not throw an exception and returns some unexpected result. If character can be represented using two Unicode points, it will use two and so on.

Syntax

positionUTF8(haystack, needle[, start_pos])

Arguments

Returned values

  • Starting position in Unicode points (counting from 1), if substring was found.

  • 0, if the substring was not found.

Type: Integer.

Examples

The phrase “Hello, world!” in Russian contains a set of Unicode points representing a single-point encoded text. The function returns some expected result:

Query:

SELECT positionUTF8('Привет, мир!', '!');

Result:

┌─positionUTF8('Привет, мир!', '!')─┐
│                                12 │
└───────────────────────────────────┘

The phrase “Salut, étudiante!”, where character é can be represented using a one point (U+00E9) or two points (U+0065U+0301) the function can be returned some unexpected result:

Query for the letter é, which is represented one Unicode point U+00E9:

SELECT positionUTF8('Salut, étudiante!', '!');

Result:

┌─positionUTF8('Salut, étudiante!', '!')─┐
│                                     17 │
└────────────────────────────────────────┘

Query for the letter é, which is represented two Unicode points U+0065U+0301:

SELECT positionUTF8('Salut, étudiante!', '!');

Result:

┌─positionUTF8('Salut, étudiante!', '!')─┐
│                                     18 │
└────────────────────────────────────────┘

Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, the function does not throw an exception and returns some unexpected result. If character can be represented using two Unicode points, it will use two and so on.

Syntax

positionCaseInsensitiveUTF8(haystack, needle[, start_pos])

Arguments

Returned value

  • Starting position in Unicode points (counting from 1), if substring was found.

  • 0, if the substring was not found.

Type: Integer.

Example

Query:

SELECT positionCaseInsensitiveUTF8('Привет, мир!', 'Мир');

Result:

┌─positionCaseInsensitiveUTF8('Привет, мир!', 'Мир')─┐
│                                                  9 │
└────────────────────────────────────────────────────┘

The search is performed on sequences of bytes without respect to string encoding and collation.

  • For case-insensitive ASCII search, use the function multiSearchAllPositionsCaseInsensitive.

  • For case-insensitive UTF-8 search, use the function multiSearchAllPositionsCaseInsensitiveUTF8.

Syntax

multiSearchAllPositions(haystack, [needle1, needle2, ..., needlen])

Arguments

Returned values

  • Array of starting positions in bytes (counting from 1), if the corresponding substring was found and 0 if not found.

Example

Query:

SELECT multiSearchAllPositions('Hello, World!', ['hello', '!', 'world']);

Result:

┌─multiSearchAllPositions('Hello, World!', ['hello', '!', 'world'])─┐
│ [0,13,0]                                                          │
└───────────────────────────────────────────────────────────────────┘

See multiSearchAllPositions.

The same as position but returns the leftmost offset of the string haystack that is matched to some of the needles.

For a case-insensitive search or/and in UTF-8 format use functions multiSearchFirstPositionCaseInsensitive, multiSearchFirstPositionUTF8, multiSearchFirstPositionCaseInsensitiveUTF8.

Returns the index i (starting from 1) of the leftmost found needlei in the string haystack and 0 otherwise.

For a case-insensitive search or/and in UTF-8 format use functions multiSearchFirstIndexCaseInsensitive, multiSearchFirstIndexUTF8, multiSearchFirstIndexCaseInsensitiveUTF8.

Returns 1, if at least one string needlei matches the string haystack and 0 otherwise.

For a case-insensitive search or/and in UTF-8 format use functions multiSearchAnyCaseInsensitive, multiSearchAnyUTF8, multiSearchAnyCaseInsensitiveUTF8.

NOTE

In all multiSearch* functions the number of needles should be less than 28 because of implementation specification.

Returns 0 if it does not match, or 1 if it matches.

For patterns to search for substrings in a string, it is better to use LIKE or ‘position’, since they work much faster.

NOTE

The length of any of the haystack string must be less than 232 bytes otherwise the exception is thrown. This restriction takes place because of hyperscan API.

The same as multiMatchAny, but returns any index that matches the haystack.

The same as multiMatchAny, but returns the array of all indices that match the haystack in any order.

The same as multiFuzzyMatchAny, but returns any index that matches the haystack within a constant edit distance.

The same as multiFuzzyMatchAny, but returns the array of all indices in any order that match the haystack within a constant edit distance.

NOTE

multiFuzzyMatch* functions do not support UTF-8 regular expressions, and such expressions are treated as bytes because of hyperscan restriction.

NOTE

To turn off all functions that use hyperscan, use setting SET allow_hyperscan = 0;.

Extracts a fragment of a string using a regular expression. If ‘haystack’ does not match the ‘pattern’ regex, an empty string is returned. If the regex does not contain subpatterns, it takes the fragment that matches the entire regex. Otherwise, it takes the fragment that matches the first subpattern.

Extracts all the fragments of a string using a regular expression. If ‘haystack’ does not match the ‘pattern’ regex, an empty string is returned. Returns an array of strings consisting of all matches to the regex. In general, the behavior is the same as the ‘extract’ function (it takes the first subpattern, or the entire expression if there isn’t a subpattern).

Matches all groups of the haystack string using the pattern regular expression. Returns an array of arrays, where the first array includes all fragments matching the first group, the second array - matching the second group, etc.

NOTE

Syntax

extractAllGroupsHorizontal(haystack, pattern)

Arguments

Returned value

If haystack does not match the pattern regex, an array of empty arrays is returned.

Example

Query:

SELECT extractAllGroupsHorizontal('abc=111, def=222, ghi=333', '("[^"]+"|\\w+)=("[^"]+"|\\w+)');

Result:

┌─extractAllGroupsHorizontal('abc=111, def=222, ghi=333', '("[^"]+"|\\w+)=("[^"]+"|\\w+)')─┐
│ [['abc','def','ghi'],['111','222','333']]                                                │
└──────────────────────────────────────────────────────────────────────────────────────────┘

See Also

Matches all groups of the haystack string using the pattern regular expression. Returns an array of arrays, where each array includes matching fragments from every group. Fragments are grouped in order of appearance in the haystack.

Syntax

extractAllGroupsVertical(haystack, pattern)

Arguments

Returned value

If haystack does not match the pattern regex, an empty array is returned.

Example

Query:

SELECT extractAllGroupsVertical('abc=111, def=222, ghi=333', '("[^"]+"|\\w+)=("[^"]+"|\\w+)');

Result:

┌─extractAllGroupsVertical('abc=111, def=222, ghi=333', '("[^"]+"|\\w+)=("[^"]+"|\\w+)')─┐
│ [['abc','111'],['def','222'],['ghi','333']]                                            │
└────────────────────────────────────────────────────────────────────────────────────────┘

See Also

Checks whether a string matches a simple regular expression. The regular expression can contain the metasymbols % and _.

% indicates any quantity of any bytes (including zero characters).

_ indicates any one byte.

Use the backslash (\) for escaping metasymbols. See the note on escaping in the description of the ‘match’ function.

For regular expressions like %needle%, the code is more optimal and works as fast as the position function. For other regular expressions, the code is the same as for the ‘match’ function.

The same thing as ‘like’, but negative.

The function ignores the language, e.g. for Turkish (i/İ), the result might be incorrect.

Syntax

ilike(haystack, pattern)

Arguments

  • pattern — If pattern does not contain percent signs or underscores, then the pattern only represents the string itself. An underscore (_) in pattern stands for (matches) any single character. A percent sign (%) matches any sequence of zero or more characters.

Some pattern examples:

'abc' ILIKE 'abc'    true
'abc' ILIKE 'a%'     true
'abc' ILIKE '_b_'    true
'abc' ILIKE 'c'      false

Returned values

  • True, if the string matches pattern.

  • False, if the string does not match pattern.

Example

Input table:

┌─id─┬─name─────┬─days─┐
│  1 │ January  │   31 │
│  2 │ February │   29 │
│  3 │ March    │   31 │
│  4 │ April    │   30 │
└────┴──────────┴──────┘

Query:

SELECT * FROM Months WHERE ilike(name, '%j%');

Result:

┌─id─┬─name────┬─days─┐
│  1 │ January │   31 │
└────┴─────────┴──────┘

See Also

Calculates the 4-gram distance between haystack and needle: counts the symmetric difference between two multisets of 4-grams and normalizes it by the sum of their cardinalities. Returns float number from 0 to 1 – the closer to zero, the more strings are similar to each other. If the constant needle or haystack is more than 32Kb, throws an exception. If some of the non-constant haystack or needle strings are more than 32Kb, the distance is always one.

For case-insensitive search or/and in UTF-8 format use functions ngramDistanceCaseInsensitive, ngramDistanceUTF8, ngramDistanceCaseInsensitiveUTF8.

Same as ngramDistance but calculates the non-symmetric difference between needle and haystack – the number of n-grams from needle minus the common number of n-grams normalized by the number of needle n-grams. The closer to one, the more likely needle is in the haystack. Can be useful for fuzzy string search.

For case-insensitive search or/and in UTF-8 format use functions ngramSearchCaseInsensitive, ngramSearchUTF8, ngramSearchCaseInsensitiveUTF8.

NOTE

For UTF-8 case we use 3-gram distance. All these are not perfectly fair n-gram distances. We use 2-byte hashes to hash n-grams and then calculate the (non-)symmetric difference between these hash tables – collisions may occur. With UTF-8 case-insensitive format we do not use fair tolower function – we zero the 5-th bit (starting from zero) of each codepoint byte and first bit of zeroth byte if bytes more than one – this works for Latin and mostly for all Cyrillic letters.

Returns the number of substring occurrences.

Syntax

countSubstrings(haystack, needle[, start_pos])

Arguments

Returned values

  • Number of occurrences.

Examples

Query:

SELECT countSubstrings('foobar.com', '.');

Result:

┌─countSubstrings('foobar.com', '.')─┐
│                                  1 │
└────────────────────────────────────┘

Query:

SELECT countSubstrings('aaaa', 'aa');

Result:

┌─countSubstrings('aaaa', 'aa')─┐
│                             2 │
└───────────────────────────────┘

Query:

SELECT countSubstrings('abc___abc', 'abc', 4);

Result:

┌─countSubstrings('abc___abc', 'abc', 4)─┐
│                                      1 │
└────────────────────────────────────────┘

Returns the number of substring occurrences case-insensitive.

Syntax

countSubstringsCaseInsensitive(haystack, needle[, start_pos])

Arguments

Returned values

  • Number of occurrences.

Examples

Query:

SELECT countSubstringsCaseInsensitive('aba', 'B');

Result:

┌─countSubstringsCaseInsensitive('aba', 'B')─┐
│                                          1 │
└────────────────────────────────────────────┘

Query:

SELECT countSubstringsCaseInsensitive('foobar.com', 'CoM');

Result:

┌─countSubstringsCaseInsensitive('foobar.com', 'CoM')─┐
│                                                   1 │
└─────────────────────────────────────────────────────┘

Query:

SELECT countSubstringsCaseInsensitive('abC___abC', 'aBc', 2);

Result:

┌─countSubstringsCaseInsensitive('abC___abC', 'aBc', 2)─┐
│                                                     1 │
└───────────────────────────────────────────────────────┘

Returns the number of substring occurrences in UTF-8 case-insensitive.

Syntax

SELECT countSubstringsCaseInsensitiveUTF8(haystack, needle[, start_pos])

Arguments

Returned values

  • Number of occurrences.

Examples

Query:

SELECT countSubstringsCaseInsensitiveUTF8('абв', 'A');

Result:

┌─countSubstringsCaseInsensitiveUTF8('абв', 'A')─┐
│                                              1 │
└────────────────────────────────────────────────┘

Query:

SELECT countSubstringsCaseInsensitiveUTF8('аБв__АбВ__абв', 'Абв');

Result:

┌─countSubstringsCaseInsensitiveUTF8('аБв__АбВ__абв', 'Абв')─┐
│                                                          3 │
└────────────────────────────────────────────────────────────┘

Returns the number of regular expression matches for a pattern in a haystack.

Syntax

countMatches(haystack, pattern)

Arguments

Returned value

  • The number of matches.

Examples

Query:

SELECT countMatches('foobar.com', 'o+');

Result:

┌─countMatches('foobar.com', 'o+')─┐
│                                2 │
└──────────────────────────────────┘

Query:

SELECT countMatches('aaaa', 'aa');

Result:

┌─countMatches('aaaa', 'aa')────┐
│                             2 │
└───────────────────────────────┘

SPLITTING AND MERGING

Splits a string into substrings separated by a specified character. It uses a constant string separator which consists of exactly one character. Returns an array of selected substrings. Empty substrings may be selected if the separator occurs at the beginning or end of the string, or if there are multiple consecutive separators.

Syntax

splitByChar(separator, s[, max_substrings]))

Arguments

  • max_substrings — An optional Int64 defaulting to 0. When max_substrings > 0, the returned substrings will be no more than max_substrings, otherwise the function will return as many substrings as possible.

Returned value(s)

Returns an array of selected substrings. Empty substrings may be selected when:

  • A separator occurs at the beginning or end of the string;

  • There are multiple consecutive separators;

  • The original string s is empty.

Example

SELECT splitByChar(',', '1,2,3,abcde');
┌─splitByChar(',', '1,2,3,abcde')─┐
│ ['1','2','3','abcde']           │
└─────────────────────────────────┘

Splits a string into substrings separated by a string. It uses a constant string separator of multiple characters as the separator. If the string separator is empty, it will split the string s into an array of single characters.

Syntax

splitByString(separator, s[, max_substrings]))

Arguments

  • max_substrings — An optional Int64 defaulting to 0. When max_substrings > 0, the returned substrings will be no more than max_substrings, otherwise the function will return as many substrings as possible.

Returned value(s)

Returns an array of selected substrings. Empty substrings may be selected when:

  • A non-empty separator occurs at the beginning or end of the string;

  • There are multiple consecutive non-empty separators;

  • The original string s is empty while the separator is not empty.

Example

SELECT splitByString(', ', '1, 2 3, 4,5, abcde');
┌─splitByString(', ', '1, 2 3, 4,5, abcde')─┐
│ ['1','2 3','4,5','abcde']                 │
└───────────────────────────────────────────┘
SELECT splitByString('', 'abcde');
┌─splitByString('', 'abcde')─┐
│ ['a','b','c','d','e']      │
└────────────────────────────┘

Concatenates string representations of values listed in the array with the separator. separator is an optional parameter: a constant string, set to an empty string by default. Returns the string.

Selects substrings of consecutive bytes from the ranges a-z and A-Z.Returns an array of substrings.

Syntax

alphaTokens(s[, max_substrings]))
splitByAlpha(s[, max_substrings])

Arguments

  • max_substrings — An optional Int64 defaulting to 0. When max_substrings > 0, the returned substrings will be no more than max_substrings, otherwise the function will return as many substrings as possible.

Returned value(s)

Returns an array of selected substrings.

Example

SELECT alphaTokens('abca1abc');
┌─alphaTokens('abca1abc')─┐
│ ['abca','abc']          │
└─────────────────────────┘

Extracts all groups from non-overlapping substrings matched by a regular expression.

Syntax

extractAllGroups(text, regexp)

Arguments

Returned values

  • If the function finds at least one matching group, it returns Array(Array(String)) column, clustered by group_id (1 to N, where N is number of capturing groups in regexp).

  • If there is no matching group, returns an empty array.

Example

Query:

SELECT extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)');

Result:

┌─extractAllGroups('abc=123, 8="hkl"', '("[^"]+"|\\w+)=("[^"]+"|\\w+)')─┐
│ [['abc','123'],['8','"hkl"']]                                         │
└───────────────────────────────────────────────────────────────────────┘

STRINGS

NOTE

Checks whether the input string is empty.

Syntax

empty(x)

A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.

Arguments

Returned value

  • Returns 1 for an empty string or 0 for a non-empty string.

Example

Query:

SELECT empty('');

Result:

┌─empty('')─┐
│         1 │
└───────────┘

Checks whether the input string is non-empty.

Syntax

notEmpty(x)

A string is considered non-empty if it contains at least one byte, even if this is a space or a null byte.

Arguments

Returned value

  • Returns 1 for a non-empty string or 0 for an empty string string.

Example

Query:

SELECT notEmpty('text');

Result:

┌─notEmpty('text')─┐
│                1 │
└──────────────────┘

Returns the length of a string in bytes (not in characters, and not in code points). The result type is UInt64. The function also works for arrays.

Returns the length of a string in Unicode code points (not in characters), assuming that the string contains a set of bytes that make up UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception). The result type is UInt64.

Returns the length of a string in Unicode code points (not in characters), assuming that the string contains a set of bytes that make up UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception). The result type is UInt64.

Returns the length of a string in Unicode code points (not in characters), assuming that the string contains a set of bytes that make up UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception). The result type is UInt64.

Pads the current string from the left with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL LPAD function.

Syntax

leftPad('string', 'length'[, 'pad_string'])

Arguments

Returned value

  • The resulting string of the given length.

Example

Query:

SELECT leftPad('abc', 7, '*'), leftPad('def', 7);

Result:

┌─leftPad('abc', 7, '*')─┬─leftPad('def', 7)─┐
│ ****abc                │     def           │
└────────────────────────┴───────────────────┘

Syntax

leftPadUTF8('string','length'[, 'pad_string'])

Arguments

Returned value

  • The resulting string of the given length.

Example

Query:

SELECT leftPadUTF8('абвг', 7, '*'), leftPadUTF8('дежз', 7);

Result:

┌─leftPadUTF8('абвг', 7, '*')─┬─leftPadUTF8('дежз', 7)─┐
│ ***абвг                     │    дежз                │
└─────────────────────────────┴────────────────────────┘

Pads the current string from the right with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL RPAD function.

Syntax

rightPad('string', 'length'[, 'pad_string'])

Arguments

Returned value

  • The resulting string of the given length.

Example

Query:

SELECT rightPad('abc', 7, '*'), rightPad('abc', 7);

Result:

┌─rightPad('abc', 7, '*')─┬─rightPad('abc', 7)─┐
│ abc****                 │ abc                │
└─────────────────────────┴────────────────────┘

Syntax

rightPadUTF8('string','length'[, 'pad_string'])

Arguments

Returned value

  • The resulting string of the given length.

Example

Query:

SELECT rightPadUTF8('абвг', 7, '*'), rightPadUTF8('абвг', 7);

Result:

┌─rightPadUTF8('абвг', 7, '*')─┬─rightPadUTF8('абвг', 7)─┐
│ абвг***                      │ абвг                    │
└──────────────────────────────┴─────────────────────────┘

Converts ASCII Latin symbols in a string to lowercase.

Converts ASCII Latin symbols in a string to uppercase.

Converts a string to lowercase, assuming the string contains a set of bytes that make up a UTF-8 encoded text. It does not detect the language. E.g. for Turkish the result might not be exactly correct (i/İ vs. i/I). If the length of the UTF-8 byte sequence is different for upper and lower case of a code point, the result may be incorrect for this code point. If the string contains a sequence of bytes that are not valid UTF-8, then the behavior is undefined.

Converts a string to uppercase, assuming the string contains a set of bytes that make up a UTF-8 encoded text. It does not detect the language. E.g. for Turkish the result might not be exactly correct (i/İ vs. i/I). If the length of the UTF-8 byte sequence is different for upper and lower case of a code point, the result may be incorrect for this code point. If the string contains a sequence of bytes that are not valid UTF-8, then the behavior is undefined.

Returns 1, if the set of bytes is valid UTF-8 encoded, otherwise 0.

Replaces invalid UTF-8 characters by the � (U+FFFD) character. All running in a row invalid characters are collapsed into the one replacement character.

toValidUTF8(input_string)

Arguments

Returned value: Valid UTF-8 string.

Example

SELECT toValidUTF8('\x61\xF0\x80\x80\x80b');
┌─toValidUTF8('a����b')─┐
│ a�b                   │
└───────────────────────┘

Repeats a string as many times as specified and concatenates the replicated values as a single string.

Alias: REPEAT.

Syntax

repeat(s, n)

Arguments

Returned value

The single string, which contains the string s repeated n times. If n \< 1, the function returns empty string.

Type: String.

Example

Query:

SELECT repeat('abc', 10);

Result:

┌─repeat('abc', 10)──────────────┐
│ abcabcabcabcabcabcabcabcabcabc │
└────────────────────────────────┘

Reverses the string (as a sequence of bytes).

Reverses a sequence of Unicode code points, assuming that the string contains a set of bytes representing a UTF-8 text. Otherwise, it does something else (it does not throw an exception).

Formatting constant pattern with the string listed in the arguments. pattern is a simplified Python format pattern. Format string contains “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output. If you need to include a brace character in the literal text, it can be escaped by doubling: {{ '{{' }} and {{ '}}' }}. Field names can be numbers (starting from zero) or empty (then they are treated as consequence numbers).

SELECT format('{1} {0} {1}', 'World', 'Hello')
┌─format('{1} {0} {1}', 'World', 'Hello')─┐
│ Hello World Hello                       │
└─────────────────────────────────────────┘
SELECT format('{} {}', 'Hello', 'World')
┌─format('{} {}', 'Hello', 'World')─┐
│ Hello World                       │
└───────────────────────────────────┘

Concatenates the strings listed in the arguments, without a separator.

Syntax

concat(s1, s2, ...)

Arguments

Values of type String or FixedString.

Returned values

Returns the String that results from concatenating the arguments.

If any of argument values is NULL, concat returns NULL.

Example

Query:

SELECT concat('Hello, ', 'World!');

Result:

┌─concat('Hello, ', 'World!')─┐
│ Hello, World!               │
└─────────────────────────────┘

The function is named “injective” if it always returns different result for different values of arguments. In other words: different arguments never yield identical result.

Syntax

concatAssumeInjective(s1, s2, ...)

Arguments

Values of type String or FixedString.

Returned values

Returns the String that results from concatenating the arguments.

If any of argument values is NULL, concatAssumeInjective returns NULL.

Example

Input table:

CREATE TABLE key_val(`key1` String, `key2` String, `value` UInt32) ENGINE = TinyLog;
INSERT INTO key_val VALUES ('Hello, ','World',1), ('Hello, ','World',2), ('Hello, ','World!',3), ('Hello',', World!',2);
SELECT * from key_val;
┌─key1────┬─key2─────┬─value─┐
│ Hello,  │ World    │     1 │
│ Hello,  │ World    │     2 │
│ Hello,  │ World!   │     3 │
│ Hello   │ , World! │     2 │
└─────────┴──────────┴───────┘

Query:

SELECT concat(key1, key2), sum(value) FROM key_val GROUP BY concatAssumeInjective(key1, key2);

Result:

┌─concat(key1, key2)─┬─sum(value)─┐
│ Hello, World!      │          3 │
│ Hello, World!      │          2 │
│ Hello, World       │          3 │
└────────────────────┴────────────┘

Returns a substring starting with the byte from the ‘offset’ index that is ‘length’ bytes long. Character indexing starts from one (as in standard SQL).

The same as ‘substring’, but for Unicode code points. Works under the assumption that the string contains a set of bytes representing a UTF-8 encoded text. If this assumption is not met, it returns some result (it does not throw an exception).

If the ‘s’ string is non-empty and does not contain the ‘c’ character at the end, it appends the ‘c’ character to the end.

Returns the string ‘s’ that was converted from the encoding in ‘from’ to the encoding in ‘to’.

Syntax

base58Encode(plaintext)

Arguments

Returned value

  • A string containing encoded value of 1st argument.

Example

Query:

SELECT base58Encode('Encoded');

Result:

┌─base58Encode('Encoded')─┐
│ 3dc8KtHrwM              │
└─────────────────────────┘

Encodes ‘s’ FixedString or String into base64.

Alias: TO_BASE64.

Decode base64-encoded FixedString or String ‘s’ into original string. In case of failure raises an exception.

Alias: FROM_BASE64.

Similar to base64Decode, but returns an empty string in case of error.

Returns whether to end with the specified suffix. Returns 1 if the string ends with the specified suffix, otherwise it returns 0.

Returns 1 whether string starts with the specified prefix, otherwise it returns 0.

SELECT startsWith('Spider-Man', 'Spi');

Returned values

  • 1, if the string starts with the specified prefix.

  • 0, if the string does not start with the specified prefix.

Example

Query:

SELECT startsWith('Hello, world!', 'He');

Result:

┌─startsWith('Hello, world!', 'He')─┐
│                                 1 │
└───────────────────────────────────┘

Removes all specified characters from the start or end of a string. By default removes all consecutive occurrences of common whitespace (ASCII character 32) from both ends of a string.

Syntax

trim([[LEADING|TRAILING|BOTH] trim_character FROM] input_string)

Arguments

Returned value

A string without leading and (or) trailing specified characters.

Type: String.

Example

Query:

SELECT trim(BOTH ' ()' FROM '(   Hello, world!   )');

Result:

┌─trim(BOTH ' ()' FROM '(   Hello, world!   )')─┐
│ Hello, world!                                 │
└───────────────────────────────────────────────┘

Removes all consecutive occurrences of common whitespace (ASCII character 32) from the beginning of a string. It does not remove other kinds of whitespace characters (tab, no-break space, etc.).

Syntax

trimLeft(input_string)

Alias: ltrim(input_string).

Arguments

Returned value

A string without leading common whitespaces.

Type: String.

Example

Query:

SELECT trimLeft('     Hello, world!     ');

Result:

┌─trimLeft('     Hello, world!     ')─┐
│ Hello, world!                       │
└─────────────────────────────────────┘

Removes all consecutive occurrences of common whitespace (ASCII character 32) from the end of a string. It does not remove other kinds of whitespace characters (tab, no-break space, etc.).

Syntax

trimRight(input_string)

Alias: rtrim(input_string).

Arguments

Returned value

A string without trailing common whitespaces.

Type: String.

Example

Query:

SELECT trimRight('     Hello, world!     ');

Result:

┌─trimRight('     Hello, world!     ')─┐
│      Hello, world!                   │
└──────────────────────────────────────┘

Removes all consecutive occurrences of common whitespace (ASCII character 32) from both ends of a string. It does not remove other kinds of whitespace characters (tab, no-break space, etc.).

Syntax

trimBoth(input_string)

Alias: trim(input_string).

Arguments

Returned value

A string without leading and trailing common whitespaces.

Type: String.

Example

Query:

SELECT trimBoth('     Hello, world!     ');

Result:

┌─trimBoth('     Hello, world!     ')─┐
│ Hello, world!                       │
└─────────────────────────────────────┘

Returns the CRC32 checksum of a string, using CRC-32-IEEE 802.3 polynomial and initial value 0xffffffff (zlib implementation).

The result type is UInt32.

Returns the CRC32 checksum of a string, using CRC-32-IEEE 802.3 polynomial.

The result type is UInt32.

Returns the CRC64 checksum of a string, using CRC-64-ECMA polynomial.

The result type is UInt64.

Replaces literals, sequences of literals and complex aliases with placeholders.

Syntax

normalizeQuery(x)

Arguments

Returned value

  • Sequence of characters with placeholders.

Example

Query:

SELECT normalizeQuery('[1, 2, 3, x]') AS query;

Result:

┌─query────┐
│ [?.., x] │
└──────────┘

Returns identical 64bit hash values without the values of literals for similar queries. It helps to analyze query log.

Syntax

normalizedQueryHash(x)

Arguments

Returned value

  • Hash value.

Example

Query:

SELECT normalizedQueryHash('SELECT 1 AS `xyz`') != normalizedQueryHash('SELECT 1 AS `abc`') AS res;

Result:

┌─res─┐
│   1 │
└─────┘

Syntax

normalizeUTF8NFC(words)

Arguments

Returned value

  • String transformed to NFC normalization form.

Example

Query:

SELECT length('â'), normalizeUTF8NFC('â') AS nfc, length(nfc) AS nfc_len;

Result:

┌─length('â')─┬─nfc─┬─nfc_len─┐
│           2 │ â   │       2 │
└─────────────┴─────┴─────────┘

Syntax

normalizeUTF8NFD(words)

Arguments

Returned value

  • String transformed to NFD normalization form.

Example

Query:

SELECT length('â'), normalizeUTF8NFD('â') AS nfd, length(nfd) AS nfd_len;

Result:

┌─length('â')─┬─nfd─┬─nfd_len─┐
│           2 │ â   │       3 │
└─────────────┴─────┴─────────┘

Syntax

normalizeUTF8NFKC(words)

Arguments

Returned value

  • String transformed to NFKC normalization form.

Example

Query:

SELECT length('â'), normalizeUTF8NFKC('â') AS nfkc, length(nfkc) AS nfkc_len;

Result:

┌─length('â')─┬─nfkc─┬─nfkc_len─┐
│           2 │ â    │        2 │
└─────────────┴──────┴──────────┘

Syntax

normalizeUTF8NFKD(words)

Arguments

Returned value

  • String transformed to NFKD normalization form.

Example

Query:

SELECT length('â'), normalizeUTF8NFKD('â') AS nfkd, length(nfkd) AS nfkd_len;

Result:

┌─length('â')─┬─nfkd─┬─nfkd_len─┐
│           2 │ â    │        3 │
└─────────────┴──────┴──────────┘

Escapes characters to place string into XML text node or attribute.

The following five XML predefined entities will be replaced: <, &, >, ", '.

Syntax

encodeXMLComponent(x)

Arguments

Returned value

  • The sequence of characters with escape characters.

Example

Query:

SELECT encodeXMLComponent('Hello, "world"!');
SELECT encodeXMLComponent('<123>');
SELECT encodeXMLComponent('&clickhouse');
SELECT encodeXMLComponent('\'foo\'');

Result:

Hello, &quot;world&quot;!
&lt;123&gt;
&amp;clickhouse
&apos;foo&apos;

Replaces XML predefined entities with characters. Predefined entities are &quot; &amp; &apos; &gt; &lt; This function also replaces numeric character references with Unicode characters. Both decimal (like &#10003;) and hexadecimal (&#x2713;) forms are supported.

Syntax

decodeXMLComponent(x)

Arguments

Returned value

  • The sequence of characters after replacement.

Example

Query:

SELECT decodeXMLComponent('&apos;foo&apos;');
SELECT decodeXMLComponent('&lt; &#x3A3; &gt;');

Result:

'foo'
< Σ >

See Also

A function to extract text from HTML or XHTML. It does not necessarily 100% conform to any of the HTML, XML or XHTML standards, but the implementation is reasonably accurate and it is fast. The rules are the following:

  1. Comments are skipped. Example: <!-- test -->. Comment must end with -->. Nested comments are not possible. Note: constructions like <!--> and <!---> are not valid comments in HTML but they are skipped by other rules.

  2. CDATA is pasted verbatim. Note: CDATA is XML/XHTML specific. But it is processed for "best-effort" approach.

  3. script and style elements are removed with all their content. Note: it is assumed that closing tag cannot appear inside content. For example, in JS string literal has to be escaped like "<\/script>". Note: comments and CDATA are possible inside script or style - then closing tags are not searched inside CDATA. Example: <script><![CDATA[</script>]]></script>. But they are still searched inside comments. Sometimes it becomes complicated: <script>var x = "<!--"; </script> var y = "-->"; alert(x + y);</script> Note: script and style can be the names of XML namespaces - then they are not treated like usual script or style elements. Example: <script:a>Hello</script:a>. Note: whitespaces are possible after closing tag name: </script > but not before: < / script>.

  4. Other tags or tag-like elements are skipped without inner content. Example: <a>.</a> Note: it is expected that this HTML is illegal: <a test=">"></a> Note: it also skips something like tags: <>, <!>, etc. Note: tag without end is skipped to the end of input: <hello

  5. HTML and XML entities are not decoded. They must be processed by separate function.

  6. Whitespaces in the text are collapsed or inserted by specific rules.

    • Whitespaces at the beginning and at the end are removed.

    • Consecutive whitespaces are collapsed.

    • But if the text is separated by other elements and there is no whitespace, it is inserted.

    • It may cause unnatural examples: Hello<b>world</b>, Hello<!-- -->world - there is no whitespace in HTML, but the function inserts it. Also consider: Hello<p>world</p>, Hello<br>world. This behavior is reasonable for data analysis, e.g. to convert HTML to a bag of words.

  7. Also note that correct handling of whitespaces requires the support of <pre></pre> and CSS display and white-space properties.

Syntax

extractTextFromHTML(x)

Arguments

Returned value

  • Extracted text.

Example

Query:

SELECT extractTextFromHTML(' <p> A text <i>with</i><b>tags</b>. <!-- comments --> </p> ');
SELECT extractTextFromHTML('<![CDATA[The content within <b>CDATA</b>]]> <script>alert("Script");</script>');
SELECT extractTextFromHTML(html) FROM url('http://www.donothingfor2minutes.com/', RawBLOB, 'html String');

Result:

A text with tags .
The content within <b>CDATA</b>
Do Nothing for 2 Minutes 2:00 &nbsp;

TUPLES

A function that allows grouping multiple columns. For columns with the types T1, T2, …, it returns a Tuple(T1, T2, …) type tuple containing these columns. There is no cost to execute the function. Tuples are normally used as intermediate values for an argument of IN operators, or for creating a list of formal parameters of lambda functions. Tuples can’t be written to a table.

The function implements the operator (x, y, …).

Syntax

tuple(x, y, …)

A function that allows getting a column from a tuple. ‘N’ is the column index, starting from 1. ‘N’ must be a constant. ‘N’ must be a strict postive integer no greater than the size of the tuple. There is no cost to execute the function.

The function implements the operator x.N.

Syntax

tupleElement(tuple, n)

Syntax

untuple(x)

You can use the EXCEPT expression to skip columns as a result of the query.

Arguments

Returned value

  • None.

Examples

Input table:

┌─key─┬─v1─┬─v2─┬─v3─┬─v4─┬─v5─┬─v6────────┐
│   1 │ 10 │ 20 │ 40 │ 30 │ 15 │ (33,'ab') │
│   2 │ 25 │ 65 │ 70 │ 40 │  6 │ (44,'cd') │
│   3 │ 57 │ 30 │ 20 │ 10 │  5 │ (55,'ef') │
│   4 │ 55 │ 12 │  7 │ 80 │ 90 │ (66,'gh') │
│   5 │ 30 │ 50 │ 70 │ 25 │ 55 │ (77,'kl') │
└─────┴────┴────┴────┴────┴────┴───────────┘

Example of using a Tuple-type column as the untuple function parameter:

Query:

SELECT untuple(v6) FROM kv;

Result:

┌─_ut_1─┬─_ut_2─┐
│    33 │ ab    │
│    44 │ cd    │
│    55 │ ef    │
│    66 │ gh    │
│    77 │ kl    │
└───────┴───────┘

Note: the names are implementation specific and are subject to change. You should not assume specific names of the columns after application of the untuple.

Example of using an EXCEPT expression:

Query:

SELECT untuple((* EXCEPT (v2, v3),)) FROM kv;

Result:

┌─key─┬─v1─┬─v4─┬─v5─┬─v6────────┐
│   1 │ 10 │ 30 │ 15 │ (33,'ab') │
│   2 │ 25 │ 40 │  6 │ (44,'cd') │
│   3 │ 57 │ 10 │  5 │ (55,'ef') │
│   4 │ 55 │ 80 │ 90 │ (66,'gh') │
│   5 │ 30 │ 25 │ 55 │ (77,'kl') │
└─────┴────┴────┴────┴───────────┘

See Also

Syntax

tupleHammingDistance(tuple1, tuple2)

Arguments

Tuples should have the same type of the elements.

Returned value

  • The Hamming distance.

SELECT
    toTypeName(tupleHammingDistance(tuple(0), tuple(0))) AS t1,
    toTypeName(tupleHammingDistance((0, 0), (0, 0))) AS t2,
    toTypeName(tupleHammingDistance((0, 0, 0), (0, 0, 0))) AS t3,
    toTypeName(tupleHammingDistance((0, 0, 0, 0), (0, 0, 0, 0))) AS t4,
    toTypeName(tupleHammingDistance((0, 0, 0, 0, 0), (0, 0, 0, 0, 0))) AS t5
┌─t1────┬─t2─────┬─t3─────┬─t4─────┬─t5─────┐
│ UInt8 │ UInt16 │ UInt32 │ UInt64 │ UInt64 │
└───────┴────────┴────────┴────────┴────────┘

Examples

Query:

SELECT tupleHammingDistance((1, 2, 3), (3, 2, 1)) AS HammingDistance;

Result:

┌─HammingDistance─┐
│               2 │
└─────────────────┘
SELECT tupleHammingDistance(wordShingleMinHash(string), wordShingleMinHashCaseInsensitive(string)) as HammingDistance FROM (SELECT 'ClickHouse is a column-oriented database management system for online analytical processing of queries.' AS string);

Result:

┌─HammingDistance─┐
│               2 │
└─────────────────┘

TYPE CONVERSION

When you convert a value from one to another data type, you should remember that in common case, it is an unsafe operation that can lead to a data loss. A data loss can occur if you try to fit value from a larger data type to a smaller data type, or if you convert values between different data types.

  • toInt8(expr) — Results in the Int8 data type.

  • toInt16(expr) — Results in the Int16 data type.

  • toInt32(expr) — Results in the Int32 data type.

  • toInt64(expr) — Results in the Int64 data type.

  • toInt128(expr) — Results in the Int128 data type.

  • toInt256(expr) — Results in the Int256 data type.

Arguments

Returned value

Integer value in the Int8, Int16, Int32, Int64, Int128 or Int256 data type.

Example

Query:

SELECT toInt64(nan), toInt32(32), toInt16('16'), toInt8(8.8);

Result:

┌─────────toInt64(nan)─┬─toInt32(32)─┬─toInt16('16')─┬─toInt8(8.8)─┐
│ -9223372036854775808 │          32 │            16 │           8 │
└──────────────────────┴─────────────┴───────────────┴─────────────┘

It takes an argument of type String and tries to parse it into Int (8 | 16 | 32 | 64 | 128 | 256). If failed, returns 0.

Example

Query:

SELECT toInt64OrZero('123123'), toInt8OrZero('123qwe123');

Result:

┌─toInt64OrZero('123123')─┬─toInt8OrZero('123qwe123')─┐
│                  123123 │                         0 │
└─────────────────────────┴───────────────────────────┘

It takes an argument of type String and tries to parse it into Int (8 | 16 | 32 | 64 | 128 | 256). If failed, returns NULL.

Example

Query:

SELECT toInt64OrNull('123123'), toInt8OrNull('123qwe123');

Result:

┌─toInt64OrNull('123123')─┬─toInt8OrNull('123qwe123')─┐
│                  123123 │                      ᴺᵁᴸᴸ │
└─────────────────────────┴───────────────────────────┘

It takes an argument of type String and tries to parse it into Int (8 | 16 | 32 | 64 | 128 | 256). If failed, returns the default type value.

Example

Query:

SELECT toInt64OrDefault('123123', cast('-1' as Int64)), toInt8OrDefault('123qwe123', cast('-1' as Int8));

Result:

┌─toInt64OrDefault('123123', CAST('-1', 'Int64'))─┬─toInt8OrDefault('123qwe123', CAST('-1', 'Int8'))─┐
│                                          123123 │                                               -1 │
└─────────────────────────────────────────────────┴──────────────────────────────────────────────────┘
  • toUInt8(expr) — Results in the UInt8 data type.

  • toUInt16(expr) — Results in the UInt16 data type.

  • toUInt32(expr) — Results in the UInt32 data type.

  • toUInt64(expr) — Results in the UInt64 data type.

  • toUInt256(expr) — Results in the UInt256 data type.

Arguments

Returned value

Integer value in the UInt8, UInt16, UInt32, UInt64 or UInt256 data type.

Example

Query:

SELECT toUInt64(nan), toUInt32(-32), toUInt16('16'), toUInt8(8.8);

Result:

┌───────toUInt64(nan)─┬─toUInt32(-32)─┬─toUInt16('16')─┬─toUInt8(8.8)─┐
│ 9223372036854775808 │    4294967264 │             16 │            8 │
└─────────────────────┴───────────────┴────────────────┴──────────────┘

Converts the argument to Date data type.

If the argument is DateTime or DateTime64, it truncates it, leaving the date component of the DateTime:

SELECT
    now() AS x,
    toDate(x)
┌───────────────────x─┬─toDate(now())─┐
│ 2022-12-30 13:44:17 │    2022-12-30 │
└─────────────────────┴───────────────┘

If the argument is a string, it is parsed as Date or DateTime. If it was parsed as DateTime, the date component is being used:

SELECT
    toDate('2022-12-30') AS x,
    toTypeName(x)
┌──────────x─┬─toTypeName(toDate('2022-12-30'))─┐
│ 2022-12-30 │ Date                             │
└────────────┴──────────────────────────────────┘

1 row in set. Elapsed: 0.001 sec. 
SELECT
    toDate('2022-12-30 01:02:03') AS x,
    toTypeName(x)
┌──────────x─┬─toTypeName(toDate('2022-12-30 01:02:03'))─┐
│ 2022-12-30 │ Date                                      │
└────────────┴───────────────────────────────────────────┘

If the argument is a number and it looks like a UNIX timestamp (is greater than 65535), it is interpreted as a DateTime, then truncated to Date in the current timezone. The timezone argument can be specified as a second argument of the function. The truncation to Date depends on the timezone:

SELECT
    now() AS current_time,
    toUnixTimestamp(current_time) AS ts,
    toDateTime(ts) AS time_Amsterdam,
    toDateTime(ts, 'Pacific/Apia') AS time_Samoa,
    toDate(time_Amsterdam) AS date_Amsterdam,
    toDate(time_Samoa) AS date_Samoa,
    toDate(ts) AS date_Amsterdam_2,
    toDate(ts, 'Pacific/Apia') AS date_Samoa_2
Row 1:
──────
current_time:     2022-12-30 13:51:54
ts:               1672404714
time_Amsterdam:   2022-12-30 13:51:54
time_Samoa:       2022-12-31 01:51:54
date_Amsterdam:   2022-12-30
date_Samoa:       2022-12-31
date_Amsterdam_2: 2022-12-30
date_Samoa_2:     2022-12-31

The example above demonstrates how the same UNIX timestamp can be interpreted as different dates in different time zones.

If the argument is a number and it is smaller than 65536, it is interpreted as the number of days since 1970-01-01 (a UNIX day) and converted to Date. It corresponds to the internal numeric representation of the Date data type. Example:

SELECT toDate(12345)
┌─toDate(12345)─┐
│    2003-10-20 │
└───────────────┘

This conversion does not depend on timezones.

If the argument does not fit in the range of the Date type, it results in an implementation-defined behavior, that can saturate to the maximum supported date or overflow:

SELECT toDate(10000000000.)
┌─toDate(10000000000.)─┐
│           2106-02-07 │
└──────────────────────┘

The function toDate can be also written in alternative forms:

SELECT
    now() AS time,
    toDate(time),
    DATE(time),
    CAST(time, 'Date')
┌────────────────time─┬─toDate(now())─┬─DATE(now())─┬─CAST(now(), 'Date')─┐
│ 2022-12-30 13:54:58 │    2022-12-30 │  2022-12-30 │          2022-12-30 │
└─────────────────────┴───────────────┴─────────────┴─────────────────────┘

Have a nice day working with dates and times.

Syntax

toDate32(expr)

Arguments

Returned value

  • A calendar date.

Example

  1. The value is within the range:

SELECT toDate32('1955-01-01') AS value, toTypeName(value);
┌──────value─┬─toTypeName(toDate32('1925-01-01'))─┐
│ 1955-01-01 │ Date32                             │
└────────────┴────────────────────────────────────┘
  1. The value is outside the range:

SELECT toDate32('1899-01-01') AS value, toTypeName(value);
┌──────value─┬─toTypeName(toDate32('1899-01-01'))─┐
│ 1900-01-01 │ Date32                             │
└────────────┴────────────────────────────────────┘
  1. With Date-type argument:

SELECT toDate32(toDate('1899-01-01')) AS value, toTypeName(value);
┌──────value─┬─toTypeName(toDate32(toDate('1899-01-01')))─┐
│ 1970-01-01 │ Date32                                     │
└────────────┴────────────────────────────────────────────┘

Example

Query:

SELECT toDate32OrZero('1899-01-01'), toDate32OrZero('');

Result:

┌─toDate32OrZero('1899-01-01')─┬─toDate32OrZero('')─┐
│                   1900-01-01 │         1900-01-01 │
└──────────────────────────────┴────────────────────┘

Example

Query:

SELECT toDate32OrNull('1955-01-01'), toDate32OrNull('');

Result:

┌─toDate32OrNull('1955-01-01')─┬─toDate32OrNull('')─┐
│                   1955-01-01 │               ᴺᵁᴸᴸ │
└──────────────────────────────┴────────────────────┘

Example

Query:

SELECT
    toDate32OrDefault('1930-01-01', toDate32('2020-01-01')),
    toDate32OrDefault('xx1930-01-01', toDate32('2020-01-01'));

Result:

┌─toDate32OrDefault('1930-01-01', toDate32('2020-01-01'))─┬─toDate32OrDefault('xx1930-01-01', toDate32('2020-01-01'))─┐
│                                              1930-01-01 │                                                2020-01-01 │
└─────────────────────────────────────────────────────────┴───────────────────────────────────────────────────────────┘

Syntax

toDateTime64(expr, scale, [timezone])

Arguments

  • scale - Tick size (precision): 10-precision seconds. Valid range: [ 0 : 9 ].

  • timezone - Time zone of the specified datetime64 object.

Returned value

  • A calendar date and time of day, with sub-second precision.

Example

  1. The value is within the range:

SELECT toDateTime64('1955-01-01 00:00:00.000', 3) AS value, toTypeName(value);
┌───────────────────value─┬─toTypeName(toDateTime64('1955-01-01 00:00:00.000', 3))─┐
│ 1955-01-01 00:00:00.000 │ DateTime64(3)                                          │
└─────────────────────────┴────────────────────────────────────────────────────────┘
  1. As decimal with precision:

SELECT toDateTime64(1546300800.000, 3) AS value, toTypeName(value);
┌───────────────────value─┬─toTypeName(toDateTime64(1546300800., 3))─┐
│ 2019-01-01 00:00:00.000 │ DateTime64(3)                            │
└─────────────────────────┴──────────────────────────────────────────┘

Without the decimal point the value is still treated as Unix Timestamp in seconds:

SELECT toDateTime64(1546300800000, 3) AS value, toTypeName(value);
┌───────────────────value─┬─toTypeName(toDateTime64(1546300800000, 3))─┐
│ 2282-12-31 00:00:00.000 │ DateTime64(3)                              │
└─────────────────────────┴────────────────────────────────────────────┘
  1. With timezone:

SELECT toDateTime64('2019-01-01 00:00:00', 3, 'Asia/Istanbul') AS value, toTypeName(value);
┌───────────────────value─┬─toTypeName(toDateTime64('2019-01-01 00:00:00', 3, 'Asia/Istanbul'))─┐
│ 2019-01-01 00:00:00.000 │ DateTime64(3, 'Asia/Istanbul')                                      │
└─────────────────────────┴─────────────────────────────────────────────────────────────────────┘
  • toDecimal32(value, S)

  • toDecimal64(value, S)

  • toDecimal128(value, S)

  • toDecimal256(value, S)

  • toDecimal32OrNull(expr, S) — Results in Nullable(Decimal32(S)) data type.

  • toDecimal64OrNull(expr, S) — Results in Nullable(Decimal64(S)) data type.

  • toDecimal128OrNull(expr, S) — Results in Nullable(Decimal128(S)) data type.

  • toDecimal256OrNull(expr, S) — Results in Nullable(Decimal256(S)) data type.

These functions should be used instead of toDecimal*() functions, if you prefer to get a NULL value instead of an exception in the event of an input value parsing error.

Arguments

  • S — Scale, the number of decimal places in the resulting value.

Returned value

A value in the Nullable(Decimal(P,S)) data type. The value contains:

  • Number with S decimal places, if ClickHouse interprets the input string as a number.

  • NULL, if ClickHouse can’t interpret the input string as a number or if the input number contains more than S decimal places.

Examples

Query:

SELECT toDecimal32OrNull(toString(-1.111), 5) AS val, toTypeName(val);

Result:

┌────val─┬─toTypeName(toDecimal32OrNull(toString(-1.111), 5))─┐
│ -1.111 │ Nullable(Decimal(9, 5))                            │
└────────┴────────────────────────────────────────────────────┘

Query:

SELECT toDecimal32OrNull(toString(-1.111), 2) AS val, toTypeName(val);

Result:

┌──val─┬─toTypeName(toDecimal32OrNull(toString(-1.111), 2))─┐
│ ᴺᵁᴸᴸ │ Nullable(Decimal(9, 2))                            │
└──────┴────────────────────────────────────────────────────┘
  • toDecimal32OrDefault(expr, S) — Results in Decimal32(S) data type.

  • toDecimal64OrDefault(expr, S) — Results in Decimal64(S) data type.

  • toDecimal128OrDefault(expr, S) — Results in Decimal128(S) data type.

  • toDecimal256OrDefault(expr, S) — Results in Decimal256(S) data type.

These functions should be used instead of toDecimal*() functions, if you prefer to get a default value instead of an exception in the event of an input value parsing error.

Arguments

  • S — Scale, the number of decimal places in the resulting value.

Returned value

A value in the Decimal(P,S) data type. The value contains:

  • Number with S decimal places, if ClickHouse interprets the input string as a number.

  • Default Decimal(P,S) data type value, if ClickHouse can’t interpret the input string as a number or if the input number contains more than S decimal places.

Examples

Query:

SELECT toDecimal32OrDefault(toString(-1.111), 5) AS val, toTypeName(val);

Result:

┌────val─┬─toTypeName(toDecimal32OrDefault(toString(-1.111), 5))─┐
│ -1.111 │ Decimal(9, 5)                                         │
└────────┴───────────────────────────────────────────────────────┘

Query:

SELECT toDecimal32OrDefault(toString(-1.111), 2) AS val, toTypeName(val);

Result:

┌─val─┬─toTypeName(toDecimal32OrDefault(toString(-1.111), 2))─┐
│   0 │ Decimal(9, 2)                                         │
└─────┴───────────────────────────────────────────────────────┘
  • toDecimal32OrZero( expr, S) — Results in Decimal32(S) data type.

  • toDecimal64OrZero( expr, S) — Results in Decimal64(S) data type.

  • toDecimal128OrZero( expr, S) — Results in Decimal128(S) data type.

  • toDecimal256OrZero( expr, S) — Results in Decimal256(S) data type.

These functions should be used instead of toDecimal*() functions, if you prefer to get a 0 value instead of an exception in the event of an input value parsing error.

Arguments

  • S — Scale, the number of decimal places in the resulting value.

Returned value

A value in the Nullable(Decimal(P,S)) data type. The value contains:

  • Number with S decimal places, if ClickHouse interprets the input string as a number.

  • 0 with S decimal places, if ClickHouse can’t interpret the input string as a number or if the input number contains more than S decimal places.

Example

Query:

SELECT toDecimal32OrZero(toString(-1.111), 5) AS val, toTypeName(val);

Result:

┌────val─┬─toTypeName(toDecimal32OrZero(toString(-1.111), 5))─┐
│ -1.111 │ Decimal(9, 5)                                      │
└────────┴────────────────────────────────────────────────────┘

Query:

SELECT toDecimal32OrZero(toString(-1.111), 2) AS val, toTypeName(val);

Result:

┌──val─┬─toTypeName(toDecimal32OrZero(toString(-1.111), 2))─┐
│ 0.00 │ Decimal(9, 2)                                      │
└──────┴────────────────────────────────────────────────────┘

Functions for converting between numbers, strings (but not fixed strings), dates, and dates with times. All these functions accept one argument.

When converting to or from a string, the value is formatted or parsed using the same rules as for the TabSeparated format (and almost all other text formats). If the string can’t be parsed, an exception is thrown and the request is canceled.

When converting dates to numbers or vice versa, the date corresponds to the number of days since the beginning of the Unix epoch. When converting dates with times to numbers or vice versa, the date with time corresponds to the number of seconds since the beginning of the Unix epoch.

The date and date-with-time formats for the toDate/toDateTime functions are defined as follows:

YYYY-MM-DD
YYYY-MM-DD hh:mm:ss

As an exception, if converting from UInt32, Int32, UInt64, or Int64 numeric types to Date, and if the number is greater than or equal to 65536, the number is interpreted as a Unix timestamp (and not as the number of days) and is rounded to the date. This allows support for the common occurrence of writing ‘toDate(unix_timestamp)’, which otherwise would be an error and would require writing the more cumbersome ‘toDate(toDateTime(unix_timestamp))’.

Conversion between a date and a date with time is performed the natural way: by adding a null time or dropping the time.

Conversion between numeric types uses the same rules as assignments between different numeric types in C++.

Additionally, the toString function of the DateTime argument can take a second String argument containing the name of the time zone. Example: Asia/Yekaterinburg In this case, the time is formatted according to the specified time zone.

Example

Query:

SELECT
    now() AS now_local,
    toString(now(), 'Asia/Yekaterinburg') AS now_yekat;

Result:

┌───────────now_local─┬─now_yekat───────────┐
│ 2016-06-15 00:11:21 │ 2016-06-15 02:11:21 │
└─────────────────────┴─────────────────────┘

Also see the toUnixTimestamp function.

Converts a String type argument to a FixedString(N) type (a string with fixed length N). N must be a constant. If the string has fewer bytes than N, it is padded with null bytes to the right. If the string has more bytes than N, an exception is thrown.

Accepts a String or FixedString argument. Returns the String with the content truncated at the first zero byte found.

Example

Query:

SELECT toFixedString('foo', 8) AS s, toStringCutToZero(s) AS s_cut;

Result:

┌─s─────────────┬─s_cut─┐
│ foo\0\0\0\0\0 │ foo   │
└───────────────┴───────┘

Query:

SELECT toFixedString('foo\0bar', 8) AS s, toStringCutToZero(s) AS s_cut;

Result:

┌─s──────────┬─s_cut─┐
│ foo\0bar\0 │ foo   │
└────────────┴───────┘

These functions accept a string and interpret the bytes placed at the beginning of the string as a number in host order (little endian). If the string isn’t long enough, the functions work as if the string is padded with the necessary number of null bytes. If the string is longer than needed, the extra bytes are ignored. A date is interpreted as the number of days since the beginning of the Unix Epoch, and a date with time is interpreted as the number of seconds since the beginning of the Unix Epoch.

This function accepts a number or date or date with time and returns a string containing bytes representing the corresponding value in host order (little endian). Null bytes are dropped from the end. For example, a UInt32 type value of 255 is a string that is one byte long.

This function accepts a number or date or date with time and returns a FixedString containing bytes representing the corresponding value in host order (little endian). Null bytes are dropped from the end. For example, a UInt32 type value of 255 is a FixedString that is one byte long.

Accepts 16 bytes string and returns UUID containing bytes representing the corresponding value in network byte order (big-endian). If the string isn't long enough, the function works as if the string is padded with the necessary number of null bytes to the end. If the string is longer than 16 bytes, the extra bytes at the end are ignored.

Syntax

reinterpretAsUUID(fixed_string)

Arguments

Returned value

Examples

String to UUID.

Query:

SELECT reinterpretAsUUID(reverse(unhex('000102030405060708090a0b0c0d0e0f')));

Result:

┌─reinterpretAsUUID(reverse(unhex('000102030405060708090a0b0c0d0e0f')))─┐
│                                  08090a0b-0c0d-0e0f-0001-020304050607 │
└───────────────────────────────────────────────────────────────────────┘

Going back and forth from String to UUID.

Query:

WITH
    generateUUIDv4() AS uuid,
    identity(lower(hex(reverse(reinterpretAsString(uuid))))) AS str,
    reinterpretAsUUID(reverse(unhex(str))) AS uuid2
SELECT uuid = uuid2;

Result:

┌─equals(uuid, uuid2)─┐
│                   1 │
└─────────────────────┘

Uses the same source in-memory bytes sequence for x value and reinterprets it to destination type.

Syntax

reinterpret(x, type)

Arguments

  • x — Any type.

Returned value

  • Destination type value.

Examples

Query:

SELECT reinterpret(toInt8(-1), 'UInt8') as int_to_uint,
    reinterpret(toInt8(1), 'Float32') as int_to_float,
    reinterpret('1', 'UInt32') as string_to_int;

Result:

┌─int_to_uint─┬─int_to_float─┬─string_to_int─┐
│         255 │        1e-45 │            49 │
└─────────────┴──────────────┴───────────────┘

Syntax

CAST(x, T)
CAST(x AS t)
x::t

Arguments

  • x — A value to convert. May be of any type.

  • t — The target data type.

Returned value

  • Converted value.

NOTE

If the input value does not fit the bounds of the target type, the result overflows. For example, CAST(-1, 'UInt8') returns 255.

Examples

Query:

SELECT
    CAST(toInt8(-1), 'UInt8') AS cast_int_to_uint,
    CAST(1.5 AS Decimal(3,2)) AS cast_float_to_decimal,
    '1'::Int32 AS cast_string_to_int;

Result:

┌─cast_int_to_uint─┬─cast_float_to_decimal─┬─cast_string_to_int─┐
│              255 │                  1.50 │                  1 │
└──────────────────┴───────────────────────┴────────────────────┘

Query:

SELECT
    '2016-06-15 23:00:00' AS timestamp,
    CAST(timestamp AS DateTime) AS datetime,
    CAST(timestamp AS Date) AS date,
    CAST(timestamp, 'String') AS string,
    CAST(timestamp, 'FixedString(22)') AS fixed_string;

Result:

┌─timestamp───────────┬────────────datetime─┬───────date─┬─string──────────────┬─fixed_string──────────────┐
│ 2016-06-15 23:00:00 │ 2016-06-15 23:00:00 │ 2016-06-15 │ 2016-06-15 23:00:00 │ 2016-06-15 23:00:00\0\0\0 │
└─────────────────────┴─────────────────────┴────────────┴─────────────────────┴───────────────────────────┘

Example

Query:

SELECT toTypeName(x) FROM t_null;

Result:

┌─toTypeName(x)─┐
│ Int8          │
│ Int8          │
└───────────────┘

Query:

SELECT toTypeName(CAST(x, 'Nullable(UInt16)')) FROM t_null;

Result:

┌─toTypeName(CAST(x, 'Nullable(UInt16)'))─┐
│ Nullable(UInt16)                        │
│ Nullable(UInt16)                        │
└─────────────────────────────────────────┘

See also

Converts x to the T data type.

Example

Query:

SELECT cast(-1, 'UInt8') as uint8;

Result:

┌─uint8─┐
│   255 │
└───────┘

Query:

SELECT accurateCast(-1, 'UInt8') as uint8;

Result:

Code: 70. DB::Exception: Received from localhost:9000. DB::Exception: Value in column Int8 cannot be safely converted into type UInt8: While processing accurateCast(-1, 'UInt8') AS uint8.

Syntax

accurateCastOrNull(x, T)

Parameters

  • x — Input value.

  • T — The name of the returned data type.

Returned value

  • The value, converted to the specified data type T.

Example

Query:

SELECT toTypeName(accurateCastOrNull(5, 'UInt8'));

Result:

┌─toTypeName(accurateCastOrNull(5, 'UInt8'))─┐
│ Nullable(UInt8)                            │
└────────────────────────────────────────────┘

Query:

SELECT
    accurateCastOrNull(-1, 'UInt8') as uint8,
    accurateCastOrNull(128, 'Int8') as int8,
    accurateCastOrNull('Test', 'FixedString(2)') as fixed_string;

Result:

┌─uint8─┬─int8─┬─fixed_string─┐
│  ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ         │
└───────┴──────┴──────────────┘

Converts input value x to the specified data type T. Returns default type value or default_value if specified if the casted value is not representable in the target type.

Syntax

accurateCastOrDefault(x, T)

Parameters

  • x — Input value.

  • T — The name of the returned data type.

  • default_value — Default value of returned data type.

Returned value

  • The value converted to the specified data type T.

Example

Query:

SELECT toTypeName(accurateCastOrDefault(5, 'UInt8'));

Result:

┌─toTypeName(accurateCastOrDefault(5, 'UInt8'))─┐
│ UInt8                                         │
└───────────────────────────────────────────────┘

Query:

SELECT
    accurateCastOrDefault(-1, 'UInt8') as uint8,
    accurateCastOrDefault(-1, 'UInt8', 5) as uint8_default,
    accurateCastOrDefault(128, 'Int8') as int8,
    accurateCastOrDefault(128, 'Int8', 5) as int8_default,
    accurateCastOrDefault('Test', 'FixedString(2)') as fixed_string,
    accurateCastOrDefault('Test', 'FixedString(2)', 'Te') as fixed_string_default;

Result:

┌─uint8─┬─uint8_default─┬─int8─┬─int8_default─┬─fixed_string─┬─fixed_string_default─┐
│     0 │             5 │    0 │            5 │              │ Te                   │
└───────┴───────────────┴──────┴──────────────┴──────────────┴──────────────────────┘

Syntax

toIntervalSecond(number)
toIntervalMinute(number)
toIntervalHour(number)
toIntervalDay(number)
toIntervalWeek(number)
toIntervalMonth(number)
toIntervalQuarter(number)
toIntervalYear(number)

Arguments

  • number — Duration of interval. Positive integer number.

Returned values

  • The value in Interval data type.

Example

Query:

WITH
    toDate('2019-01-01') AS date,
    INTERVAL 1 WEEK AS interval_week,
    toIntervalWeek(1) AS interval_to_week
SELECT
    date + interval_week,
    date + interval_to_week;

Result:

┌─plus(date, interval_week)─┬─plus(date, interval_to_week)─┐
│                2019-01-08 │                   2019-01-08 │
└───────────────────────────┴──────────────────────────────┘

Syntax

parseDateTimeBestEffort(time_string [, time_zone])

Arguments

Supported non-standard formats

  • A string with a date and a time component: YYYYMMDDhhmmss, DD/MM/YYYY hh:mm:ss, DD-MM-YY hh:mm, YYYY-MM-DD hh:mm:ss, etc.

  • A string with a date, but no time component: YYYY, YYYYMM, YYYY*MM, DD/MM/YYYY, DD-MM-YY etc.

  • A string with a day and time: DD, DD hh, DD hh:mm. In this case YYYY-MM are substituted as 2000-01.

  • A string that includes the date and time along with time zone offset information: YYYY-MM-DD hh:mm:ss ±h:mm, etc. For example, 2020-12-12 17:36:00 -5:00.

For all of the formats with separator the function parses months names expressed by their full name or by the first three letters of a month name. Examples: 24/DEC/18, 24-Dec-18, 01-September-2018.

Returned value

  • time_string converted to the DateTime data type.

Examples

Query:

SELECT parseDateTimeBestEffort('23/10/2020 12:12:57')
AS parseDateTimeBestEffort;

Result:

┌─parseDateTimeBestEffort─┐
│     2020-10-23 12:12:57 │
└─────────────────────────┘

Query:

SELECT parseDateTimeBestEffort('Sat, 18 Aug 2018 07:22:16 GMT', 'Asia/Istanbul')
AS parseDateTimeBestEffort;

Result:

┌─parseDateTimeBestEffort─┐
│     2018-08-18 10:22:16 │
└─────────────────────────┘

Query:

SELECT parseDateTimeBestEffort('1284101485')
AS parseDateTimeBestEffort;

Result:

┌─parseDateTimeBestEffort─┐
│     2015-07-07 12:04:41 │
└─────────────────────────┘

Query:

SELECT parseDateTimeBestEffort('2018-10-23 10:12:12')
AS parseDateTimeBestEffort;

Result:

┌─parseDateTimeBestEffort─┐
│     2018-10-23 10:12:12 │
└─────────────────────────┘

Query:

SELECT parseDateTimeBestEffort('10 20:19');

Result:

┌─parseDateTimeBestEffort('10 20:19')─┐
│                 2000-01-10 20:19:00 │
└─────────────────────────────────────┘

See Also

Syntax

parseDateTime64BestEffort(time_string [, precision [, time_zone]])

Parameters

Returned value

Examples

Query:

SELECT parseDateTime64BestEffort('2021-01-01') AS a, toTypeName(a) AS t
UNION ALL
SELECT parseDateTime64BestEffort('2021-01-01 01:01:00.12346') AS a, toTypeName(a) AS t
UNION ALL
SELECT parseDateTime64BestEffort('2021-01-01 01:01:00.12346',6) AS a, toTypeName(a) AS t
UNION ALL
SELECT parseDateTime64BestEffort('2021-01-01 01:01:00.12346',3,'Asia/Istanbul') AS a, toTypeName(a) AS t
FORMAT PrettyCompactMonoBlock;

Result:

┌──────────────────────────a─┬─t──────────────────────────────┐
│ 2021-01-01 01:01:00.123000 │ DateTime64(3)                  │
│ 2021-01-01 00:00:00.000000 │ DateTime64(3)                  │
│ 2021-01-01 01:01:00.123460 │ DateTime64(6)                  │
│ 2020-12-31 22:01:00.123000 │ DateTime64(3, 'Asia/Istanbul') │
└────────────────────────────┴────────────────────────────────┘

Syntax

toLowCardinality(expr)

Arguments

Returned values

  • Result of expr.

Type: LowCardinality(expr_result_type)

Example

Query:

SELECT toLowCardinality('1');

Result:

┌─toLowCardinality('1')─┐
│ 1                     │
└───────────────────────┘

Converts a DateTime64 to a Int64 value with fixed sub-second precision. Input value is scaled up or down appropriately depending on it precision.

NOTE

The output value is a timestamp in UTC, not in the timezone of DateTime64.

Syntax

toUnixTimestamp64Milli(value)

Arguments

  • value — DateTime64 value with any precision.

Returned value

  • value converted to the Int64 data type.

Examples

Query:

WITH toDateTime64('2019-09-16 19:20:12.345678910', 6) AS dt64
SELECT toUnixTimestamp64Milli(dt64);

Result:

┌─toUnixTimestamp64Milli(dt64)─┐
│                1568650812345 │
└──────────────────────────────┘

Query:

WITH toDateTime64('2019-09-16 19:20:12.345678910', 6) AS dt64
SELECT toUnixTimestamp64Nano(dt64);

Result:

┌─toUnixTimestamp64Nano(dt64)─┐
│         1568650812345678000 │
└─────────────────────────────┘

Converts an Int64 to a DateTime64 value with fixed sub-second precision and optional timezone. Input value is scaled up or down appropriately depending on it’s precision. Please note that input value is treated as UTC timestamp, not timestamp at given (or implicit) timezone.

Syntax

fromUnixTimestamp64Milli(value [, ti])

Arguments

  • value — Int64 value with any precision.

  • timezone — String (optional) timezone name of the result.

Returned value

  • value converted to the DateTime64 data type.

Example

Query:

WITH CAST(1234567891011, 'Int64') AS i64
SELECT fromUnixTimestamp64Milli(i64, 'UTC');

Result:

┌─fromUnixTimestamp64Milli(i64, 'UTC')─┐
│              2009-02-13 23:31:31.011 │
└──────────────────────────────────────┘

Converts arbitrary expressions into a string via given format.

Syntax

formatRow(format, x, y, ...)

Arguments

  • x,y, ... — Expressions.

Returned value

  • A formatted string. (for text formats it's usually terminated with the new line character).

Example

Query:

SELECT formatRow('CSV', number, 'good')
FROM numbers(3);

Result:

┌─formatRow('CSV', number, 'good')─┐
│ 0,"good"
                         │
│ 1,"good"
                         │
│ 2,"good"
                         │
└──────────────────────────────────┘

Note: If format contains suffix/prefix, it will be written in each row.

Example

Query:

SELECT formatRow('CustomSeparated', number, 'good')
FROM numbers(3)
SETTINGS format_custom_result_before_delimiter='<prefix>\n', format_custom_result_after_delimiter='<suffix>'

Result:

┌─formatRow('CustomSeparated', number, 'good')─┐
│ <prefix>
0   good
<suffix>                   │
│ <prefix>
1   good
<suffix>                   │
│ <prefix>
2   good
<suffix>                   │
└──────────────────────────────────────────────┘

Note: Only row-based formats are supported in this function.

Converts arbitrary expressions into a string via given format. Differs from formatRow in that this function trims the last if any.

Syntax

formatRowNoNewline(format, x, y, ...)

Arguments

  • x,y, ... — Expressions.

Returned value

  • A formatted string.

Example

Query:

SELECT formatRowNoNewline('CSV', number, 'good')
FROM numbers(3);

Result:

┌─formatRowNoNewline('CSV', number, 'good')─┐
│ 0,"good"                                  │
│ 1,"good"                                  │
│ 2,"good"                                  │
└───────────────────────────────────────────┘

URLs

All these functions do not follow the RFC. They are maximally simplified for improved performance.

Extracts the protocol from a URL.

Examples of typical returned values: http, https, ftp, mailto, tel, magnet…

Extracts the hostname from a URL.

domain(url)

Arguments

The URL can be specified with or without a scheme. Examples:

svn+ssh://some.svn-hosting.com:80/repo/trunk
some.svn-hosting.com:80/repo/trunk
https://clickhouse.com/time/

For these examples, the domain function returns the following results:

some.svn-hosting.com
some.svn-hosting.com
clickhouse.com

Returned values

  • Host name. If ClickHouse can parse the input string as a URL.

  • Empty string. If ClickHouse can’t parse the input string as a URL.

Type: String.

Example

SELECT domain('svn+ssh://some.svn-hosting.com:80/repo/trunk');
┌─domain('svn+ssh://some.svn-hosting.com:80/repo/trunk')─┐
│ some.svn-hosting.com                                   │
└────────────────────────────────────────────────────────┘

Returns the domain and removes no more than one ‘www.’ from the beginning of it, if present.

Extracts the the top-level domain from a URL.

topLevelDomain(url)

Arguments

The URL can be specified with or without a scheme. Examples:

svn+ssh://some.svn-hosting.com:80/repo/trunk
some.svn-hosting.com:80/repo/trunk
https://clickhouse.com/time/

Returned values

  • Domain name. If ClickHouse can parse the input string as a URL.

  • Empty string. If ClickHouse cannot parse the input string as a URL.

Type: String.

Example

SELECT topLevelDomain('svn+ssh://www.some.svn-hosting.com:80/repo/trunk');
┌─topLevelDomain('svn+ssh://www.some.svn-hosting.com:80/repo/trunk')─┐
│ com                                                                │
└────────────────────────────────────────────────────────────────────┘

Returns the “first significant subdomain”. The first significant subdomain is a second-level domain if it is ‘com’, ‘net’, ‘org’, or ‘co’. Otherwise, it is a third-level domain. For example, firstSignificantSubdomain (‘https://news.clickhouse.com/’) = ‘clickhouse’, firstSignificantSubdomain (‘https://news.clickhouse.com.tr/’) = ‘clickhouse’. The list of “insignificant” second-level domains and other implementation details may change in the future.

Returns the part of the domain that includes top-level subdomains up to the “first significant subdomain” (see the explanation above).

For example:

  • cutToFirstSignificantSubdomain('https://news.clickhouse.com.tr/') = 'clickhouse.com.tr'.

  • cutToFirstSignificantSubdomain('www.tr') = 'tr'.

  • cutToFirstSignificantSubdomain('tr') = ''.

Returns the part of the domain that includes top-level subdomains up to the “first significant subdomain”, without stripping "www".

For example:

  • cutToFirstSignificantSubdomain('https://news.clickhouse.com.tr/') = 'clickhouse.com.tr'.

  • cutToFirstSignificantSubdomain('www.tr') = 'www.tr'.

  • cutToFirstSignificantSubdomain('tr') = ''.

Can be useful if you need fresh TLD list or you have custom.

Configuration example:

<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<top_level_domains_lists>
    <!-- https://publicsuffix.org/list/public_suffix_list.dat -->
    <public_suffix_list>public_suffix_list.dat</public_suffix_list>
    <!-- NOTE: path is under top_level_domains_path -->
</top_level_domains_lists>

Syntax

cutToFirstSignificantSubdomain(URL, TLD)

Parameters

Returned value

  • Part of the domain that includes top-level subdomains up to the first significant subdomain.

Example

Query:

SELECT cutToFirstSignificantSubdomainCustom('bar.foo.there-is-no-such-domain', 'public_suffix_list');

Result:

┌─cutToFirstSignificantSubdomainCustom('bar.foo.there-is-no-such-domain', 'public_suffix_list')─┐
│ foo.there-is-no-such-domain                                                                   │
└───────────────────────────────────────────────────────────────────────────────────────────────┘

See Also

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain without stripping www. Accepts custom TLD list name.

Can be useful if you need fresh TLD list or you have custom.

Configuration example:

<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<top_level_domains_lists>
    <!-- https://publicsuffix.org/list/public_suffix_list.dat -->
    <public_suffix_list>public_suffix_list.dat</public_suffix_list>
    <!-- NOTE: path is under top_level_domains_path -->
</top_level_domains_lists>

Syntax

cutToFirstSignificantSubdomainCustomWithWWW(URL, TLD)

Parameters

Returned value

  • Part of the domain that includes top-level subdomains up to the first significant subdomain without stripping www.

Example

Query:

SELECT cutToFirstSignificantSubdomainCustomWithWWW('www.foo', 'public_suffix_list');

Result:

┌─cutToFirstSignificantSubdomainCustomWithWWW('www.foo', 'public_suffix_list')─┐
│ www.foo                                                                      │
└──────────────────────────────────────────────────────────────────────────────┘

See Also

Returns the first significant subdomain. Accepts customs TLD list name.

Can be useful if you need fresh TLD list or you have custom.

Configuration example:

<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<top_level_domains_lists>
    <!-- https://publicsuffix.org/list/public_suffix_list.dat -->
    <public_suffix_list>public_suffix_list.dat</public_suffix_list>
    <!-- NOTE: path is under top_level_domains_path -->
</top_level_domains_lists>

Syntax

firstSignificantSubdomainCustom(URL, TLD)

Parameters

Returned value

  • First significant subdomain.

Example

Query:

SELECT firstSignificantSubdomainCustom('bar.foo.there-is-no-such-domain', 'public_suffix_list');

Result:

┌─firstSignificantSubdomainCustom('bar.foo.there-is-no-such-domain', 'public_suffix_list')─┐
│ foo                                                                                      │
└──────────────────────────────────────────────────────────────────────────────────────────┘

See Also

Returns the port or default_port if there is no port in the URL (or in case of validation error).

Returns the path. Example: /top/news.html The path does not include the query string.

The same as above, but including query string and fragment. Example: /top/news.html?page=2#comments

Returns the query string. Example: page=1&lr=213. query-string does not include the initial question mark, as well as # and everything after #.

Returns the fragment identifier. fragment does not include the initial hash symbol.

Returns the query string and fragment identifier. Example: page=1#29390.

Returns the value of the ‘name’ parameter in the URL, if present. Otherwise, an empty string. If there are many parameters with this name, it returns the first occurrence. This function works under the assumption that the parameter name is encoded in the URL exactly the same way as in the passed argument.

Returns an array of name=value strings corresponding to the URL parameters. The values are not decoded in any way.

Returns an array of name strings corresponding to the names of URL parameters. The values are not decoded in any way.

Returns an array containing the URL, truncated at the end by the symbols /,? in the path and query-string. Consecutive separator characters are counted as one. The cut is made in the position after all the consecutive separator characters.

The same as above, but without the protocol and host in the result. The / element (root) is not included.

URLPathHierarchy('https://example.com/browse/CONV-6788') =
[
    '/browse/',
    '/browse/CONV-6788'
]

Returns the decoded URL. Example:

SELECT decodeURLComponent('http://127.0.0.1:8123/?query=SELECT%201%3B') AS DecodedURL;
┌─DecodedURL─────────────────────────────┐
│ http://127.0.0.1:8123/?query=SELECT 1; │
└────────────────────────────────────────┘

Extracts network locality (username:password@host:port) from a URL.

Syntax

netloc(URL)

Arguments

Returned value

  • username:password@host:port.

Type: String.

Example

Query:

SELECT netloc('http://paul@www.example.com:80/');

Result:

┌─netloc('http://paul@www.example.com:80/')─┐
│ paul@www.example.com:80                   │
└───────────────────────────────────────────┘

Removes no more than one ‘www.’ from the beginning of the URL’s domain, if present.

Removes query string. The question mark is also removed.

Removes the fragment identifier. The number sign is also removed.

Removes the query string and fragment identifier. The question mark and number sign are also removed.

Removes the name parameter from URL, if present. This function does not encode or decode characters in parameter names, e.g. Client ID and Client%20ID are treated as different parameter names.

Syntax

cutURLParameter(URL, name)

Arguments

Returned value

  • URL with name URL parameter removed.

Type: String.

Example

Query:

SELECT
    cutURLParameter('http://bigmir.net/?a=b&c=d&e=f#g', 'a') as url_without_a,
    cutURLParameter('http://bigmir.net/?a=b&c=d&e=f#g', ['c', 'e']) as url_without_c_and_e;

Result:

┌─url_without_a────────────────┬─url_without_c_and_e──────┐
│ http://bigmir.net/?c=d&e=f#g │ http://bigmir.net/?a=b#g │
└──────────────────────────────┴──────────────────────────┘

UUID

The functions for working with UUID are listed below.

Syntax

generateUUIDv4([x])

Arguments

Returned value

The UUID type value.

Usage example

This example demonstrates creating a table with the UUID type column and inserting a value into the table.

CREATE TABLE t_uuid (x UUID) ENGINE=TinyLog

INSERT INTO t_uuid SELECT generateUUIDv4()

SELECT * FROM t_uuid
┌────────────────────────────────────x─┐
│ f4bf890f-f9dc-4332-ad5c-0c18e73f28e9 │
└──────────────────────────────────────┘

Usage example if it is needed to generate multiple values in one row

SELECT generateUUIDv4(1), generateUUIDv4(2)
┌─generateUUIDv4(1)────────────────────┬─generateUUIDv4(2)────────────────────┐
│ 2d49dc6e-ddce-4cd0-afb8-790956df54c1 │ 8abf8c13-7dea-4fdf-af3e-0e18767770e6 │
└──────────────────────────────────────┴──────────────────────────────────────┘

Converts String type value to UUID type.

toUUID(String)

Returned value

The UUID type value.

Usage example

SELECT toUUID('61f0c404-5cb3-11e7-907b-a6006ad3dba0') AS uuid
┌─────────────────────────────────uuid─┐
│ 61f0c404-5cb3-11e7-907b-a6006ad3dba0 │
└──────────────────────────────────────┘

It takes an argument of type String and tries to parse it into UUID. If failed, returns NULL.

toUUIDOrNull(String)

Returned value

The Nullable(UUID) type value.

Usage example

SELECT toUUIDOrNull('61f0c404-5cb3-11e7-907b-a6006ad3dba0T') AS uuid
┌─uuid─┐
│ ᴺᵁᴸᴸ │
└──────┘

It takes an argument of type String and tries to parse it into UUID. If failed, returns zero UUID.

toUUIDOrZero(String)

Returned value

The UUID type value.

Usage example

SELECT toUUIDOrZero('61f0c404-5cb3-11e7-907b-a6006ad3dba0T') AS uuid
┌─────────────────────────────────uuid─┐
│ 00000000-0000-0000-0000-000000000000 │
└──────────────────────────────────────┘

Syntax

UUIDStringToNum(string[, variant = 1])

Arguments

Returned value

FixedString(16)

Usage examples

SELECT
    '612f3c40-5d3b-217e-707b-6a546a3d7b29' AS uuid,
    UUIDStringToNum(uuid) AS bytes
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ a/<@];!~p{jTj={) │
└──────────────────────────────────────┴──────────────────┘
SELECT
    '612f3c40-5d3b-217e-707b-6a546a3d7b29' AS uuid,
    UUIDStringToNum(uuid, 2) AS bytes
┌─uuid─────────────────────────────────┬─bytes────────────┐
│ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │ @</a;]~!p{jTj={) │
└──────────────────────────────────────┴──────────────────┘

Accepts binary containing a binary representation of a UUID, with its format optionally specified by variant (Big-endian by default), and returns a string containing 36 characters in text format.

Syntax

UUIDNumToString(binary[, variant = 1])

Arguments

Returned value

String.

Usage example

SELECT
    'a/<@];!~p{jTj={)' AS bytes,
    UUIDNumToString(toFixedString(bytes, 16)) AS uuid
┌─bytes────────────┬─uuid─────────────────────────────────┐
│ a/<@];!~p{jTj={) │ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │
└──────────────────┴──────────────────────────────────────┘
SELECT
    '@</a;]~!p{jTj={)' AS bytes,
    UUIDNumToString(toFixedString(bytes, 16), 2) AS uuid
┌─bytes────────────┬─uuid─────────────────────────────────┐
│ @</a;]~!p{jTj={) │ 612f3c40-5d3b-217e-707b-6a546a3d7b29 │
└──────────────────┴──────────────────────────────────────┘

minus(a, b), a - b operator

multiply(a, b), a * b operator

divide(a, b), a / b operator

intDiv(a, b)

intDivOrZero(a, b)

modulo(a, b), a % b operator

moduloOrZero(a, b)

Differs from in that it returns zero when the divisor is zero.

negate(a), -a operator

abs(a)

gcd(a, b)

lcm(a, b)

empty

Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1 the function reads only subcolumn instead of reading and processing the whole array column. The query SELECT empty(arr) FROM TABLE; transforms to SELECT arr.size0 = 0 FROM TABLE;.

The function also works for or .

[x] — Input array. .

Type: .

notEmpty

Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1 the function reads only subcolumn instead of reading and processing the whole array column. The query SELECT notEmpty(arr) FROM table transforms to SELECT arr.size0 != 0 FROM TABLE.

The function also works for or .

[x] — Input array. .

Type: .

length

Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1 the function reads only subcolumn instead of reading and processing the whole array column. The query SELECT length(arr) FROM table transforms to SELECT arr.size0 FROM TABLE.

emptyArrayInt8, emptyArrayInt16, emptyArrayInt32, emptyArrayInt64

emptyArrayFloat32, emptyArrayFloat64

emptyArrayDate, emptyArrayDateTime

emptyArrayString

emptyArrayToSingle

range(end), range([start, ] end [, step])

start — The first element of the array. Optional, required if step is used. Default value: 0.

end — The number before which the array is constructed. Required.

step — Determines the incremental step between each element in the array. Optional. Default value: 1.

An exception is thrown if query results in arrays with a total length of more than number of elements specified by the setting.

array(x1, …), operator [x1, …]

arrayConcat

arrays – Arbitrary number of arguments of type. Example

has(arr, elem)

hasAll

hasAny

hasSubstr

indexOf(arr, x)

arrayCount([func,] arr1, …)

Note that the arrayCount is a . You can pass a lambda function to it as the first argument.

countEqual(arr, x)

arrayEnumerate(arr)

arrayEnumerateUniq(arr, …)

arrayPopBack

arrayPopFront

arrayPushBack

single_value – A single value. Only numbers can be added to an array with numbers, and only strings can be added to an array of strings. When adding numbers, ClickHouse automatically sets the single_value type for the data type of the array. For more information about the types of data in ClickHouse, see “”. Can be NULL. The function adds a NULL element to an array, and the type of array elements converts to Nullable.

arrayPushFront

single_value – A single value. Only numbers can be added to an array with numbers, and only strings can be added to an array of strings. When adding numbers, ClickHouse automatically sets the single_value type for the data type of the array. For more information about the types of data in ClickHouse, see “”. Can be NULL. The function adds a NULL element to an array, and the type of array elements converts to Nullable.

arrayResize

arraySlice

arraySort([func,] arr, …)

Note that arraySort is a . You can pass a lambda function to it as the first argument. In this case, sorting order is determined by the result of the lambda function applied to the elements of the array.

For each element of the source array, the lambda function returns the sorting key, that is, [1 –> -1, 2 –> -2, 3 –> -3]. Since the arraySort function sorts the keys in ascending order, the result is [3, 2, 1]. Thus, the (x) –> -x lambda function sets the in a sorting.

To improve sorting efficiency, the is used.

arrayReverseSort([func,] arr, …)

Note that the arrayReverseSort is a . You can pass a lambda function to it as the first argument. Example is shown below.

arrayUniq(arr, …)

arrayJoin(arr)

A special function. See the section .

arrayDifference

array – .

Type: , , .

arrayDistinct

array – .

arrayEnumerateDense(arr)

arrayIntersect(arr)

arrayReduce

agg_func — The name of an aggregate function which should be a constant .

arr — Any number of type columns as the parameters of the aggregation function.

arrayReduceInRanges

agg_func — The name of an aggregate function which should be a constant .

ranges — The ranges to aggretate which should be an of which containing the index and the length of each range.

arr — Any number of type columns as the parameters of the aggregation function.

Type: .

arrayReverse(arr)

reverse(arr)

Synonym for

arrayFlatten

array_of_arrays — of arrays. For example, [[1,2,3], [4,5]].

arrayCompact

arr — The to inspect.

arrayZip

arrN — .

Array with elements from the source arrays grouped into . Data types in the tuple are the same as types of the input arrays and in the same order as arrays are passed.

Type: .

arrayAUC

Calculate AUC (Area Under the Curve, which is a concept in machine learning, see more details: ).

arrayMap(func, arr1, …)

Note that the arrayMap is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayFilter(func, arr1, …)

Note that the arrayFilter is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayFill(func, arr1, …)

Note that the arrayFill is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayReverseFill(func, arr1, …)

Note that the arrayReverseFill is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arraySplit(func, arr1, …)

Note that the arraySplit is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayReverseSplit(func, arr1, …)

Note that the arrayReverseSplit is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayExists([func,] arr1, …)

Note that the arrayExists is a . You can pass a lambda function to it as the first argument.

arrayAll([func,] arr1, …)

Note that the arrayAll is a . You can pass a lambda function to it as the first argument.

arrayFirst(func, arr1, …)

Note that the arrayFirst is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayFirstIndex(func, arr1, …)

Note that the arrayFirstIndex is a . You must pass a lambda function to it as the first argument, and it can’t be omitted.

arrayMin

Note that the arrayMin is a . You can pass a lambda function to it as the first argument.

func — Function. .

arr — Array. .

arrayMax

Note that the arrayMax is a . You can pass a lambda function to it as the first argument.

func — Function. .

arr — Array. .

arraySum

Note that the arraySum is a . You can pass a lambda function to it as the first argument.

func — Function. .

arr — Array. .

Type: for decimal numbers in source array (or for converted values, if func is specified) — , for floating point numbers — , for numeric unsigned — , and for numeric signed — .

arrayAvg

Note that the arrayAvg is a . You can pass a lambda function to it as the first argument.

func — Function. .

arr — Array. .

Type: .

arrayCumSum([func,] arr1, …)

Note that the arrayCumSum is a . You can pass a lambda function to it as the first argument.

arrayCumSumNonNegative(arr)

Note that the arraySumNonNegative is a . You can pass a lambda function to it as the first argument.

arrayProduct

Multiplies elements of an .

arr — of numeric values.

Type: .

Return value type is always . Result:

bitOr(a, b)

bitXor(a, b)

bitNot(a)

bitShiftLeft(a, b)

a — A value to shift. , or .

b — The number of shift positions. , 64 bit types or less are allowed.

In the following queries and functions are used to show bits of shifted values.

bitShiftRight(a, b)

a — A value to shift. , or .

b — The number of shift positions. , 64 bit types or less are allowed.

bitRotateLeft(a, b)

bitRotateRight(a, b)

bitTest

Takes any integer and converts it into , returns the value of a bit at specified position. The countdown starts from 0 from the right to the left.

bitTestAll

Returns result of (AND operator) of all bits at given positions. The countdown starts from 0 from the right to the left.

bitTestAny

Returns result of (OR operator) of all bits at given positions. The countdown starts from 0 from the right to the left.

bitCount

x — or number. The function uses the value representation in memory. It allows supporting floating-point numbers.

The function does not convert input value to a larger type (). So, for example, bitCount(toUInt8(-1)) = 8.

bitHammingDistance

Returns the between the bit representations of two integer values. Can be used with functions for detection of semi-duplicate strings. The smaller is the distance, the more likely those strings are the same.

int1 — First integer value. .

int2 — Second integer value. .

Type: .

With :

For more information on RoaringBitmap, see: .

bitmapBuild

bitmapToArray

bitmapSubsetInRange

bitmap – .

range_start – Range start point. Type: .

range_end – Range end point (excluded). Type: .

bitmapSubsetLimit

bitmap – .

range_start – The subset starting point. Type: .

cardinality_limit – The subset cardinality upper limit. Type: .

Type: .

subBitmap

Returns the bitmap elements, starting from the offset position. The number of returned elements is limited by the cardinality_limit parameter. Analog of the ) string function, but for bitmap.

bitmap – The bitmap. Type: .

offset – The position of the first element of the subset. Type: .

cardinality_limit – The maximum number of elements in the subset. Type: .

Type: .

bitmapContains

haystack – , where the function searches.

needle – Value that the function searches. Type: .

bitmapHasAny

If you are sure that bitmap2 contains strictly one element, consider using the function. It works more efficiently.

bitmapHasAll

bitmapCardinality

bitmapMin

bitmapMax

bitmapTransform

bitmapAnd

bitmapOr

bitmapXor

bitmapAndnot

bitmapAndCardinality

bitmapOrCardinality

bitmapXorCardinality

bitmapAndnotCardinality

if

You can use the setting to calculate the if function according to a short scheme. If this setting is enabled, then expression is evaluated only on rows where cond is true, else expression – where cond is false. For example, an exception about division by zero is not thrown when executing the query SELECT if(number = 0, 0, intDiv(42, number)) FROM numbers(10), because intDiv(42, number) will be evaluated only for numbers that doesn't satisfy condition number = 0.

Note: NULL values are not used in this example, check section.

Ternary Operator

.

multiIf

Allows you to write the operator more compactly in the query.

You can use the setting to calculate the multiIf function according to a short scheme. If this setting is enabled, then_i expression is evaluated only on rows where ((NOT cond_1) AND (NOT cond_2) AND ... AND (NOT cond_{i-1}) AND cond_i) is true, cond_i will be evaluated only on rows where ((NOT cond_1) AND (NOT cond_2) AND ... AND (NOT cond_{i-1})) is true. For example, an exception about division by zero is not thrown when executing the query SELECT multiIf(number = 2, intDiv(1, number), number = 5) FROM numbers(10).

Case

timeZone

Type: .

toTimeZone

value — Time or date and time. .

timezone — Timezone for the returned value. . This argument is a constant, because toTimezone changes the timezone of a column (timezone is an attribute of DateTime* types).

Type: .

timeZoneOf

Returns the timezone name of or data types.

value — Date and time. or .

Type: .

timeZoneOffset

Returns a timezone offset in seconds from . The function takes into account and historical timezone changes at the specified date and time. is used to calculate the offset.

value — Date and time. or .

Type: .

toYear

toQuarter

toMonth

toDayOfYear

toDayOfMonth

toDayOfWeek

toHour

toMinute

toSecond

toUnixTimestamp

For DateTime argument: converts value to the number with type UInt32 -- Unix Timestamp (). For String argument: converts the input string to the datetime according to the timezone (optional second argument, server timezone is used by default) and returns the corresponding unix timestamp.

The return type of toStartOf*, toLastDayOfMonth, toMonday, timeSlot functions described below is determined by the configuration parameter which is 0 by default.

toStartOfYear

toStartOfISOYear

toStartOfQuarter

toStartOfMonth

toMonday

toStartOfWeek(t[,mode])

toStartOfDay

toStartOfHour

toStartOfMinute

toStartOfSecond

value — Date and time. .

timezone — for the returned value (optional). If not specified, the function uses the timezone of the value parameter. .

Type: .

server configuration parameter.

toStartOfFiveMinutes

toStartOfTenMinutes

toStartOfFifteenMinutes

toStartOfInterval(time_or_data, INTERVAL x unit [, time_zone])

toTime

toRelativeYearNum

toRelativeQuarterNum

toRelativeMonthNum

toRelativeWeekNum

toRelativeDayNum

toRelativeHourNum

toRelativeMinuteNum

toRelativeSecondNum

toISOYear

toISOWeek

toWeek(date[,mode])

toYearWeek(date[,mode])

date_trunc

unit — The type of interval to truncate the result. . Possible values:

value — Date and time. or .

timezone — for the returned value (optional). If not specified, the function uses the timezone of the value parameter. .

Type: .

date_add

unit — The type of interval to add. . Possible values:

value — Value of interval to add. .

date — The date or date with time to which value is added. or .

Type: or .

date_diff

Returns the difference between two dates or dates with time values. The difference is calculated using relative units, e.g. the difference between 2022-01-01 and 2021-12-29 is 3 days for day unit (see ), 1 month for month unit (see ), 1 year for year unit (see ).

unit — The type of interval for result. . Possible values:

startdate — The first time value to subtract (the subtrahend). , , or .

enddate — The second time value to subtract from (the minuend). , , or .

timezone — (optional). If specified, it is applied to both startdate and enddate. If not specified, timezones of startdate and enddate are used. If they are not the same, the result is unspecified. .

Type: .

date_sub

unit — The type of interval to subtract. . Possible values:

value — Value of interval to subtract. .

date — The date or date with time from which value is subtracted. or .

Type: or .

timestamp_add

date — Date or date with time. or .

value — Value of interval to add. .

unit — The type of interval to add. . Possible values:

Type: or .

timestamp_sub

unit — The type of interval to subtract. . Possible values:

value — Value of interval to subtract. .

date — Date or date with time. or .

Type: or .

now

timezone — for the returned value (optional). .

Type: .

timezone — for the returned value (optional). .

Type: .

today

yesterday

timeSlot

toYYYYMM

toYYYYMMDD

toYYYYMMDDhhmmss

addYears, addMonths, addWeeks, addDays, addHours, addMinutes, addSeconds, addQuarters

subtractYears, subtractMonths, subtractWeeks, subtractDays, subtractHours, subtractMinutes, subtractSeconds, subtractQuarters

formatDateTime

formatDateTime uses MySQL datetime format style, refer to .

four-digit year format for ISO week number, calculated from the week-based year standard, normally useful only with %V

formatDateTimeInJodaSyntax

Similar to formatDateTime, except that it formats datetime in Joda style instead of MySQL style. Refer to .

dateName

date_part — Date part. Possible values: 'year', 'quarter', 'month', 'week', 'dayofyear', 'day', 'weekday', 'hour', 'minute', 'second'. .

date — Date. , , or .

timezone — Timezone. Optional. .

Type:

FROM_UNIXTIME

Function converts Unix timestamp to a calendar date and a time of a day. When there is only a single argument of type, it acts in the same way as and return type.

FROM_UNIXTIME uses MySQL datetime format style, refer to .

When there are two or three arguments, the first an , , , or , the second a constant format string and the third an optional constant time zone string — it acts in the same way as and return type.

fromUnixTimestampInJodaSyntax

Similar to FROM_UNIXTIME, except that it formats time in Joda style instead of MySQL style. Refer to .

toModifiedJulianDay

Converts a date in text form YYYY-MM-DD to a number in Int32. This function supports date from 0000-01-01 to 9999-12-31. It raises an exception if the argument cannot be parsed as a date, or the date is invalid.

date — Date in text form. or .

Type: .

toModifiedJulianDayOrNull

Similar to , but instead of raising exceptions it returns NULL.

date — Date in text form. or .

Type: .

fromModifiedJulianDay

Converts a number to a date in text form YYYY-MM-DD. This function supports day number from -678941 to 2973119 (which represent 0000-01-01 and 9999-12-31 respectively). It raises an exception if the day number is outside of the supported range.

day — Modified Julian Day number. .

Type:

fromModifiedJulianDayOrNull

Similar to , but instead of raising exceptions it returns NULL.

day — Modified Julian Day number. .

Type:

For information on connecting and configuring dictionaries, see .

dictGet, dictGetOrDefault, dictGetOrNull

dict_name — Name of the dictionary. .

attr_names — Name of the column of the dictionary, , or tuple of column names, ().

id_expr — Key value. returning dictionary key-type value or -type value depending on the dictionary configuration.

default_value_expr — Values returned if the dictionary does not contain a row with the id_expr key. or (), returning the value (or values) in the data types configured for the attr_names attribute.

If ClickHouse parses the attribute successfully in the , functions return the value of the dictionary attribute that corresponds to id_expr.

dictHas

dict_name — Name of the dictionary. .

id_expr — Key value. returning dictionary key-type value or -type value depending on the dictionary configuration.

dictGetHierarchy

Creates an array, containing all the parents of a key in the .

dict_name — Name of the dictionary. .

key — Key value. returning a -type value.

Type: .

dictIsIn

dict_name — Name of the dictionary. .

child_id_expr — Key to be checked. returning a -type value.

ancestor_id_expr — Alleged ancestor of the child_id_expr key. returning a -type value.

Other Functions

dict_name — Name of the dictionary. .

attr_name — Name of the column of the dictionary. .

id_expr — Key value. returning a or -type value depending on the dictionary configuration.

default_value_expr — Value returned if the dictionary does not contain a row with the id_expr key. returning the value in the data type configured for the attr_name attribute.

If ClickHouse parses the attribute successfully in the , functions return the value of the dictionary attribute that corresponds to id_expr.

char

number_1, number_2, ..., number_n — Numerical arguments interpreted as integers. Types: , .

hex

Values of type and are formatted as corresponding integers (the number of days since Epoch for Date and the value of Unix Timestamp for DateTime).

For and , all bytes are simply encoded as two hexadecimal numbers. Zero bytes are not omitted.

Values of and types are encoded as their representation in memory. As we support little-endian architecture, they are encoded in little-endian. Zero leading/trailing bytes are not omitted.

Values of type are encoded as big-endian order string.

arg — A value to convert to hexadecimal. Types: , , , , or .

Type: .

unhex

Performs the opposite operation of . It interprets each pair of hexadecimal digits (in the argument) as a number and converts it to the byte represented by the number. The return value is a binary string (BLOB).

If you want to convert the result to a number, you can use the and functions.

arg — A string containing any number of hexadecimal digits. Type: , .

Type: .

bitmaskToList(num)

bitmaskToArray(num)

encrypt

mode — Encryption mode. .

plaintext — Text thats need to be encrypted. .

key — Encryption key. .

iv — Initialization vector. Required for -gcm modes, optinal for others. .

aad — Additional authenticated data. It isn't encrypted, but it affects decryption. Works only in -gcm modes, for others would throw an exception. .

Ciphertext binary string. .

aes_encrypt_mysql

Compatible with mysql encryption and resulting ciphertext can be decrypted with function.

mode — Encryption mode. .

plaintext — Text that needs to be encrypted. .

key — Encryption key. If key is longer than required by mode, MySQL-specific key folding is performed. .

iv — Initialization vector. Optional, only first 16 bytes are taken into account .

Ciphertext binary string. .

decrypt

mode — Decryption mode. .

ciphertext — Encrypted text that needs to be decrypted. .

key — Decryption key. .

iv — Initialization vector. Required for -gcm modes, optinal for others. .

aad — Additional authenticated data. Won't decrypt if this value is incorrect. Works only in -gcm modes, for others would throw an exception. .

Decrypted String. .

Re-using table from .

aes_decrypt_mysql

Compatible with mysql encryption and decrypts data encrypted with function.

mode — Decryption mode. .

ciphertext — Encrypted text that needs to be decrypted. .

key — Decryption key. .

iv — Initialization vector. Optinal. .

Decrypted String. .

file

path — The relative path to the file from . Path to file support following wildcards: *, ?, {abc,def} and {N..M} where N, M — numbers, 'abc', 'def' — strings.

default — The value that will be returned in the case when a file does not exist or cannot be accessed. Data types supported: and .

greatCircleDistance

Calculates the distance between two points on the Earth’s surface using .

geoDistance

greatCircleAngle

Calculates the central angle between two points on the Earth’s surface using .

pointInEllipses

pointInPolygon

(x, y) — Coordinates of a point on the plane. Data type — — A tuple of two numbers.

[(a, b), (c, d) ...] — Polygon vertices. Data type — . Each vertex is represented by a pair of coordinates (a, b). Vertices should be specified in a clockwise or counterclockwise order. The minimum number of vertices is 3. The polygon must be constant.

is the geocode system, which subdivides Earth’s surface into buckets of grid shape and encodes each cell into a short string of letters and digits. It is a hierarchical data structure, so the longer is the geohash string, the more precise is the geographic location.

If you need to manually convert geographic coordinates to geohash strings, you can use .

geohashEncode

Encodes latitude and longitude as a -string.

geohashDecode

Decodes any -encoded string into longitude and latitude.

geohashesInBox

Returns an array of -encoded strings of given precision that fall inside and intersect boundaries of given box, basically a 2D grid flattened into array.

longitude_min — Minimum longitude. Range: [-180°, 180°]. Type: .

latitude_min — Minimum latitude. Range: [-90°, 90°]. Type: .

longitude_max — Maximum longitude. Range: [-180°, 180°]. Type: .

latitude_max — Maximum latitude. Range: [-90°, 90°]. Type: .

precision — Geohash precision. Range: [1, 12]. Type: .

Type: ().

is a geographical indexing system where Earth’s surface divided into a grid of even hexagonal cells. This system is hierarchical, i. e. each hexagon on the top level ("parent") can be split into seven even but smaller ones ("children"), and so on.

The full description of the H3 system is available at .

h3IsValid

Verifies whether the number is a valid index.

h3index — Hexagon index number. Type: .

Type: .

h3GetResolution

Defines the resolution of the given index.

h3index — Hexagon index number. Type: .

If the index is not valid, the function returns a random value. Use to verify the index.

Type: .

h3EdgeAngle

Calculates the average length of the hexagon edge in grades.

resolution — Index resolution. Type: . Range: [0, 15].

The average length of the hexagon edge in grades. Type: .

h3EdgeLengthM

Calculates the average length of the hexagon edge in meters.

resolution — Index resolution. Type: . Range: [0, 15].

The average length of the hexagon edge in meters. Type: .

geoToH3

Returns point index (lon, lat) with specified resolution.

lon — Longitude. Type: .

lat — Latitude. Type: .

resolution — Index resolution. Range: [0, 15]. Type: .

Type: .

h3kRing

Lists all the hexagons in the raduis of k from the given hexagon in random order.

h3index — Hexagon index number. Type: .

k — Radius. Type:

Type: ().

h3GetBaseCell

Returns the base cell number of the index.

index — Hexagon index number. Type: .

Type: .

h3HexAreaM2

resolution — Index resolution. Range: [0, 15]. Type: .

Type: .

h3IndexesAreNeighbors

Returns whether or not the provided indexes are neighbors.

index1 — Hexagon index number. Type: .

index2 — Hexagon index number. Type: .

Type: .

h3ToChildren

Returns an array of child indexes for the given index.

index — Hexagon index number. Type: .

resolution — Index resolution. Range: [0, 15]. Type: .

Type: ().

h3ToParent

Returns the parent (coarser) index containing the given index.

index — Hexagon index number. Type: .

resolution — Index resolution. Range: [0, 15]. Type: .

Type: .

h3ToString

index — Hexagon index number. Type: .

Type: .

stringToH3

index_str — String representation of the H3 index. Type: .

Hexagon index number. Returns 0 on error. Type: .

halfMD5

all the input parameters as strings and calculates the hash value for each of them. Then combines hashes, takes the first 8 bytes of the hash of the resulting string, and interprets them as UInt64 in big-endian byte order.

The function is relatively slow (5 million short strings per second per processor core). Consider using the function instead.

The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).

A data type hash value.

MD5

sipHash64

Produces a 64-bit hash value.

This is a cryptographic hash function. It works at least three times faster than the function.

Function all the input parameters as strings and calculates the hash value for each of them. Then combines hashes by the following algorithm:

The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).

A data type hash value.

sipHash128

Produces a 128-bit hash value. Differs from in that the final xor-folding state is done up to 128 bits.

The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).

Type: .

cityHash64

Produces a 64-bit hash value.

The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).

A data type hash value.

intHash32

intHash64

SHA1, SHA224, SHA256, SHA512

Calculates SHA-1, SHA-224, SHA-256, SHA-512 hash from a string and returns the resulting set of bytes as .

s — Input string for SHA hash calculation. .

Type: .

Use the function to represent the result as a hex-encoded string.

URLHash(url[, N])

farmFingerprint64

farmHash64

Produces a 64-bit or Fingerprint value. farmFingerprint64 is preferred for a stable and portable value.

These functions use the Fingerprint64 and Hash64 methods respectively from all .

The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data)..

A data type hash value.

javaHash

Calculates JavaHash from a , , , , . This hash function is neither fast nor having a good quality. The only reason to use it is when this algorithm is already used in another system and you have to calculate exactly the same result.

javaHashUTF16LE

Calculates from a string, assuming it contains bytes representing a string in UTF-16LE encoding.

hiveHash

This is just with zeroed out sign bit. This function is used in for versions before 3.0. This hash function is neither fast nor having a good quality. The only reason to use it is when this algorithm is already used in another system and you have to calculate exactly the same result.

metroHash64

Produces a 64-bit hash value.

The function takes a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).

A data type hash value.

jumpConsistentHash

Calculates JumpConsistentHash form a UInt64. Accepts two arguments: a UInt64-type key and the number of buckets. Returns Int32. For more information, see the link:

murmurHash2_32, murmurHash2_64

Produces a hash value.

Both functions take a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).

The murmurHash2_32 function returns hash value having the data type.

The murmurHash2_64 function returns hash value having the data type.

gccMurmurHash

Calculates a 64-bit hash value using the same hash seed as . It is portable between CLang and GCC builds.

par1, ... — A variable number of parameters that can be any of the .

Type: .

murmurHash3_32, murmurHash3_64

Produces a hash value.

Both functions take a variable number of input parameters. Arguments can be any of the . For some data types calculated value of hash function may be the same for the same values even if types of arguments differ (integers of different size, named and unnamed Tuple with the same data, Map and the corresponding Array(Tuple(key, value)) type with the same data).

The murmurHash3_32 function returns a data type hash value.

The murmurHash3_64 function returns a data type hash value.

murmurHash3_128

Produces a 128-bit hash value.

expr — A list of . .

Type: .

xxHash32, xxHash64

.

ngramSimHash

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

ngramSimHashCaseInsensitive

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

ngramSimHashUTF8

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

ngramSimHashCaseInsensitiveUTF8

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

wordShingleSimHash

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

wordShingleSimHashCaseInsensitive

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

wordShingleSimHashUTF8

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optinal. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

wordShingleSimHashCaseInsensitiveUTF8

Can be used for detection of semi-duplicate strings with . The smaller is the of the calculated simhashes of two strings, the more likely these strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

Type: .

ngramMinHash

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

ngramMinHashCaseInsensitive

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

ngramMinHashUTF8

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

ngramMinHashCaseInsensitiveUTF8

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

ngramMinHashArg

Splits a ASCII string into n-grams of ngramsize symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case sensitive.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

ngramMinHashArgCaseInsensitive

Splits a ASCII string into n-grams of ngramsize symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case insensitive.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

ngramMinHashArgUTF8

Splits a UTF-8 string into n-grams of ngramsize symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case sensitive.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

ngramMinHashArgCaseInsensitiveUTF8

Splits a UTF-8 string into n-grams of ngramsize symbols and returns the n-grams with minimum and maximum hashes, calculated by the function with the same input. Is case insensitive.

string — String. .

ngramsize — The size of an n-gram. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

wordShingleMinHash

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

wordShingleMinHashCaseInsensitive

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

wordShingleMinHashUTF8

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

wordShingleMinHashCaseInsensitiveUTF8

Can be used for detection of semi-duplicate strings with . For two strings: if one of the returned hashes is the same for both strings, we think that those strings are the same.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: (, ).

wordShingleMinHashArg

Splits a ASCII string into parts (shingles) of shinglesize words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case sensitive.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

wordShingleMinHashArgCaseInsensitive

Splits a ASCII string into parts (shingles) of shinglesize words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case insensitive.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

wordShingleMinHashArgUTF8

Splits a UTF-8 string into parts (shingles) of shinglesize words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case sensitive.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

wordShingleMinHashArgCaseInsensitiveUTF8

Splits a UTF-8 string into parts (shingles) of shinglesize words each and returns the shingles with minimum and maximum word hashes, calculated by the function with the same input. Is case insensitive.

string — String. .

shinglesize — The size of a word shingle. Optional. Possible values: any number from 1 to 25. Default value: 3. .

hashnum — The number of minimum and maximum hashes used to calculate the result. Optional. Possible values: any number from 1 to 25. Default value: 6. .

Type: ((), ()).

You can use functions described in this chapter to introspect and for query profiling.

Set the setting to 1.

ClickHouse saves profiler reports to the system table. Make sure the table and profiler are configured properly.

addressToLine

address_of_binary_instruction () — Address of instruction in a running process.

Type: .

The function allows to process each individual element of the trace array by the addressToLine function. The result of this processing you see in the trace_source_code_lines column of output.

addressToLineWithInlines

address_of_binary_instruction () — Address of instruction in a running process.

Type: .

The functions will split array to rows.

addressToSymbol

address_of_binary_instruction () — Address of instruction in a running process.

Type: .

The function allows to process each individual element of the trace array by the addressToSymbols function. The result of this processing you see in the trace_symbols column of output.

demangle

Converts a symbol that you can get using the function to the C++ function name.

symbol () — Symbol from an object file.

Type: .

The function allows to process each individual element of the trace array by the demangle function. The result of this processing you see in the trace_functions column of output.

tid

Returns id of the thread, in which current is processed.

Current thread id. .

logTrace

Emits trace log message to server log for each .

message — Message that is emitted to server log. .

IPv4NumToString(num)

IPv4StringToNum(s)

IPv4NumToStringClassC(num)

IPv6NumToString(x)

IPv6StringToNum

The reverse function of . If the IPv6 address has an invalid format, it throws exception.

string — IP address. .

Type: .

.

IPv4ToIPv6(x)

Takes a UInt32 number. Interprets it as an IPv4 address in . Returns a FixedString(16) value containing the IPv6 address in binary format. Examples:

cutIPv6(x, bytesToCutForIPv6, bytesToCutForIPv4)

IPv4CIDRToRange(ipv4, Cidr),

Accepts an IPv4 and an UInt8 value containing the . Return a tuple with two IPv4 containing the lower range and the higher range of the subnet.

IPv6CIDRToRange(ipv6, Cidr),

toIPv4(string)

An alias to IPv4StringToNum() that takes a string form of IPv4 address and returns value of type, which is binary equal to value returned by IPv4StringToNum().

toIPv6

Converts a string form of IPv6 address to type. If the IPv6 address has an invalid format, returns an empty value. Similar to function, which converts IPv6 address to binary format.

string — IP address.

Type: .

isIPv4String

string — IP address. .

Type: .

isIPv6String

string — IP address. .

Type: .

isIPAddressInRange

Determines if an IP address is contained in a network represented in the notation. Returns 1 if true, or 0 otherwise.

address — An IPv4 or IPv6 address. .

prefix — An IPv4 or IPv6 network prefix in CIDR. .

Type: .

visitParamHas(params, name)

visitParamExtractUInt(params, name)

visitParamExtractInt(params, name)

visitParamExtractFloat(params, name)

visitParamExtractBool(params, name)

visitParamExtractRaw(params, name)

visitParamExtractString(params, name)

The following functions are based on designed for more complex JSON parsing requirements. The assumption 2 mentioned above still applies.

isValidJSON(json)

JSONHas(json[, indices_or_keys]…)

JSONLength(json[, indices_or_keys]…)

JSONType(json[, indices_or_keys]…)

JSONExtractUInt(json[, indices_or_keys]…)

JSONExtractInt(json[, indices_or_keys]…)

JSONExtractFloat(json[, indices_or_keys]…)

JSONExtractBool(json[, indices_or_keys]…)

JSONExtractString(json[, indices_or_keys]…)

JSONExtract(json[, indices_or_keys…], Return_type)

JSONExtractKeysAndValues(json[, indices_or_keys…], Value_type)

JSONExtractKeys

json — with valid JSON.

a, b, c... — Comma-separated indices or keys that specify the path to the inner field in a nested JSON object. Each argument can be either a to get the field by the key or an to get the N-th field (indexed from 1, negative integers count from the end). If not set, the whole JSON is parsed as the top-level object. Optional parameter.

Type: ().

JSONExtractRaw(json[, indices_or_keys]…)

JSONExtractArrayRaw(json[, indices_or_keys…])

JSONExtractKeysAndValuesRaw

json — with valid JSON.

p, a, t, h — Comma-separated indices or keys that specify the path to the inner field in a nested JSON object. Each argument can be either a to get the field by the key or an to get the N-th field (indexed from 1, negative integers count from the end). If not set, the whole JSON is parsed as the top-level object. Optional parameter.

Type: ((, ).

JSON_EXISTS(json, path)

JSON_QUERY(json, path)

JSON_VALUE(json, path)

toJSONString

Serializes a value to its JSON representation. Various data types and nested structures are supported. 64-bit or bigger (like UInt64 or Int128) are enclosed in quotes by default. controls this behavior. Special values NaN and inf are replaced with null. Enable setting to show them. When serializing an value, the function outputs its name.

Type: .

The first example shows serialization of a . The second example shows some special values wrapped into a .

evalMLMethod

stochasticLinearRegression

The aggregate function implements stochastic gradient descent method using linear model and MSE loss function. Uses evalMLMethod to predict on new data.

stochasticLogisticRegression

The aggregate function implements stochastic gradient descent method for binary classification problem. Uses evalMLMethod to predict on new data.

map

Arranges key:value pairs into data type.

key — The key part of the pair. , , , , , , , , .

value — The value part of the pair. Arbitrary type, including and .

Type: .

data type

mapAdd

Arguments are or of two , where items in the first array represent keys, and the second array contains values for the each key. All key arrays should have same type, and all value arrays should contain items which are promoted to the one type (, or ). The common promoted type is used as a type for the result array.

Depending on the arguments returns one or , where the first array contains the sorted keys and the second array contains values.

mapSubtract

Arguments are or of two , where items in the first array represent keys, and the second array contains values for the each key. All key arrays should have same type, and all value arrays should contain items which are promote to the one type (, or ). The common promoted type is used as a type for the result array.

Depending on the arguments returns one or , where the first array contains the sorted keys and the second array contains values.

mapPopulateSeries

Arguments are or two , where the first array represent keys, and the second array contains values for the each key.

keys — Array of keys. ().

values — Array of values. ().

max — Maximum key value. Optional. .

map — Map with integer keys. .

Depending on the arguments returns a or a of two : keys in sorted order, and values the corresponding keys.

mapContains

map — Map. .

Type: .

mapKeys

Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1 the function reads only subcolumn instead of reading and processing the whole column data. The query SELECT mapKeys(m) FROM table transforms to SELECT m.keys FROM table.

map — Map. .

Type: .

mapValues

Can be optimized by enabling the setting. With optimize_functions_to_subcolumns = 1 the function reads only subcolumn instead of reading and processing the whole column data. The query SELECT mapValues(m) FROM table transforms to SELECT m.values FROM table.

map — Map. .

Type: .

e()

exp(x)

log(x), ln(x)

exp2(x)

log2(x)

exp10(x)

log10(x)

sqrt(x)

cbrt(x)

erf(x)

erfc(x)

lgamma(x)

tgamma(x)

sin(x)

cos(x)

tan(x)

asin(x)

acos(x)

atan(x)

pow(x, y), power(x, y)

intExp2

intExp10

cosh(x)

.

x — The angle, in radians. Values from the interval: -∞ < x < +∞. .

Type: .

acosh(x)

.

x — Hyperbolic cosine of angle. Values from the interval: 1 <= x < +∞. .

Type: .

sinh(x)

.

x — The angle, in radians. Values from the interval: -∞ < x < +∞. .

Type: .

asinh(x)

.

x — Hyperbolic sine of angle. Values from the interval: -∞ < x < +∞. .

Type: .

atanh(x)

.

x — Hyperbolic tangent of angle. Values from the interval: –1 < x < 1. .

Type: .

atan2(y, x)

The calculates the angle in the Euclidean plane, given in radians, between the positive x axis and the ray to the point (x, y) ≠ (0, 0).

y — y-coordinate of the point through which the ray passes. .

x — x-coordinate of the point through which the ray passes. .

Type: .

hypot(x, y)

Calculates the length of the hypotenuse of a right-angle triangle. The avoids problems that occur when squaring very large or very small numbers.

x — The first cathetus of a right-angle triangle. .

y — The second cathetus of a right-angle triangle. .

Type: .

log1p(x)

Calculates log(1+x). The log1p(x) is more accurate than log(1+x) for small values of x.

x — Values from the interval: -1 < x < +∞. .

Type: .

sign(x)

isNull

Checks whether the argument is .

isNotNull

Checks whether the argument is .

coalesce

ifNull

nullIf

assumeNotNull

Results in an equivalent non-Nullable value for a type. In case the original value is NULL the result is undetermined. See also ifNull and coalesce functions.

toNullable

hostName()

getMacro

Gets a named value from the section of the server configuration.

name — Name to retrieve from the macros section. .

Type: .

FQDN

basename

expr — Expression resulting in a type value. All the backslashes must be escaped in the resulting value.

visibleWidth(x)

toTypeName(x)

blockSize()

byteSize

Type: .

For arguments the funtion returns the string length + 9 (terminating zero + length).

materialize(x)

ignore(…)

sleep(seconds)

sleepEachRow(seconds)

currentDatabase()

currentUser()

isConstant

A constant expression means an expression whose resulting value is known at the query analysis (i.e. before execution). For example, expressions over are constant expressions.

Type: .

isFinite(x)

isInfinite(x)

ifNotFinite

x — Value to be checked for infinity. Type: .

y — Fallback value. Type: .

You can get similar result by using : isFinite(x) ? x : y.

isNaN(x)

hasColumnInTable([‘hostname’[, ‘username’[, ‘password’]],] ‘database’, ‘table’, ‘column’)

bar

transform

transform(x, array_from, array_to, default)

transform(x, array_from, array_to)

formatReadableDecimalSize(x)

formatReadableSize(x)

formatReadableQuantity(x)

formatReadableTimeDelta

least(a, b)

greatest(a, b)

uptime()

version()

blockNumber

rowNumberInBlock

rowNumberInAllBlocks()

neighbor

The rows order used during the calculation of neighbor can differ from the order of rows returned to the user. To prevent that you can make a subquery with and call the function from outside the subquery.

offset — The number of rows forwards or backwards from the current row of column. .

runningDifference(x)

The rows order used during the calculation of runningDifference can differ from the order of rows returned to the user. To prevent that you can make a subquery with and call the function from outside the subquery.

runningDifferenceStartingWithFirstValue

Same as for , the difference is the value of the first row, returned the value of the first row, and each subsequent row returns the difference from the previous row.

runningConcurrency

start — A column with the start time of events. , , or .

end — A column with the end time of events. , , or .

Type:

MACNumToString(num)

MACStringToNum(s)

MACStringToOUI(s)

getSizeOfEnumType

Returns the number of fields in .

blockSerializedSize

toColumnTypeName

dumpColumnStructure

defaultValueOfArgumentType

ᴺᵁᴸᴸ for .

defaultValueOfTypeName

ᴺᵁᴸᴸ for .

indexHint

Type: .

Here is the example of test data from the table .

replicate

Used for internal implementation of .

filesystemAvailable

Returns amount of remaining space on the filesystem where the files of the databases located. It is always smaller than total free space () because some space is reserved for OS.

Type: .

filesystemFree

Type: .

filesystemCapacity

Returns the capacity of the filesystem in bytes. For evaluation, the to the data directory must be configured.

Type: .

initializeAggregation

Calculates result of aggregate function based on single value. It is intended to use this function to initialize aggregate functions with combinator . You can create states of aggregate functions and insert them to columns of type or use initialized aggregates as default values.

aggregate_function — Name of the aggregation function to initialize. .

finalizeAggregation

Takes state of aggregate function. Returns result of aggregation (or finalized state when using combinator).

state — State of aggregation. .

runningAccumulate

agg_state — State of the aggregate function. .

grouping — Grouping key. Optional. The state of the function is reset if the grouping value is changed. It can be any of the for which the equality operator is defined.

The subquery generates sumState for every number from 0 to 9. sumState returns the state of the function that contains the sum of a single number.

joinGet

The function lets you extract data from the table the same way as from a .

Gets data from tables using the specified join key.

join_storage_table_name — an indicates where search is performed. The identifier is searched in the default database (see parameter default_database in the config file). To override the default database, use the USE db_name or specify the database and the table through the separator db_name.db_table, see the example.

If certain does not exist in source table then 0 or null will be returned based on setting.

More info about join_use_nulls in .

catboostEvaluate(path_to_model, feature_1, feature_2, …, feature_n)

Evaluate external catboost model. is an open-source gradient boosting library developed by Yandex for machine learing. Accepts a path to a catboost model and model arguments (features). Returns Float64.

Before evaluating catboost models, the libcatboostmodel.<so|dylib> library must be made available. See how to compile it.

See for how to train catboost models from a training data set.

throwIf(x[, message[, error_code]])

identity

getSetting

Returns the current value of a .

custom_setting — The setting name. .

isDecimalOverflow

Checks whether the value is out of its (or specified) precision.

d — value. .

p — precision. Optional. If omitted, the initial precision of the first argument is used. Using of this paratemer could be helpful for data extraction to another DBMS or file. .

countDigits

x — or value.

Type: .

For Decimal values takes into account their scales: calculates result over underlying integer type which is (value * scale). For example: countDigits(42) = 2, countDigits(42.000) = 5, countDigits(0.04200) = 4. I.e. you may check decimal overflow for Decimal64 with countDecimal(x) > 18. It's a slow variant of .

errorCodeToName

Type: .

tcpPort

Returns TCP port number listened by this server. If it is executed in the context of a distributed table, then it generates a normal column, otherwise it produces a constant value.

Type: .

rand, rand32

rand64

randCanonical

randConstant

x — resulting in any of the . The resulting value is discarded, but the expression itself if used for bypassing if the function is called multiple times in one query. Optional parameter.

Type: .

randomString

randomFixedString

randomPrintableASCII

randomStringUTF8

fuzzBits

Functions for and are described separately.

replaceOne(haystack, pattern, replacement)

replaceAll(haystack, pattern, replacement), replace(haystack, pattern, replacement)

replaceRegexpOne(haystack, pattern, replacement)

Replaces the first occurrence of the substring matching the regular expression ‘pattern’ in ‘haystack‘ by the ‘replacement‘ string. ‘pattern‘ must be a constant . ‘replacement’ must be a plain constant string or a constant string containing substitutions \0-\9. Substitutions \1-\9 correspond to the 1st to 9th capturing group (submatch), substitution \0 corresponds to the entire match. To use a verbatim \ character in the ‘pattern‘ or ‘replacement‘ string, escape it using \. Also keep in mind that string literals require an extra escaping.

replaceRegexpAll(haystack, pattern, replacement)

regexpQuoteMeta(s)

The function adds a backslash before some predefined characters in the string. Predefined characters: \0, \\, |, (, ), ^, $, ., [, ], ?, *, +, {, :, -. This implementation slightly differs from re2::RE2::QuoteMeta. It escapes zero byte as \0 instead of \x00 and it escapes only required characters. For more information, see the link:

floor(x[, N])

ceil(x[, N]), ceiling(x[, N])

trunc(x[, N]), truncate(x[, N])

round(x[, N])

expression — A number to be rounded. Can be any returning the numeric .

Examples

roundBankers

expression — A number to be rounded. Can be any returning the numeric .

Examples

roundToExp2(num)

roundDuration(num)

roundAge(num)

roundDown(num, arr)

Functions for and are described separately.

position(haystack, needle), locate(haystack, needle)

For a case-insensitive search, use the function .

haystack — String, in which substring will to be searched. .

needle — Substring to be searched. .

start_pos – Position of the first character in the string to start search. . Optional.

The same phrase in Russian contains characters which can’t be represented using a single byte. The function returns some unexpected result (use function for multi-byte encoded text):

positionCaseInsensitive

The same as returns the position (in bytes) of the found substring in the string, starting from 1. Use the function for a case-insensitive search.

haystack — String, in which substring will to be searched. .

needle — Substring to be searched. .

start_pos — Optional parameter, position of the first character in the string to start search. .

positionUTF8

For a case-insensitive search, use the function .

haystack — String, in which substring will to be searched. .

needle — Substring to be searched. .

start_pos — Optional parameter, position of the first character in the string to start search.

positionCaseInsensitiveUTF8

The same as , but is case-insensitive. Returns the position (in Unicode points) of the found substring in the string, starting from 1.

haystack — String, in which substring will to be searched. .

needle — Substring to be searched. .

start_pos — Optional parameter, position of the first character in the string to start search.

multiSearchAllPositions

The same as but returns Array of positions (in bytes) of the found corresponding substrings in the string. Positions are indexed starting from 1.

For search in UTF-8, use the function .

haystack — String, in which substring will to be searched. .

needle — Substring to be searched. .

multiSearchAllPositionsUTF8

multiSearchFirstPosition(haystack, [needle1, needle2, …, needlen])

multiSearchFirstIndex(haystack, [needle1, needle2, …, needlen])

multiSearchAny(haystack, [needle1, needle2, …, needlen])

match(haystack, pattern)

Checks whether the string matches the regular expression pattern in re2 syntax. Re2 has a more limited than Perl regular expressions.

Matching is based on UTF-8, e.g. . matches the Unicode code point ¥ which is represented in UTF-8 using two bytes. The regular expression must not contain null bytes. If the haystack or pattern contain a sequence of bytes that are not valid UTF-8, then the behavior is undefined. No automatic Unicode normalization is performed, if you need it you can use the functions for that.

multiMatchAny(haystack, [pattern1, pattern2, …, patternn])

The same as match, but returns 0 if none of the regular expressions are matched and 1 if any of the patterns matches. It uses library. For patterns to search substrings in a string, it is better to use multiSearchAny since it works much faster.

multiMatchAnyIndex(haystack, [pattern1, pattern2, …, patternn])

multiMatchAllIndices(haystack, [pattern1, pattern2, …, patternn])

multiFuzzyMatchAny(haystack, distance, [pattern1, pattern2, …, patternn])

The same as multiMatchAny, but returns 1 if any pattern matches the haystack within a constant . This function relies on the experimental feature of library, and can be slow for some corner cases. The performance depends on the edit distance value and patterns used, but it's always more expensive compared to a non-fuzzy variants.

multiFuzzyMatchAnyIndex(haystack, distance, [pattern1, pattern2, …, patternn])

multiFuzzyMatchAllIndices(haystack, distance, [pattern1, pattern2, …, patternn])

extract(haystack, pattern)

extractAll(haystack, pattern)

extractAllGroupsHorizontal

extractAllGroupsHorizontal function is slower than .

haystack — Input string. Type: .

pattern — Regular expression with . Must contain groups, each group enclosed in parentheses. If pattern contains no groups, an exception is thrown. Type: .

Type: .

extractAllGroupsVertical

haystack — Input string. Type: .

pattern — Regular expression with . Must contain groups, each group enclosed in parentheses. If pattern contains no groups, an exception is thrown. Type: .

Type: .

like(haystack, pattern), haystack LIKE pattern operator

Matching is based on UTF-8, e.g. _ matches the Unicode code point ¥ which is represented in UTF-8 using two bytes. If the haystack or pattern contain a sequence of bytes that are not valid UTF-8, then the behavior is undefined. No automatic Unicode normalization is performed, if you need it you can use the functions for that.

notLike(haystack, pattern), haystack NOT LIKE pattern operator

ilike

Case insensitive variant of function. You can use ILIKE operator instead of the ilike function.

haystack — Input string. .

ngramDistance(haystack, needle)

ngramSearch(haystack, needle)

countSubstrings

For a case-insensitive search, use or functions.

haystack — The string to search in. .

needle — The substring to search for. .

start_pos – Position of the first character in the string to start search. Optional. .

Type: .

countSubstringsCaseInsensitive

haystack — The string to search in. .

needle — The substring to search for. .

start_pos — Position of the first character in the string to start search. Optional. .

Type: .

countSubstringsCaseInsensitiveUTF8

haystack — The string to search in. .

needle — The substring to search for. .

start_pos — Position of the first character in the string to start search. Optional. .

Type: .

countMatches(haystack, pattern)

haystack — The string to search in. .

pattern — The regular expression with . .

Type: .

splitByChar(separator, s[, max_substrings])

separator — The separator which should contain exactly one character. .

s — The string to split. .

Type: ().

splitByString(separator, s[, max_substrings])

separator — The separator. .

s — The string to split. .

Type: ().

arrayStringConcat(arr[, separator])

alphaTokens(s[, max_substrings]), splitByAlpha(s[, max_substrings])

s — The string to split. .

Type: ().

extractAllGroups(text, regexp)

text — or .

regexp — Regular expression. Constant. or .

Type: .

Functions for and in strings are described separately.

empty

The function also works for or .

x — Input value. .

Type: .

notEmpty

The function also works for or .

x — Input value. .

Type: .

length

lengthUTF8

char_length, CHAR_LENGTH

character_length, CHARACTER_LENGTH

leftPad

string — Input string that needs to be padded. .

length — The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.

pad_string — The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.

Type: .

leftPadUTF8

Pads the current string from the left with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL LPAD function. While in the function the length is measured in bytes, here in the leftPadUTF8 function it is measured in code points.

string — Input string that needs to be padded. .

length — The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.

pad_string — The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.

Type: .

rightPad

string — Input string that needs to be padded. .

length — The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.

pad_string — The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.

Type: .

rightPadUTF8

Pads the current string from the right with spaces or a specified string (multiple times, if needed) until the resulting string reaches the given length. Similarly to the MySQL RPAD function. While in the function the length is measured in bytes, here in the rightPadUTF8 function it is measured in code points.

string — Input string that needs to be padded. .

length — The length of the resulting string. . If the value is less than the input string length, then the input string is returned as-is.

pad_string — The string to pad the input string with. . Optional. If not specified, then the input string is padded with spaces.

Type: .

lower, lcase

upper, ucase

lowerUTF8

upperUTF8

isValidUTF8

toValidUTF8

input_string — Any set of bytes represented as the data type object.

repeat

s — The string to repeat. .

n — The number of times to repeat the string. .

reverse

reverseUTF8

format(pattern, s0, s1, …)

concat

concatAssumeInjective

Same as , the difference is that you need to ensure that concat(s1, s2, ...) → sn is injective, it will be used for optimization of GROUP BY.

substring(s, offset, length), mid(s, offset, length), substr(s, offset, length)

substringUTF8(s, offset, length)

appendTrailingCharIfAbsent(s, c)

convertCharset(s, from, to)

base58Encode(plaintext)

Accepts a String and encodes it using encoding scheme using "Bitcoin" alphabet.

plaintext — column or constant.

Type: .

base64Encode(s)

base64Decode(s)

tryBase64Decode(s)

endsWith(s, suffix)

startsWith(str, prefix)

trim

trim_character — Specified characters for trim. .

input_string — String for trim. .

trimLeft

input_string — string to trim. .

trimRight

input_string — string to trim. .

trimBoth

input_string — string to trim. .

CRC32(s)

CRC32IEEE(s)

CRC64(s)

normalizeQuery

x — Sequence of characters. .

Type: .

normalizedQueryHash

x — Sequence of characters. .

Type: .

normalizeUTF8NFC

Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.

words — Input string that contains UTF-8 encoded text. .

Type: .

normalizeUTF8NFD

Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.

words — Input string that contains UTF-8 encoded text. .

Type: .

normalizeUTF8NFKC

Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.

words — Input string that contains UTF-8 encoded text. .

Type: .

normalizeUTF8NFKD

Converts a string to , assuming the string contains a set of bytes that make up a UTF-8 encoded text.

words — Input string that contains UTF-8 encoded text. .

Type: .

encodeXMLComponent

x — The sequence of characters. .

Type: .

decodeXMLComponent

x — A sequence of characters. .

Type: .

extractTextFromHTML

x — input text. .

Type: .

The first example contains several tags and a comment and also shows whitespace processing. The second example shows CDATA and script tag processing. In the third example text is extracted from the full HTML response received by the function.

tuple

tupleElement

untuple

Performs syntactic substitution of elements in the call location.

x — A tuple function, column, or tuple of elements. .

tupleHammingDistance

Returns the between two tuples of the same size.

tuple1 — First tuple. .

tuple2 — Second tuple. .

Type: The result type is calculed the same way it is for , based on the number of elements in the input tuples.

Can be used with functions for detection of semi-duplicate strings:

Common Issues of Numeric Conversions

ClickHouse has the .

toInt(8|16|32|64|128|256)

Converts an input value to the data type. This function family includes:

expr — returning a number or a string with the decimal representation of a number. Binary, octal, and hexadecimal representations of numbers are not supported. Leading zeroes are stripped.

Functions use , meaning they truncate fractional digits of numbers.

The behavior of functions for the arguments is undefined. Remember about , when using the functions.

toInt(8|16|32|64|128|256)OrZero

toInt(8|16|32|64|128|256)OrNull

toInt(8|16|32|64|128|256)OrDefault

toUInt(8|16|32|64|256)

Converts an input value to the data type. This function family includes:

expr — returning a number or a string with the decimal representation of a number. Binary, octal, and hexadecimal representations of numbers are not supported. Leading zeroes are stripped.

Functions use , meaning they truncate fractional digits of numbers.

The behavior of functions for negative arguments and for the arguments is undefined. If you pass a string with a negative number, for example '-32', ClickHouse raises an exception. Remember about , when using the functions.

toUInt(8|16|32|64|256)OrZero

toUInt(8|16|32|64|256)OrNull

toUInt(8|16|32|64|256)OrDefault

toFloat(32|64)

toFloat(32|64)OrZero

toFloat(32|64)OrNull

toFloat(32|64)OrDefault

toDate

toDateOrZero

toDateOrNull

toDateOrDefault

toDateTime

toDateTimeOrZero

toDateTimeOrNull

toDateTimeOrDefault

toDate32

Converts the argument to the data type. If the value is outside the range, toDate32 returns the border values supported by Date32. If the argument has type, borders of Date are taken into account.

expr — The value. , or .

Type: .

toDate32OrZero

The same as but returns the min value of if an invalid argument is received.

toDate32OrNull

The same as but returns NULL if an invalid argument is received.

toDate32OrDefault

Converts the argument to the data type. If the value is outside the range, toDate32OrDefault returns the lower border value supported by Date32. If the argument has type, borders of Date are taken into account. Returns default value if an invalid argument is received.

toDateTime64

Converts the argument to the data type.

expr — The value. , , or .

Type: .

toDecimal(32|64|128|256)

Converts value to the data type with precision of S. The value can be a number or a string. The S (scale) parameter specifies the number of decimal places.

toDecimal(32|64|128|256)OrNull

Converts an input string to a data type value. This family of functions includes:

expr — , returns a value in the data type. ClickHouse expects the textual representation of the decimal number. For example, '1.111'.

toDecimal(32|64|128|256)OrDefault

Converts an input string to a data type value. This family of functions includes:

expr — , returns a value in the data type. ClickHouse expects the textual representation of the decimal number. For example, '1.111'.

toDecimal(32|64|128|256)OrZero

Converts an input value to the data type. This family of functions includes:

expr — , returns a value in the data type. ClickHouse expects the textual representation of the decimal number. For example, '1.111'.

toString

toFixedString(s, N)

toStringCutToZero(s)

reinterpretAsUInt(8|16|32|64)

reinterpretAsInt(8|16|32|64)

reinterpretAsFloat(32|64)

reinterpretAsDate

reinterpretAsDateTime

reinterpretAsString

reinterpretAsFixedString

reinterpretAsUUID

fixed_string — Big-endian byte string. .

The UUID type value. .

reinterpret(x, T)

type — Destination type. .

CAST(x, T)

Converts an input value to the specified data type. Unlike the function, CAST tries to present the same value using the new data type. If the conversion can not be done then an exception is raised. Several syntax variants are supported.

T — The name of the target data type. .

Conversion to FixedString(N) only works for arguments of type or .

Type conversion to and back is supported.

setting

accurateCast(x, T)

The difference from is that accurateCast does not allow overflow of numeric types during cast if type value x does not fit the bounds of type T. For example, accurateCast(-1, 'UInt8') throws an exception.

accurateCastOrNull(x, T)

Converts input value x to the specified data type T. Always returns type and returns if the casted value is not representable in the target type.

accurateCastOrDefault(x, T[, default_value])

toInterval(Year|Quarter|Month|Week|Day|Hour|Minute|Second)

Converts a Number type argument to an data type.

parseDateTimeBestEffort

parseDateTime32BestEffort

Converts a date and time in the representation to data type.

The function parses , , ClickHouse’s and some other date and time formats.

time_string — String containing a date and time to convert. .

time_zone — Time zone. The function parses time_string according to the time zone. .

A string containing 9..10 digit .

parseDateTimeBestEffortUS

This function behaves like for ISO date formats, e.g. YYYY-MM-DD hh:mm:ss, and other date formats where the month and date components can be unambiguously extracted, e.g. YYYYMMDDhhmmss, YYYY-MM, DD hh, or YYYY-MM-DD hh:mm:ss ±h:mm. If the month and the date components cannot be unambiguously extracted, e.g. MM/DD/YYYY, MM-DD-YYYY, or MM-DD-YY, it prefers the US date format instead of DD/MM/YYYY, DD-MM-YYYY, or DD-MM-YY. As an exception from the latter, if the month is bigger than 12 and smaller or equal than 31, this function falls back to the behavior of , e.g. 15/08/2020 is parsed as 2020-08-15.

parseDateTimeBestEffortOrNull

parseDateTime32BestEffortOrNull

Same as for except that it returns NULL when it encounters a date format that cannot be processed.

parseDateTimeBestEffortOrZero

parseDateTime32BestEffortOrZero

Same as for except that it returns zero date or zero date time when it encounters a date format that cannot be processed.

parseDateTimeBestEffortUSOrNull

Same as function except that it returns NULL when it encounters a date format that cannot be processed.

parseDateTimeBestEffortUSOrZero

Same as function except that it returns zero date (1970-01-01) or zero date with time (1970-01-01 00:00:00) when it encounters a date format that cannot be processed.

parseDateTime64BestEffort

Same as function but also parse milliseconds and microseconds and returns data type.

time_string — String containing a date or date with time to convert. .

precision — Required precision. 3 — for milliseconds, 6 — for microseconds. Default — 3. Optional. .

time_zone — . The function parses time_string according to the timezone. Optional. .

time_string converted to the data type.

parseDateTime64BestEffortUS

Same as for , except that this function prefers US date format (MM/DD/YYYY etc.) in case of ambiguity.

parseDateTime64BestEffortOrNull

Same as for except that it returns NULL when it encounters a date format that cannot be processed.

parseDateTime64BestEffortOrZero

Same as for except that it returns zero date or zero date time when it encounters a date format that cannot be processed.

parseDateTime64BestEffortUSOrNull

Same as for , except that this function prefers US date format (MM/DD/YYYY etc.) in case of ambiguity and returns NULL when it encounters a date format that cannot be processed.

parseDateTime64BestEffortUSOrZero

Same as for , except that this function prefers US date format (MM/DD/YYYY etc.) in case of ambiguity and returns zero date or zero date time when it encounters a date format that cannot be processed.

toLowCardinality

Converts input parameter to the version of same data type.

To convert data from the LowCardinality data type use the function. For example, CAST(x as String).

expr — resulting in one of the .

toUnixTimestamp64Milli

toUnixTimestamp64Micro

toUnixTimestamp64Nano

fromUnixTimestamp64Milli

fromUnixTimestamp64Micro

fromUnixTimestamp64Nano

formatRow

format — Text format. For example, , .

formatRowNoNewline

format — Text format. For example, , .

protocol

domain

url — URL. Type: .

domainWithoutWWW

topLevelDomain

url — URL. Type: .

firstSignificantSubdomain

cutToFirstSignificantSubdomain

cutToFirstSignificantSubdomainWithWWW

cutToFirstSignificantSubdomainCustom

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain. Accepts custom name.

URL — URL. .

TLD — Custom TLD list name. .

Type: .

.

cutToFirstSignificantSubdomainCustomWithWWW

URL — URL. .

TLD — Custom TLD list name. .

Type: .

.

firstSignificantSubdomainCustom

URL — URL. .

TLD — Custom TLD list name. .

Type: .

.

port(URL[, default_port = 0])

path

pathFull

queryString

fragment

queryStringAndFragment

extractURLParameter(URL, name)

extractURLParameters(URL)

extractURLParameterNames(URL)

URLHierarchy(URL)

URLPathHierarchy(URL)

decodeURLComponent(URL)

netloc

url — URL. .

cutWWW

cutQueryString

cutFragment

cutQueryStringAndFragment

cutURLParameter(URL, name)

url — URL. .

name — name of URL parameter. or of Strings.

generateUUIDv4

Generates the of .

x — resulting in any of the . The resulting value is discarded, but the expression itself if used for bypassing if the function is called multiple times in one query. Optional parameter.

toUUID (x)

toUUIDOrNull (x)

toUUIDOrZero (x)

UUIDStringToNum

Accepts string containing 36 characters in the format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, and returns a as its binary representation, with its format optionally specified by variant (Big-endian by default).

string — String of 36 characters or FixedString(36). .

variant — Integer, representing a variant as specified by . 1 = Big-endian (default), 2 = Microsoft.

UUIDNumToString

binary — as a binary representation of a UUID.

variant — Integer, representing a variant as specified by . 1 = Big-endian (default), 2 = Microsoft.

​
​
​
​
​
​
​
​
modulo
​
​
​
​
​
optimize_functions_to_subcolumns
size0
strings
UUID
Array
UInt8
​
optimize_functions_to_subcolumns
size0
strings
UUID
Array
UInt8
​
optimize_functions_to_subcolumns
size0
​
​
​
​
​
​
UInt
UInt
UInt
function_range_max_elements_in_block
​
​
Array
​
​
​
​
​
​
higher-order function
​
​
​
​
​
​
Data types
​
Data types
​
​
​
higher-order function
descending order
Schwartzian transform
​
higher-order function
​
​
“ArrayJoin function”
​
Array
UInt*
Int*
Float*
​
Array
​
​
​
string
array
​
string
array
tuples
Array
Array
​
​
“arrayReverse”
​
Array
​
array
​
Array
tuples
Array
​
https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
​
higher-order function
Expression
Array
​
higher-order function
Expression
Array
​
higher-order function
Expression
Array
Decimal128
Float64
UInt64
Int64
​
higher-order function
Expression
Array
Float64
​
higher-order function
​
higher-order function
​
array
Array
Float64
Float64
​
​
​
​
​
Integer types
String
FixedString
Unsigned integer types
bin
hex
​
Integer types
String
FixedString
Unsigned integer types
​
​
​
binary form
​
logical conjuction
​
logical disjunction
​
Integer
floating-point
sign extension
​
Hamming Distance
SimHash
Int64
Int64
UInt8
SimHash
CRoaring
​
​
​
Bitmap object
UInt32
UInt32
​
Bitmap object
UInt32
UInt32
Bitmap object
​
substring
Bitmap object
UInt32
UInt32
Bitmap object
​
Bitmap object
UInt32
​
bitmapContains
​
​
​
​
​
​
​
​
​
​
​
​
​
​
short_circuit_function_evaluation
NULL values in conditionals
​
ifNotFinite
​
CASE
short_circuit_function_evaluation
​
​
String
​
DateTime64
String
DateTime
​
DateTime
DateTime64
DateTime
DateTime64
String
​
UTC
daylight saving time
IANA timezone database
DateTime
DateTime64
Int32
​
​
​
​
​
​
​
​
​
​
https://en.wikipedia.org/wiki/Unix_time
enable_extended_results_for_datetime_functions
​
​
​
​
​
​
​
​
​
​
DateTime64
Timezone
String
DateTime64
Timezone
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
String Literal
DateTime
DateTime64
Timezone name
String
DateTime
toStartOfInterval
​
String
Int
Date
DateTime
Date
DateTime
​
toRelativeDayNum
toRelativeMonthNum
toRelativeYearNum
String
Date
Date32
DateTime
DateTime64
Date
Date32
DateTime
DateTime64
Timezone name
String
Int
​
String
Int
Date
DateTime
Date
DateTime
​
Date
DateTime
Int
String
Date
DateTime
​
String
Int
Date
DateTime
Date
DateTime
​
Timezone name
String
DateTime
Timezone name
String
DateTime64
​
​
​
​
​
​
​
​
​
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_date-format
formatDateTimeInJodaSyntax
​
https://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html
​
String
Date
Date32
DateTime
DateTime64
String
String
​
Integer
toDateTime
DateTime
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_date-format
Integer
Date
Date32
DateTime
DateTime64
formatDateTime
String
fromUnixTimestampInJodaSyntax
​
https://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html
​
Proleptic Gregorian calendar
Modified Julian Day
String
FixedString
Int32
​
toModifiedJulianDay()
String
FixedString
Nullable(Int32)
​
Modified Julian Day
Proleptic Gregorian calendar
Any integral types
String
​
fromModifiedJulianDayOrNull()
Any integral types
Nullable(String)
Dictionaries
​
String literal
String literal
Tuple
String literal
Expression
Tuple
Expression
Tuple
Expression
attribute’s data type
Dictionaries
​
String literal
Expression
Tuple
​
hierarchical dictionary
String literal
Expression
UInt64
Array(UInt64)
​
String literal
Expression
UInt64
Expression
UInt64
​
String literal
String literal
Expression
UInt64
Tuple
Expression
attribute’s data type
​
Int
Float
​
Date
DateTime
String
FixedString
Float
Decimal
UUID
String
UInt
Float
Decimal
Date
DateTime
String
​
hex
reverse
reinterpretAs<Type>
String
FixedString
String
​
​
​
String
String
String
String
String
String
​
AES_DECRYPT
String
String
String
String
String
​
String
String
String
String
String
String
encrypt
​
AES_ENCRYPT
String
String
String
String
String
​
user_files_path
String
NULL
user_files_path
file
​
the great-circle formula
​
​
the great-circle formula
​
​
Tuple
Array
Geohash
geohash.org
​
geohash
​
geohash
​
geohash
Float
Float
Float
Float
UInt8
Array
String
H3
the Uber Engeneering site
​
H3
UInt64
UInt8
​
H3
UInt64
h3IsValid
UInt8
​
H3
UInt8
H3
Float64
​
H3
UInt8
H3
Float64
​
H3
Float64
Float64
UInt8
UInt64
​
H3
UInt64
integer
Array
UInt64
​
H3
UInt64
UInt8
​
UInt8
Float64
​
H3
UInt64
UInt64
UInt8
​
H3
UInt64
UInt8
Array
UInt64
​
H3
UInt64
UInt8
UInt64
​
UInt64
String
​
String
UInt64
​
Interprets
MD5
sipHash64
supported data types
UInt64
​
​
SipHash
MD5
interprets
supported data types
UInt64
​
SipHash
sipHash64
supported data types
FixedString(16)
​
CityHash
supported data types
UInt64
​
​
​
FixedString
String
FixedString
hex
​
​
​
FarmHash
available methods
supported data types
UInt64
​
string
Byte
Short
Integer
Long
​
JavaHash
​
JavaHash
Apache Hive
​
MetroHash
supported data types
UInt64
​
JumpConsistentHash
​
MurmurHash2
supported data types
UInt32
UInt64
​
MurmurHash2
gcc
supported data types
UInt64
​
MurmurHash3
supported data types
UInt32
UInt64
​
MurmurHash3
expressions
String
FixedString(16)
​
xxHash
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
bitHammingDistance
Hamming Distance
String
UInt8
UInt64
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
ngramMinHash
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
​
ngramMinHashCaseInsensitive
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
​
ngramMinHashUTF8
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
​
ngramMinHashCaseInsensitiveUTF8
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
tupleHammingDistance
String
UInt8
UInt8
Tuple
UInt64
UInt64
​
wordshingleMinHash
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
​
wordShingleMinHashCaseInsensitive
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
​
wordShingleMinHashUTF8
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
​
wordShingleMinHashCaseInsensitiveUTF8
String
UInt8
UInt8
Tuple
Tuple
String
Tuple
String
ELF
DWARF
allow_introspection_functions
trace_log
​
UInt64
String
arrayMap
​
UInt64
Array(String)
arrayJoin
​
UInt64
String
arrayMap
​
addressToSymbol
String
String
arrayMap
​
Block
Uint64
​
Block
String
​
​
​
​
​
IPv6NumToString
String
FixedString(16)
cutIPv6
​
big endian
​
​
CIDR
​
​
IPv4
​
IPv6
IPv6StringToNum
String
IPv6
​
String
UInt8
​
String
UInt8
​
CIDR
String
String
UInt8
​
​
​
​
​
​
​
simdjson
​
​
​
​
​
​
​
​
​
​
​
​
String
String
Integer
Array
String
​
​
​
String
string
integer
Array
Tuple
String
String
​
​
​
​
integers
output_format_json_quote_64bit_integers
output_format_json_quote_denormals
Enum
String
Map
Tuple
​
​
stochasticLinearRegression
​
stochasticLogisticRegression
​
Map(key, value)
String
Integer
LowCardinality
FixedString
UUID
Date
DateTime
Date32
Enum
Map
Array
Map(key, value)
Map(key, value)
​
maps
tuples
arrays
Int64
UInt64
Float64
map
tuple
​
maps
tuples
arrays
Int64
UInt64
Float64
map
tuple
​
maps
arrays
Array
Int
Array
Int
Int8, Int16, Int32, Int64, Int128, Int256
Map
map
tuple
arrays
​
Map
UInt8
​
optimize_functions_to_subcolumns
keys
Map
Array
​
optimize_functions_to_subcolumns
values
Map
Array
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
Hyperbolic cosine
Float64
Float64
​
Inverse hyperbolic cosine
Float64
Float64
cosh(x)
​
Hyperbolic sine
Float64
Float64
​
Inverse hyperbolic sine
Float64
Float64
sinh(x)
​
Inverse hyperbolic tangent
Float64
Float64
​
function
Float64
Float64
Float64
​
function
Float64
Float64
Float64
​
function
Float64
Float64
log(x)
​
​
NULL
​
NULL
​
​
​
​
Nullable
​
​
​
macros
String
String
​
​
String
​
​
​
​
UInt64
String
​
​
​
​
​
​
​
literals
UInt8
​
​
​
Float*
Float*
ternary operator
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
ORDER BY
Int64
​
ORDER BY
​
runningDifference
​
Date
DateTime
DateTime64
Date
DateTime
DateTime64
UInt32
​
​
​
​
Enum
​
​
​
​
Nullable
​
Nullable
​
Uint8
ontime
​
arrayJoin
​
filesystemFree
UInt64
​
UInt64
​
path
UInt64
​
-State
AggregateFunction
String
arrayReduce
​
-State
AggregateFunction
arrayReduce
initializeAggregation
​
AggregateFunction
supported data types
sum
​
dictionary
Join
identifier
join_use_nulls
Join operation
​
CatBoost
CatBoost documentation
Training and applying models
​
​
​
custom setting
String
Custom Settings
​
Decimal
Decimal
UInt8
​
Int
Decimal
UInt8
isDecimalOverflow
​
LowCardinality(String)
​
native interface
UInt16
tcp_port
​
​
​
​
Expression
supported data types
common subexpression elimination
UInt32
​
​
​
​
​
searching
other manipulations with strings
​
​
​
re2 regular expression
​
​
RE2
​
​
​
​
expression
data type
​
roundBankers
​
expression
data type
​
round
​
​
​
​
replacing
other manipulations with strings
​
positionCaseInsensitive
String
String
UInt
positionUTF8
​
position
String
String
UInt
​
positionCaseInsensitiveUTF8
String
String
UInt
​
positionUTF8
String
String
UInt
​
position
multiSearchAllPositionsUTF8
String
String
​
​
​
​
​
syntax
normalizeUTF8*()
​
hyperscan
​
​
​
edit distance
hyperscan
​
​
​
​
​
extractAllGroupsVertical
String
re2 syntax
String
Array
extractAllGroupsVertical
​
String
re2 syntax
String
Array
extractAllGroupsHorizontal
​
normalizeUTF8*()
​
​
like
String
like
​
​
​
countSubstringsCaseInsensitive
countSubstringsCaseInsensitiveUTF8
String
String
UInt
UInt64
​
String
String
UInt
UInt64
​
String
String
UInt
UInt64
​
String
re2 syntax
String
UInt64
​
String
String
Array
String
​
String
String
Array
String
​
​
String
Array
String
​
String
FixedString
String
FixedString
Array
searching
replacing
​
arrays
UUID
String
UInt8
​
arrays
UUID
String
UInt8
​
​
​
​
​
String
UInt
String
String
​
leftPad
String
UInt
String
String
​
String
UInt
String
String
​
rightPad
String
UInt
String
String
​
​
​
​
​
​
String
​
String
UInt
​
​
​
​
​
concat
​
​
​
​
​
Base58
String
String
​
​
​
​
​
​
String
String
​
String
​
String
​
String
​
​
​
​
String
String
​
String
UInt64
​
NFC normalized form
String
String
​
NFD normalized form
String
String
​
NFKC normalized form
String
String
​
NFKD normalized form
String
String
​
String
String
​
String
String
List of XML and HTML character entity references
​
String
String
url
​
​
​
tuple
Tuple
Tuple
​
Hamming Distance
Tuple
Tuple
Arithmetic functions
MinHash
​
same behavior as C++ programs
​
Int
Expression
rounding towards zero
NaN and Inf
numeric conversions issues
​
​
​
​
UInt
Expression
rounding towards zero
NaN and Inf
numeric conversions issues
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
Date32
Date
String
UInt32
Date
Date32
​
toDate32
Date32
​
toDate32
​
Date32
Date
​
DateTime64
String
UInt32
Float
DateTime
DateTime64
​
Decimal
​
Nullable(Decimal(P,S))
Expression
String
​
Decimal(P,S)
Expression
String
​
Decimal(P,S)
Expression
String
​
​
​
​
​
​
​
​
​
​
​
FixedString
UUID
​
String
​
reinterpret
String
String
FixedString
Nullable
cast_keep_nullable
​
cast(x, T)
​
Nullable
NULL
​
​
Interval
​
​
String
DateTime
ISO 8601
RFC 1123 - 5.2.14 RFC-822 Date and Time Specification
String
String
unix timestamp
ISO 8601 announcement by @xkcd
RFC 1123
toDate
toDateTime
​
parseDateTimeBestEffort
parseDateTimeBestEffort
​
​
parseDateTimeBestEffort
​
​
parseDateTimeBestEffort
​
parseDateTimeBestEffortUS
​
parseDateTimeBestEffortUS
​
parseDateTimeBestEffort
DateTime
String
UInt8
Timezone
String
DateTime
​
parseDateTime64BestEffort
​
parseDateTime64BestEffort
​
parseDateTime64BestEffort
​
parseDateTime64BestEffort
​
parseDateTime64BestEffort
​
LowCardinality
CAST
Expression
supported data types
​
​
​
​
​
​
​
CSV
TSV
​
CSV
TSV
​
​
String
​
​
String
​
​
​
​
TLD list
String
String
String
firstSignificantSubdomain
​
String
String
String
firstSignificantSubdomain
​
String
String
String
firstSignificantSubdomain
​
​
​
​
​
​
​
​
​
​
​
​
​
String
​
​
​
​
​
String
String
Array
​
UUID
version 4
Expression
supported data types
common subexpression elimination
​
​
​
​
FixedString(16)
String
RFC4122
​
FixedString(16)
RFC4122
defined by the ISO 8601