Documentation
  • The Fundamental
  • ACTIVE SYNC
    • Data Ingestion
      • Data Tracking
        • API Key Management
        • Generate Tracking ID
        • Install tracking with Tag Manager
        • Install Tracking from the Console
        • Tracking Method on Website
      • Datasource
        • MySQL
        • PostgreSQL
        • MongoDB
        • Microsoft SQL Server
        • Shopify
        • CSV
        • Google Sheets
    • Data Ingestion API
      • Data Lake
        • File upload
        • Tracking API
      • Data Warehouse
        • Batch upload
        • CSV upload
        • Tracking API
      • Data Schema Warehouse API
    • Data Integrations
      • Manage your API Key
      • Get Data using API
  • ROCKET.BI
    • Introduction
    • Data Warehouse
      • Data Management
      • Ad-hoc Query
        • Measure Schema
        • Calculated Field
      • Query Analysis
      • Relationship
    • Row-level Security
    • Dashboard
      • Dashboard Filter
      • Chart Control
        • Tab Control
        • Single Choice
        • Multiple Choice
        • Dropdown Control
        • Slicer Control
        • Date Control
        • Input Control
      • Manage Dashboard
        • Relationship
        • View and Share
        • Select Main Date Filter
        • Boost
        • Settings
        • Add Chart
        • Add Tab
        • Add Text
    • Chart Builder
      • Chart Types
        • Pie Chart
        • Column Chart
        • Bar Chart
        • Line Chart
        • Line Stock Chart
        • Pareto Chart
        • Bubble Chart
        • Scatter Chart
        • Map Chart
        • Area Chart
        • KPI Chart
        • Lollipop Chart
        • Parliament Chart
        • Funnel Chart
        • Pyramid Chart
        • Gauge Chart
        • Bullet Graph Chart
        • Heat Map Chart
        • Word Cloud Chart
        • Tree Map Chart
        • Stacked Column Chart
        • Stacked Bar Chart
        • Sankey Chart
        • Spider Web Chart
        • Wind Rose Chart
        • Histogram Chart
        • Bell Curve Chart
        • Table Chart
        • Pivot Table Chart
      • Chart Settings
        • Zoom
        • Inner chart filter
      • Chart Filters
        • Tab Filter
        • Single Choice
        • Multiple Choice
        • Dropdown Filter
        • Slicer Filter
        • Date Filter
        • Input Filter
      • Right-click Settings
        • Change date function
        • Drill down
        • Drill through
        • Use as a filter
    • SQL Query
      • Syntax
      • Functions
      • Aggregate Functions
      • Data Types
  • UNLOCK.CI
    • Unlock.CI
Powered by GitBook
On this page
  • Detect schema from csv file
  • Register csv schema
  • Ingest data into warehouse
  1. ACTIVE SYNC
  2. Data Ingestion API
  3. Data Warehouse

CSV upload

Ingest csv data into data warehouse flows:

  1. Detect schema from csv file

  2. Register detected schema to data warehouse system

  3. Ingest data into registered schema

Detect schema from csv file

Parameter
Type
Description

csv_setting.delimiter

string

char to distinguish between fields

csv_setting.include_header

boolean

true means this csv contains column name line, else false

sample

string

first n lines of csv file, may or may not contain header line

Sample curl:

curl --request POST \
  --url http://[client].datainsider.co/api/ingestion/csv/detect \
  --header 'Content-Type: application/json' \
  --data '{
    "csv_setting": {
   	 "include_header": true,
   	 "delimiter": ","
    },
    "sample": "user_id,name,date_of_birth,last_payment\n1,Nguyen Van A,16/05/1997,99.90\n2,Nguyen Thi B,25/01/2000,100.0\n3,Tran Thi C,01/01/2021,25.5"
}'

Sample response:

{
  "schema": {
	"name": "",
	"db_name": "",
	"organization_id": 0,
	"display_name": "",
	"columns": [
  	{
    	"class_name": "int32",
    	"name": "user_id",
    	"display_name": "user_id",
    	"is_nullable": true
  	},
  	{
    	"class_name": "string",
    	"name": "name",
    	"display_name": "name",
    	"is_nullable": true
  	},
  	{
    	"class_name": "datetime",
    	"name": "date_of_birth",
    	"display_name": "date_of_birth",
    	"input_as_timestamp": false,
    	"input_formats": [
      	"dd/MM/yyyy"
    	],
    	"is_nullable": true
  	},
  	{
    	"class_name": "double",
    	"name": "last_payment",
    	"display_name": "last_payment",
    	"is_nullable": true
  	}
	],
	"primary_keys": [],
	"partition_by": [],
	"order_bys": [],
	"temporary": false
  },
  "csv_setting": {
	"include_header": true,
	"delimiter": ",",
	"add_batch_info": false
  },
  "records": [
	[
  	1,
  	"Nguyen Van A",
  	"1997-05-15T17:00:00.000+00:00",
  	99.9
	],
	[
  	2,
  	"Nguyen Thi B",
  	"2000-01-24T17:00:00.000+00:00",
  	100.0
	],
	[
  	3,
  	"Tran Thi C",
  	"2020-12-31T17:00:00.000+00:00",
  	25.5
	]
  ]
}

Register csv schema

Parameter
Type
Description

api_key

string

key that has csv ingestion permission

schema

json

JSON object return from detect schema api, need to provide this object with two additional field: db_name and name to denote destination of data to data warehouse

schema.db_name

string

name of database in data warehouse

schema.name

string

name of table in data warehouse

Sample curl:

curl --request POST \
  --url http://[client].datainsider.co/api/ingestion/csv/schema \
  --header 'Content-Type: application/json' \
  --data '{
    "api_key": "cccccccc-14a1-4eb1-8964-000000000000",
    "schema": {
   	 "organization_id": 0,
   	 "name": "sale_records_100",
   	 "db_name": "csv_ingestion",
   	 "display_name": "",
   	 "columns": [
   		 {
   			 "class_name": "string",
   			 "name": "user_id",
   			 "display_name": "user_id",
   			 "is_nullable": true
   		 },
   		 {
   			 "class_name": "string",
   			 "name": "name",
   			 "display_name": "name",
   			 "is_nullable": true
   		 },
   		 {
   			 "class_name": "datetime",
   			 "name": "date_of_birth",
   			 "display_name": "date_of_birth",
   			 "input_formats": [
   				 "dd/MM/yyyy"
   			 ],
   			 "is_nullable": true
   		 },
   		 {
   			 "class_name": "double",
   			 "name": "last_payment",
   			 "display_name": "last_payment",
   			 "is_nullable": true
   		 }
   	 ]
    }
}'

Sample response:

{
  "name": "sale_records_100",
  "db_name": "csv_ingestion",
  "organization_id": 0,
  "display_name": "",
  "columns": [
	{
  	"class_name": "string",
  	"name": "user_id",
  	"display_name": "user_id",
  	"is_nullable": true
	},
	{
  	"class_name": "string",
  	"name": "name",
  	"display_name": "name",
  	"is_nullable": true
	},
	{
  	"class_name": "datetime",
  	"name": "date_of_birth",
  	"display_name": "date_of_birth",
  	"input_as_timestamp": false,
  	"input_formats": [
    	"dd/MM/yyyy"
  	],
  	"is_nullable": true
	},
	{
  	"class_name": "double",
  	"name": "last_payment",
  	"display_name": "last_payment",
  	"is_nullable": true
	}
  ],
  "primary_keys": [],
  "partition_by": [],
  "order_bys": [],
  "temporary": false
}

Ingest data into warehouse

Parameter
Type
Description

api_key

string

key that has csv ingestion permission

db_name

string

name of destination database

tbl_name

string

name of destination table

csv_setting

json

setting of csv file

data

string

lines of csv file to be ingested. A batch is a string which contains csv lines separated by '\n' character, and does not contain header line. Recommended number of lines in a batch is 1000 lines and are called sequentially.

Sample curl:

curl --request POST \
  --url http://[client].datainsider.co/api/ingestion/csv \
  --header 'Content-Type: application/json' \
  --data '{
    "api_key": "cccccccc-14a1-4eb1-8964-000000000000",
    "db_name": "csv_ingestion",
    "tbl_name": "sale_records_100",
    "csv_setting": {
   	 "include_header": true,
   	 "delimiter": ","
    },
    "data": "1,Nguyen Van A,16/05/1997,99.90\n2,Nguyen Thi B,25/01/2000,100.0\n3,Tran Thi C,01/01/2021,25.5\n4,Nguyen Van D,26/06/2006,919.90\n5,Nguyen Thi E,05/10/2001,200.0\n6,Tran Thi F,31/12/2020,75.5"
}'

Sample response:

{
  "succeed": true,
  "row_inserted": 6
}
PreviousBatch uploadNextTracking API

Last updated 2 years ago