File upload

File upload API allows to put files into Data Lake by 2 sync mode:

  • Full sync: old sync data will be moved to trash

  • Incremental sync: data is gradually sync

File upload flows:

  1. Start sync with api_key to get sync_id for this sync

  2. Upload files, can be uploaded multiple times with the same sync_id

  3. End sync

Start sync

Method: POST

Path:

https://[client].datainsider.co/api/ingestion/file/sync/start

Parameter
Type
Description

name

string

name of the sync

path

string

file storage path on data lake

api_key

string

API key

sync_type

enum

Full Sync | Incremental Sync

Sample request:

curl --request POST \
  --url http://[client].datainsider.co/api/ingestion/file/sync/start \
  --header 'Content-Type: application/json' \
  --data '{
  "name" : "transactions incremental sync 06/12/21",
  "path" : "/data/db/transactions",
  "api_key" : "cccccccc-14a1-4eb1-8964-000000000000",
  "sync_type" : "IncrementalSync"
}'

Sample response:

{
  "sync_id": 1
}

Upload file

Method: POST

Path:

https://[client].datainsider.co/api/lake/file/upload?path={path}&sync_id={sync_id}

Parameter
Type
Description

path

string

file storage path on data lake

sync_id

string

sync_id value that is taken from api start sync

Sample request:

curl --request POST \
  --url 'https://[client].datainsider.co/api/lake/file/upload?sync_id=8' \
    --header 'Content-Type: multipart/form-data; boundary=---011000010111000001101001' \
  --form file=@/data/example/sales_transaction.parquet

Sample response:

{
  "code": 0,
  "msg": null
}

End sync job

Method: POST

Path:

https://[client].datainsider.co/api/ingestion/file/sync/end

Sample request:

curl --request POST \
  --url https://[client].datainsider.co/api/ingestion/file/sync/end \
  --header 'Content-Type: application/json' \
  --data '{
  "sync_id" : 8,
  "api_key" : "cccccccc-14a1-4eb1-8964-000000000000"
}
'

Sample response:

{
  "success": true
}

Get list of sync job

Method: GET

Path:

https://[client].datainsider.co/api/ingestion/file/sync/list?from=0&size=10

Sample request:

curl --request GET \
  --url 'https://[client].datainsider.co/api/ingestion/file/sync/list?from=0&size=10' \
  --header 'Content-Type: application/json'

Sample response:

[
  {
	"org_id": 0,
	"sync_id": 1,
	"name": "hello",
	"path": "/data/db/transaction2020",
	"sync_type": "FullSync",
	"sync_status": "Finished",
	"start_time": 1638728596890,
	"end_time": 1638729532803,
	"total_files": 4,
	"num_failed": 2
  },
  {
	"org_id": 0,
	"sync_id": 2,
	"name": "hello",
	"path": "/data/db/transaction2021",
	"sync_type": "IncrementalSync",
	"sync_status": "Syncing",
	"start_time": 1638760723792,
	"end_time": 0,
	"total_files": 0,
	"num_failed": 0
  }
]

Note: sync_status field is enum with values: Syncing, Failed, Finished

Uploaded file history

Method: GET

Path:

https://[client].datainsider.co/api/ingestion/file/sync/history?from=0&size=10

Sample request:

curl --request GET \
  --url 'https://[client].datainsider.co/api/ingestion/file/sync/history?from=0&size=10' \
  --header 'Content-Type: application/json'

Sample response:

[
  {
	"org_id": 0,
	"history_id": 3,
	"sync_id": 2,
	"file_name": "random_data.parquet",
	"start_time": 1638787907031,
	"end_time": 0,
	"sync_status": "Syncing",
	"message": ""
  },
  {
	"org_id": 0,
	"history_id": 4,
	"sync_id": 2,
	"file_name": "random_data.parquet",
	"start_time": 1638787922904,
	"end_time": 1638787922952,
	"sync_status": "Finished",
	"message": ""
  },
  {
	"org_id": 0,
	"history_id": 5,
	"sync_id": 3,
	"file_name": "random_data.parquet",
	"start_time": 1638788464984,
	"end_time": 1638788465020,
	"sync_status": "Finished",
	"message": ""
  },
  {
	"org_id": 0,
	"history_id": 6,
	"sync_id": 4,
	"file_name": "full_vn_map.jpg",
	"start_time": 1638788697950,
	"end_time": 1638788697985,
	"sync_status": "Finished",
	"message": ""
  }
]

Last updated