All Products
Search
Document Center

DataWorks:Data upload

Last Updated:Oct 11, 2025

The DataWorks data upload feature lets you upload data from local files, DataAnalysis workbooks, Object Storage Service (OSS) files, and HTTP files to engines such as MaxCompute, EMR Hive, Hologres, and StarRocks for analysis and management. This feature provides a convenient data transmission service to help you quickly use data to drive your business. This topic describes how to use the data upload feature.

Precautions

  • If you perform cross-border data uploads, such as transferring data from mainland China to outside mainland China or between different countries or regions, read the related compliance statement in advance. Otherwise, the data upload may fail, and you will be held legally responsible.

  • Before you upload data, set the table headers to English. If the table headers are in Chinese, parsing may fail and cause an upload error.

Limits

Billing

Data upload incurs the following fees:

  • Data transmission fees.

  • If you create a new table, computing and storage fees are charged.

The preceding fees are charged by the respective engines. For specific fees, see the billing documentation for the corresponding engine: MaxCompute billing, Hologres billing, E-MapReduce billing, and EMR Serverless StarRocks product billing.

Go to the data upload page

  1. Go to the Upload and Download page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Integration > Data Upload and Download. On the page that appears, click Go to Data Upload and Download.

  2. In the navigation pane on the left, click the image icon to go to the Data Upload page.

  3. Click Data Upload and follow the on-screen instructions to upload the data.

Select the file data to upload

You can upload data from local files, workbooks, OSS, and HTTP files. Select a data source as needed.

Note

When you upload a file, specify whether to filter out dirty data as needed.

  • Yes: If dirty data is encountered, the platform automatically ignores it and continues to upload the data.

  • No: If dirty data is encountered, the platform does not ignore it, and the data upload is interrupted.

Local file

If the data that you want to upload is in a local file, select this method.

  1. Set Data Source to Local File.

  2. Specify Data To Upload: Drag your local files to the Select File area.

    Note
    • The supported file formats are CSV, XLS, XLSX, and JSON. The maximum file size is 5 GB for CSV files and 100 MB for other file formats.

    • By default, the first sheet of a file is uploaded. To upload multiple sheets from a file, you must create a table for each sheet and make it the first sheet of the file.

    • Uploading files in SQL format is not supported.

Workbook

If the data that you want to upload is in a DataWorks DataAnalysis workbook, select this method.

  1. Set Data Source to Workbook.

  2. Specify Data To Upload:

    1. From the drop-down list next to Select File, select the workbook file to upload.

    2. If the workbook does not exist, click the New button next to it to create one. You can also go to the DataAnalysis module to create a workbook and import data.

Object Storage Service (OSS)

If the data that you want to upload is in Object Storage Service (OSS), select this method.

Prerequisites:

Steps:

  1. Set Data Source to Object Storage OSS.

  2. Specify The Data To Upload:

    1. From the Select Bucket drop-down list, select the destination OSS bucket that stores the data to upload.

      Note

      You can upload data only from a bucket that is in the same region as the current DataWorks workspace.

    2. In the Select File area, select the file data that you want to upload.

      Note

      Only files in CSV, XLS, XLSX, and JSON formats are supported.

HTTP file

If the data that you want to upload is in an HTTP file, select this method.

  1. Set Data Source to HTTP File.

  2. Specify Data To Upload:

    Parameter

    Configuration description

    File Address

    The address where the file data is stored.

    Note

    File addresses in HTTP and HTTPS formats are supported.

    File Type

    The file type is automatically detected based on the file you upload.

    Files in CSV, XLS, and XLSX formats are supported. The maximum size of a CSV file is 5 GB. The maximum size of other files is 50 MB.

    Request Method

    GET, POST, and PUT are supported. Using GET to obtain data is recommended. However, the specific method depends on your defined allowed request methods.

    Advanced Parameters

    You can also set the Request Header and Request Body in the Advanced Parameters section as needed.

Set the destination table

In the Set Destination Table section, select a Destination Engine for the data upload and configure the related parameters for the selected engine.

Important

When you set the destination table, distinguish between the production (PROD) and development (DEV) environments when you select a data source. If you select the wrong environment, the data is uploaded to the other environment.

MaxCompute

To upload data to a MaxCompute table, configure the following parameters.

Parameter

Configuration description

MaxCompute project name

Select a MaxCompute data source that is attached to the current region. If the data source that you want to use is not found, you can attach a MaxCompute compute resource to the current workspace to generate a data source with the same name.

Destination table

Select Existing Table or New Table.

Destination Table > Existing Table

Select destination table

The table where the data is stored. You can search for the table by keyword.

Note

You can upload data only to tables that you own. For more information, see Limits.

Upload mode

Select a method to add the data to the destination table.

  • Overwrite: Clears the data in the destination table and then imports all the data into the corresponding mapped fields in the destination table.

  • Append: Appends the data to the corresponding mapped fields in the destination table.

Destination Table > New Table

Table name

Enter a custom name for the new table.

Note

When a new table is created for the MaxCompute engine, the MaxCompute account information configured for the DataWorks computing resources is used. The table is then created in the corresponding MaxCompute project.

Table type

Select Non-partitioned Table or Partitioned Table as needed. If you select Partitioned Table, specify the partition fields and their values.

Lifecycle

Specify the lifecycle of the table. After the table expires, it may become unavailable. For more information about table lifecycles, see Lifecycle and Lifecycle action.

EMR HIVE

To upload data to an EMR HIVE table, configure the following parameters.

Parameter

Configuration description

Data source

Select an EMR Hive data source (Alibaba Cloud instance mode) that is attached to the workspace in the current region.

Destination table

You can upload data only to an Existing Table.

Select destination table

The table where the data is stored. You can search for the table by keyword.

Note
  • If the destination table does not exist, follow the on-screen instructions to go to Table Management in Data Development to create a table.

  • You can upload data only to tables that you own. For more information, see Limits.

Upload mode

Select a method to add the data to the destination table.

  • Overwrite: Clears the data in the destination table and then imports all the data into the corresponding mapped fields in the destination table.

  • Append: Appends the data to the corresponding mapped fields in the destination table.

Hologres

To upload data to a Hologres table, configure the following parameters.

Parameter

Configuration description

Data source

Select a Hologres data source that is attached to the workspace in the current region. If the data source that you want to use is not found, you can attach a Hologres compute resource to the current workspace to generate a data source with the same name.

Destination table

You can upload data only to an Existing Table.

Select destination table

The table where the data is stored. You can search for the table by keyword.

Note
  • If the destination table does not exist, follow the on-screen instructions to go to the Hologres console to create a table.

  • You can upload data only to tables that you own. For more information, see Limits.

Upload mode

Select a method to add the data to the destination table.

  • Overwrite: Clears the data in the destination table and then imports all the data into the corresponding mapped fields in the destination table.

  • Append: Appends the data to the corresponding mapped fields in the destination table.

Primary key conflict policy

If a data upload causes a primary key conflict in the destination table, you can adopt one of the following policies.

  • Ignore: The uploaded data is ignored. The data in the destination table is not updated.

  • Update (replace): The uploaded data completely overwrites the old data in the destination table. Fields that are not mapped are forcibly set to NULL.

  • Update (update): The uploaded data overwrites the old data in the destination table, but only for the mapped fields.

StarRocks

To upload data to a StarRocks table, configure the following parameters.

Parameter

Configuration description

Data source

Select a StarRocks data source that is attached to the workspace in the current region.

Destination table

You can upload data only to an Existing Table.

Select destination table

The table where the data is stored. You can search for the table by keyword.

Note
  • If the destination table does not exist, follow the on-screen instructions to go to the EMR Serverless StarRocks instance page to create a table.

  • You can upload data only to tables that you own. For more information, see Limits.

Upload mode

Select a method to add the data to the destination table.

  • Overwrite: Clears the data in the destination table and then imports all the data into the corresponding mapped fields in the destination table.

  • Append: Appends the data to the corresponding mapped fields in the destination table.

Advanced parameters

You can configure Stream Load request parameters.

Preview the data to upload

After you set the destination table, you can adjust the file encoding and data mapping based on the data preview.

Note

You can preview only the first 20 rows of data.

  • File Encoding: If the data contains garbled text, you can switch the encoding format. UTF-8, GB18030, Big5, UTF-16LE, and UTF-16BE are supported.

  • Preview data and set destination table fields:

    • Upload data to an existing table: You must configure the mapping between the columns in the source file and the fields in the destination table. After the mapping is configured, the data can be uploaded. You can select Map By Column Name or Map By Position. After the mapping is complete, you can also customize the field names in the destination table.

      Note
      • If a column in the source data is not mapped to a field in the destination table, the data in that column is grayed out and is not uploaded.

      • A column in the source data cannot be mapped to multiple fields in the destination table.

      • The field name and field type cannot be empty. Otherwise, the data cannot be uploaded.

    • Upload data to a new table: You can use Smart Field Generation to automatically fill in field information, or you can manually modify the field information.

      Note
      • The field name and field type cannot be empty. Otherwise, the data cannot be uploaded.

      • The EMR Hive, Hologres, and StarRocks engines do not support creating a new table during data upload.

  • Ignore First Row: Specify whether to upload the first row of the file data, which is usually the column names, to the destination table.

    • Selected: If the first row of the file contains column names, the first row is not uploaded to the destination table.

    • Not selected: If the first row of the file contains data, the first row is uploaded to the destination table.

Upload the data

After you preview the data, click the Data Upload button in the lower-left corner to upload the data.

What to do next

After the data is uploaded, you can click the image icon in the navigation pane on the left to go to the Data Upload page. Find the data upload task that you created and perform the following operations as needed:

  • Continue upload: In the Actions column, click Continue Upload to upload the data again.

  • Query data: In the Actions column, click Query Data to query and analyze the data.

  • View upload data details: Click the destination Table Name to go to Data Map and view the detailed information of the destination table. For more information, see General data query and management.

Appendix: Compliance statement for cross-border data upload

Important

If you perform cross-border data uploads, such as transferring data from mainland China to outside mainland China or between different countries or regions, read the related compliance statement in advance. Otherwise, the data upload may fail, and you will be held legally responsible.

Cross-border data operations will cause your business data in the cloud to be transferred to the region or product deployment area that you select. You must ensure that such operations comply with the following requirements:

  • You have the right to process the relevant business data in the cloud.

  • You have adopted sufficient data security protection technologies and policies.

  • The data transfer complies with the requirements of relevant laws and regulations. For example, the transferred data does not contain any content that is restricted or prohibited from being transferred or disclosed by applicable laws.

Alibaba Cloud reminds you that if your data upload operation may result in cross-border data transfer, you should consult with professional legal or compliance personnel before you perform the operation. Ensure that the cross-border data transfer complies with the requirements of applicable laws, regulations, and regulatory policies. For example, you must obtain valid authorization from personal information subjects, complete the signing and filing of relevant contract clauses, and complete relevant security assessments and other legal obligations.

If you perform cross-border data operations without complying with this statement, you will bear the corresponding legal consequences. You are also liable for any losses incurred by Alibaba Cloud and its affiliates.

References

FAQ

  1. Resource group configuration issue.

    Error message: The current file source or destination engine requires a resource group to be configured for data upload. Contact the workspace administrator to configure a resource group.

    Solution: To configure resource groups for an engine in DataAnalysis, see System administration.

  2. Resource group attachment issue.

    Error message: The global data upload resource group configured for your current workspace is not attached to the workspace to which the upload table belongs. Contact the workspace administrator to attach it.

    Solution: You can attach the resource group that you set in System Administration to the workspace.