After you create a pipeline, you can manage its details, column definition, and transformations as necessary.
Requirements
Data Prep is controlled entirely at the org level and does not recognize individual workspaces or their permissions.
This means:
- Data Prep is shared among all authorized users in your org.
- Any user with access to Chain Builder also has access to Data Prep.
- All users who can create or edit chains will have the ability to manage pipelines in Data Prep.
- A single Data Prep pipeline can be used across multiple chains and workspaces within an organization.
Edit a pipeline's column definition
You can update the name, data type, or format of a pipeline's column at any time. When you define a pipeline's column, select the type of data it contains:
| Data type | Description | Example values |
|---|---|---|
| String | A sequence of alphanumeric characters |
California, 400010
|
| Integer | A whole number, with or without a thousands separator |
25, 37450
|
| Number | A number that includes a decimal, with or without a thousands separator |
15.75, 37865.95, 25,789.62
|
| Boolean | A true or false value |
True or 1, False or 0
|
| Date | A date with a day, month, and year |
1/1/2021, 2021-01-01
|
| Time | A time of day |
14:37, 09:52:10
|
| DateTime | A date and timestamp | 2021-01-01T18:26:33 |
You can define a pipeline's columns manually, or use the column definition from a sample file or file upload.
To ease pipeline creation, we recommend you use a sample file to define its columns:
Note: To use a sample file, first upload it to Sample files.
- From Wdata Chains, click Data Prep.
- From Pipelines , open the pipeline.
- On the Columns tab, click Edit columns.
- Under Define columns, click Pick from list.
- Select the sample file with the column definition to use, and click OK.
Note: The sample file's column definition will replace any columns defined for the pipeline.
- Review the column definition, and edit the columns' names as necessary.
- Click Save.
To define the pipeline's columns, you can upload a file with the same column definition.
Note: The file must be delimited and contain a header row.
- From Wdata Chains, click Data Prep.
- From Pipelines , open the pipeline.
- On the Columns tab, click Edit columns.
- Under Define columns, click Create from file.
- Browse to and select the file with the column definition to use, and click OK.
Note: The file's column definition will replace any columns defined for the pipeline.
- Review the column definition, and edit the columns' names and data types as necessary.
Note: Be sure to review and update the column definition. The pipeline uses columns names from the file's header row and guesses data types based on the data.
- Click Save.
To manually define a column:
- From Wdata Chains, click Data Prep.
- From Pipelines , open the pipeline.
- On the Columns tab, click Edit columns.
- Under Define columns, click Add columns.
- Select the column's data type.
- Enter a name and description to help identify the column.
- Specify the format of the column's data, based on its type:
- For a String column, select any special format, such as for universally unique identifiers (UUIDs), binary strings, email addresses, or uniform resource identifier (URI) web addresses.
- For an Integer column, select the thousands separator.
- For a Number column, enter the number of decimals places, and select the decimal and thousands separators.
- For a Date, Time, or DateTime column, select its string-from-time (strftime) format.
Note: A Binary column contains values such as True or False, or 1 or 0.
- After you define all columns, click Save.
Copy a pipeline
To quickly create a new pipeline with similar columns or transformations as another pipeline, start with a copy of the existing pipeline:
- From Pipelines , click Copy for the existing pipeline.
- To rename the new pipeline, edit its name, and click OK.
- Edit the column definition or transformations as necessary.
- Click Publish.
Archive a pipeline
If you no longer use a pipeline, you can archive it so it's no longer active:
- From Pipelines , on the Active tab, click Archive for the pipeline.
- From the pipeline, select Archive from its menu.
Note: To return an archived pipeline to Active status, from Pipelines, select the Archived tab, and click Unarchive for the pipeline.
Delete a pipeline
To completely remove a pipeline, you can delete it.
Note: Unlike an archived pipeline, you can't restore a deleted pipeline. Delete a pipeline only if you no longer need it again.
- From Pipelines , archive the pipeline if active.
- On the Archived tab, click Delete for the pipeline.
- In Confirm, enter
delete. - Click Delete.