To ease the creation of pipelines or mapping groups, you can use sample files to indicate some or all of the data to be transformed.
Note: The sample file is not transformed by the pipeline; it only helps streamline the pipeline's creation and enables a preview of the transformations' impact.
Requirements
Data Prep is controlled entirely at the org level and does not recognize individual workspaces or their permissions.
This means:
- Data Prep is shared among all authorized users in your org.
- Any user with access to Chain Builder also has access to Data Prep.
- All users who can create or edit chains will have the ability to manage pipelines in Data Prep.
- A single Data Prep pipeline can be used across multiple chains and workspaces within an organization.
Sample file specifications
To be used by a pipeline, the sample file must:
- Be up to 1 MB
- Include a header row
- Include a delimiter—a comma (,), tab, pipe (|), or semicolon (;)
- Be viewable in a text editor, such as Notepad++, Wordpad, or Textpad
- Have a consistent data layout for all rows
Tip: While the header names and column order in the sample file don't need to match the actual data transformed by the pipeline, align the sample file and actual data when possible to avoid confusion and further ease pipeline creation.
For example:
PERIOD,YEAR,ENTITY,ACCOUNT,PRODUCT,AMOUNT JAN,2O21,US,SALES,REGULAR-COLA,12500 JAN,2021,US,SALES,DIET-COLA,10000 JAN,2021,US,SALARIES,,3000
Note: The columns defined by the sample file may contain null or blank values, such as the blank PRODUCT
field in the example's fourth row.
Upload sample files
To upload a sample file:
- From Wdata Chains, click Data Prep.
Note: To access Data Prep from Wdata Chains, first set up a Data Prep connector.
- From Sample files
, click Add files (+) next to the search bar. - Under File upload, drag or browse to the file to upload.
- Under Columns, review and adjust the file's column definition as necessary.
- Click Save.
Edit a sample file's column definition
When you define a sample files column, select the type of data it contains:
Data type | Description | Example values |
---|---|---|
String | A sequence of alphanumeric characters |
California , 400010
|
Integer | A whole number, with or without a thousands separator |
25 , 37450
|
Number | A number that includes a decimal, with or without a thousands separator |
15.75 , 37865.95 , 25,789.62
|
Boolean | A true or false value |
True or 1 , False or 0
|
Date | A date with a day, month, and year |
1/1/2021 , 2021-01-01
|
Time | A time of day |
14:37 , 09:52:10
|
DateTime | A date and timestamp | 2021-01-01T18:26:33 |
To edit a sample file's column definition:
- From Sample files , click the file's row.
- Click Columns, then adjust the data type and details as necessary.
- Click Save.
Note: To edit or delete a column from your sample file, use the Pipelines
Learn more about managing pipelines.
tab.
Pin a sample file to a pipeline
To enable a preview of the transformations a pipeline applies to data, pin a sample file with the same column definition:
- From Pipelines , open the pipeline.
- On the Files
Note: If necessary, click Upload sample files to upload the sample file to Sample files.
tab, click Pin file for the sample file. - Map the columns from the sample file to the pipeline's column definition.
Note: The pipeline automatically maps columns with the same exact name and data type. You can only map columns with the same data type.
- Click Submit.
Pin a sample file to a mapping group
To enable a preview of the rules a mapping group applies to its primary column, pin a sample file indicative of the data it will map to:
- From Mapping groups , open the mapping group.
- On the Files tab, click Pin file for the sample file.
- Under Match columns, select which columns from the file map to the mapping group's columns.
Note: You can only map columns with the same data type.
- Click Submit.
Delete a sample file
To remove a sample file you no longer need from Sample files
, click its Delete.