Decode Stage

Decode stage
http://pic.dhe.ibm.com/infocenter/iisinfsv/v8r7/advanced/print.jsp?topic=...
Decode stage
Contents
1. Decode stage: fast path 2. Decode stage: Stage page 2.1. Decode stage: Properties tab 2.1.1. Decode stage: Options category 2.2. Decode stage: Advanced tab 3. Decode stage: Input page 3.1. Decode stage: Partitioning tab 4. Decode stage: Output page
IBM InfoSphere DataStage, Version 8.7.0 Feedback
Decode stage
The Decode stage is a processing stage. It decodes a data set using a UNIX decoding command, such as gzip, that you supply. It converts a data stream of raw binary data into a data set. Its companion stage, Encode, converts a data set from a sequence of records to a stream of raw binary data (see Encode Stage). As the input is always a single stream, you do not have to define meta data for the input link.
The stage editor has three pages: Stage Page. This is always present and is used to specify general information about the stage. Input Page. This is where you specify the details about the single input set from which you are selecting records. Output Page. This is where you specify details about the processed data being output from the stage. Decode stage: fast path This section specifies the minimum steps to take to get a Decode stage functioning. Decode stage: Stage page Decode stage: Input page
1 of 5
9/18/2013 5:42 PM
Decode stage
Decode stage: Output page Parent topic: Processing Data
Release date: 2011-10-01 PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide
1. IBM InfoSphere DataStage, Version 8.7.0
Feedback
Decode stage: fast path

This section specifies the minimum steps to take to get a Decode stage functioning.
About this task

InfoSphere DataStage has many defaults which means that it can be very easy to include Decode stages in a job. InfoSphere DataStage provides a versatile user interface, and there are many shortcuts to achieving a particular end, this section describes the basic method, you will learn where the shortcuts are when you get familiar with the product.
Procedure
In the Stage Page Properties Tab, specify the UNIX command that will be used to decode the data, together with any required arguments. The command should expect its input from STDIN and send its output to STDOUT. Parent topic: Decode stage
Feedback
Decode stage: Stage page

The General tab allows you to specify an optional description of the stage. The Properties tab lets you specify what the stage does. The Advanced tab allows you to specify how the stage executes. Decode stage: Properties tab Decode stage: Advanced tab Parent topic: Decode stage
Release date: 2011-10-01
2 of 5
9/18/2013 5:42 PM
Decode stage
PDF version of this information: IBM InfoSphere DataStage and QualityStage Parallel Job Developer's Guide
2.1. IBM InfoSphere DataStage, Version 8.7.0
Feedback
Decode stage: Properties tab

The Properties tab allows you to specify properties which determine what the stage actually does. This stage only has one property and you must supply a value for this. The property appears in the warning color (red by default) until you supply a value. Table 1. Properties Category/Property Options/Command Line
Values Command Line
Default N/A
Mandatory? Y
Repeats? N
Dependent of N/A
Decode stage: Options category Parent topic: Decode stage: Stage page
2.1.1. IBM InfoSphere DataStage, Version 8.7.0
Feedback
Decode stage: Options category

Command line
Specifies the command line used for decoding the data set. The command line must configure the UNIX command to accept input from standard input and write its results to standard output. The command must be located in the search path of your application and be accessible by every processing node on which the Decode stage executes. Parent topic: Decode stage: Properties tab
Feedback
Decode stage: Advanced tab

This tab allows you to specify the following: Execution Mode. The stage can execute in parallel mode or sequential mode. In
3 of 5
9/18/2013 5:42 PM
Decode stage
parallel mode the input data is processed by the available nodes as specified in the Configuration file, and by any node constraints specified on the Advanced tab. In Sequential mode the entire data set is processed by the conductor node. Combinability mode. This is Auto by default, which allows InfoSphere DataStage to combine the operators that underlie parallel stages so that they run in the same process if it is sensible for this type of stage. Preserve partitioning. This is Propagate by default. It adopts Set or Clear from the previous stage. You can explicitly select Set or Clear. Select Set to request that next stage in the job should attempt to maintain the partitioning. Node pool and resource constraints. Select this option to constrain parallel execution to the node pool or pools or resource pool or pools specified in the grid. The grid allows you to make choices from drop down lists populated from the Configuration file. Node map constraint. Select this option to constrain parallel execution to the nodes in a defined node map. You can define a node map by typing node numbers into the text box or by clicking the browse button to open the Available Nodes dialog box and selecting nodes from there. You are effectively defining a new node pool for this stage (in addition to any node pools defined in the Configuration file). Parent topic: Decode stage: Stage page
Feedback
Decode stage: Input page

The Input page allows you to specify details about the incoming data sets. The Decode stage expects a single incoming data set. The General tab allows you to specify an optional description of the input link. The Partitioning tab allows you to specify how incoming data is partitioned before being decoded. The Columns tab specifies the column definitions of incoming data. The Advanced tab allows you to change the default buffering settings for the input link. Details about Decode stage partitioning are given in the following section. See "Stage Editors," for a general description of the other tabs. Decode stage: Partitioning tab Parent topic: Decode stage
Feedback
4 of 5
9/18/2013 5:42 PM
Decode stage
Decode stage: Partitioning tab

The Partitioning tab allows you to specify details about how the incoming data is partitioned or collected before it is decoded. It also allows you to specify that the data should be sorted before being operated on. The Decode stage partitions in Same mode and this cannot be overridden. If the Decode stage is set to execute in sequential mode, but the preceding stage is executing in parallel, then you can set a collection method from the Collector type drop-down list. This will override the default collection method. The following Collection methods are available: (Auto). This is the default collection method for Decode stages. Normally, when you are using Auto mode, InfoSphere DataStage will eagerly read any row from any input partition as it becomes available. Ordered. Reads all records from the first partition, then all records from the second partition, and so on. Round Robin . Reads a record from the first input partition, then from the second partition, and so on. After reaching the last partition, the operator starts over. Sort Merge. Reads records in an order based on one or more columns of the record. This requires you to select a collecting key column from the Available list. Parent topic: Decode stage: Input page
Feedback
Decode stage: Output page

The Output page allows you to specify details about data output from the Decode stage. The Decode stage can have only one output link. The General tab allows you to specify an optional description of the output link. The Columns tab specifies the column definitions for the decoded data. See "Stage Editors," for a general description of the tabs. Parent topic: Decode stage
5 of 5
9/18/2013 5:42 PM

Decode Stage

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decode Stage

Uploaded by

Copyright:

Available Formats

Decode stage

Decode stage: Output page Parent topic: Processing Data

1. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: fast path

About this task

2. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: Stage page

Release date: 2011-10-01

2.1. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: Properties tab

Values Command Line

2.1.1. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: Options category

2.2. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: Advanced tab

3. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: Input page

3.1. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: Partitioning tab

4. IBM InfoSphere DataStage, Version 8.7.0

Decode stage: Output page

You might also like