Sort & Icetool
1. Display the number of input or output records
I have some millions of records in my input file and I
want to know only the number of records present in that input file. Can DFSORT
display just the number of input records in an output file?
Here's a simple DFSORT job that will write just the
record count to SORTOUT as an 8-digit number:
//CTRCDS EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=_
//SORTIN DD DSN=... input file
//SORTOUT DD DSN=... output file
//SYSIN DD *
OPTION COPY
OUTFIL NODETAIL,REMOVECC,
TRAILER1=(COUNT=(M11,LENGTH=8))
/*
NODETAIL ensures that the data records are not included
in the SORTOUT file. REMOVECC ensures that a '1' for the ANSI eject character
is not included in the SORTOUT record. TRAILER1 with COUNT writes one
record with the count to the SORTOUT file. M11,LENGTH=8 formats the count as 8
digits with leading zeros. So if the input file had five thousand records,
SORTOUT would have:
00005000
Other edit masks could be used to format the count
differently, if appropriate. For example, M10,LENGTH=8 formats the count as 8
digits with blanks substituted for leading zeros. So if the input file had five
thousand records, SORTOUT would have:
5000
Like that,
…
//SYSIN DD *
OPTION COPY
OUTFIL REMOVECC,
TRAILER1=(‘XXXXXXX’,COUNT=(M11,LENGTH=8),DATENS=(DM4))
/*
whereas, ‘XXXXXXX’
can be any string which we need before displaying the count.
DATENS=(DM4) will print the system date in
dd/mm/yyyy format.
2. Sample records
An SORT/ICETOOL job that can
sample every 100th record using the SAMPLE=n parameter of OUTFIL.
//S1 EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=... input file
//SORTOUT DD DSN=... output file
//SYSIN DD *
OPTION COPY
OUTFIL STARTREC=100,SAMPLE=100
/*
In this case STARTREC=100 is
used to write record 100 to SORTOUT, and SAMPLE=100 is used to write records
200, 300, and so on to SORTOUT.
3a.
Use of STARTREC and ENDREC
STARTREC=n and ENDREC=m can be used to select a
sequential range of records to be included in each output data set. STARTREC=n
starts processing at the n’th record while ENDREC=m ends processing at the m’th
record. Here's an example of STARTREC=n and ENDREC=m:
//RANGE
EXEC PGM=ICEMAN
//SYSOUT DD
SYSOUT=*
//SORTIN DD
DSN=Y897797.INPUT2,DISP=OLD
//FRONT DD
DSN=Y897797.RANGE1,DISP=(NEW,CATLG),
// SPACE=(CYL,(5,5)),UNIT=SYSDA
//MIDDLE DD
DSN=Y897797.RANGE2,DISP=(NEW,CATLG),
// SPACE=(CYL,(5,5)),UNIT=SYSDA
//BACK DD
DSN=Y897797.RANGE3,DISP=(NEW,CATLG),
// SPACE=(CYL,(5,5)),UNIT=SYSDA
//SYSIN DD
*
OPTION COPY
OUTFIL FNAMES=FRONT,ENDREC=500
OUTFIL FNAMES=MIDDLE,STARTREC=501,ENDREC=2205
OUTFIL FNAMES=BACK,STARTREC=2206
Input records 1 through input record 500 are written
to the FRONT data set. Input record 501 through input record 2205 are written
to the MIDDLE data set. Input record 2206 through the last input record are
written to the BACK data set.
3b. Use INCLUDE, OMIT and
SAVE
INCLUDE/OMIT and SAVE can be used to select specific
records to be included in each output data set. The INCLUDE and OMIT operands
provide all of the capabilities of the INCLUDE and OMIT statements including
substring search and bit logic. SAVE can
be used to select the records that are not selected for any other subset,
eliminating the need to specify complex conditions.
//SUBSET
EXEC PGM=ICEMAN
//SYSOUT DD
SYSOUT=*_
//SORTIN DD
DSN=Y897797.INPUT3,DISP=OLD
//OUT1 DD
DSN=Y897797.SUBSET1,DISP=(NEW,CATLG),
// SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT2 DD
DSN=Y897797.SUBSET2,DISP=(NEW,CATLG),
// SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT3 DD
DSN=Y897797.SUBSET3,DISP=(NEW,CATLG),
// SPACE=(CYL,(5,5)),UNIT=SYSDA
//SYSIN DD *
OPTION COPY
OUTFIL INCLUDE=(8,6,CH,EQ,C'ACCTNG'),FNAMES=OUT1
OUTFIL INCLUDE=(8,6,CH,EQ,C'DVPMNT'),FNAMES=OUT2
OUTFIL SAVE,FNAMES=OUT3
Records with ACCTNG in positions 8-13 are included in
the OUT1 data set. Records with DVPMNT in positions 8-13 are included in the
OUT2 data set. Records without ACCTNG or DVPMNT in positions 8-13 are written
to the OUT3 data set.
3c. Use STOPAFT , ENDREC and SKIPREC
For example, to keep the
first 10 records,
you can use either:
OPTION COPY,STOPAFT=10
(or)
OPTION COPY
OUTFIL ENDREC=10
Similarly, to skip first 10
records,
OPTION COPY,SKIPREC=10
3d. Keep the last n records
See last page.
(Before
going to that, you should be aware of Seq. no., so, continue reading and at
last go through that.)
4. Change all zeros / strings in your records to
spaces / strings
Here's how you could change
all low values (X'00') to spaces (X'40'), in an FB data set with
an LRECL of 60:
ALTSEQ CODE=(0040)
OUTREC FIELDS=(1,60,TRAN=ALTSEQ)
Here's how you could change
all 'a' (X'81') and 'x' (X'A7') characters to 'b' (X'82')
and 'y' (X'A8') characters, respectively, in a VB input data set with
any LRECL:
ALTSEQ CODE=(81A7,82A8)
OUTREC FIELDS=(1,4,5,TRAN=ALTSEQ)
5. Using IFTHEN clauses
I have to replace all leading spaces (X'40') in
positions 1 to 6 with zeroes (X'F0'). The input file has
RECFM=FB and LRECL=80 and looks like this:
0001XX 20041210
7 S5
2044X
X4407 20041110
X8768978
25 0003
Non-leading blanks, as for example the fourth
character in the 7 S5 record, should not be replaced by zeros.
So the output file should look like this:
0001XX 20041210
007 S5
02044X
00X4407 20041110
X8768978
000025 0003
In this case, we check for
6-1 leading blanks and replace them with 6-1 leading zeros, respectively, as
follows:
//SYSIN DD *
OPTION
COPY
INREC
IFTHEN=(WHEN=(1,6,CH,EQ,C’ ’),OVERLAY=(6C‘0’)),
IFTHEN=(WHEN=(1,5,CH,EQ,C‘
’),OVERLAY=(5C‘0’)),
IFTHEN=(WHEN=(1,4,CH,EQ,C‘
’),OVERLAY=(4C‘0’)),
IFTHEN=(WHEN=(1,3,CH,EQ,C‘
’),OVERLAY=(3C‘0’)),
IFTHEN=(WHEN=(1,2,CH,EQ,C‘
’),OVERLAY=(2C‘0’)),
IFTHEN=(WHEN=(1,1,CH,EQ,C‘
’),OVERLAY=(C‘0’))
/*
6. DISPLAY Operator
To
print the values or characters of specified numeric or character fields in a
separate list data set. Simple, tailored, or sectioned reports can be produced.
e.g.
DISPLAY
FROM(CODASD) LIST(OUT) ON(28,6,PD) ON(18,4,ZD) ON(1,15,CH)
(28,6,PD) (18,4,ZD) (1,15,CH)
+000000000005200 +000000000000020 Aspen
+000000000007351
+000000000000032 Boulder
7. RANGE Operator
To filter out records which are having the values
between 2 extremes?
e.g. 1 - for picking records
which are having values X; and X is
-1500 < X < +8000.
RANGE
FROM(CADASD) ON(28,6,PD) HIGHER(-1500) LOWER(+8000)
──┬───
───┬─── ──┬── ──┬──
│ │ │ └── Upper limit for range
│
│ └──────── Lower limit for range
│ └────── ────────location
└──────────────────────
dd name of input data set
e.g. 2 - RANGE
FROM(ALL) ON(18,4,ZD) LOWER(32) - for giving
records which are having values less than 32 at 18th position.
8a. SELECT Operator
- You can keep only the first record for each value
(FIRST operand), only the last record for each value (LAST
operand),
- or only those records with values that occur:
More
than once, that is, duplicate values (ALLDUPS operand)
- Only once, that is, non-duplicate values (NODUPS
operand)
- A specified number of times (EQUAL operand)
- More than a specified number of times (HIGHER
operand)
- Less than a specified number of times (LOWER
operand)
e.g. SELECT
FROM(BKIN) TO(BKOUT) ON(106,4,CH) HIGHER(4)
BKIN is the ddname for the sample input data set. BKOUT is the ddname of the output data set
that will contain the records for each respective field value that occurs more
than 4 times.
8b. DISCARD Parameter:
To
save the records which are not satisfy as per our condition?
e.g. SELECT FROM(IN1)
TO(OUT1) ON(1,60,CH) ALLDUPS DISCARD(OUT2)
and here, whichever records
are unique will be pushed to OUT2.
8c. SPLICE
Operator
SPLICE is a new ICETOOL operator that helps you to perform various
file "join" and "match" operations. SPLICE allows you to
create output records in a variety of ways by splicing together fields from
records that have the same key, but different information.
e.g. SPLICE FROM(T1) TO(OUT12) ON(1,1,CH) WITH(13,1) USING(CTL3) KEEPNODUPS
SPLICE overlays the WITH fields from the overlay record to the
base record using matching ON fields, it's
usually necessary to do some initial setup before using SPLICE, to
ensure that:
- the ON
fields are in the same positions in the base and overlay records
- the
WITH fields in the overlay records are in the positions they will occupy in the
base records
- the
base records and overlay records are the same length.
(Note: For more detail, refer to document SORT New Enhancements at \\mbpnas\DATA\ACC_T444_DATA\Mainframe\Programming
Languages\Mainframes\JCL)
9. Set RC=12 or RC=4 if file is empty, has more than n
records, etc
I need to check if a data set is empty or not. If it's
empty I want to skip all other steps. Is there any way to check for empty data
sets?
I would like to skip certain steps in
my job if a file is empty. This would be easiest if a utility would generate a
non-zero RC if the input file is empty.
I
have several datasets that I need to sort together. How can I terminate if the
total count of records in these data sets is greater than 5000.
I
have a file that always has a header and trailer record and may or may not have
data records. Is there anyway to check if there are no data records in the
file?
//STEP1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=...
//TOOLIN DD *
* SET RC=12 IF THE
'IN' DATA SET IS EMPTY, OR
* SET RC= IF THE
'IN' DATA SET IS NOT EMPTY
COUNT FROM(IN) EMPTY
/*
….
….
….
// IF STEP1.RC =
THEN
//*** STEP2 WILL
RUN IF 'IN' IS NOT EMPTY
//*** STEP2 WILL
NOT RUN IF 'IN' IS EMPTY
//STEP2 EXEC ...
...
// ENDIF
10. Join fields from two files on a key
File A has the following
records:
000 $$$$$
111 AAAAA
222 CCCCC
333 EEEEE
444 GGGGG
and File B has the following
records:
111 BBBBB
333 DDDDD
444 FFFFF
555 HHHHH
I want to join the data
fields for pairs of records with the same key to get the following output:
111 AAAAA BBBBB
333 EEEEE DDDDD
444 GGGGG FFFFF
Below is an ICETOOL job that can do it. The trick is
to reformat the fields of the IN1 and IN2 files so you can join them with the
SPLICE operator. SPLICE helps you perform various file join and match
operations.
//DFSORT
EXEC PGM=ICETOOL
//TOOLMSG
DD SYSOUT=*
//DFSMSG DD
SYSOUT=*
//IN1 DD
DSN=... file 1
//IN2 DD
DSN=... file 2
//TMP1 DD
DSN=&&TEMP1,DISP=(MOD,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA
//OUT DD
DSN=... output file
//TOOLIN DD
*
* For this
example, the fields (p,m) are as follows:
* IN1: sort
key - 1,3
* f1fld -
5,5
* IN2: sort
key - 1,3
* f2fld -
5,5
*
* Reformat
the IN1 data set so it can be spliced
COPY
FROM(IN1) TO(TMP1) USING(CPY1)
* Reformat
the IN2 data set so it can be spliced
COPY
FROM(IN2) TO(TMP1) USING(CPY2)
* Splice
records with matching IN1/IN2 keys
SPLICE
FROM(TMP1) TO(OUT) ON(1,3,CH) WITH(11,5)
/*
//CPY1CNTL
DD _
* Use
OUTREC to create |key |f1fld |blank|
OUTREC
FIELDS=(1:1,3,5:5,5,11:5X)
/*
//CPY2CNTL
DD _
* Use
OUTREC to create:|key |blank |f2fld|
OUTREC
FIELDS=(1:1,3,11:5,5)
/*
11. Join fields from two files record-by-record
I want to merge two files laterally record by record.
For example:
If FILE1 contains:
AAAAA
BBBBB
CCCCC
and FILE2 contains:
11111
22222
33333
then my output file should contain:
AAAAA11111
BBBBB22222
CCCCC33333
Normally, we're asked how to join files on a key, but
in this case there is no key; we just want to join the files record-by-record.
Well, no problem, we'll just create a key we can use by adding a sequence
number (1, 2, ...) to the records in file1 and a sequence number (1, 2, to the
records in file2. Then we can join the fields from the 1 records, from the 2
records, and so on using the same technique we used for joining files on a key.
//S1 EXEC
PGM=ICETOOL
//TOOLMSG
DD SYSOUT=*
//DFSMSG DD
SYSOUT=*
//IN1 DD
DSN=... file1
//IN2 DD
DSN=... file2
//TMP1 DD
DSN=&&TEMP1,DISP=(MOD,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA
//OUT DD
DSN=... output file
//TOOLIN DD
*
* Reformat
the IN1 data set so it can be spliced
COPY
FROM(IN1) TO(TMP1) USING(CTL1)
* Reformat
the IN2 data set so it can be spliced
COPY
FROM(IN2) TO(TMP1) USING(CTL2)
* Splice
records with matching sequence numbers.
SPLICE
FROM(TMP1) TO(OUT) ON(11,8,PD) WITH(6,5) USING(CTL3)
/*
//CTL1CNTL
DD _
* Use
OUTREC to create: |f1fld|blank|seqnum|
OUTREC
FIELDS=(1:1,5,11:SEQNUM,8,PD)
/*
//CTL2CNTL
DD _
* Use
OUTREC to create: |blank|f2fld|seqnum|
OUTREC
FIELDS=(6:1,5,11:SEQNUM,8,PD)
/*
//CTL3CNTL
DD _
* Use
OUTFIL OUTREC to remove the sequence number
OUTFIL
FNAMES=OUT,OUTREC=(1,10)
/*
3d. Keep the last n records
But keeping the last n
records of an input data set is a bit more challenging since
DFSORT doesn't have any
built-in functions to do that.
The
trick is to attach a sequence number to the records and sort them in descending
order by the sequence number.
I also recommend reading
following documents present at \\mbpnas\DATA\ACC_T444_DATA\Mainframe\Programming
Languages\Mainframes\JCL
- IBM
Storage Storage Software DFSORT-MVS Smart DFSORT Tricks.html
- Sort
and Icetool
- SORT
New Enhancements and
- Dfsort-application
programming guide
- DFSORT
Tricks
~~~~~
The End ~~~~
Additions…
12. Using SORT we can even do arithmetic operations also.
For eg. OUTREC
FIELDS=(1,4,5:6,6,11:1X,(12,11,ZD,MUL,-1),ZD,LENGTH=11,58X) will multiply number 11 digit number starting from 12
by -1.
Comments
Post a Comment