Sort Utilities
Sort Utilities
Sort Utilities
PIC X(20).
X(15).
X(15).
PIC 99.
X(28).
In this JCL
SORTIN represents the input file
SORTOUT represents the output file
SORT FILEDS=(1,20,CH,A) is the sort control statement.
//V021984S JOB ,,NOTIFY=&SYSUID,CLASS=U,MSGLEVEL=(1,1),
//
MSGCLASS=X
//***************************************************************
//** THIS JOB EXPLAINS DIFFERENT TYPE OF SORT UTILITIES
**
//***************************************************************
//SORT
EXEC PGM=SORT
//SYSOUT
DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SORTIN
DD DSN=V021984.SORT.INPUT.FILE,DISP=SHR
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(1,1))
//SORTOUT DD DSN=V021984.SORT.OUTPUT.FILE,
//
DISP=(NEW,CATLG,DELETE),
//
UNIT=SYSDA,SPACE=(TRK,(10,10),RLSE),
//
DCB=(MODEL,RECFM=FB,LRECL=80,BLKSIZE=800)
//***************************************************************
//** SORT CARD FOR SORTING THE FILE BY ONE FIELD
**
//***************************************************************
//SYSIN
DD *
SORT FIELDS=(1,20,CH,A)
/*
Here the input file is sorted first by the category and later by their
ranks.
FORMAT:
If a file is sorted by two fields which are of same data type then the
keyword FORMAT can be used as shown below.
//SYSIN
DD *
SORT FIELDS=(21,15,CH,A,36,15,CH,A)
/*
OR
//SYSIN
DD *
SORT FIELDS=(21,15,A,36,15,A),FORMAT=CH
/*
//SYSIN
DD *
SORT FIELDS=COPY
/*
The Sort card for this will need information like range of columns
within which you want the search to be performed. To search in the
whole file you can give 1 to the LRECL of the file as done above. We can
also replace the EQ parameter by NE which will fetch us all the
records which dont have the specified string.
Skip Header Records
When you want to sort a file there might be some header records
which you may want to skip. This can be done with the following sort
card.
SORT FIELDS=COPY,
SKIPREC=9
The above sort card will copy the Input file to the Output file but
will skip the first 9 records. Please note that SKIPREC is a parameter of
the SORT verb and not a statement.
Sort Certain Number Of Records.
When you want to sort a file you might need only a certain
number of records in the Output file without having any specific
condition. This can be done with the following sort card.
SORT FIELDS=COPY,
STOPAFT=1000
The above sort card will copy the Input file to the Output file but
will copy only the first 1000 records. Please note the STOPAFT is a
parameter of the SORT verb and not a statement. These above two
parameters i.e. SKIPREC and STOPAFT can also be combined together
SORT FIELDS=COPY,
SKIPREC=9,STOPAFT=1000
The above sort card will skip the first 9 records and copy the next
1000 records to the output file.
This statement tells the JCL that includes only those records/rows in
the output file in which the value of the field which starts at column
21 of length 20 is equal to the value of a field which starts at column
41 of length 20.
SORT FIELDS=COPY
INCLUDE COND=(21,20,CH,EQ,40,21,CH)
This statement tells the JCL that includes only those records/rows in
the output file in which the value of the field which starts at column
41 of length 20 is equal to the character string INDIA. In our case
it writes only those rows to the output file in which the country name
is INDIA.
SORT FIELDS=COPY
INCLUDE COND=(41,20,CH,EQ,C'INDIA')
SORT FIELDS=COPY
INCLUDE COND=(71,2,CH,GE,C'40')
SORT FIELDS=COPY
INCLUDE COND=(71,2,BI,LT,X'0000000A')
SORT FIELDS=COPY
INCLUDE COND=(71,2,CH,GE,C'40',AND,21,20,CH,EQ,CINDIA)
In a case where the field length of 2 fields which are being compared
is not same then the depending on the type of data in field it is
truncated. If the filed has a character data then it will be truncated
or padded with spaces to the right.
INCLUDE COND=(21,20,CH,EQ,CINDIA')
Here the field is compared with value INDIA and not INDIA
.
Similarly in INCLUDE COND=(71,2,CH,EQ,C'987')
Here the field is compared with value 98 and not 987.
The binary fields are truncated and padded with zeroes to the left if
the length mismatch is there between the items being compared.
OMIT COND:
OMIT is used to exclude records based upon some criteria.
SORT FIELDS=COPY
OMIT COND=(41,30,CH,EQ,C'ENGLAND')
OUTREC Statement-1
SORT FIELDS=COPY
OUTREC FIELDS=(01,20,71,2)
In this case the OUTPUT file will only contain 2 fields, first starting
from 1st column and of length 20 and second field starting from column
71 and of length 2 (INPUT FILE).
These fields will be written in the OUTPUT file one after the other.
OUTREC Statement-2
SORT FIELDS=COPY
OUTREC FIELDS=(71,2,01,20)
OUTREC Statement-3
SORT FIELDS=COPY
OUTREC FIELDS = (2Z,71,2,01,20)
OUTREC Statement-4
SORT FIELDS=COPY
OUTREC FIELDS = (71,2,10X,01,20)
OUTREC Statement-5
SORT FIELDS=COPY
OUTREC FIELDS=(01,20,Cis of 71,2,C years age)
OUTREC Statement-6
SORT FIELDS=COPY
OUTREC FIELDS=(11:01,20,Cis of 71,2,C years age)
When we use the INREC and the OUTREC statement the columns
which the user has formatted comes as desired. If any value in between
are not populated by the user then it is replaced with the blank
character (Hex value: 40). But after the end of the user formatted
columns the rest of the space till LRECL contains the Invalid/Nondisplay characters (HEX value: 00). This might cause problems in case
if the Output file is going to be used in any COBOL programs.
In order to overcome this problem we pass a blank character
(Hex value: 40) as the last column in the record of the Output file.
Hence as a result all the blank values within the file have a HEX value
as 40.
SORT FIELDS=(1,10,CH,A)
OUTREC FIELDS=(1:10,10,
11:1,5,
80:1x)
The LRECL in the above case is 80 and hence a blank character has
been moved to that particular column so that there are no Invalid/NonDisplay characters in the Output files.
INREC- CONTROL STATEMENT
To filter/reformat the data coming on the input file and then
sorting the filtered/reformatted data and writing it to the Output file
'INREC' can be used. This increases the performance of the sort as the
records are first filtered/reformatted. Hence the whole file is not
required to be sorted and only the filtered/reformatted fields are sorted
depending on the condition. The sort in this case is performed on the
Output file structure and not on the Input file structure.
Because INREC reformats the records before they are sorted, the SORT
statements must refer to the reformatted records as they will appear in
the output
INREC FIELDS=(1:1,6,
10:32,10,
21:43,5)
SORT FIELDS=(10,10,CH,A)
In the above SORT card we include only the EMPLOYEE NO, CITY and
the SALARY fields in the Output file. The file is then sorted on the basis
of CITY. Please note that the Output record structure is being used for
the SORT. Along with reformatting the values present in the Input file
we can also add characters to the records using the INREC statement.
INREC FIELDS=(1:32,10,
15:1,6,
23:C'**',
25:43,5)
SORT FIELDS=(1,10,CH,A)
The above Sort card will put ** at the position 23 in the formatted data
of Output file before sorting. This will be done for all the records
coming on the Input file.
Along with this the EDIT parameter can also be used to format
the field values coming from the Input file. Some of the examples using
the EDIT parameter are
INREC FIELDS=(1:32,10,
15:1,6,
23:C'**',
25:43,5,EDIT(TTT-TT)
SORT FIELDS=(1,10,CH,A)
In the above sort card if the value at position 43 on the Input file is
00000 then it will be displayed as 000-00 on the Output file. We can
have a comma (,) or any other character in place of the hyphen (-).
INREC FIELDS=(1:32,10,
15:1,6,
23:C'**',
25:43,5,EDIT(IIIIT)
SORT FIELDS=(1,10,CH,A)
In the above Sort card if the value at position 43 on the Input file is
00001 then on the Output file
1 will be displayed. The preceding
zeros are removed.
Using the Sum Fields Statement
The Sum Fields statement can be used to sum the values of a
particular field for all those records which have the same values for the
fields given in the SORT card condition. A simple SUM FIELDS
statement will look like.
SORT FIELDS=(32,10,CH,A)
SUM FIELDS=(43,5,ZD)
The above SORT card will sum the values of the Salary field of all
the records which have the same value for the City field. Thus on the
Output file there will only be one record written per City. The City name
and the Salary (cumulative value for that city) will be written and the
rest of the fields will have the values of the first occurrence of that
particular City. Here ZD means Zoned Decimal and the Sum Fields
statement requires the type of a field for it to sum. It can also be
Packed Decimal (PD).
When using the Sum Fields statement you might observe the
output always comes in the Signed format. In order to circumvent this
to display the Output in the human readable format we can reformat
the Output of the Sum filed statement using the EDIT Statement.
.
SORT FIELDS=(32,10,CH,A)
SUM FIELDS=(43,5,ZD)
OUTREC FIELDS=(1:32,10,
15:43,5,ZD,EDIT(TTT-TT),
80:1X)
In the above Sort card the sum of fields will be formatted with the
help of EDIT Statement and hence a value of 2500{will look like 25000.
Creating Multiple Output Data Sets and Reports (OUTFIL)
With OUTFIL control statement, we can:
Create multiple output data sets from a single pass over the input
data set. These data sets can have the same or different records and
field arrangements because each OUTFIL statement can specify one
or more OUTFIL DD names and can have its own INCLUDE or OMIT
and OUTREC parameters as well as report parameters.
Create detailed reports with three levels (report, page, and section)
of report elements such as:
- Headers
- Trailers
- Totals
- Counts
- Maximums
- Minimums
- Averages
- Page control via line counters
OUTFIL- CONTROL STATEMENT
Creating multiple output data sets from a single pass over the input
data set-1
//SYSOUT DD SYSOUT=A
//SYSPRINT DD SYSOUT=A
//SORTIN DD DSN=T67835.SORT.INPUT.FILE,DISP=SHR,
//SORTOF1 DD DSN=T67835.SORT.OUTPUT1,DISP=OLD
//SORTOF2 DD DSN=T67835.SORT.OUTPUT2,DISP=OLD
//SORTOF3 DD DSN=T67835.SORT.OUTPUT3,DISP=OLD
//SYSIN DD *
OPTION COPY
OUTFIL FILES=(1,2,3)
/*
Note: SAVE specifies that OUTFIL input records not included for any
other OUTFIL groups are to be included in the data sets for this
OUTFIL group.
Table lookup:
You can also introduce table lookups in DFSORT. It is like if you want
to replace ENGLAND which is present in Input file with United
Kingdom in output file. This can be achieved using DFSORT table
lookups. OUTREC parameter is used for this purpose.
PROCESSING ORDER OF CONTROL STATEMENTS:
//V019548R JOB ,,NOTIFY=&SYSUID,CLASS=D,MSGLEVEL=(1,1),MSGCLASS=X
//SORT EXEC PGM=SORT
//SYSOUT DD SYSOUT=A
//SYSPRINT DD SYSOUT=A
//SORTIN DD DSN=T67835.SORT.INPUT.FILE,DISP=SHR,
//ENGL DD DSN=T67835.SORT.OUTPUT,DISP=OLD
//
//SYSIN DD *
SORT FIELDS=(1,75,CH,A),
OUTFIL FNAMES=ENGL,INCLUDE=(41,30,CH,EQ,CENGLAND'),
OUTREC= (
CHANGE=(30,C'ENGLAND',C'United Kingdom'),
NOMATCH=(C'UNKNOWN'),
/*
Making Reports
The SORT Program can also be used to prepare Formatted
Reports. This is one of the most powerful features of SORT and also
one of the more under utilized once. We can save a lot of time by
having the SORT program make reports for us instead of writing
COBOL programs for it.
We will be using the same record Structure as shown above to
make the Report.
SORT FIELDS=(1,6,CH,A)
OUTFIL OUTREC=(5:1,6,
21:7,25,
50:32,10,
65:43,5,
132:1X),
HEADER2=(02:'REPORT NO:',
14:'RPT 1',
40:'COMPANY NAME',
60:'DATE:',
66:&DATE,
1/,40:'NAME OF REPORT',
60:'PAGE:',
66:&PAGE,
2/,05:'EMPLOYEE NO',
21:'EMPLOYEE NAME',
50:'CITY',
65:'SALARY',
1/),
LINES=35
RPT 1
COMPANY NAME
NAME OF REPORT
DATE:
PAGE:
EMPLOYEE NO
EMPLOYEE NAME
CITY
SALARY
000120
000121
000122
000123
000124
000125
000126
000127
000128
000129
000130
000131
NAME1 SURNAME1
NAME2 SURNAME2
NAME3 SURNAME3
NAME4 SURNAME4
NAME5 SURNAME5
NAME6 SURNAME6
NAME7 SURNAME7
NAME8 SURNAME8
NAME9 SURNAME9
NAME10 SURNAME 10
NAME11 SURNAME 11
NAME12 SURNAME 12
PUNE
DELHI
PUNE
BOMBAY
BOMBAY
DELHI
MOHALI
PUNE
MOHALI
BOMBAY
PUNE
ARKONNAM
02000
00300
05000
12000
11000
02000
00500
07000
00700
02000
05000
00300
Points to Note
The report which is generated will be 133 characters long. In this
case we put 1X at position 132 and not at the last position i.e. 133
as at the last position the new line character is present which is
useful while printing the report and helps the printer to go to the
new line.
The &DATE will get the system date and display in the format
MM/DD/YY.
The LINES=35 will make the report to have a page break after every
35 lines. Thus after every 35 lines it will display the header section
and then continue with the rest of the records.
The &PAGE will calculate the page no i.e. after every 35 lines have
been displayed the page no will be incremented by one.
The 1/ and 2/ signifies the amount of lines to be skipped before
the next line is to be written and till / doesnt arrive it is assumed to
be the same line.
The record format for the output record structure should be FBA.
The Output DD should look like.
DSN=Output-file-name,
DISP=(,CATLG,DELETE),SPACE=(CYL,(5,5),RLSE),
DCB=(LRECL=133,BLKSIZE=1330,RECFM=FBA)
&DATE
&DATE(YMD/)
&DATE(MYD-)
&DATE(4DM-)
&DATE(D4M/)
:
:
:
:
:
05/01/02
02/05/01
05-02-01
2002-01-05
01/2002/05
&TIME
&TIME(24:)
&TIME(12:)
&TIME(24.)
&TIME(12-)
:
:
:
:
:
01:30:00
13:30:00
01:30:00 PM
13.30.00
01-30-00 PM
The above clause will sum the salary values and give that total
at the End of the page. So in the SORT card above it will add the
Salary values of 35 employees and give the Total at the end of the
page. Please note that the Salary values coming on the Input file can
be formatted as required. This is because the data for the summation
is taken from the Input file.
Accumulating data at the end of Report:
We usually also have the need to accumulate the data for some
particular fields at the end of the report. This can be done using the
Trailer1 parameter
TRAILER1=(40:'TOTAL SALARY SUM : ',
65:TOTAL(43,5,ZD,EDIT(IIITT)),
The above clause will sum up all the values coming in the Salary fields
and will display it only once when all the records in the report has
been displayed.
Other parameters which can be used with Trailer1 and/or
Trailer2:
The other parameters which can be used in Trailer1 and Trailer2 are
COUNT
TOTAL
MAX
MIN
:
:
:
:
AVG
The Section clause above asks the Sort program to check from
32 for 10 characters which is the position of the City field. For any
change in the City the total for that City is written and the rest of the
Page is skipped. The new page starts with the report header and
continues in the same vein. Where Report For : is written in front
the City name appears in the Report Header as also for Section Wise
Summary Total.
Points to Note:
REPORT NO:
RPT 1
COMPANY NAME
REPORT FOR :ARKONNAM
EMPLOYEE NO
EMPLOYEE NAME
CITY
000131
NAME12 SURNAME 12
ARKONNAM
RPT 1
COMPANY NAME
REPORT FOR :BOMBAY
EMPLOYEE NO
EMPLOYEE NAME
CITY
000123
000124
000129
NAME4 SURNAME4
NAME5 SURNAME5
NAME10 SURNAME 10
BOMBAY
BOMBAY
BOMBAY
RPT 1
COMPANY NAME
REPORT FOR :DELHI
EMPLOYEE NO
EMPLOYEE NAME
CITY
000121
000125
NAME2 SURNAME2
NAME6 SURNAME6
DELHI
DELHI
SALARY SUM FOR DELHI
REPORT NO:
RPT 1
COMPANY NAME
DATE: 02-05-14
PAGE:
1
SALARY
300
-----------:
300
-----------DATE: 02-05-14
PAGE:
2
SALARY
12000
11000
2000
-----------:
25000
-----------DATE: 02-05-14
PAGE:
3
SALARY
300
2000
-----------:
2300
-----------DATE: 02-05-14
EMPLOYEE NAME
CITY
000126
000128
NAME7 SURNAME7
NAME9 SURNAME9
MOHALI
MOHALI
SALARY SUM FOR MOHALI
REPORT NO:
RPT 1
COMPANY NAME
REPORT FOR :PUNE
EMPLOYEE NO
EMPLOYEE NAME
CITY
000120
000122
000127
000130
NAME1 SURNAME1
NAME3 SURNAME3
NAME8 SURNAME8
NAME11 SURNAME 11
PUNE
PUNE
PUNE
PUNE
PAGE:
SALARY
500
700
-----------:
1200
-----------DATE: 02-05-14
PAGE:
5
SALARY
2000
5000
7000
5000
-----------:
19000
-----------************
47800
************
SORT D 10
SORT 16 20
SORT A 10 20 D 30 35
descending
SORT .FROM .TO D 10 20 - Sorts columns 10 to 20 in descending order for the labeled section
SORT X
SORT NX
SORT FIELDS=COPY
INCLUDE COND=(447,02,CH,NE,X'0000') For low values