Hi,
For a requirement I want to delete the duplicate records from the file but I do not wantto sort the records in output file.They should come in output with the same order as they are in the input.
Is it even possible in SORT, it be default sort the record always.
Keep the SORT order in output as input while removing the duplicates.
-
- New Member
- Posts: 3
- Joined: Sun Jun 19, 2016 7:44 pm
-
- Website Team
- Posts: 70
- Joined: Wed Jul 31, 2013 10:19 pm
Re: Keep the SORT order in output as input while removing the duplicates.
I can think of a two step solution here
1. Remove the duplicates using your key and in INREC add the sequence number at the end and create output file with sequence number
2. Sort the output file from step 1 on sequence number and write the outpur file. Remove the sequence number in OUTREC
Thanks,
Chandan
1. Remove the duplicates using your key and in INREC add the sequence number at the end and create output file with sequence number
2. Sort the output file from step 1 on sequence number and write the outpur file. Remove the sequence number in OUTREC
Thanks,
Chandan
-
- Global Moderator
- Posts: 490
- Joined: Sun Aug 25, 2013 7:24 pm
Re: Keep the SORT order in output as input while removing the duplicates.
Don't sort the data twice when you don't even need to do it once.
Now, you may have to do one SORT, if your keys are not contiguous, but you want them so. And then you do have to do two SORT, as outlined above.
However, you also may want to deduplicate contiguous keys whilst leaving the same key value elsewhere untouched, in which case SORT is an extremely bad thing to do.
Assuming you don't need to SORT, how about WHEN=GROUP with KEYBEGIN and PUSH SEQ (long enough to cover maximum number of duplicates). Then OUTFIL INCLUDE= for "one" in the seq, and BUILD to drop off the sequence number.
Now, you may have to do one SORT, if your keys are not contiguous, but you want them so. And then you do have to do two SORT, as outlined above.
However, you also may want to deduplicate contiguous keys whilst leaving the same key value elsewhere untouched, in which case SORT is an extremely bad thing to do.
Assuming you don't need to SORT, how about WHEN=GROUP with KEYBEGIN and PUSH SEQ (long enough to cover maximum number of duplicates). Then OUTFIL INCLUDE= for "one" in the seq, and BUILD to drop off the sequence number.
Create an account or sign in to join the discussion
You need to be a member in order to post a reply
Create an account
Not a member? register to join our community
Members can start their own topics & subscribe to topics
It’s free and only takes a minute