Reading variable length records in Sequencial Stage

Shaina Austin · Post by **Shaina Austin** » Wed Jul 07, 2004 8:45 am

Hi All:

I have a fixed width input file which has 3 record types - A, B and C.

My job has to produce a csv output file with all all the fields of record type B, 2 fields of record type A and one of C. The value of fields from A and C is repeated for all the records in the output file.

All the fields are of variable length. I am unable to read data with variable length in a sequencial stage.

Can you please advice me on how to do this.

Thanks in advance.

Shaina

tonystark622 · Post by **tonystark622** » Wed Jul 07, 2004 9:06 am

Are each of the record types the same length, i.e. are type A records always the same width, type B records always the same width and type C records always the same width? If so, probably the easiest way to read this file would be with a complex flat file stage. If not, you can probably read each row in as one big field (variable length) and split the fields out manually, or use the row splitter stage). How I would implement this depends on the data in the file. Do these records always have the same number of rows (always A, B, C or A, B, B, C, etc) and occur in a given sequence (always A,B.C or sometimes A, B, C ; sometimes A, C; Sometimes just A) ? Are there fields in A, B, and C that tie the records together?

If you can tell us more about the data, we can probably give you better advise on how to implement this.

Tony

Shaina Austin · Post by **Shaina Austin** » Wed Jul 07, 2004 9:37 am

Thanks Tony,

There is only one record of type A and C each. All other records are type B. Now I have to produce an output file with A.Field1, A.Field2, C.Field1, and all the fields of B.

A.Field1, A.Field2, C.Field1 have only one value which has to be replicated for all the records in the file.

The records always occur in the sequence of A(1 record), B(n no of records), C(1 record). There are no fields in the these records which connect them.

Shaina

tonystark622 · Post by **tonystark622** » Wed Jul 07, 2004 10:00 am

Shaina,

Is there any data in the A or B records that is unique and could be used as a lookup for the C data? If so, save that unique data in a stage variable.

You may have to read each row as a single long variable length field and pull data from each row by using the substring function or the Field[start,length] syntax (look up [] in the help file).

For A records save the Fields from A in Stage variables and don't output the A record. The way to do this is in the derivation of the stage variable put something like If (Whatever determines that this is an A record) Then In.Field1 Else StageVarName.

For A or B records that contain a unique identifier you may have to save the unique identifier in a stage variable.

For B records, write the Stage Variables with the B data and the unique identifier to the next stage (probably a flat file).

When you read a C record, write the unique identifier stage variable and the C fields you need to a hash file.

You can write A and B data from the first transformer to a flat file. Output the flat file to a second transformer. In the second transformer do a look up agains the hash file that you created in the first transformer and get the C fields you need. Output the A, B and C fields to your output flat file.

I know this sounds complicated, but it's pretty easy. Yell if you have any further questions.

Tony

ray.wurlod · Post by **ray.wurlod** » Wed Jul 07, 2004 3:54 pm

Sounds like A=header, B=detail and C=trailer.
Search for these terms on this Forum to get more ideas.
There are at least three different solutions, even without using a Complex Flat File stage.

elavenil · Post by **elavenil** » Wed Jul 07, 2004 9:57 pm

Hi,

As Ray mentioned, it looks like header, detail & trailer records.

Create field for header, detail and trailer records and assign a value as '1' and load header & trailer records into a hash file and use these hash files as look up to get the values from header & trailer records.

Hope this would help.

Regards
Saravanan

Shaina Austin · Post by **Shaina Austin** » Thu Jul 08, 2004 12:17 pm

Thanks All.

I did try outputting header and trailer to a hash file and then output all required fields from the detail and hash files to a seq file. But the header and trailer fields do not get populated. Hence I tried using stage variables and call them in the end seq file, but still the header and trailer fields do not get populated.

Can you suggest why they are not getting populated or is there an alternative to this....

ray.wurlod · Post by **ray.wurlod** » Thu Jul 08, 2004 5:03 pm

The "why" is almost certainly an issue with your constraint and/or derivation expressions. Are you able to post your design overview, with the actual constraint expressions that you're using?

rasi · Post by **rasi** » Thu Jul 08, 2004 6:05 pm

Things you need to look at

1. Check your filter conditions to seprate header and footer records.
2. Try to do this first and check your hash file
3. If you can see your hash files with header and footer record then check your join condition with your detail records.

Once if you had checked the above mentioned, I can't see a reason "Why" your job is not running.

Regards
Rasi

willpeng · Post by **willpeng** » Thu Jul 08, 2004 6:43 pm

This is what I think you are trying to do:

Record A1-1 A1-2 A1-3 A1-4
Record B1-1 B1-2 B1-3 B1-4
Record B2-1 B2-2 B2-3 B2-4
Record B3-1 B3-2 B3-3 B3-4
Record C1-1 C1-2 C1-3 C1-4

Into

A1-1, A1-1, B1-1, B1-2, B1-3, B1-4, C1-1
A1-1, A1-1, B2-1, B2-2, B2-3, B2-4, C1-1
A1-1, A1-1, B3-1, B3-2, B3-3, B3-4, C1-1

I will do this like this.

Flat -> Transform -> 2 hashes, 3 link outs

Link 1 out to Hash 1 - All records minus first and last, Natural key for record
Link 2 out to Hash 2 - Constraint: RowCount = 1, key = "Head", Value = Record F1@FMF2
Link 3 out to Hash 2 - Constraint: None, key = "Tail", Value = Record F1

Then

Use Hash 1 as drive Hash and Hash 2 as look up.

Key Hash 2 for Head and Field[@FM, 1, 1] for Col 1 Field[@FM, 2, 1] for Col 2.
Move all Hash 1 cols to Output Cols
Key Hash 2 for Tail and move Col to output Cols.

This is rough design, you may need to iron out some code issue.

Let me know if you try this design and if it works for you.

William