Flat File and Custom Data Format Support

Comprehensive Support for a Range of Formats


Data Integration Suite provides built-in support for many common flat file formats. And using Data Integration Suite's powerful extensibility mechanism, developers build custom XML converters for enterprise-specific file formats.

Overall, Data Integration Suite provides support for a wide variety of flat file formats which are processed by either built-in XML converters or user-defined, custom XML converters. Data Integration Suite leverages a streaming environment, providing scalability without sacrificing performance.

  • Flat Files – such as comma- and tab-separated value files
  • Fixed-width Files – like dumps from databases or certain EDI-like formats
  • Tagged Files – which contain multiple row types within a single data stream
  • Hybrid Files – those that contain mixtures of the above qualities

Built-In Flat File Conversion


Data Integration Suite provides built-in support for many common flat file formats. Specifically, the following file types are supported out-of-the-box:

  • Base-64
  • Binary
  • CSV
  • dbase (II, III, III+, IV, V)
  • DIF
  • DotD
  • HTML
  • JavaProps
  • JSON
  • Pyx
  • RTF
  • SDI
  • SYLK
  • Tab-separated values
  • Whole-line text
  • Windows .ini file
  • Windows Write

Custom Flat File Conversion


When it comes to XML conversion, Data Integration Suite understands more than just textual data types. Often a file will contain binary information in any number of specialized formats — from hardware-specific types like BCD or IEEE Floats and Doubles to cross-platform standards such as COBOL Comp3 or ISO 8601 Date Time.

Rather than having to resort to extension functions, using Data Integration Suite lets developers specify the exact native data type and then automatically handles the conversion. Supported data types include:

  • BCD (Binary Coded Decimal)
  • Float
  • Binary (including the W3C Schema types base64Binary and hexBinary)
  • Boolean (including support for null or unknown values)
  • Integer (32-bit integers)
  • Byte (8-bit integers)
  • Long (64-bit integers)
  • Comp3 (the COBOL internal format; "IBM Packed")
  • Number (unlimited-precision numbers)
  • Date (in multiple languages)
  • Short (16-bit integers)
  • DateTime
  • String
  • Decimal (from System.Decimal on .Net)
  • Time
  • Double Zoned ("IBM Zoned" mainframe datatype)