Formats for Importing Data into a CAAT

One of the challenges involving the use of Computer Aided Audit Tools and Techniques (CAATs) when testing application controls is the transfer of information from the production system being audited into the auditor’s CAAT of choice.

There are a large number of data formats that could be used, depending on the DBMS (database management system), application software, or platform of the system being audited. The extracted data file could be a delimited text file, an XML file, an MS Excel spreadsheet, a PDF file, a dBase .dbf file, or even a connection to the database itself via ODBC (Open Database Connectivity).

In the article, the author ranks Data File formats by how efficient they are for importing data into a CAAT. His ranking, from most efficient to least efficient is:

  • dBase dbf file.
  • Adobe PDF File exported from system (not a scanned image).
  • Microsoft Excel XLS or XLSX file.
  • Delimited text file, such as CSV or TXT file.
  • XML file.
  • Miscellaneous others. (Tommie W. Singleton 2010)

I don’t agree with his ranking – I think that trying to extract data from a PDF file is more difficult and challenging than using almost any of the other methods.

There are limits to how many rows of data an MS Excel file can utilize. In the older XLS format, there is a 65,536 row limit (Microsoft Corporation 2012). In the newer XLSX file format, the limit is 1,048,576 rows. (Microsoft Corporation n.d.)

XML has become more common as a data extaction and communication tool. However, XML data use data dictionaries, and the auditor may not have access to the audited system’s data dictionary.

In the article, the author states that the “ideal format” is typically a flat file composed of rows of data. The first row contains column heading names, while subsequent rows contain the data.

CAAT Flat File

CAAT Flat File

After the data has been imported into the CAAT, it’s important to verify that it is exactly the same as the data in the operational system. One method of verification is similar to the batch transmittal sheet method. This involves using metrics about the data itself, such as the number of data records, summing total dollar amount or quantity columns or other similar kinds of data.

The author concludes that CAATs provide an effective and efficient method for meeting the goals of the audit. However, the most difficult step of the process may be putting the data into a format that can be used by the CAAT. (Tommie W. Singleton 2010)

Bibliography

Microsoft Corporation. Excel specifications and limits. n.d. http://office.microsoft.com/en-us/excel-help/excel-specifications-and-limits-HA103980614.aspx (accessed 03 09, 2014).

—. Text files that are larger than 65,536 rows cannot be imported to Excel 97, Excel 2000, Excel 2002 and Excel 2003. December 19, 2012. http://support.microsoft.com/kb/120596 (accessed March 09, 2014).

Tommie W. Singleton, Ph.D., CISA, CGEIT, CITP, CMA, CPA. Data Extraction, A Hindrance to Using CAATs . June 01, 2010. http://www.isaca.org/Journal/Past-Issues/2010/Volume-6/Pages/Data-Extraction-A-Hindrance-to-Using-CAATs.aspx (accessed 03 09, 2014).

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>