CSV file reading and writing
Comma Separated Value or CSV files are a common format typically used to represent the contents of a spreadsheet or database (or any other 2D matrix of data cells); they can by read and written by programs such as Microsoft Excel and Microsoft Access. The main properties of CSV files are as follows:
- Data rows are separated by end-of-line sequences.
- Data columns are separated by commas.
- Quotation marks must be escaped using the convention "".
- Data cells containing commas or quotes must be enclosed in quotes.
- Any data cell may optionally be enclosed in quotes.
Here is an example of the contents of a CSV file:
1997,Ford,E350,"AC, ABS, moon roof",3000.00 1999,Chevy,"Venture ""Extended Edition""",,4900.00
For more details see this Wikipedia article.
Records with no quote characters
If you know that no records contain " (quote) characters, or are willing to ignore them, the following code will read and write a CSV file.
REM Read CSV file from in% to array$()
imax%=0:REPEAT
s$=FNrd(in%)
FOR record%=0 TO rmax%-1
i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1)
IFLEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2)
array$(imax%,record%)=r$
NEXT record%
imax%=imax%+1
UNTIL EOF#in%
REM Write CSV file from array$() to out%
FOR item%=0 TO imax%-1
FOR record%=0 TO rmax%-1
s$=array$(item%,record%)
IF INSTR(s$,",")<>0 THEN s$=""""+s$+""""
PROCwr(out%,s$+",")
NEXT record%
PTR#out%=PTR#out%-1:BPUT#out%,13
NEXT item%
This code uses imax% to hold the maximum number of items
and rmax% to hold the maximum number of records, and
PROCwr() to write a string to a file and
FNrd() to read a terminated line from a file.
PROCwr() and FNrd() are in the
StringIO library.
Records containing quote characters
The following code will read and write CSV files if you need to deal with records containing " (quote) characters.
REM Read CSV file from in% to array$()
imax%=0:REPEAT
s$=FNrd(in%)
FOR record%=0 TO rmax%-1
i%=INSTR(s$+",",","):r$=LEFT$(s$,i%-1):s$=MID$(s$,i%+1)
IFLEFT$(r$,1)="""":IF RIGHT$(r$,1)="""":r$=MID$(r$,2,LENr$-2)
i%=0
REPEAT
i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,i%+1)
UNTIL i%=0
array$(imax%,record%)=r$
NEXT record%
imax%=imax%+1
UNTIL EOF#in%
REM Write CSV file from array$() to out%
FOR item%=0 TO imax%-1
FOR record%=0 TO rmax%-1
r$=array$(item%,record%)
i%=0
REPEAT
i%=INSTR(r$,"""",i%+1):IF i%:r$=LEFT$(r$,i%)+MID$(r$,2,i%):i%=i%+1
UNTIL i%=0
IF INSTR(r$,",")<>0 THEN r$=""""+r$+""""
PROCwr(out%,r$+",")
NEXT record%
PTR#out%=PTR#out%-1:BPUT#out%,13
NEXT item%
Note that this code is vunerable to String too long errors if escaping embedded characters result in the output string becoming too long.
Jgharston 20:20, 29 August 2008 (BST)