How not to do it
The impetus for this blog posting is a project I was assigned which accessed CSV files all the wrong ways. I was taken aback recently with a 2016 era project when I saw it include code to read each line in a text file (System.IO.File.ReadAllLines) and use Array.Split to process the CSV and at the same time to do it in a flawed manner. An example of this travesty against Alan Turing and all things good is as follows:private static DataTable ReadCSVFile()
{
string[] rows = File.ReadAllLines(GetCSVFilename());
string[] columnNames = null;
DataTable table = new DataTable();
if (rows.Length == 0)
{
return table; // error empty CSV file
}
columnNames = rows[0].Split(',');
foreach (string columnName in columnNames)
{
table.Columns.Add(columnName);
}
object[] rowValues = new object[columnNames.Length];
// start from 1 b/c rowindex=0 is the header row of column names
for (int rowIndex = 1; rowIndex < rows.Length; rowIndex++)
{
string rowLine = rows[rowIndex];
// flaw in code -- any string containing a column will
// generate an extra column
string[] columnValues = rowLine.Split(',');
if (columnNames.Length != columnValues.Length)
{
throw new Exception(
"Split() failed because there is a comma in the data.");
}
columnIndex < columnValues.Length;
columnIndex++)
{
rowValues[columnIndex] = columnValues[columnIndex];
}
table.Rows.Add(rowValues);
}
return table;
}
Pay close attention to the code marked in boldface which is a core flaw in how the CSV values are parsed. This bug is the reason it is better to use established technologies. The bug show up at line 291 due this text data that contains a column:
For urgent issues, do not send e-mail
How to access a CSV file using ADO.NET's OLE DB Provider
This article will demonstrated a cleaner way to convert a CSV file into a DataTable. The previously referenced MSDN article that demonstrates CSV file to DataTable code is actually flawed with respect to how connections to CSV files are handled and how the command text related to accessing values is written. The example presented here will address these issues and said example is as follows:
public static DataTable GetDataTableFromCSV(string folderPath,
string csvFileName)
{
DataTable table = new DataTable();
string connectionText = String.Format(
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source='{0}';" +
"Extended Properties='text;HDR=Yes';",
folderPath);
string commandText = String.Format(
"SELECT * FROM [{0}]", csvFileName);
using (OleDbConnection connection =
new OleDbConnection(connectionText))
{
connection.Open();
using (OleDbDataAdapter command =
new OleDbDataAdapter(commandText, connection))
{
command.Fill(table);
command.Dispose();
}
connection.Close();
}
}
The connection text in the previous code includes the folder in which the CSV file resides. The folder path is assigned to the Data Source property of the connection string as follows:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source='{0}';" +
"Extended Properties='text;HDR=Yes';",
"Extended Properties='text;HDR=Yes';",
Notice that the {0} String.Format parameters is bracketed by single quotes: '{0}'. This single quotes protect the code from any special characters in the folder path that would cause an error in the connection string.
The command text is a SQL SELECT where the CSV filename is the value specified for the FROM clause:
"SELECT * FROM [{0}]"
Notice that the {0} String.Format parameters is bracketed by square brackets. In a perfect would this would protect the command execution from special characters in the filename.
Accessing the Source Code
Each project used in this application can be found in Github for user SoftwarePronto (https://github.com/softwarepronto/Blog). The solution associated with blog is:
The pertinent projects are:
Each project includes a sample CSV file that is part of the project and is copied to the build folder. The settings associated with supporting thsi were discussed in a previous posting, "Visual Studio Installer Projects: Installing PDFs and other Content with an Install Project".
- ADODotNetReadingCSVFiles
The pertinent projects are:
- BarbaricCSVAccess
- CSVToDataTableUsingADODotNet
Each project includes a sample CSV file that is part of the project and is copied to the build folder. The settings associated with supporting thsi were discussed in a previous posting, "Visual Studio Installer Projects: Installing PDFs and other Content with an Install Project".
No comments :
Post a Comment