0% found this document useful (0 votes)
57 views79 pages

Dev 4

Download as txt, pdf, or txt
Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1/ 79

Like every other website we use cookies.

By using our site you acknowledge that you


have read and understand our Cookie Policy, Privacy Policy, and our Terms of
Service. Learn more
Ask me later
Allow cookies
Click here to Skip to main content
14,556,277 members
Sign in
Home
Search for articles, questions, tips
Submit
homearticles
Chapters and Sections>
Latest Articles
Top Articles
Posting/Update Guidelines
Article Help Forum
Submit an article or tip
Import GitHub Project
Import your Blog
quick answers
Ask a Question about this article
Ask a Question
View Unanswered Questions
View All Questions
View C# questions
View VB.NET questions
View SQL questions
View ASP.NET questions
View Javascript questions
discussions
All Message Boards...
Application Lifecycle>
Running a Business
Sales / Marketing
Collaboration / Beta Testing
Work Issues
Design and Architecture
ASP.NET
JavaScript
C / C++ / MFC>
ATL / WTL / STL
Managed C++/CLI
C#
Free Tools
Objective-C and Swift
Database
Hardware & Devices>
System Admin
Hosting and Servers
Java
Linux Programming
.NET (Core and Framework)
Android
iOS
Mobile
SharePoint
Silverlight / WPF
Visual Basic
Web Development
Site Bugs / Suggestions
Spam and Abuse Watch
features
Competitions
News
The Insider Newsletter
The Daily Build Newsletter
Newsletter archive
Surveys
Product Showcase
CodeProject Stuff
community
Who's Who
Most Valuable Professionals
The Lounge
Where I Am: Member Photos
The Insider News
The Weird & The Wonderful
help
What is 'CodeProject'?
General FAQ
Ask a Question
Bugs and Suggestions
Article Help Forum
About Us
Articles » Platforms, Frameworks & Libraries » Libraries » General

Cinchoo ETL - CSV Reader

Cinchoo
Rate this:

4.93 (82 votes)


24 May 2020
CPOL
Simple CSV file reader for .NET
ChoETL is an open source ETL (extract, transform and load) framework for .NET. It
is a code based library for extracting data from multiple sources, transforming,
and loading into your very own data warehouse in .NET environment. You can have
data in your data warehouse in no time. This article talks about using CSVReader
component offered by ChoETL framework. It is a simple utility class to extract CSV
data from file / source.
Download source code
Download binary (.NET Framework)
Download binary (.NET Standard / .NET Core)
Contents
1. Introduction
2. Requirement
3. "Hello World!" Sample
3.1. Quick load - Data First Approach
3.2. Code First Approach
3.3. Configuration First Approach
3.4. Code First with Data Annotation
3.5. Code First with Fluent API
4. Reading All Records
5. Read Records Manually
6. Customize CSV Record
7. Customize CSV Header
8. Customize CSV Fields
8.1. DefaultValue
8.2. ChoFallbackValue
8.3. Type Converters
8.3.1. Declarative Approach
8.3.2. Configuration Approach
8.3.3. Custom Value Converter Approach
8.4. Validations
8.5. ChoIgnoreMember
8.6. StringLength
8.7. Display
8.8. DisplayName
9. Excel Field Separator
10. Callback Mechanism
10.1. Using CSVReader events
10.2. Implementing IChoNotifyRecordRead interface
10.3. BeginLoad
10.4. EndLoad
10.5. BeforeRecordLoad
10.6. AfterRecordLoad
10.7. RecordLoadError
10.8. BeforeRecordFieldLoad
10.9. AfterRecordFieldLoad
10.10. RecordLoadFieldError
10.11. SkipUntil
10.12. DoWhile
10. Customization
11. AsDataReader Helper Method
12. AsDataTable Helper Method
13. Using Dynamic Object
13.1. DefaultValue
13.2. ChoFallbackValue
13.3. FieldType
13.4. Type Converters
13.5. Validations
14. Working with sealed POCO object
15. Exceptions
16. Tips
16.1. Multiline CSV column value
16.2. CSV column value with field separator
16.3. CSV column value with single quotes
17. Using MetadataType Annotation
18. Configuration Choices
18.1. Manual Configuration
18.2. Auto Map Configuration
18.3. Attaching MetadataType class
19. LoadText Helper Method
20. Advanced Topics
20.1. Override Converters Format Specs
20.2. Currency Support
20.3. Enum Support
20.4. Boolean Support
20.5. DateTime Support
21. Fluent API
21.1 WithDelimiter
21.2. WithFirstLineHeader
21.3. WithFields
21.4. WithField
21.5. QuoteAllFields
21.6. ColumnCountStrict
21.6. ColumnOrderStrict
21.8. NotifyAfter
21.9. Configure
21.10. Setup
21.11. IgnoreHeader
21.12. WithEOLDelimiter
21.13. WithHeaderLineAt
22. FAQ
22.1. I want to read file with headers
22.1. File has lines I want to skip them from loading
22.2. Not all characters are read from data file
22.3. How to handle multiline records
22.4. How to handle errors and log them
22.5. How to handle column headers having spaces or special characters in them?
22.6. Failed to format DateTime during load?
22.7. How to handle duplicate CSV columns?
22.8. What if CSV header is found other than first line?
22.9. How to turn off Records loaded or other trace messages?
22.10. How to specify default values to CSV columns?
22.11. Does Cinchoo ETL support hierarchical object support?
22.12. Does the Cinchoo driver auto discover columns datatypes?
22.13 How to comment or ignore line in CSV file?
22.14 How to bulk copy CSV file to SqlServer (any db)?
22.15. What is the best way to get column names from CSV file?
22.16. Is there a way to parse CSV, mapping by varied column names?
22.17. How to extract CSV file to DataTable?
22.18. How to extract CSV file to DataReader?
22.19. How to handle column values with quotes in it?
22.20. How to read a CSV file into a typed DataTable?
22.21. How can I find empty lines in csv file for tracking purpose?
22.22. How can I change the order of the fields?
22.23. How to turn off column type discovery?
22.24. How to skip lines conditionally during CSV parsing?
22.25. How to stop CSV parsing conditionally?
22.26. Does Cinchoo reader supports loading CSV to child classes?
22.27. How to turn the validation on child object?
22.28. How to specify CSV column size to member?
22.28. How to specify CSV column name to member?
22.29. How to handle null special values in CSV?
22.30 Does Cinchoo handles currency values?
22.31 Can Cinchoo read different record types in same CSV?
22.32. How to validate CSV file?
22.33. How to sort large CSV file?
22.34 How to ignore a member from CSV loading?
22.35. How to handle custom datetime formatted values?
22.36. How to handle custom boolean values
22.37. How to handle custom boolean values
22.38. How to read CSV to Collection / Array members?
22.39. How to preserve leading / padded zeros correctly in Excel?
23.40. Does ChoCSVReader supports multiline header?
23.41. How to load CSV to Dictionary Member of POCO type?
23.42. Can duplicate names while reading dynamic records be handled?
23.43. Can ChoCSVReader find delimiter in CSV file automatically?
23. ChoTSVReader
25. References
1. Introduction
ChoETL is an open source ETL (extract, transform and load) framework for .NET. It
is a code based library for extracting data from multiple sources, transforming,
and loading into your very own data warehouse in .NET environment. You can have
data in your data warehouse in no time. This article talks about using CSVReader
component offered by ChoETL framework. It is a simple utility class to extract CSV
data from file / source.

Corresponding CSVWriter article can be found here.

Features:

Ultra-fast CSV Reader, parses CSV file with 15 columns / 500,000 records in 15
seconds.
Stream based parsers allow for ultimate performance, low resource usage, and nearly
unlimited versatility scalable to any size data file, even tens or hundreds of
gigabytes.
Event based data manipulation and validation allows total control over the flow of
data during the bulk insert process.
Follows CSV standard file rules. Gracefully handles data fields that contain commas
and line breaks.
In addition to comma, most delimiting characters can be used, including tab
delimited fields.
Exposes IEnumarable list of objects - which is often used with LINQ query for
projection, aggregation and filtration etc.
Supports deferred reading.
Supports processing files with culture specific date, currency and number formats.
Supports different character encoding.
Recognizes a wide variety of date, currency, enum, boolean and number formats when
reading files.
Provides fine control of date, currency, enum, boolean, number formats when writing
files.
Detailed and robust error handling, allowing you to quickly find and fix problems.
Performance:

This framework was tested with the test files containing 15 fields with different
number of records (250,000, 500,000, 1M, 1.5M, 2M, 2.5M etc). The graph below shows
the total time taken to load each file. The test was performed in Intel Core i5 /
8GB RAM machine.

Image 1

No of Records Run time (secs)


250,000 7
500,000 15
1,000,000 29
1,500,000 44
2,000,000 58
2,500,000 72

2. Requirement
This framework library is written in C# using .NET 4.5 Framework / .NET core 2.x.

3. "Hello World!" Sample


Open VS.NET 2013 or higher
Create a sample VS.NET (.NET Framework 4.5) Console Application project
Install ChoETL via Package Manager Console using Nuget Command based on the .NET
environment:
Install-Package ChoETL
Install-Package ChoETL.NETStandard
Use the ChoETL namespace
Let's begin by looking into a simple example of reading CSV file having 2 columns

Listing 3.1 Sample CSV data file (Emp.csv)

Hide Copy Code


Id,Name
1,Tom
2,Carl
3,Mark
There are number of ways you can get the CSV file parsing started with minimal
setup

3.1. Quick load - Data First Approach


It is the zero config, quick way to load a CSV file in no time. No POCO object is
required. Sample code below shows how to load the file

Listing 3.1.1 Load CSV file using iterator

Hide Copy Code


foreach (dynamic rec in new ChoCSVReader("Emp.csv")
.WithFirstLineHeader())
{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
Listing 3.1.2 Load CSV file using loop

Hide Copy Code


var reader = new ChoCSVReader("Emp.csv").WithFirstLineHeader();
dynamic rec;

while ((rec = reader.Read()) != null)


{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
You can also access csv fields by index as well. Sample below shows how to access
them by index

Listing 3.1.3 Access by index

Hide Copy Code


foreach (dynamic rec in new ChoCSVReader("Emp.csv")
.WithFirstLineHeader())
{
Console.WriteLine($"Id: {rec[0]}");
Console.WriteLine($"Name: {rec[1]}");
}
If the CSV file does not comes with header, CSVReader auto name the columns as
Column1, Column2 ... in the dynamic object.

3.2. Code First Approach


This is another zero config way to parse and load CSV file using POCO class. First
define a simple data class to match the underlying CSV file layout

Listing 3.2.1 Simple POCO entity class

Hide Copy Code


public partial class EmployeeRec
{
public int Id { get; set; }
public string Name { get; set; }
}
In above, the class defines two properties matching the sample CSV file template.

Listing 3.2.2 Load CSV file

Hide Copy Code


foreach (var rec in new ChoCSVReader<EmployeeRec>("Emp.csv")
.WithFirstLineHeader())
{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
3.3. Configuration First Approach
In this model, we define the CSV configuration with all the necessary parsing
parameters along with CSV columns matching with the underlying CSV file.

Listing 3.3.1 Define CSV configuration

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Id",
1));
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2));
In above, the class defines two properties matching the sample CSV file template.

Listing 3.3.2 Load CSV file without POCO object

Hide Copy Code


foreach (dynamic rec in new ChoCSVReader("Emp.csv", config)
.WithFirstLineHeader())
{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
Listing 3.3.3 Load CSV file with POCO object

Hide Copy Code


foreach (var rec in new ChoCSVReader<EmployeeRec>("Emp.csv",
config).WithFirstLineHeader())
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
}
3.4. Code First with Data Annotation
Data annotations is a simple attribute based configuration method where attributes
can be applied to POCO classes to configure the model. id is required column and
name is optional value column with default value "XXXX". If name is not present, it
will take the default value.

Listing 3.4.1 Define POCO Object

Hide Copy Code


[ChoCSVFileHeader]
[ChoCSVRecordObject(ObjectValidationMode = ChoObjectValidationMode.MemberLevel)]
public class EmployeeRec
{
[Required]
public int? Id
{
get;
set;
}
[DefaultValue("XXXX")]
public string Name
{
get;
set;
}

public override string ToString()


{
return $"{Id}. {Name}.";
}
}
The code above illustrates about defining POCO object to carry the values of each
record line in the input file. Define the members in the order of CSV columns
apprears in the file. [ChoCSVFileHeader] attribute tells the parser the first line
is header line. By default, validaton is turned off by the parser. To turn on, you
must specify either ChoObjectValidationMode as MemberLevel / ObjectLever using
ChoCSVRecordObject attribute. Id is a required property. We decorated it with
RequiredAttribute. Name is given default value using DefaultValueAttribute. It
means that if the Name CSV column contains empty value in the file, it will be
defaulted to 'XXXX' value.

It is very simple and ready to extract CSV data in no time.

Listing 3.4.2 Main Method

Hide Copy Code


class Program
{
static void Main(string[] args)
{
string csv = @"Id, Name
1, Mark
2, Tom";
foreach (var rec in ChoCSVReader<EmployeeRec>.LoadText(csv))
{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
}
}
We start by creating a new instance of ChoCSVReader object. That's all. All the
heavy lifting of parsing and loading CSV data stream into the objects is done by
the parser under the hood.

By default, CSVReader discovers and uses default configuration parameters while


loading CSV file. These can be overridable according to your needs. The following
sections will give details about each configuration attributes.

3.5. Code First with Fluent API


This is the one another approach to define POCO entity class along with custom
selective CSV configuration using Fluent API. Define POCO class

Listing 3.5.1 Define POCO Object

Hide Copy Code


public class EmployeeRec
{
public int Id
{
get;
set;
}
public string Name
{
get;
set;
}
public string Address
{
get;
set;
}
}
Sample below shows how to do the custom CSV field mapping using Fluent API. POCO
has 3 members defined and CSV file comes with only Id, Name columns, you can use
the fluent API to specify the parse to consider only Id, Name columns only.

Listing 3.5.2 Main Method

Hide Copy Code


class Program
{
static void Main(string[] args)
{
string csv = @"Id, Name
1, Mark
2, Tom";
using (var p = ChoCSVReader<EmployeeRec>.LoadText(csv)
.WithField(c => c.Id)
.WithField(c => c.Name)
)
{
foreach (var rec in p)
{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
}
}
}
We start by creating a new instance of ChoCSVReader object. That's all. All the
heavy lifting of parsing and loading CSV data stream into the objects is done by
the parser under the hood.

By default, CSVReader discovers and uses default configuration parameters while


loading CSV file. These can be overridable according to your needs. The following
sections will give details about each configuration attributes.

4. Reading All Records


It is as easy as setting up POCO object match up with CSV file structure, you can
read the whole file as enumerable pattern. It is a deferred execution mode, but
takes care of making any aggregate operation on them. FYI, careful of doing
aggregate operation on them, since these type of operations will load the entire
file records into memory.

Listing 4.1 Read CSV File

Hide Copy Code


foreach (var rec in new ChoCSVReader<EmployeeRec>("Emp.csv").WithFirstLineHeader())
{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
or:

Listing 4.2 Read CSV file stream

Hide Copy Code


using (var stream = File.Open(@"test.csv", FileMode.Open))
{
foreach (var rec in new
ChoCSVReader<EmployeeRec>(stream).WithFirstLineHeader())
{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
}
This model keeps your code elegant, clean, easy to read and maintain. Also
leverages LINQ extension methods to to perform grouping, joining, projection,
aggregation etc.

Listing 4.3 Using LINQ

Hide Copy Code


var list = (from o in new
ChoCSVReader<EmployeeRec>("Emp.csv").WithFirstLineHeader()
where o.Name != null && o.Name.StartsWith("R")
select o).ToArray();

foreach (var rec in list)


{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
5. Read Records Manually
It is as easy as setting up POCO object match up with CSV file structure, you can
read the whole file as enumerable pattern

Listing 5.1 Read CSV file

Hide Copy Code


var reader = new ChoCSVReader<EmployeeRec>("Emp.csv").WithFirstLineHeader();
EmployeeRec rec = null;

while ((rec = reader.Read()) != null)


{
Console.WriteLine($"Id: {rec.Id}");
Console.WriteLine($"Name: {rec.Name}");
}
6. Customize CSV Record
Using ChoCSVRecordObjectAttribute, you can customize the POCO entity object
declaratively.

Listing 6.1 Customizing POCO object for each record

Hide Copy Code


[ChoCSVRecordObject(Encoding = "Encoding.UTF32",
ErrorMode = ChoErrorMode.IgnoreAndContinue, IgnoreFieldValueMode =
ChoIgnoreFieldValueMode.All)]
public class EmployeeRec
{
[ChoCSVRecordField(1, FieldName = "id")]
public int Id { get; set; }
[ChoCSVRecordField(2, FieldName ="Name", QuoteField = true)]
[Required]
[DefaultValue("ZZZ")]
public string Name { get; set; }
}
Here are the available attributes to carry out customization of CSV load operation
on a file.

Delimiter - The value used to separate the fields in a CSV row. Default is
Culture.TextInfo.ListSeparator used.
EOLDelimiter - The value used to separate CSV rows. Default is \r\n (NewLine).
CultureName - The culture name (ex. en-US, en-GB) used to read and write CSV data.
IgnoreEmptyLine - A flag to let the reader know if a record should be skipped when
reading if it's empty. A record is considered empty if all fields are empty.
Comments - The value used to denote a line that is commented out. Multiple comments
can be specified. Must be separated by comma.
QuoteChar - The value used to escape fields that contain a delimiter, quote, or
line ending.
QuoteAllFields - N/A for reader.
Encoding - The encoding of the CSV file.
HasExcelSeperator - N/A for reader. Reader seamlessly recognize the excel separator
if specified in the CSV file and use them for parsing.
ColumnCountStrict - This flag indicates if an exception should be thrown if reading
an expected field is missing.
ColumnOrderStrict - This flag indicates if an exception should be thrown if reading
an expected field is in wrong position in the file. This check will be performed
only when ColumnCountStrict is true.
BufferSize - The size of the internal buffer that is used when reader is from the
StreamReader.
NullValue - Special null value text expect to be treated as null value from CSV
file at the record level.
ErrorMode - This flag indicates if an exception should be thrown if reading and an
expected field is failed to load. This can be overridden per property. Possible
values are:
IgnoreAndContinue - Ignore the error, record will be skipped and continue with
next.
ReportAndContinue - Report the error to POCO entity if it is of
IChoNotifyRecordRead type
ThrowAndStop - Throw the error and stop the execution
IgnoreFieldValueMode - A flag to let the reader know if a record should be skipped
when reading if it's empty / null. This can be overridden per property. Possible
values are:
Null - N/A
DBNull - N/A
Empty - skipped if the record value is empty
WhiteSpace - skipped if the record value contains only whitespaces
ObjectValidationMode - A flag to let the reader know about the type of validation
to be performed with record object. Possible values are:
Off - No object validation performed.
MemberLevel - Validation performed at the time of each CSV property gets loaded
with value.
ObjectLevel - Validation performed after all the properties are loaded to the POCO
object.
7. Customize CSV Header
If the CSV file has header, you can instruct the POCO entity by using
ChoCSVFileHeaderAttribute.

Listing 6.1 Customizing POCO object for file header

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
[ChoCSVRecordField(1, FieldName = "id")]
public int Id { get; set; }
[ChoCSVRecordField(2, FieldName ="Name", QuoteField = true)]
[Required]
[DefaultValue("ZZZ")]
public string Name { get; set; }
}
Here are the available members to add some customization to it according to your
need.

FillChar - N/A for reader


Justification - N/A for reader
IgnoreCase - true, CSV column matching are done with case insensitive. Otherwise
false.
TrimOption - This flag tells the reader to trim whitespace from the beginning and
ending of the CSV column header when reading. Possible values are Trim, TrimStart,
TrimEnd.
Truncate - N/A for reader
IgnoreColumnsWithEmptyHeader - true, ignore CSV columns having empty headers.
Otherwise false.
QuoteAll - N/A for reader
HeaderLineAt - If CSV file comes with header NOT at first line, you can use this
attribute to specify the header line.
IgnoreHeader - true, to ignore the file header and load the data by field position.
8. Customize CSV Fields
For each CSV column, you can specify the mapping in POCO entity property using
ChoCSVRecordFieldAttribute.

Listing 6.1 Customizing POCO object for CSV columns

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
[ChoCSVRecordField(1, FieldName = "id")]
public int Id { get; set; }
[ChoCSVRecordField(2, FieldName ="Name", QuoteField = true)]
[Required]
[DefaultValue("ZZZ")]
public string Name { get; set; }
}
Here are the available members to add some customization to it for each property:

FieldPosition - When mapping by position, you specify the index of the CSV column
that you want to use for that property. It is 1 based.
FieldName - When mapping by name, you specify the name of the CSV column that you
want to use for that property. For this to work, the CSV file must have a header
record. The name you specify must match with the name of the header record.
FillChar - N/A for reader.
FieldValueJustification - N/A for reader.
FieldValueTrimOption - This flag tells the reader to trim whitespace from the
beginning and ending of the field value when reading. Possible values are Trim,
TrimStart, TrimEnd.
Truncate - Truncate field value if over maximum field size.
Size - Maxiimum field value size of field.
QuoteField - A flag that tells the reader that the CSV column value is surrounded
by quotes.
NullValue - Special null value text expect to be treated as null value from CSV
file at the field level.
FormatText - Specify custom format specifier to parse the CSV field values.
ErrorMode - This flag indicates if an exception should be thrown if reading and an
expected field failed to load. Possible values are:
IgnoreAndContinue - Ignore the error and continue to load other properties of the
record.
ReportAndContinue - Report the error to POCO entity if it is of IChoRecord type.
ThrowAndStop - Throw the error and stop the execution.
IgnoreFieldValueMode - A flag to let the reader know if a record should be skipped
when reading if it's empty / null. Possible values are:
Null - N/A
DBNull - N/A
Empty - skipped if the record value is empty.
WhiteSpace - skipped if the record value contains only whitespaces.
8.1. DefaultValue
It is the value used and set to the property when the CSV value is empty or
whitespace (controlled via IgnoreFieldValueMode).

Any POCO entity property can be specified with default value using
System.ComponentModel.DefaultValueAttribute.

8.2. ChoFallbackValue
It is the value used and set to the property when the CSV value failed to set.
Fallback value only set when ErrorMode is either IgnoreAndContinue or
ReportAndContinue.

Any POCO entity property can be specified with fallback value using
ChoETL.ChoFallbackValueAttribute.

8.3. Type Converters


Most of the primitive types are automatically converted and set them to the
properties. If the value of the CSV field can't automatically be converted into the
type of the property, you can specify a custom / built-in .NET converters to
convert the value. These can be either IValueConverter, IChoTypeConverter or
TypeConverter converters.

There are couple of ways you can specify the converters for each field

Declarative Approach
Configuration Approach
8.3.1. Declarative Approach
This model is applicable to POCO entity object only. If you have POCO class, you
can specify the converters to each property to carry out necessary conversion on
them. Samples below shows the way to do it.

Listing 8.3.1.1 Specifying type converters

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
[ChoCSVRecordField(1, FieldName = "id")]
[ChoTypeConverter(typeof(IntConverter))]
public int Id { get; set; }
[ChoCSVRecordField(2, FieldName ="Name", QuoteField = true)]
[Required]
[DefaultValue("ZZZ")]
public string Name { get; set; }
}
Listing 8.3.1.2 IntConverter implementation

Hide Copy Code


public class IntConverter : IValueConverter
{
public object Convert(object value, Type targetType, object parameter,
CultureInfo culture)
{
return value;
}

public object ConvertBack(object value, Type targetType, object parameter,


CultureInfo culture)
{
return value;
}
}
In the example above, we defined custom IntConverter class. And showed how to use
it with 'Id' CSV property.

8.3.2. Configuration Approach


This model is applicable to both dynamic and POCO entity object. This gives freedom
to attach the converters to each property at runtime. This takes the precedence
over the declarative converters on POCO classes.

Listing 8.3.2.1 Specifying TypeConverters

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.FileHeaderConfiguration.HasHeaderRecord = true;
config.ThrowAndStopOnMissingField = false;

ChoCSVRecordFieldConfiguration idConfig = new ChoCSVRecordFieldConfiguration("Id",


1);
idConfig.AddConverter(new IntConverter());
config.CSVRecordFieldConfigurations.Add(idConfig);

config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2));
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name1",
2));
In above, we construct and attach the IntConverter to 'Id' field using AddConverter
helper method in ChoCSVRecordFieldConfiguration object.

Likewise, if you want to remove any converter from it, you can use RemoveConverter
on ChoCSVRecordFieldConfiguration object.

8.3.3. Custom Value Converter Approach


This approach allows to attach value converter to each CSV member using Fluenrt
API. This is quick way to handle any odd conversion process and avoid creating
value converter class.

Listing 8.3.3.1 POCO class

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
[ChoCSVRecordField(1, FieldName = "id")]
public int Id { get; set; }
[ChoCSVRecordField(2, FieldName ="Name", QuoteField = true)]
[Required]
[DefaultValue("ZZZ")]
public string Name { get; set; }
}
With the fluent API, sample below shows how to attach value converter to Id column

Listing 8.3.3.2 Attaching Value Converter

Hide Copy Code


using (var dr = new ChoCSVReader<EmployeeRec>(@"Test.csv")
.WithFirstLineHeader()
.WithField(c => c.Id, valueConverter: (v) => Convert.ToInt32(v as string))
)
{
foreach (var rec in dr)
{
Console.WriteLine(rec.Id);
}
}
8.4. Validations
CSVReader leverages both System.ComponentModel.DataAnnotations and Validation Block
validation attributes to specify validation rules for individual fields of POCO
entity. Refer to the MSDN site for a list of available DataAnnotations validation
attributes.

Listing 8.4.1 Using validation attributes in POCO entity

Hide Copy Code


[ChoCSVFileHeader]
[ChoCSVRecordObject(Encoding = "Encoding.UTF32", ErrorMode =
ChoErrorMode.IgnoreAndContinue,
IgnoreFieldValueMode = ChoIgnoreFieldValueMode.All,
ThrowAndStopOnMissingField = false)]
public partial class EmployeeRec
{
[ChoCSVRecordField(1, FieldName = "id")]
[ChoTypeConverter(typeof(IntConverter))]
[Range(1, int.MaxValue, ErrorMessage = "Id must be > 0.")]
[ChoFallbackValue(1)]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name")]


[Required]
[DefaultValue("ZZZ")]
[ChoFallbackValue("XXX")]
public string Name { get; set; }
}
In example above, used Range validation attribute for Id property. Required
validation attribute to Name property. CSVReader performs validation on them during
load based on Configuration.ObjectValidationMode is set to
ChoObjectValidationMode.MemberLevel or ChoObjectValidationMode.ObjectLevel.

Sometime you may want override the defined declarative validation behaviors comes
with POCO class, you can do with Cinchoo ETL via configuration approach. The sample
below shows the way to override them.

Hide Shrink Copy Code


static void ValidationOverridePOCOTest()
{
ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
var idConfig = new ChoCSVRecordFieldConfiguration("Id", 1);
idConfig.Validators = new ValidationAttribute[] { new RequiredAttribute() };
config.CSVRecordFieldConfigurations.Add(idConfig);
config.CSVRecordFieldConfigurations.Add(new
ChoCSVRecordFieldConfiguration("Name", 2));
config.CSVRecordFieldConfigurations.Add(new
ChoCSVRecordFieldConfiguration("Salary", 3) { FieldType = typeof(ChoCurrency) });

using (var stream = new MemoryStream())


using (var reader = new StreamReader(stream))
using (var writer = new StreamWriter(stream))
using (var parser = new ChoCSVReader<EmployeeRecWithCurrency>(reader, config))
{
writer.WriteLine("1,Carl,$100000");
writer.WriteLine("2,Mark,$50000");
writer.WriteLine("3,Tom,1000");

writer.Flush();
stream.Position = 0;

object rec;
while ((rec = parser.Read()) != null)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}

public class EmployeeRecWithCurrency


{
public int? Id { get; set; }
public string Name { get; set; }
public ChoCurrency Salary { get; set; }
}
Some cases, you may want to take control and perform manual self validation within
the POCO entity class. This can be achieved by inheriting POCO object from
IChoValidatable interface.

Listing 8.4.2 Manual validation on POCO entity

Hide Shrink Copy Code


[ChoCSVFileHeader]
[ChoCSVRecordObject(Encoding = "Encoding.UTF32", ErrorMode =
ChoErrorMode.IgnoreAndContinue,
IgnoreFieldValueMode = ChoIgnoreFieldValueMode.All,
ThrowAndStopOnMissingField = false)]
public partial class EmployeeRec : IChoValidatable
{
[ChoCSVRecordField(1, FieldName = "id")]
[ChoTypeConverter(typeof(IntConverter))]
[ChoFallbackValue(1)]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name")]


[Required]
[DefaultValue("ZZZ")]
[ChoFallbackValue("XXX")]
public string Name { get; set; }

public bool TryValidate(object target, ICollection<ValidationResult>


validationResults)
{
return true;
}

public bool TryValidateFor(object target, string memberName,


ICollection<ValidationResult> validationResults)
{
return true;
}
}
Sample above shows how to implement custom self-validation in POCO object.

IChoValidatable interface exposes below methods

TryValidate - Validate entire object, return true if all validation passed.


Otherwise return false.
TryValidateFor - Validate specific property of the object, return true if all
validation passed. Otherwise return false.
8.5. ChoIgnoreMember
If you want to ignore a POCO class member from CSV parsing in OptOut mode, decorate
them with ChoIgnoreMemberAttribute. Sample below shows Title member is ignored from
CSV loading process.

Listing 8.5.1 Ignore a member

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
public int Id { get; set; }
public string Name { get; set; }
[ChoIgnoreMember]
public string Title { get; set; }
}
8.6. StringLength
In OptOut mode, you can specify the size of the CSV column by using
System.ComponentModel.DataAnnotations.StringLengthAttribute.

Listing 8.6.1 Specifying Size of CSV member

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
public int Id { get; set; }
[StringLength(25)]
public string Name { get; set; }
[ChoIgnoreMember]
public string Title { get; set; }
}
8.7. Display
In OptOut mode, you can specify the name of CSV column mapped to member using
System.ComponentModel.DataAnnotations.DisplayAttribute.

Listing 8.7.1 Specifying name of CSV column

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
public int Id { get; set; }
[Display(Name="FullName")]
[StringLength(25)]
public string Name { get; set; }
[ChoIgnoreMember]
public string Title { get; set; }
}
8.8. DisplayName
In OptOut mode, you can specify the name of CSV column mapped to member using
System.ComponentModel.DataAnnotations.DisplayNameAttribute.

Listing 8.8.1 Specifying name of CSV column

Hide Copy Code


[ChoCSVFileHeader]
public class EmployeeRec
{
public int Id { get; set; }
[DisplayName("FullName")]
[StringLength(25)]
public string Name { get; set; }
[ChoIgnoreMember]
public string Title { get; set; }
}
9. Excel Field Separator
If the input CSV file contains Excel field separator, the reader automatically
detects and use them as field separator.

Listing 9.1 Sample CSV file with Excel field separator

Hide Copy Code


sep=,
1,"Eldon Base for stackable storage shelf, platinum"
2,"1.7 Cubic Foot Compact ""Cube"" Office Refrigerators"
3,"Cardinal Slant-D® Ring Binder, Heavy Gauge Vinyl"
4,R380
5,Holmes HEPA Air Purifier
10. Callback Mechanism
CSVReader offers industry standard CSV parsing out of the box to handle most of the
parsing needs. If the parsing is not handling any of the needs, you can use the
callback mechanism offered by CSVReader to handle such situations. In order to
participate in the callback mechanism, you can use either of the following models

Using event handlers exposed by CSVReader via IChoReader interface.


Inheriting POCO entity object from IChoNotifyRecordRead / IChoNotifyFileRead /
IChoNotifyRecordFieldRead interfaces
Inheriting DataAnnotation's MetadataType type object by IChoNotifyRecordRead /
IChoNotifyFileRead / IChoNotifyRecordFieldRead interfaces.
Inheriting IChoNotifyRecordFieldConfigurable / IChoNotifyRecordFieldConfigurable
configuration interfaces
Note: Any exceptions raised out of these interface methods will be ignored.

IChoReader exposes the below events:

BeginLoad - Invoked at the begin of the CSV file load


EndLoad - Invoked at the end of the CSV file load
BeforeRecordLoad - Raised before the CSV record load
AfterRecordLoad - Raised after CSV record load
RecordLoadError - Raised when CSV record load errors out
BeforeRecordFieldLoad - Raised before CSV column value load
AfterRecordFieldLoad - Raised after CSV column value load
RecordFieldLoadError - Raised when CSV column value errors out
SkipUntil - Raised before the CSV parsing kicks off to add custom logic to skip
record lines.
DoWhile - Raised during CSV parsing where you can add custom logic to stop the
parsing.
IChoNotifyRecordRead exposes the below methods:

BeforeRecordLoad - Raised before the CSV record load


AfterRecordLoad - Raised after CSV record load
RecordLoadError - Raised when CSV record load errors out
IChoNotifyFileRead exposes the below methods:

BeginLoad - Invoked at the begin of the CSV file load


EndLoad - Invoked at the end of the CSV file load
SkipUntil - Raised before the CSV parsing kicks off to add custom logic to skip
record lines.
DoWhile - Raised during CSV parsing where you can add custom logic to stop the
parsing.
IChoNotifyRecordFieldRead exposes the below methods:

BeforeRecordFieldLoad - Raised before CSV column value load


AfterRecordFieldLoad - Raised after CSV column value load
RecordFieldLoadError - Raised when CSV column value errors out
IChoNotifyRecordConfigurable exposes the below methods:

RecondConfigure - Raised for CSV record configuration


IChoNotifyRecordFieldConfigurable exposes the below methods:

RecondFieldConfigure - Raised for each CSV record field configuration


10.1. Using CSVReader events
This is more direct and simplest way to subscribe to the callback events and handle
your odd situations in parsing CSV files. Downside is that code can't be reusable
as you do by implementing IChoNotifyRecordRead with POCO record object.

Sample below shows how to use the BeforeRecordLoad callback method to skip lines
stating with '%' characters.

Listing 10.1.1 Using CSVReader callback events

Hide Copy Code


static void IgnoreLineTest()
{
using (var parser = new
ChoCSVReader("IgnoreLineFile.csv").WithFirstLineHeader())
{
parser.Configuration.Encoding = Encoding.BigEndianUnicode;

parser.BeforeRecordLoad += (o, e) =>


{
if (e.Source != null)
{
e.Skip = ((string)e.Source).StartsWith("%");
}
};
foreach (var e in parser)
Console.WriteLine(e.Dump());
}
}
Likewise you can use other callback methods as well with CSVReader.

10.2. Implementing IChoNotifyRecordRead interface


Sample below shows how to implement IChoNotifyRecordRead interface to direct POCO
class.

Listing 10.2.1 Direct POCO callback mechanism implementation

Hide Shrink Copy Code


[ChoCSVFileHeader]
[ChoCSVRecordObject(Encoding = "Encoding.UTF32", ErrorMode =
ChoErrorMode.IgnoreAndContinue,
IgnoreFieldValueMode = ChoIgnoreFieldValueMode.All,
ThrowAndStopOnMissingField = false)]
public partial class EmployeeRec : IChoNotifyRecordRead
{
[ChoCSVRecordField(1, FieldName = "id")]
[ChoTypeConverter(typeof(IntConverter))]
[Range(1, int.MaxValue, ErrorMessage = "Id must be > 0.")]
[ChoFallbackValue(1)]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name", QuoteField = true)]


[Required]
[DefaultValue("ZZZ")]
[ChoFallbackValue("XXX")]
public string Name { get; set; }

public bool AfterRecordLoad(object target, int index, object source)


{
throw new NotImplementedException();
}
public bool BeforeRecordLoad(object target, int index, ref object source)
{
throw new NotImplementedException();
}

public bool RecordLoadError(object target, int index, object source, Exception


ex)
{
throw new NotImplementedException();
}
}
Sample below shows how to attach Metadata class to POCO class by using
MetadataTypeAttribute on it.

Listing 10.2.2 MetaDataType based callback mechanism implementation

Hide Shrink Copy Code


[ChoCSVFileHeader]
[ChoCSVRecordObject(Encoding = "Encoding.UTF32", ErrorMode =
ChoErrorMode.IgnoreAndContinue,
IgnoreFieldValueMode = ChoIgnoreFieldValueMode.All,
ThrowAndStopOnMissingField = false)]
public class EmployeeRecMeta : IChoNotifyRecordRead
{
[ChoCSVRecordField(1, FieldName = "id")]
[ChoTypeConverter(typeof(IntConverter))]
[Range(1, int.MaxValue, ErrorMessage = "Id must be > 0.")]
[ChoFallbackValue(1)]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name", QuoteField = true)]


[Required]
[DefaultValue("ZZZ")]
[ChoFallbackValue("XXX")]
public string Name { get; set; }

public bool AfterRecordLoad(object target, int index, object source)


{
throw new NotImplementedException();
}

public bool BeforeRecordLoad(object target, int index, ref object source)


{
throw new NotImplementedException();
}

public bool RecordLoadError(object target, int index, object source, Exception


ex)
{
throw new NotImplementedException();
}
}

[MetadataType(typeof(EmployeeRecMeta))]
public partial class EmployeeRec
{
public int Id { get; set; }
public string Name { get; set; }
}
Sample below shows how to attach Metadata class for sealed or third party POCO
class by using ChoMetadataRefTypeAttribute on it.

Listing 10.2.3 ChoMetaDataRefType based callback mechanism implementation

Hide Shrink Copy Code


[ChoMetadataRefType(typeof(EmployeeRec))]
[ChoCSVFileHeader]
[ChoCSVRecordObject(Encoding = "Encoding.UTF32", ErrorMode =
ChoErrorMode.IgnoreAndContinue,
IgnoreFieldValueMode = ChoIgnoreFieldValueMode.All,
ThrowAndStopOnMissingField = false)]
public class EmployeeRecMeta : IChoNotifyRecordRead
{
[ChoCSVRecordField(1, FieldName = "id")]
[ChoTypeConverter(typeof(IntConverter))]
[Range(1, int.MaxValue, ErrorMessage = "Id must be > 0.")]
[ChoFallbackValue(1)]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name", QuoteField = true)]


[Required]
[DefaultValue("ZZZ")]
[ChoFallbackValue("XXX")]
public string Name { get; set; }

public bool AfterRecordLoad(object target, int index, object source)


{
throw new NotImplementedException();
}

public bool BeforeRecordLoad(object target, int index, ref object source)


{
throw new NotImplementedException();
}

public bool RecordLoadError(object target, int index, object source, Exception


ex)
{
throw new NotImplementedException();
}
}

public partial class EmployeeRec


{
public int Id { get; set; }
public string Name { get; set; }
}
10.3. BeginLoad
This callback invoked once at the beginning of the CSV file load. source is the CSV
file stream object. In here you have chance to inspect the stream, return true to
continue the CSV load. Return false to stop the parsing.

Listing 10.3.1 BeginLoad Callback Sample

Hide Copy Code


public bool BeginLoad(object source)
{
StreamReader sr = source as StreamReader;
return true;
}
10.4. EndLoad
This callback invoked once at the end of the CSV file load. source is the CSV file
stream object. In here you have chance to inspect the stream, do any post steps to
be performed on the stream.

Listing 10.4.1 EndLoad Callback Sample

Hide Copy Code


public void EndLoad(object source)
{
StreamReader sr = source as StreamReader;
}
10.5. BeforeRecordLoad
This callback invoked before each record line in the CSV file is loaded. target is
the instance of the POCO record object. index is the line index in the file. source
is the CSV record line. In here you have chance to inspect the line, and override
it with new line if want to.

TIP: If you want to skip the line from loading, set the source to null.

TIP: If you want to take control of parsing and loading the record properties by
yourself, set the source to String.Empty.

Return true to continue the load process, otherwise return false to stop the
process.

Listing 10.5.1 BeforeRecordLoad Callback Sample

Hide Copy Code


public bool BeforeRecordLoad(object target, int index, ref object source)
{
string line = source as string;
return true;
}
10.6. AfterRecordLoad
This callback invoked after each record line in the CSV file is loaded. target is
the instance of the POCO record object. index is the line index in the file. source
is the CSV record line. In here you have chance to do any post step operation with
the record line.

Return true to continue the load process, otherwise return false to stop the
process.

Listing 10.6.1 AfterRecordLoad Callback Sample

Hide Copy Code


public bool AfterRecordLoad(object target, int index, object source)
{
string line = source as string;
return true;
}
10.7. RecordLoadError
This callback invoked if error encountered while loading record line. target is the
instance of the POCO record object. index is the line index in the file. source is
the CSV record line. ex is the exception object. In here you have chance to handle
the exception. This method invoked only when Configuration.ErrorMode is
ReportAndContinue.

Return true to continue the load process, otherwise return false to stop the
process.

Listing 10.7.1 RecordLoadError Callback Sample

Hide Copy Code


public bool RecordLoadError(object target, int index, object source, Exception ex)
{
string line = source as string;
return true;
}
10.8. BeforeRecordFieldLoad
This callback invoked before each CSV record column is loaded. target is the
instance of the POCO record object. index is the line index in the file. propName
is the CSV record property name. value is the CSV column value. In here you have
chance to inspect the CSV record property value and perform any custom validations
etc.

Return true to continue the load process, otherwise return false to stop the
process.

Listing 10.8.1 BeforeRecordFieldLoad Callback Sample

Hide Copy Code


public bool BeforeRecordFieldLoad(object target, int index, string propName, ref
object value)
{
return true;
}
10.9. AfterRecordFieldLoad
This callback invoked after each CSV record column is loaded. target is the
instance of the POCO record object. index is the line index in the file. propName
is the CSV record property name. value is the CSV column value. Any post field
operation can be performed here, like computing other properties, validations etc.

Return true to continue the load process, otherwise return false to stop the
process.

Listing 10.9.1 AfterRecordFieldLoad Callback Sample

Hide Copy Code


public bool AfterRecordFieldLoad(object target, int index, string propName, object
value)
{
return true;
}
10.10. RecordLoadFieldError
This callback invoked when error encountered while loading CSV record column value.
target is the instance of the POCO record object. index is the line index in the
file. propName is the CSV record property name. value is the CSV column value. ex
is the exception object. In here you have chance to handle the exception. This
method invoked only after the below two sequences of steps performed by the
CSVReader

CSVReader looks for FallbackValue value of each CSV property. If present, it tries
to assign its value to it.
If the FallbackValue value not present and the Configuration.ErrorMode is specified
as ReportAndContinue., this callback will be executed.
Return true to continue the load process, otherwise return false to stop the
process.

Listing 10.10.1 RecordFieldLoadError Callback Sample

Hide Copy Code


public bool RecordFieldLoadError(object target, int index, string propName, object
value, Exception ex)
{
return true;
}
10.11. SkipUntil
This callback invoked at the start of the CSV parsing with custom logic to skip
lines. index is the line index in the file.

Return true to skip the line, otherwise return false.

Listing 10.11.1 SkipUntil Callback Sample

Hide Copy Code


public bool SkipUntil(long index, object source)
{
return false;
}
10.12. DoWhile
This callback invoked at the start of the CSV parsing with custom logic to skip
lines. index is the line index in the file.

Return true to stop the parsing, otherwise return false.

Listing 10.12.1 DoWhile Callback Sample

Hide Copy Code


public bool DoWhile(long index, object source)
{
return false;
}
10. Customization
CSVReader automatically detects and loads the configured settings from POCO entity.
At runtime, you can customize and tweak these parameters before CSV parsing.
CSVReader exposes Configuration property, it is of ChoCSVRecordConfiguration
object. Using this property, you can customize them.

Listing 10.1 Customizing CSVReader at run-time

Hide Copy Code


class Program
{
static void Main(string[] args)
{
string csv = @"Id, Name
1, Carl
2, Mark
3,";

dynamic row = null;


using (var parser = ChoCSVReader.LoadText(csv)
.WithFirstLineHeader()
)
{
parser.Configuration.ColumnCountStrict = true;
while ((row = parser.Read()) != null)
{
Console.WriteLine($"Id: {row.Id}");
Console.WriteLine($"Name: {row.Name}");
}
}
}
}
11. AsDataReader Helper Method
CSVReader exposes AsDataReader helper method to retrieve the CSV records in .NET
datareader object. DataReader are fast-forward streams of data. This datareader can
be used in few places like bulk coping data to database using SqlBulkCopy, loading
disconnected DataTable, etc.

Listing 11.1 Reading as DataReader sample

Hide Copy Code


static void AsDataReaderTest()
{
string csv = @"Id, Name
1, Carl
2, Mark
3,";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
)
{
IDataReader dr = parser.AsDataReader();
while (dr.Read())
{
Console.WriteLine("Id: {0}, Name: {1}", dr[0], dr[1]);
}
}
}
12. AsDataTable Helper Method
CSVReader exposes AsDataTable helper method to retrieve the CSV records in .NET
DataTable object. It then can be persisted to disk, displayed in grid/controls or
stored in memory like any other object.

Listing 12.1 Reading as DataTable sample

Hide Copy Code


static void AsDataTableTest()
{
string csv = @"Id, Name
1, Carl
2, Mark
3,";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
)
{
IDataReader dr = parser.AsDataTable();
while (dr.Read())
{
Console.WriteLine("Id: {0}, Name: {1}", dr[0], dr[1]);
}
}
}
13. Using Dynamic Object
So far, the article explained about using CSVReader with POCO object. CSVReader
also supports loading CSV file without POCO object. It leverages .NET dynamic
feature. The sample below shows how to read CSV stream without POCO object.

If you have CSV file, you can parse and load the file with minimal/zero
configuration. If the CSV file does not have header record line, the parser
automatically names the columns as Column1, Column2, etc.

The sample below shows it:

Listing 13.1 Loading CSV file without header sample

Hide Copy Code


class Program
{
static void Main(string[] args)
{
string csv = @"1, Carl
2, Mark
3,";

dynamic row = null;


using (var parser = ChoCSVReader.LoadText(csv))
{
while ((row = parser.Read()) != null)
{
Console.WriteLine($"Id: {row.Column1}");
Console.WriteLine($"Name: {row.Column2}");
}
}
}
}
If the CSV file has a header, you can state that in the configuration as
HasHeaderRecord is true / use WithFirstLineHeader fluent API and parse the file as
simple as below:

Listing 13.2 Loading CSV file with header sample

Hide Copy Code


class Program
{
static void Main(string[] args)
{
string csv = @"Id, Name
1, Carl
2, Mark
3,";

dynamic row = null;


using (var parser = ChoCSVReader.LoadText(csv)
.WithFirstLineHeader()
)
{
while ((row = parser.Read()) != null)
{
Console.WriteLine($"Id: {row.Id}");
Console.WriteLine($"Name: {row.Name}");
}
}
}
}
The above example automatically discovers the CSV columns from the header and
parses the file.

You can override the default behavior of discovering columns automatically by


adding field configurations manually and pass it to CSVReader for parsing file.

Sample shows how to do it:

Listing 13.3 Loading CSV file with configuration

Hide Copy Code


class Program
{
static void Main(string[] args)
{
ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.CSVFileHeaderConfiguration.HasHeaderRecord = true;
config.CSVRecordFieldConfigurations.Add(new
ChoCSVRecordFieldConfiguration("Id", 1));
config.CSVRecordFieldConfigurations.Add(new
ChoCSVRecordFieldConfiguration("Name", 2));

dynamic row = null;


using (var parser = ChoCSVReader.LoadText(csv, config)
)
{
while ((row = parser.Read()) != null)
{
Console.WriteLine($"Id: {row.Id}");
Console.WriteLine($"Name: {row.Name}");
}
}
}
}
To completely turn off the auto column discovery, you will have to set
ChoCSVRecordConfiguration.AutoDiscoverColumns to false.

13.1. DefaultValue
It is the value used and set to the property when the CSV value is empty or
whitespace (controlled via IgnoreFieldValueMode).

Any POCO entity property can be specified with default value using
System.ComponentModel.DefaultValueAttribute.

For dynamic object members or to override the declarative POCO object member's
default value specification, you can do so through configuration as shown below.

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Id",
1));
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2) { DefaultValue = "NoName" })
13.2. ChoFallbackValue
It is the value used and set to the property when the CSV value failed to set.
Fallback value only set when ErrorMode is either IgnoreAndContinue or
ReportAndContinue.

Any POCO entity property can be specified with fallback value using
ChoETL.ChoFallbackValueAttribute.

For dynamic object members or to override the declarative POCO object member's
fallback values, you can do through configuration as shown below.

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Id",
1));
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2) { FallbackValue = "Tom" });
13.3. FieldType
In the type less dynamic object model, the reader reads individual field value and
populate them to dynamic object members in 'string' value. If you want to enforce
the type and do extra type checking during load, you can do so by declaring the
field type at the field configuration.

Listing 8.5.1 Defining FieldType

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Id", 1)
{ FieldType = typeof(int) });
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2));
In above sample shows to define field type as 'int' to 'Id' field. This instruct
the CSVReader to parse and convert the value to integer before assigning to it.
This extra type safety alleviate the incorrect values being loaded to object while
parsing.

13.4. Type Converters


Most of the primitive types are automatically converted and set them to the
properties by CSVReader. If the value of the CSV field can't automatically be
converted into the type of the property, you can specify a custom / built-in .NET
converters to convert the value. These can be either IValueConverter or
TypeConverter converters.

In the dynamic object model, you can specify these converters via configuration.
See below example on the approach taken to specify type converters for CSV columns

Listing 13.4.1 Specifying TypeConverters

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.FileHeaderConfiguration.HasHeaderRecord = true;
config.ThrowAndStopOnMissingField = false;

ChoCSVRecordFieldConfiguration idConfig = new ChoCSVRecordFieldConfiguration("Id",


1);
idConfig.AddConverter(new IntConverter());
config.CSVRecordFieldConfigurations.Add(idConfig);
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2));
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name1",
2));
In above, we construct and attach the IntConverter to 'Id' field using AddConverter
helper method in ChoCSVRecordFieldConfiguration object.

Likewise, if you want to remove any converter from it, you can use RemoveConverter
on ChoCSVRecordFieldConfiguration object.

13.5. Validations
CSVReader leverages both System.ComponentModel.DataAnnotations and Validation Block
validation attributes to specify validation rules for individual CSV fields. Refer
to the MSDN site for a list of available DataAnnotations validation attributes.

Listing 13.5.1 Specifying Validations

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.FileHeaderConfiguration.HasHeaderRecord = true;
config.ThrowAndStopOnMissingField = false;

ChoCSVRecordFieldConfiguration idConfig = new ChoCSVRecordFieldConfiguration("Id",


1);
idConfig.Validators = new ValidationAttribute[] { new RangeAttribute(0, 100) };
config.CSVRecordFieldConfigurations.Add(idConfig);

config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2));
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name1",
2));
In example above, we used Range validation attribute for Id property. CSVReader
performs validation on them during load based on Configuration.ObjectValidationMode
is set to ChoObjectValidationMode.MemberLevel or
ChoObjectValidationMode.ObjectLevel.

PS: Self validation NOT supported in Dynamic object model

14. Working with sealed POCO object


If you already have existing sealed POCO object or the object is in 3rd party
library, we can use them with CSVReader. All you need is the CSV file with header
in it.

Listing 14.1 Exisiting sealed POCO Object

Hide Copy Code


public sealed class ThirdPartyRec
{
public int Id
{
get;
set;
}
public string Name
{
get;
set;
}
}
Listing 14.2 Consuming CSV file

Hide Copy Code


class Program
{
static void Main(string[] args)
{
string csv = @"Id, Name
1, Carl
2, Mark
3,";

dynamic row = null;


using (var parser = ChoCSVReader<ThirdPartyRec>.LoadText(csv)
.WithFirstLineHeader()
)
{
while ((row = parser.Read()) != null)
{
Console.WriteLine($"Id: {row.Id}");
Console.WriteLine($"Name: {row.Name}");
}
}
}
}
In this case, CSVReader reverse discover the CSV columns from the CSV file and load
the data into POCO object. If the CSV file structure and POCO object matches, the
load will success with populating all corresponding data to its properties. In case
the property is missing for any CSV column, CSVReader silently ignores them and
continue on with rest.

You can override this behavior by setting


ChoCSVRecordConfiguration.ThrowAndStopOnMissingField property to false. In this
case, the CSVReader will throw ChoMissingRecordFieldException exception if a
property is missing for a CSV column.

Is there a way to extend the sealed third party object with CSV specification? Yes,
Cinchoo provides model to extend them by creating metadata class with
ChoMetadataRefTypeAttribute. Cinchoo auto discover it and load the specs from this
special class.

Hide Copy Code


[ChoMetadataRefType(typeof(ThirdPartyRec))]
public sealed class ThirdPartyRecMetaData
{
[ChoCSVRecordField(1, Name = "SeqNo")]
public int Id
{
get;
set;
}
[ChoCSVRecordField(2)]
public string Name
{
get;
set;
}
}
There is another way to inject MetaData object to Cinchoo framework as well.

Hide Copy Code


ChoMetadataObjectCache.Default.Attach(typeof(ThirdPartyRec), new
ThirdPartyRecMetaData());
15. Exceptions
CSVReader throws different types of exceptions in different situations.

ChoParserException - CSV file is bad and parser not able to recover.


ChoRecordConfigurationException - Any invalid configuration settings are specified,
this exception will be raised.
ChoMissingRecordFieldException - A property is missing for a CSV column, this
exception will be raised.
16. Tips
16.1. Multiline CSV column value
If CSV file contains column values with newline characters in it, ChoCSVReader can
handle it by specifying Configuration.MayContainEOLInData = true.

Listing 16.1.1 Multiline column values in CSV file

Hide Copy Code


Id,Name
1,"Tom
Cassawaw"
2,"Carl"
3,"Mark"
In above, the Id (1) has name in multiline with surrounded quotes. Sample below
show on how to handle it.

Listing 16.1.2 Reading multiline column values in CSV file

Hide Copy Code


static void IgnoreUnwantedLines()
{
using (var parser = new ChoCSVReader("Emp.csv").WithFirstLineHeader())
{
parser.Configuration.MayContainEOLInData = true;

foreach (var e in parser)


Console.WriteLine(e.Dump());
}
}
16.2. CSV column value with field separator
If CSV file contains column values with field separator (,) in it, ChoCSVReader can
handle it with surrounded quotes.

Listing 16.2.1 CSV column value with separators

Hide Copy Code


Id,Name
1,"Tom Cassawaw"
2,"Carl, Malcolm"
3,"Mark"
In above, the Id (2) has name with separator (,) in it. In order for ChoCSVReader
recognize this situation, it must be surrounded by quotes.

16.3. CSV column value with single quotes


ChoCSVReader can read CSV column values with single quotes in it seamlessly. No
surrounded quotes required.
Listing 16.3.1 CSV column value with single quotes

Hide Copy Code


Id,Name
1,Tom Cassawaw
2,Carl'Malcolm
3,Mark
In above, the Id (2) has name with single quotes (') in it. ChoCSVReader recognize
this situation, load these values successfully.

17. Using MetadataType Annotation


Cinchoo ETL works better with data annotation's MetadataType model. It is way to
attach MetaData class to data model class. In this associated class, you provide
additional metadata information that is not in the data model. It roles is to add
attribute to a class without having to modify this one. You can add this attribute
that takes a single parameter to a class that will have all the attributes. This is
useful when the POCO classes are auto generated (by Entity Framework, MVC etc) by
an automatic tools. This is why second class come into play. You can add new stuffs
without touching the generated file. Also this promotes modularization by
separating the concerns into multiple classes.

For more information about it, please search in MSDN.

Listing 17.1 MetadataType annotation usage sample

Hide Shrink Copy Code


[MetadataType(typeof(EmployeeRecMeta))]
public class EmployeeRec
{
public int Id { get; set; }
public string Name { get; set; }
}

[ChoCSVFileHeader]
[ChoCSVRecordObject(Encoding = "Encoding.UTF32", ErrorMode =
ChoErrorMode.ThrowAndStop,
IgnoreFieldValueMode = ChoIgnoreFieldValueMode.All, ThrowAndStopOnMissingField =
false,
ObjectValidationMode = ChoObjectValidationMode.MemberLevel)]
public class EmployeeRecMeta : IChoNotifyRecordRead, IChoValidatable
{
[ChoCSVRecordField(1, FieldName = "id", ErrorMode =
ChoErrorMode.ReportAndContinue )]
[ChoTypeConverter(typeof(IntConverter))]
[Range(1, 1, ErrorMessage = "Id must be > 0.")]
[ChoFallbackValue(1)]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name", QuoteField = true)]


[StringLength(1)]
[DefaultValue("ZZZ")]
[ChoFallbackValue("XXX")]
public string Name { get; set; }

public bool AfterRecordLoad(object target, int index, object source)


{
throw new NotImplementedException();
}
public bool BeforeRecordLoad(object target, int index, ref object source)
{
throw new NotImplementedException();
}

public bool RecordFieldLoadError(object target, int index, string propName,


object value, Exception ex)
{
throw new NotImplementedException();
}

public bool TryValidate(object target, ICollection<ValidationResult>


validationResults)
{
return true;
}

public bool TryValidateFor(object target, string memberName,


ICollection<ValidationResult> validationResults)
{
return true;
}

public void Validate(object target)


{
}

public void ValidateFor(object target, string memberName)


{
}
}
In above EmployeeRec is the data class. Contains only domain specific properties
and operations. Mark it very simple class to look at it.

We separate the validation, callback mechanism, configuration etc into metadata


type class, EmployeeRecMeta.

18. Configuration Choices


If the POCO entity class is an auto-generated class or exposed via library or it is
a sealed class, it limits you to attach CSV schema definition to it declaratively.
In such case, you can choose one of the options below to specify CSV layout
configuration

Manual Configuration
Auto Map Configuration
Attaching MetadataType class
I'm going to show you how to configure the below POCO entity class on each approach

Listing 18.1 Sealed POCO entity class

Hide Copy Code


public sealed class EmployeeRec
{
public int Id { get; set; }
public string Name { get; set; }
}
18.1. Manual Configuration
Define a brand new configuration object from scratch and add all the necessary CSV
fields to the ChoCSVConfiguration.CSVRecordFieldConfigurations collection property.
This option gives you greater flexibility to control the configuration of CSV
parsing. But the downside is that possibility of making mistakes and hard to manage
them if the CSV file layout is large,

Listing 18.1.1 Manual Configuration

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();

config.FileHeaderConfiguration.HasHeaderRecord = true;
config.ThrowAndStopOnMissingField = true;
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Id",
1));
config.CSVRecordFieldConfigurations.Add(new ChoCSVRecordFieldConfiguration("Name",
2));

foreach (var e in new ChoCSVReader<EmployeeRec>("Emp.csv", config))


{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
Above configuration approach suitable for both POCO as well as dynamic model
approach. Below approach more type safe approach if you parse the CSV file using
POCO model

Hide Copy Code


var config = new ChoCSVRecordConfiguration<EmployeeRec>()
.WithFirstLineHeader()
.Configure(c => c.ThrowAndStopOnMissingField = true)
.Map(f => f.Id, 1)
.Map(f => f.Name, 2);

foreach (var e in new ChoCSVReader<EmployeeRec>("Emp.csv", config))


{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
18.2. Auto Map Configuration
This is an alternative approach and very less error-prone method to auto map the
CSV columns for the POCO entity class. This approach helpfull if the POCO model
objects are sealed / exposed via thrid party library / maintained in separate
repository not under your control, you cam define mapping class and register them
with parser to define the field mapping and configuration.

First define a schema class for EmployeeRec POCO entity class as below

Listing 18.2.1 Auto Map class

Hide Copy Code


public class EmployeeRecMap
{
[ChoCSVRecordField(1, FieldName = "id")]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name")]


public string Name { get; set; }
}
Then you can use it to auto map CSV columns by using
ChoCSVRecordConfiguration.MapRecordFields method

Listing 18.2.2 Using Auto Map configuration

Hide Copy Code


ChoCSVRecordConfiguration config = new ChoCSVRecordConfiguration();
config.MapRecordFields<EmployeeRecMap>();

foreach (var e in new ChoCSVReader<EmployeeRec>("Emp.csv",


config).WithFirstLineHeader())
{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
18.3. Attaching MetadataType class
This is one another approach to attach MetadataType class for POCO entity object.
Previous approach simple care for auto mapping of CSV columns only. Other
configuration properties like property converters, parser parameters,
default/fallback values etc. are not considered.

This model, accounts for everything by defining MetadataType class and specifying
the CSV configuration parameters declaratively. This is useful when your POCO
entity is sealed and not partial class. Also it is one of favorable and less error-
prone approach to configure CSV parsing of POCO entity.

Listing 18.3.1 Define MetadataType class

Hide Shrink Copy Code


[ChoCSVFileHeader()]
[ChoCSVRecordObject(Encoding = "Encoding.UTF32", ErrorMode =
ChoErrorMode.ReportAndContinue,
IgnoreFieldValueMode = ChoIgnoreFieldValueMode.All, ThrowAndStopOnMissingField =
false,
ObjectValidationMode = ChoObjectValidationMode.MemberLevel)]
public class EmployeeRecMeta : IChoNotifyRecordRead, IChoValidatable
{
[ChoCSVRecordField(1, FieldName = "id", ErrorMode =
ChoErrorMode.ReportAndContinue )]
[ChoTypeConverter(typeof(IntConverter))]
[Range(1, 1, ErrorMessage = "Id must be > 0.")]
public int Id { get; set; }

[ChoCSVRecordField(2, FieldName = "Name", QuoteField = true)]


[StringLength(1)]
[DefaultValue("ZZZ")]
[ChoFallbackValue("XXX")]
public string Name { get; set; }

public bool AfterRecordLoad(object target, int index, object source)


{
throw new NotImplementedException();
}

public bool BeforeRecordLoad(object target, int index, ref object source)


{
throw new NotImplementedException();
}

public bool RecordFieldLoadError(object target, int index, string propName,


object value, Exception ex)
{
return true;
}

public bool TryValidate(object target, ICollection<ValidationResult>


validationResults)
{
return true;
}

public bool TryValidateFor(object target, string memberName,


ICollection<ValidationResult> validationResults)
{
return true;
}

public void Validate(object target)


{
}

public void ValidateFor(object target, string memberName)


{
}
}
Listing 18.3.2 Attaching MetadataType class

Hide Copy Code


//Attach metadata
ChoMetadataObjectCache.Default.Attach<EmployeeRec>(new EmployeeRecMeta());

foreach (var e in new ChoCSVReader<EmployeeRec>("Emp.csv").WithFirstLineHeader())


{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
19. LoadText Helper Method
This is little nifty helper method to parse and load CSV text string into objects.

Listing 19.1 Using LoadText method

Hide Copy Code


string txt = "Id, Name\r\n1, Mark";
foreach (dynamic e in ChoCSVReader.LoadText(txt).WithFirstLineHeader())
{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
20. Advanced Topics
20.1. Override Converters Format Specs
Cinchoo ETL automatically parses and converts each CSV column values to the
corresponding CSV column's underlying data type seamlessly. Most of the basic .NET
types are handled automatically without any setup needed.

This is achieved through two key settings in the ETL system

ChoCSVRecordConfiguration.CultureInfo - Represents information about a specific


culture including the names of the culture, the writing system, and the calendar
used, as well as access to culture-specific objects that provide information for
common operations, such as formatting dates and sorting strings. Default is 'en-
US'.
ChoTypeConverterFormatSpec - It is global format specifier class holds all the
intrinsic .NET types formatting specs.
In this section, I'm going to talk about changing the default format specs for each
.NET intrinsic data types according to parsing needs.

ChoTypeConverterFormatSpec is singleton class, the instance is exposed via


'Instance' static member. It is thread local, means that there will be separate
instance copy kept on each thread.

There are 2 sets of format specs members given to each intrinsic type, one for
loading and another one for writing the value, except for Boolean, Enum, DataTime
types. These types have only one member for both loading and writing operations.

Specifying each intrinsic data type format specs through ChoTypeConverterFormatSpec


will impact system wide. ie. By setting ChoTypeConverterFormatSpec.IntNumberStyle =
NumberStyles.AllowParentheses, will impact all integer members of CSV objects to
allow parentheses. If you want to override this behavior and take control of
specific CSV data member to handle its own unique parsing of CSV value from global
system wide setting, it can be done by specifying TypeConverter at the CSV field
member level. Refer section 13.4 for more information.

NumberStyles (optional) used for loading values from CSV stream and Format string
are used for writing values to CSV stream.

In this article I'll brief about using NumberStyles for loading CSV data from
stream. These values are optional. It determines the styles permitted for each type
during parsing of CSV file. System automatically figures out the way to parse and
load the values from underlying Culture. In odd situation, you may want to override
and set the styles the way you want in order to successfully load the file. Refer
the MSDN for more about NumberStyles and its values.

Listing 20.1.1 ChoTypeConverterFormatSpec Members

Hide Shrink Copy Code


public class ChoTypeConverterFormatSpec
{
public static readonly ThreadLocal<ChoTypeConverterFormatSpec> Instance = new
ThreadLocal<ChoTypeConverterFormatSpec>(() => new ChoTypeConverterFormatSpec());

public string DateTimeFormat { get; set; }


public ChoBooleanFormatSpec BooleanFormat { get; set; }
public ChoEnumFormatSpec EnumFormat { get; set; }

public NumberStyles? CurrencyNumberStyle { get; set; }


public string CurrencyFormat { get; set; }

public NumberStyles? BigIntegerNumberStyle { get; set; }


public string BigIntegerFormat { get; set; }

public NumberStyles? ByteNumberStyle { get; set; }


public string ByteFormat { get; set; }

public NumberStyles? SByteNumberStyle { get; set; }


public string SByteFormat { get; set; }

public NumberStyles? DecimalNumberStyle { get; set; }


public string DecimalFormat { get; set; }
public NumberStyles? DoubleNumberStyle { get; set; }
public string DoubleFormat { get; set; }

public NumberStyles? FloatNumberStyle { get; set; }


public string FloatFormat { get; set; }

public string IntFormat { get; set; }


public NumberStyles? IntNumberStyle { get; set; }

public string UIntFormat { get; set; }


public NumberStyles? UIntNumberStyle { get; set; }

public NumberStyles? LongNumberStyle { get; set; }


public string LongFormat { get; set; }

public NumberStyles? ULongNumberStyle { get; set; }


public string ULongFormat { get; set; }

public NumberStyles? ShortNumberStyle { get; set; }


public string ShortFormat { get; set; }

public NumberStyles? UShortNumberStyle { get; set; }


public string UShortFormat { get; set; }
}
Sample below shows how to load CSV data stream having 'se-SE' (Swedish) culture
specific data using CSVReader. Also the input feed comes with 'EmployeeNo' values
containing parentheses. In order to make the load successful, we have to set the
ChoTypeConverterFormatSpec.IntNumberStyle to NumberStyles.AllowParenthesis.

Listing 20.1.2 Using ChoTypeConverterFormatSpec in code

Hide Copy Code


static void UsingFormatSpecs()
{
string csv = @"Id,Name,Salary,JoinedDate,EmployeeNo
1,Carl,12.345679 kr,2017-10-10,5
2,Mark,50000 kr,2001-10-01,6
3,Tom,150000 kr,1996-01-25,9";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithField("Id")
.WithField("Name")
.WithField("Salary", fieldType: typeof(float))
.WithField("JoinedDate", fieldType: typeof(DateTime))
.WithField("EmployeeNo", fieldType: typeof(int))
.Configure(c => c.Culture = new System.Globalization.CultureInfo("se-SE"))
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
20.2. Currency Support
Cinchoo ETL provides ChoCurrency object to read and write currency values in CSV
files. ChoCurrency is a wrapper class to hold the currency value in decimal type
along with support of serializing them in text format during CSV load.

Listing 20.2.1 Using Currency members in dynamic model

Hide Copy Code


static void CurrencyDynamicTest()
{
string csv = @"Id,Name,Salary,JoinedDate,EmployeeNo
1,Carl,12.345679 kr,2017-10-10,5
2,Mark,50000 kr,2001-10-01,6
3,Tom,150000 kr,1996-01-25,9";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithField("Id")
.WithField("Name")
.WithField("Salary", fieldType: typeof(ChoCurrency))
.WithField("JoinedDate", fieldType: typeof(DateTime))
.WithField("EmployeeNo", fieldType: typeof(int))
.Configure(c => c.Culture = new System.Globalization.CultureInfo("se-SE"))
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
Sample above shows how to load currency values using dynamic object model. By
default, all the members of dynamic object are treated as string type, unless
specified explicitly via ChoCSVFieldConfiguration.FieldType. By specifying the
field type as ChoCurrency to the 'Sa;lary' CSV field, CSVReader loads them as
currency object.

PS: The format of the currency value is figured by CSVReader through


ChoRecordConfiguration.Culture and ChoTypeConverterFormatSpec.CurrencyNumberStyle.

Sample below shows how to use ChoCurrency CSV field in POCO entity class.

Listing 20.2.2 Using Currency members in POCO model

Hide Copy Code


public class EmployeeRecWithCurrency
{
public int Id { get; set; }
public string Name { get; set; }
public ChoCurrency Salary { get; set; }
}

static void CurrencyTest()


{
string csv = @"Id,Name,Salary
1,Carl,$12.345679
2,Mark,$50000
3,Tom,$150000";

using (var parser = ChoCSVReader<EmployeeRecWithCurrency>.LoadText(csv)


.WithFirstLineHeader()
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
20.3. Enum Support
Cinchoo ETL implicitly handles parsing of enum column values from CSV files. If you
want to fine control the parsing of these values, you can specify them globally via
ChoTypeConverterFormatSpec.EnumFormat. Default is ChoEnumFormatSpec.Value

FYI, changing this value will impact system wide.

There are 3 possible values can be used

ChoEnumFormatSpec.Value - Enum value is used for parsing.


ChoEnumFormatSpec.Name - Enum key name is used for parsing.
ChoEnumFormatSpec.Description - If each enum key is decorated with
DescriptionAttribute, its value will be use for parsing.
Listing 20.3.1 Specifying Enum format specs during parsing

Hide Shrink Copy Code


public enum EmployeeType
{
[Description("Full Time Employee")]
Permanent = 0,
[Description("Temporary Employee")]
Temporary = 1,
[Description("Contract Employee")]
Contract = 2
}

static void EnumTest()


{
ChoTypeConverterFormatSpec.Instance.EnumFormat = ChoEnumFormatSpec.Description;

string csv = @"Id,Name,Salary,JoinedDate,EmployeeType


1,Carl,12345679,01/10/2016,Full Time Employee
2,Mark,50000,10/01/1995,Temporary Employee
3,Tom,150000,01/01/1940,Contract Employee";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithField("Id")
.WithField("Name")
.WithField("Salary", fieldType: typeof(float))
.WithField("JoinedDate", fieldType: typeof(DateTime))
.WithField("EmployeeType", fieldType: typeof(EmployeeType))
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
20.4. Boolean Support
Cinchoo ETL implicitly handles parsing of boolean CSV column values from CSV files.
If you want to fine control the parsing of these values, you can specify them
globally via ChoTypeConverterFormatSpec.BooleanFormat. Default value is
ChoBooleanFormatSpec.ZeroOrOne

FYI, changing this value will impact system wide.

There are 4 possible values can be used

ChoBooleanFormatSpec.ZeroOrOne - '0' for false. '1' for true.


ChoBooleanFormatSpec.YOrN - 'Y' for true, 'N' for false.
ChoBooleanFormatSpec.TrueOrFalse - 'True' for true, 'False' for false.
ChoBooleanFormatSpec.YesOrNo - 'Yes' for true, 'No' for false.
Listing 20.4.1 Specifying boolean format specs during parsing

Hide Copy Code


static void BoolTest()
{
ChoTypeConverterFormatSpec.Instance.BooleanFormat =
ChoBooleanFormatSpec.ZeroOrOne;

string csv = @"Id,Name,Salary,JoinedDate,Active


1,Carl,12345679,01/10/2016,0
2,Mark,50000,10/01/1995,1
3,Tom,150000,01/01/1940,1";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithField("Id")
.WithField("Name")
.WithField("Salary", fieldType: typeof(float))
.WithField("JoinedDate", fieldType: typeof(DateTime))
.WithField("Active", fieldType: typeof(bool))
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
20.5. DateTime Support
Cinchoo ETL implicitly handles parsing of datetime CSV column values from CSV files
using system Culture or custom set culture. If you want to fine control the parsing
of these values, you can specify them globally via
ChoTypeConverterFormatSpec.DateTimeFormat. Default value is 'd'.

FYI, changing this value will impact system wide.

You can use any valid standard or custom datetime .NET format specification to
parse the datetime CSV values from the file.

Listing 20.5.1 Specifying datetime format specs during parsing

Hide Copy Code


static void DateTimeTest()
{
ChoTypeConverterFormatSpec.Instance.DateTimeFormat = "MMM dd, yyyy";
ChoTypeConverterFormatSpec.Instance.BooleanFormat =
ChoBooleanFormatSpec.ZeroOrOne;
string csv = @"Id,Name,Salary,JoinedDate,Active
1,Carl,12345679,""Jan 01, 2011"",0
2,Mark,50000,""Sep 23, 1995"",1
3,Tom,150000,""Apr 10, 1999"",1";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithField("Id")
.WithField("Name")
.WithField("Salary", fieldType: typeof(float))
.WithField("JoinedDate", fieldType: typeof(DateTime))
.WithField("Active", fieldType: typeof(bool))
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
Sample above shows how to parse custom datetime CSV values from CSV file.

Note: As the datetime values contains CSV separator, it is given with double quotes
to pass the parsing.

21. Fluent API


CSVReader exposes few frequent to use configuration parameters via fluent API
methods. This will make the programming of parsing of CSV files quicker.

21.1 WithDelimiter
This API method sets the CSV field separator on CSVReader.

Hide Copy Code


static void QuickDynamicTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithDelimiter(",")
.WithFirstLineHeader()
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
21.2. WithFirstLineHeader
This API method flags the CSV file contains first row as header or not. Optional
bool parameter specifies the first row header or not. Default is true.

Hide Copy Code


static void QuickDynamicTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
21.3. WithFields
This API method specifies the list of CSV fields to be considered for parsing and
loading. Other fields in the CSV file will be discarded. This call re-initializes
with the specified columns.

In dynamic object model, all the CSV columns will be created and parsed as string
type.

Hide Copy Code


static void QuickDynamicTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithFields("Id", "Name")
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
21.4. WithField
This API method used to add CSV column with specific date type. This method helpful
in dynamic object model, by specifying each and individual CSV column with
appropriate datatype.

Hide Copy Code


static void QuickDynamicTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithField("Id")
.WithField("Name")
.WithField("Salary", typeof(float))
)
{
foreach (var rec in parser)
{
Console.WriteLine(rec.Dump());
}
}
}
In POCO model, you can opt-in the CSV columns using type safe WithField API call

For a POCO class

Hide Copy Code


public class EmployeeRec
{
public int Id
{
get;
set;
}
public string Name
{
get;
set;
}
public double Salary
{
get;
set;
}
}
You can opt-in Id, Name fields as below

Hide Copy Code


foreach (var e in new ChoCSVReader<EmployeeRec>("Emp.csv")
.WithFirstLineHeader()
.WithField(c => c.Id)
.WithField(c => c.Name)
)
{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
21.5. QuoteAllFields
This API method used to specify whether all fields are to be surrounded by quotes
or not.

Hide Copy Code


static void QuickDynamicTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.QuoteAllFields()
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
21.6. ColumnCountStrict
This API method used to set the CSVReader to perform column count check before
loading each line in CSV file.

Hide Copy Code


static void QuickDynamicTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.ColumnCountStrict()
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
21.6. ColumnOrderStrict
This API method used to set the CSVReader to perform column orderness check before
loading each line in CSV file. If there is any order mis-match found in any line,
an error will be reported. This option is effective only if the CSV file contains
first line as header. So it must be combined with WithFirstLineHeader().

Hide Copy Code


static void QuickDynamicTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.ColumnOrderStrict()
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
21.8. NotifyAfter
This API method used to define the number of rows to be processed before generating
a notification event. This property is designed for user interface components that
illustrate the CSV loading progress. Notification are sent to subscribers who
subscribed to RowsLoaded event.

Hide Copy Code


static void NotifyAfterTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.NotifyAfter(1000)
)
{
parser.RowsLoaded += (o, e) => Console.WriteLine(e.RowsLoaded);

foreach (var rec in parser)


{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
21.9. Configure
This API method used to configure all configuration parameters which are not
exposed via fluent API.

Hide Copy Code


static void ConfigureTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.Configure(c => c.ErrorMode = ChoErrorMode.ThrowAndStop)
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
21.10. Setup
This API method used to setup the reader's parameters / events via fluent API.
Hide Copy Code
static void SetupTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.Setup(r => r.BeforeRecordLoad += (o, e) =>
{
if (e.Source.CastTo<string>().StartsWith("//"))
e.Skip = true;
}
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
21.11. IgnoreHeader
This API method used to simply ignore the CSV header line from the file. In this
case, no field name crosss checking will be performed. In the dynamic mode, all
column names will be Column1, Column2,..

Hide Copy Code


static void IgnoreHeaderTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.IgnoreHeader()
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Column1: {0}", rec.Column1));
Console.WriteLine(String.Format("Column2: {0}", rec.Column2));
Console.WriteLine(String.Format("Column3: {0}", rec.Column3));
}
}
}
21.12. WithEOLDelimiter
This API method let you set non standard end of line (EOL) character for parsing
CSV file.

Hide Copy Code


static void EOLDelimiterTest()
{
string csv = @"Id,Name,Salary
1,Carl,10000
2,Mark,5000
3,Tom,2000";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithEOLDelimiter(Environment.NewLine)
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
21.13. WithHeaderLineAt
If your CSV file comes with header line apprears in different line, then you can
use this API method to set the header line number.

Hide Copy Code


using (var parser = new ChoCSVReader("emp.csv").WithHeaderLineAt(10))
{
foreach (var t in parser)
{
}
}
22. FAQ
22.1. I want to read file with headers
You can use WithFirstLineHeader() fluent API method as below

Hide Copy Code


foreach (dynamic e in new ChoCSVReader("Emp.csv").WithFirstLineHeader())
{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
Or use Configuration.FirstLineHeader member

Hide Copy Code


foreach (dynamic e in new ChoCSVReader("Emp.csv").WithFirstLineHeader())
{
parser.Configuration.FileHeaderConfiguration.HasHeaderRecord = true;

Console.WriteLine(String.Format("Id: {0}", e.Id));


Console.WriteLine(String.Format("Name: {0}", e.Name));
}
22.1. File has lines I want to skip them from loading
You can subscribe to parser's BeforeRecordLoad event to observe the lines and
ignore unwanted lines as below

Hide Copy Code


using (var parser = new ChoCSVReader("Emp.csv").WithFirstLineHeader())
{
parser.BeforeRecordLoad += (o, e) =>
{
if (e.Source != null)
{
e.Skip = ((string)e.Source).StartsWith("_");
}
};

foreach (var e in parser)


{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
}
22.2. Not all characters are read from data file
CSVReader smart enough to determine the encoding of the CSV file automatically. In
rare cases, you may want to specify the encoding explicitly to CSVReader to read
out all the characters from the CSV file.

Sample below shows how to specify the encoding to CSVReader configuration object to
let the reader handles the opening of the file with passed encoding value.

Hide Copy Code


using (var parser = new ChoCSVReader("Emp.csv").WithFirstLineHeader())
{
parser.Configuration.Encoding = Encoding.BigEndianUnicode;

foreach (var e in parser)


{
Console.WriteLine(String.Format("Id: {0}", e.Id));
Console.WriteLine(String.Format("Name: {0}", e.Name));
}
}
22.3. How to handle multiline records
CSVReader can handle multiline records in data file. You can turn on this option
via Configuration.MayContainEOLInData option.

Sample below shows how to read such file

Hide Copy Code


using (var parser = new ChoCSVReader("MultiLineValue.csv").WithFirstLineHeader())
{
parser.Configuration.MayContainEOLInData = true;

foreach (var e in parser)


Console.WriteLine(e.Dump());
}
22.4. How to handle errors and log them
CSVReader exposes callback machanism by which you can subscribe to the events and
handle odd situation gracefully. Sample below shows how to handle field level
errors and log them to custom log object

Hide Copy Code


using (var parser = new
ChoCSVReader<EmployeeRec>("empwithsalary.csv").WithFirstLineHeader())
{
parser.RecordFieldLoadError += (o, e) =>
{
//Log the error message to log object
Console.Write(e.Exception.Message);
//Set the handled flag as true to continue loading of CSV file
e.Handled = true;
};
foreach (var i in parser)
Console.WriteLine(i.Dump());
}
In above, we subscribe to RecordFieldLoadError event and capture any errors in the
custom log object.

22.5. How to handle column headers having spaces or special characters in them?
CSVReader can handle this situation gracefully, load the CSV file without any
issues. When you use dynamic parser, the column names with special characters are
replaced with _ character.

ex.

COMPANY NAME -> COMPANY_NAME

COMPANY TYPE -> COMPANY_TYPE

FIRST NAME$ -> FIRST_NAME_

Hide Copy Code


COMPANY NAME, Email, Job Title,COMPANY TYPE, CITY, FIRST NAME$, LAST NAME
"Bbc Worldwide Labs, Bounce Republic Ltd",hidden @hidden.com,"Broadcaster, Voice
Over & Founder Of Passion Pods", Broadcast Media, London, Emily, Chiswell
Loading file using dynamic parser, the field names are normalized automtically with
_ character as below

Hide Copy Code


foreach (dynamic rec in new ChoCSVReader("Emp.csv").WithFirstLineHeader())
{
Console.WriteLine(rec.COMPANY_NAME);
Console.WriteLine(rec.COMPANY_TYPE);
Console.WriteLine(rec.FIRST_NAME_);
}
UPDATE:

Now you can access members by original names through indexers as shown below

Hide Copy Code


foreach (dynamic rec in new ChoCSVReader("Emp.csv").WithFirstLineHeader())
{
Console.WriteLine(rec["COMPANY_NAME");
Console.WriteLine(rec["COMPANY TYPE");
Console.WriteLine(rec["FIRST NAME$"]);
}
22.6. Failed to format DateTime during load?
CSVReader uses default culture as en-US. If the data file contains datetimes in
differenct culture, CSVReader may failed to load those values during parsing. This
can be handled by settings culture to the matching data file culture. Sample below
shows how to load data file containing datetimes in en-GB culture.

CSV File (Emp.csv):

Hide Copy Code


Id,Date,Account,Amount,Subcategory,Memo
1,09/05/2017,XXX XXXXXX,-29.00,FT,Sample string
2,09/05/2017,XXX XXXXXX,-20.00,FT,Sample string
3,25/05/2017,XXX XXXXXX,-6.30,PAYMENT,Sample string
POCO class:
Hide Copy Code
class Transaction
{
public string Id { get; set; }
public DateTime Date { get; set; }
public string Account { get; set; }
public decimal Amount { get; set; }
public string Subcategory { get; set; }
public string Memo { get; set; }
}
Parsing Code:

Hide Copy Code


using (var parser = new ChoCSVReader<Transaction>("emp.csv").WithFirstLineHeader())
{
csv.Configuration.Culture = CultureInfo.GetCultureInfo("en-GB");
foreach (var t in parser)
Console.WriteLine(string.Format("{0:dd-MMM-yyyy} {1} {2,6} {3,-7} {4}",
t.Date, t.Account, t.Amount, t.Subcategory, t.Memo));
}
22.7. How to handle duplicate CSV columns?
CSVReader can't handle duplicate CSV columns out of the box. If found, it will
error out and stop the parsing of the CSV file. But there is a workaround to handle
such situation.

CSV File (Emp.csv):

Hide Copy Code


Id,Date,Account,Amount,Subcategory,Memo,Memo
1,09/05/2017,XXX XXXXXX,-29.00,FT,Memo1,DupMemo1
2,09/05/2017,XXX XXXXXX,-20.00,FT,Memo2,DupMemo2
3,25/05/2017,XXX XXXXXX,-6.30,PAYMENT,Memo3,DupMemo2
In above, the file has duplicate 'Memo' csv column. In order to load it using
dynamic mode, you should tell the CSVReader to ignore the header line using
WithFirstLineHeader() overload as below

Hide Copy Code


using (var parser = new ChoCSVReader("emp.csv").WithFirstLineHeader(true))
{
foreach (var t in parser)
{
Console.WriteLine(t[1]);
Console.WriteLine(t["Column1"]);
}
}
In above, you have choices to access the CSV columns by indexes or by system
generated names Column1, Column2 etc.

Alternatively, you can provide column name overrides via configuration as below to
access them by name

Hide Copy Code


using (var parser = new ChoCSVReader("emp.csv").WithFirstLineHeader(true)
.WithColumns("Id,Date,Account,Amount,Subcategory,Memo,Memo1"))
{
foreach (var t in parser)
{
Console.WriteLine(t.Memo);
Console.WriteLine(t.Memo1);
}
}
POCO class:

When using POCO class for parsing, define the members with CSV column by position
using ChoCSVRecordFieldAttribute. In below, we defined Memo1 object member to map
to duplicate Memo column.

Hide Copy Code


class Transaction
{
[ChoCSVRecordField(1)]
public string Id { get; set; }
[ChoCSVRecordField(2)]
public DateTime Date { get; set; }
[ChoCSVRecordField(3)]
public string Account { get; set; }
[ChoCSVRecordField(4)]
public decimal Amount { get; set; }
[ChoCSVRecordField(5)]
public string Subcategory { get; set; }
[ChoCSVRecordField(6)]
public string Memo { get; set; }
[ChoCSVRecordField(7)]
public string Memo1 { get; set; }
}
Then use this POCO class to ChoCSVReader to load the file as below

Hide Copy Code


using (var parser = new
ChoCSVReader<Tranaction>("emp.csv").WithFirstLineHeader(true))
{
foreach (var t in parser)
{
Console.WriteLine(t.Memo);
Console.WriteLine(t.Memo1);
}
}
22.8. What if CSV header is found other than first line?
Rarely some CSV files may comes with header line at some place other than first
line. If the headerline is always comes at fixed line, this sample shows you how to
parse the file using WithHeaderLineAt() method.

CSV File (Emp.csv):

Hide Copy Code


# This file is produced by XYZ systems
# Id - id column
# Date - Date of the account opened
Id,Date,Account,Amount,Subcategory,Memo,Memo1
1,09/05/2017,XXX XXXXXX,-29.00,FT,Memo1,DupMemo1
2,09/05/2017,XXX XXXXXX,-20.00,FT,Memo2,DupMemo2
3,25/05/2017,XXX XXXXXX,-6.30,PAYMENT,Memo3,DupMemo2
In above, the CSV file has the header at line 4. (PS. Index starts with 1). Here is
how you can load them with header is present at line 4.

Hide Copy Code


using (var parser = new ChoCSVReader("emp.csv").WithHeaderLineAt(4))
{
foreach (var t in parser)
{
Console.WriteLine(t.Id);
Console.WriteLine(t.Date);
}
}
What if the CSV file header is comes at random line each time when you receive the
file, but preceded by it with only comment lines? This can be handled by standard
mechanism as shown below

Hide Copy Code


using (var parser = new ChoCSVReader("emp.csv")
.Configure(c => c.Comments = new string[] { "#" })
.WithFirstLineHeader()
)
{
foreach (var t in parser)
{
Console.WriteLine(t.Id);
Console.WriteLine(t.Date);
}
}
22.9. How to turn off Records loaded or other trace messages?
You can turn off or control the Cinchoo ETL trace messages by setting
ChoETLBootstrap.TraceLevel property at the application startup

Hide Copy Code


ChoETLFrxBootstrap.TraceLevel = System.Diagnostics.TraceLevel.Off;
Possible values are

Hide Copy Code


System.Diagnostics.TraceLevel.Off
System.Diagnostics.TraceLevel.Info;
System.Diagnostics.TraceLevel.Error;
System.Diagnostics.TraceLevel.Verbose;
22.10. How to specify default values to CSV columns?
Sample shows how to specify defaults to CSV columns in dynamic model

Hide Copy Code


using (var cr = new ChoCSVReader("sample.csv")
.WithFirstLineHeader()
.WithField("firstName", defaultValue)
.WithField("lastName")
.WithField("salary", defaultValue: 100, fieldType: typeof(double))
)
)
{
foreach (var x in cr)
Console.WriteLine(ChoUtility.Dump(x));
}
In above sample, the default value is specified to 'Salary' CSV colum as 100 using
WithField fluent API.

In POCO model, the default values can be specified using


System.ComponentModel.DefaultValueAttribute as shown below

Hide Copy Code


public class Employee
{
public string FirstName { get; set; }
public string LastName { get; set; }
[DefaultValue(100)]
public double Salary { get; set; }
}
22.11. Does Cinchoo ETL support hierarchical object support?
Yes, it does. If your CSV file comes with header containing CSV column names are
specified with hierarchical separator, Cinchoo ETL automatically construct them
into the nested object.

PS: This feature is currently supported in dynamic object model only.

Nested CSV:

Hide Copy Code


id,name,category/id,category/name,category/subcategory/id,category/subcategory/name
,description
1,Test123,15,Cat123,10,SubCat123,Desc123
The above sample CSV file comes with nested column names, category/name,
category/subcategory/id etc. The column name separater is '/'. In order to treat
them as nested columns and load them as nested object, you must specify the
'NestedColumnSepatator' property of ChoCSVReader engine. Sample below shows how to
do it

Hide Copy Code


using (var csv = new ChoCSVReader("nested.csv").WithFirstLineHeader()
.Configure(c => c.NestedColumnSeparator = '/')
)
{
foreach (var x in csv)
Console.WriteLine(x.DumpAsJson());
}
22.12. Does the Cinchoo driver auto discover columns datatypes?
Yes, it does. The CSV configuration exposes MaxScanRows property to specify the
numbder of rows to be scanned before deciding the column data types. By default it
is 0. Cinchoo ETL driver apply its very own algorithm to determine the column
types. It does not help always. This is more effective only when all the cell
values are exists in the file.

Hide Copy Code


using (var csv = new ChoCSVReader("sample.csv").WithFirstLineHeader()
.Configure(c => c.MaxScanRows = 10)
)
{
foreach (var x in csv)
Console.WriteLine(x.DumpAsJson());
}
22.13 How to comment or ignore line in CSV file?
Couple of ways you can ignore lines.

Using comment chararacter

Hide Copy Code


using (var csv = new ChoCSVReader("sample.csv").WithFirstLineHeader()
.Configure(c => c.Comment = "#")
)
{
foreach (var x in csv)
Console.WriteLine(x.DumpAsJson());
}
By registering to 'BeforeRecordLoad event handler to CSVReader engine

Hide Copy Code


foreach (dynamic rec in new ChoCSVReader("emp.csv")
.Setup(r => r.BeforeRecordLoad += (o, e) =>
{
if (e.Source != null)
{
e.Skip = ((string)e.Source).StartsWith("#");
}
})
)
{
Console.WriteLine(rec.Id);
}
22.14 How to bulk copy CSV file to SqlServer (any db)?
SqlBulkCopy lets you efficiently bulk load data to SQLServer. With Cinchoo ETL, you
can load large CSV file to SQLServer quickly. Sample below shows how you can do it.

For a sample CSV:

Hide Copy Code


CustId,Name,Balance,AddedDate
1732,Tom Perez,435.00,05/11/2002
5541,Pedro Gomez,12342.30,06/02/2004
1122,Mark Politti,0.00,01/02/2000
1924,Pablo Ramirez,3321.30,11/24/2002
Define database table matching above CSV file format

Hide Copy Code


IF OBJECT_ID ('dbo.Customers') IS NOT NULL
DROP TABLE dbo.Customers
GO

CREATE TABLE dbo.Customers


(
CustId INT NOT NULL,
Name NVARCHAR (max) NULL,
Balance DECIMAL (18, 2) NOT NULL,
AddedDate DATETIME NOT NULL
)
GO
Here is how you can upload the file to sqlserver

Hide Copy Code


string connectionstring =
@"Data Source=(localdb)\v11.0;Initial Catalog=TestDb;Integrated Security=True";

using (SqlBulkCopy bcp = new SqlBulkCopy(connectionstring))


{
using (var dr = new ChoCSVReader("Cust.csv").WithFirstLineHeader()
.WithField("CustId", fieldType: typeof(int))
.WithField("Name", fieldType: typeof(string))
.WithField("Balance", fieldType: typeof(double))
.WithField("AddedDate", fieldType: typeof(DateTime))
.AsDataReader())
{
bcp.DestinationTableName = "dbo.Customers";
bcp.EnableStreaming = true;

bcp.BatchSize = 10000;
bcp.BulkCopyTimeout = 0;
bcp.NotifyAfter = 10;
bcp.SqlRowsCopied += delegate (object sender, SqlRowsCopiedEventArgs e)
{
Console.WriteLine(e.RowsCopied.ToString("#,##0") + " rows copied.");
};
bcp.WriteToServer(dr);
}
}
22.15. What is the best way to get column names from CSV file?
The header is stored in the CSVReader.Context object. In order to get the headers,
you must read the first record. Sample show how to get the columns names of the CSV
file

Hide Copy Code


Id, Name
1, Tom
2, Mark
Sample code shows how to get the column names

Hide Copy Code


using (var p = new ChoCSVReader("emp.csv").WithFirstLineHeader())
{
p.Read();
Console.WriteLine(String.Join(", ", p.Context.Headers));
}
22.16. Is there a way to parse CSV, mapping by varied column names?
Lets say you have multiple CSV files with varying column names, wanted to parse it
using parser with same POCO object. Cinchoo ETL offers a ways to setup this
scenario by specifying alternative column names for each field.

For a CSV files

Emp1.csv

Hide Copy Code


Num, FirstName
1, Tom
2, Mark
Emp2.csv

Hide Copy Code


Identity, LastName
1, Tom
2, Mark
In order to parse these 2 files using same POCO object, you must setup as below

Hide Copy Code


public class Employee
{
[ChoCSVRecordField(1, AltFieldNames = "Num, Identity")]
public int Id { get; set; }

[ChoCSVRecordField(1, AltFieldNames = "FirstName, LastName")]


public int Name { get; set; }
}
Once you have the columns setup with alternative names, you can use the same POCO
to parse the above 2 files. Sample below shows how

Hide Copy Code


using (var p = new ChoCSVReader<Employee>("emp1.csv").WithFirstLineHeader())
{
foreach (var emp in p)
Console.WriteLine(emp.Id);
}

using (var p = new ChoCSVReader<Employee>("emp2.csv").WithFirstLineHeader())


{
foreach (var emp in p)
Console.WriteLine(emp.Id);
}
22.17. How to extract CSV file to DataTable?
You can do it with Cinchoo ETL with few lines of codes

Hide Copy Code


using (var p = new ChoCSVReader("emp.csv").WithFirstLineHeader())
{
DataTable dt = p.AsDataTable();
}
22.18. How to extract CSV file to DataReader?
You can do it with Cinchoo ETL with few lines of codes

Hide Copy Code


using (var p = new ChoCSVReader("emp.csv").WithFirstLineHeader())
{
IDataReader dr = p.AsDataReader();
while (dr.Read())
{
Console.WriteLine("Id: {0}, Name: {1}", dr[0], dr[1]);
}
}
22.19. How to handle column values with quotes in it?
You can specify the colums having values with quotes by setting QuoteField
attribute value as true. Let say, the below CSV file comes with Name column with
quotes around them

Emp.csv

Hide Copy Code


Id, Name
1, "Tom"
2, "Mark"
You can specify the Name property in the POCO object with QuoteField as true.

Hide Copy Code


public class Employee
{
[ChoCSVRecordField(1)]
public int Id { get; set; }

[ChoCSVRecordField(1, QuoteField = true)]


public int Name { get; set; }
}
22.20. How to read a CSV file into a typed DataTable?
Cinchoo CSV reader simplyfies it by exposing Fill() method to populate CSV file to
typed datatable.

Lets say, DataSet1 is created with Employees typed datatable. Sample below shows
how to load the CSV to it.

Hide Copy Code


DataSet1 ds = new DataSet1();
using (var p = new ChoCSVReader("emp.csv").WithFirstLineHeader())
{
p.Fill(ds.Employees);
}
22.21. How can I find empty lines in csv file for tracking purpose?
Cinchoo CSV reader exploses callback event to report the empty line found in the
CSV file.

The sample below shows how to achieve it in dynamic model

Hide Copy Code


static void ReportEmptyLines()
{
using (var p = new ChoCSVReader("EmptyLines.csv").WithFirstLineHeader()
.Setup(s => s.EmptyLineFound += (o, e) =>
{
Console.WriteLine(e.LineNo);
})
)
{
foreach (dynamic rec in p)
Console.WriteLine(rec.DumpAsJson());
}
}
In a POCO model, you can also track them by implementing record object by
IChoEmptyLineReportable interface.

Hide Copy Code


public class Employee : IChoEmptyLineReportable
{
[ChoCSVRecordField(1)]
public int Id { get; set; }

[ChoCSVRecordField(1, QuoteField = true)]


public int Name { get; set; }

bool EmptyLineFound(long lineNo)


{
Console.WriteLine("Empty line at: " + lineNo);
return true; //true to continue the process, otherwise false
}
}
Then parsing code looks as below

Hide Copy Code


static void ReportEmptyLines()
{
using (var p = new
ChoCSVReader<Employee>("EmptyLines.csv").WithFirstLineHeader())
{
foreach (var rec in p)
Console.WriteLine(rec.DumpAsJson());
}
}
22.22. How can I change the order of the fields?
If you have multiple classes with the below setup

Hide Copy Code


public class Employee
{
public int Id { get; set; }
public string Name { get; set; }
}

public class FullTime : Employee


{
public double Salary { get; set; }
public string Department { get; set; }
}

public class Contract : Employee


{
public double Rate { get; set; }
public string Department { get; set; }
}
Cinchoo CSV reader exploses callback event to report the empty line found in the
CSV file.

The sample below shows how to achieve it in dynamic model

Hide Copy Code


static void ReportEmptyLines()
{
using (var p = new ChoCSVReader("EmptyLines.csv").WithFirstLineHeader()
.Setup(s => s.EmptyLineFound += (o, e) =>
{
Console.WriteLine(e.LineNo);
})
)
{
foreach (dynamic rec in p)
Console.WriteLine(rec.DumpAsJson());
}
}
22.23. How to turn off column type discovery?
Cinchoo CSV reader automatically discover column type at it best in dynamic model.
You can turn this feature off, by setting Configuration.MaxScanRows to 0. In this
case, all the column types are treated as string type unless specified explicitly.

Hide Copy Code


static void TurnOffColumnTypeDiscovery()
{
using (var p = new ChoCSVReader("EmptyLines.csv").WithFirstLineHeader()
.Configure(c => c.MaxScanRows = 0)
)
{
foreach (var rec in p)
Console.WriteLine(rec.DumpAsJson());
}
}
22.24. How to skip lines conditionally during CSV parsing?
In case the CSV file comes with lines besides actual CSV data and you wanted to
ignore them, Cinchoo exposes callback event SkipUntil to skip those lines. Sample
below shows how to skip lines of CSV file

Hide Copy Code


static void SkipUntilTest()
{
using (var p = new ChoCSVReader("EmptyLines.csv").WithFirstLineHeader()
.Setup(p => p.SkipUntil += (o, e) =>
{
string line = e.Source as string;
e.Skip = !line.StartsWith("** Some Match **")
})
)
{
foreach (var rec in p)
Console.WriteLine(rec.DumpAsJson());
}
}
22.25. How to stop CSV parsing conditionally?
In case the CSV file comes with lines besides actual CSV data and you wanted to
stop the parsing once you come across some invalid lines, Cinchoo exposes callback
event DoWhile to customize the logic and stop the parsing if needed. Sample below
shows how to stop the CSV parsing

Hide Copy Code


static void DoWhileTest()
{
using (var p = new ChoCSVReader("EmptyLines.csv").WithFirstLineHeader()
.Setup(p => p.DoWhile += (o, e) =>
{
string line = e.Source as string;
e.Stop = line.StartsWith("** Some Match **")
})
)
{
foreach (var rec in p)
Console.WriteLine(rec.DumpAsJson());
}
}
22.26. Does Cinchoo reader supports loading CSV to child classes?
Yes, it does. Make sure the field names are distinct. In case, if there are
duplicates then you can resolve them by specifying unique names using
DisplayNameAttribute. Also DisplayNameAttribute can be used to map the class member
to specific CSV column. Sample below shows how to do it

For a sample CSV file:

Hide Copy Code


SITE_ID,HOUSE,STREET,CITY,STATE,ZIP,APARTMENT
44,545395,PORT ROYAL,CORPUS CHRISTI,TX,,2
44,608646,TEXAS AVE,ODESSA,TX,79762,
44,487460,EVERHART RD,CORPUS CHRISTI,TX,78413,
44,275543,EDWARD GARY,SAN MARCOS,TX,78666,4
44,136811,MAGNOLIA AVE,SAN ANTONIO,TX1,,1
Class structure:

Hide Copy Code


public class Site
{
public int SiteID { get; set; }
public int House { get; set; }
public SiteAddress SiteAddress { get; set; }
public int Apartment { get; set; }
}
public class SiteAddress
{
public string Street { get; set; }
public string City { get; set; }
public SitePostal SitePostal { get; set; }
}
public class SitePostal
{
public string State { get; set; }
public string Zip { get; set; }
}
Parsing code:

Hide Copy Code


using (var p = new ChoCSVReader<Site>("Sample3.csv")
.WithFirstLineHeader()
)
{
foreach (var rec in p)
Console.WriteLine(rec.Dump());
}
22.27. How to turn the validation on child object?
By decorating child class member with ChoValidateObjectAttribute, you can turn the
validation on it.

Hide Copy Code


public class Site
{
[Required(ErrorMessage = "SiteID can't be null")]
public int SiteID { get; set; }
[Required]
public int House { get; set; }
[ChoValidateObject]
public SiteAddress SiteAddress { get; set; }
public int Apartment { get; set; }
}
public class SiteAddress
{
[Required]
public string Street { get; set; }
[Required]
[RegularExpression("^[a-zA-Z][a-zA-Z ]*$")]
public string City { get; set; }
}
In above, the SiteAddress is a child member decorated with
ChoValidateObjectAttribute to fire the object validation during the CSV parsing.

Sample below shows loading CSV file, with validation on

Hide Copy Code


using (var p = new ChoCSVReader<Site>("Sample3.csv")
.WithFirstLineHeader()
.Configure(c => c.ObjectValidationMode = ChoObjectValidationMode.ObjectLevel)
)
{
foreach (var rec in p)
Console.WriteLine(rec.Dump());
}
22.28. How to specify CSV column size to member?
You can use either ChoCSVRecordFieldAttribute or StringLengthAttribute
(System.ComponentModel.DataAnnotations) to specify size of each CSV column value
can be. By default, CSV reader truncate and load the CSV values if over the
specified size. To turn off the truncate behavior, specify
ChoCSVConfiguration.Truncate to false. If truncate is off, an exception will be
raised if CSV value found over the size.

Hide Copy Code


public class Site
{
[Required(ErrorMessage = "SiteID can't be null")]
[StringLength(5)]
public int SiteID { get; set; }
[Required]
public int House { get; set; }
[ChoValidateObject]
public SiteAddress SiteAddress { get; set; }
public int Apartment { get; set; }
}
Sample above, SiteID is specified with size 10 using StringLengthAttribute.

22.28. How to specify CSV column name to member?


You can use either ChoCSVRecordFieldAttribute or DisplayAttribute
(System.ComponentModel.DataAnnotations) to map a CSV column to object member.

Hide Copy Code


public class Site
{
[Required(ErrorMessage = "SiteID can't be null")]
[Display(Name="ID")]
public int SiteID { get; set; }
[Required]
public int House { get; set; }
[ChoValidateObject]
public SiteAddress SiteAddress { get; set; }
public int Apartment { get; set; }
}
22.29. How to handle null special values in CSV?
Cinchoo handles null special values in the CSV values via Configuration.NullValue.

For CSV file:

Hide Copy Code


Id, Name, City
1, Tom, #NULL#
2, Mark, NJ
3, Lou, FL
4, Smith, PA
5, Raj, DC
In the above CSV, some city values are specified with #NULL# to represent null
values. This can be handled as below

Hide Copy Code


using (var cp2 = new ChoCSVReader(new StringReader(csv))
.WithFirstLineHeader()
.Configure(c => c.NullValue = "#NULL#")
)
{
foreach (var rec in cp2)
Console.WriteLine(rec.Dump());
}
Or if POCO object is used for parsing, it can be specified at object level as below

Hide Copy Code


[ChoCSVRecordObject(NullValue = "#NULL#")]
public class Emp
{
public int Id { get; set; }
public string Name { get; set; }
public string City { get; set; }
}
Or at the member level

Hide Copy Code


public class Emp
{
[ChoCSVRecordField(1)]
public int Id { get; set; }
[ChoCSVRecordField(2)]
public string Name { get; set; }
[ChoCSVRecordField(3, NullValue = "#NULL#")]
public string City { get; set; }
}
22.30 Does Cinchoo handles currency values?
Yes, it does. Cinchoo auto discover them based on current culture and load them as
ChoCurrency objects by default.

For CSV file:

Hide Copy Code


Id, Name, City, Salary
1, Tom, NY, $10000
2, Mark, NJ, $12000
3, Lou, FL, $150000
4, Smith, PA, $132000
5, Raj, DC, $200500
In the above CSV, Salary field will be loaded as ChoCurrency type.

Hide Copy Code


using (var cp2 = new ChoCSVReader(new StringReader(csv))
.WithFirstLineHeader()
)
{
foreach (var rec in cp2)
Console.WriteLine(rec.Dump());
}
Alternatively you can load the currency values as double as well by specifying
Configuration.TreatCurrencyAsDecimal as true.

Hide Copy Code


foreach (var p in new ChoCSVReader("Sample2.csv").WithFirstLineHeader()
.Configure(c => c.TreatCurrencyAsDecimal = false)
)
{
Console.WriteLine(p.Dump());
}
22.31 Can Cinchoo read different record types in same CSV?
Yes, it does. You going to use custom record selector to scan each line and return
matching record type to the driver to load the line.

For CSV file:

Hide Copy Code


PlaceName,Longitude,Latitude,Elevation
NameString,123.456,56.78,40

Date,Count
1/1/2012,1
2/1/2012,3
3/1/2012,10
4/2/2012,6
Define POCO classes matching the records in the CSV file as below

Hide Copy Code


public class LocationDef
{
public string PlaceName { get; set; }
public double Longitude { get; set; }
public double Latitude { get; set; }
public double Elevation { get; set; }
}

public class CountDef


{
public DateTime Date { get; set; }
public int Count { get; set; }
}
In the above CSV, Salary field will be loaded as ChoCurrency type.

Hide Copy Code


using (var p = new ChoCSVReader(new StringReader(csv))
.WithCustomRecordSelector((l) =>
{
Tuple<long, string> kvp = l as Tuple<long, string>;
if (kvp.Item1 == 1 || kvp.Item1 == 3 || kvp.Item1 == 4) //Skip the headers
and empty lines
return null;

if (kvp.Item1 < 4)
return typeof(LocationDef);
else
return typeof(CountDef);
}
)
)
{
foreach (var rec in p)
Console.WriteLine(ChoUtility.Dump(rec));

}
22.32. How to validate CSV file?
You may have requirement to take the CSV file, validate them before doing any other
operation on it (ex. saving to db). Cinchoo gives API to do validation on it. It
leverages DataAnnonations/Validation library to perform the CSV validation. Sample
below shows how to do it.

First declare POCO object with validation rules

Hide Copy Code


public class Site
{
[Required(ErrorMessage = "SiteID can't be null")]
public int SiteID { get; set; }
[Required]
public int House { get; set; }
[ChoValidateObject]
public SiteAddress SiteAddress { get; set; }
public int Apartment { get; set; }
}
To perform validation,

Hide Copy Code


using (var p = new ChoCSVReader<Site>("Sample3.csv")
.WithFirstLineHeader(true)
)
{
Exception ex;
Console.WriteLine("IsValid: " + p.IsValid(out ex));
}
22.33. How to sort large CSV file?
Large CSV files present a challenge when need arises to sort. It fails with
OutOfMemoryException if you try to do with .NET application. With Microsoft Excel
for sorting on large files, you will promptly receive an error the first time.

If a string sort will work for you, then just use the Windows SORT command. Sort
the file and be done with it.

If you need to filter and convert the file, specifically the date / custom type
fields, then you would need to write a small conversion program that converts the
fields whatever you like, and rewrites the record. This is where Cinchoo ETL
framework places a role to ease up the conversion of large CSV file the way you
want with all possible requirements with minimal memory footprint and ultra-fast
performance.

Sample CSV File:

Hide Copy Code


CustId,Name,Balance,AddedDate
1732,Tom Perez,435.00,05/11/2002
5541,Pedro Gomez,12342.30,06/02/2000
1122,Mark Politti,0.00,01/02/2004
1924,Pablo Ramirez,3321.30,11/24/2002
Define a POCO class

Hide Copy Code


public class Customer
{
public int CustId { get; set; }
public string Name { get; set; }
public decimal Balance { get; set; }
public DateTime AddedDate { get; set; }
}
Code shows how to sort the feed by 'AddedDate'

Hide Copy Code


public static void POCOSort()
{
using (var dr = new ChoCSVReader<Customer>(@"Test.csv").WithFirstLineHeader())
{
foreach (var rec in dr.ExternalSort(new ChoLamdaComparer<Customer>((e1, e2)
=> DateTime.Compare(e1.AddedDate, e1.AddedDate))))
{
Console.WriteLine(rec.CustId);
}
}
}
Sorting CSV file without POCO class

Hide Copy Code


public static void DynamicSort()
{
using (var dr = new ChoCSVReader(@"Test.csv").WithFirstLineHeader())
{
foreach (var rec in dr.ExternalSort(new ChoLamdaComparer<dynamic>((e1, e2)
=> DateTime.Compare(e1.AddedDate, e1.AddedDate))))
{
Console.WriteLine(rec.CustId);
}
}
}
22.34 How to ignore a member from CSV loading?
In opt-out model, you can specify a member from CSV loading using
ChoIgnoreMemberAttribute

Hide Copy Code


public class Employee
{
public int Id { get; set; }
public int Name { get; set; }
[ChoIgnoreMember]
public string City { get; set; }
}
22.35. How to handle custom datetime formatted values?
There are number ways you can handle custom date values in the CSV file. The
simplest, build-in option would be setting custom format string in formatText to
the CSV column to handle it.

For a sample CSV file

Hide Copy Code


Id, DateCreated
1, 20180201
2, 20171120
Where DateCreated column comes with date values in 'yyyyMMdd' format. Sample below
shows to handle such values and load the file successfully using dynamic model

Hide Copy Code


using (var p = new ChoCSVReader(new StringReader(csv))
.WithFirstLineHeader()
.WithField("Id", fieldType: typeof(int))
.WithField("Date", fieldType: typeof(DateTime), formatText: "yyyyMMdd")
)
{
foreach (var rec in p)
Console.WriteLine(rec.Dump());
}
In a POCO opt-out model, you have to draft class as below

Hide Copy Code


[ChoCSVFileHeader]
public class Consumer
{
public int Id { get; set; }
[DisplayFormat(DataFormatString = "yyyyMMdd")]
public DateTime DateCreated { get; set; }
}
where DateCreated field is decorated with DisplayFormat attribute with format text.

In a POCO opt-in model, you must define the class with all the members with
ChoCSVRecordFieldAttrbute as below

Hide Copy Code


[ChoCSVFileHeader]
public class ConsumerOptIn
{
[ChoCSVRecordField(1)]
public int Id { get; set; }
[ChoCSVRecordField(2, FormatText = "yyyyMMdd")]
public DateTime DateCreated { get; set; }
}
In above, DateCreated field is specified with format text in
ChoCSVRecordFieldAttribute to care for custom formatting need.

22.36. How to handle custom boolean values


There are number ways you can handle custom boolean values in the CSV file. The
simplest, build-in option would be setting custom format string in formatText to
the CSV column to handle it.

Here are the possible standard formatting specs available to use (case-insensitive
comparision is performed on the bool values)

YOrN - 'Y' - true, 'N' - false


TOrF - 'T' - true, 'F' - false
TrueOrFalse - 'true' - true, 'false' - false
YesOrNo - 'yes' - true, 'no' - false
ZeroOrOne - '1' - true, '0' - false
For a sample CSV file

Hide Copy Code


Id, DateCreated, IsActive
1, 20180201, A
2, 20171120, B
Where IsActive column comes with custom boolean values. 'A' - true, other values as
false.

Sample below shows to handle such values and load the file successfully using
dynamic model
Hide Copy Code
using (var p = new ChoCSVReader(new StringReader(csv))
.WithFirstLineHeader()
.WithField("Id", fieldType: typeof(int))
.WithField("Date", fieldType: typeof(DateTime), formatText: "yyyyMMdd")
.WithField("IsActive", fieldType: typeof(bool), formatText: "A")
)
{
foreach (var rec in p)
Console.WriteLine(rec.Dump());
}
In a POCO opt-out model, you have to draft class as below

Hide Copy Code


[ChoCSVFileHeader]
public class Consumer
{
public int Id { get; set; }
[DisplayFormat(DataFormatString = "yyyyMMdd")]
public DateTime DateCreated { get; set; }
[DisplayFormat(DataFormatString = "A")]
public bool IsActive { get; set; }
}
where IsActive field is decorated with DisplayFormat attribute with format text as
'A'.

In a POCO opt-in model, you must define the class with all the members with
ChoCSVRecordFieldAttrbute as below

Hide Copy Code


[ChoCSVFileHeader]
public class ConsumerOptIn
{
[ChoCSVRecordField(1)]
public int Id { get; set; }
[ChoCSVRecordField(2, FormatText = "yyyyMMdd")]
public DateTime DateCreated { get; set; }
[ChoCSVRecordField(3, FormatText = "A")]
public bool IsActive { get; set; }
}
In above, IsActive field is specified with format text in
ChoCSVRecordFieldAttribute to care for custom formatting need.

22.37. How to handle custom boolean values


Cinchoo does take care of it seamlessly. Just define the class structure with
mapping CSV column to each members.

For a sample CSV file

Hide Copy Code


Id, Name, Street, City
1, Tom, 1 Main Street, New York
2, Mark, 10 River Road, Boston
You can define class structure as below matching the CSV file by field names as
below

Hide Copy Code


public class StudentInfo
{
[DisplayName("Id")]
public string Id { get; set; }
public Student Student { get; set; }
}

public class Student


{
[DisplayName("Name")]
public string Name { get; set; }
public Address Address { get; set; }
}

public class Address


{
[DisplayName("Street")]
public string Street { get; set; }
[DisplayName("City")]
public string City { get; set; }
}
Or you can define class structure as below matching the CSV file by field position
as below

Hide Copy Code


public class StudentInfo
{
[ChoFieldPosition(1)]
public string Id { get; set; }
public Student Student { get; set; }
}

public class Student


{
[ChoFieldPosition(2)]
public string Name { get; set; }
public Address Address { get; set; }
}

public class Address


{
[ChoFieldPosition(3)]
public string Street { get; set; }
[ChoFieldPosition(4)]
public string City { get; set; }
}
To parse CSV file to above class structure, you can do as below

Hide Copy Code


using (var r = ChoCSVReader<StudentInfo>.LoadText(csv)
.WithFirstLineHeader()
)
{
foreach (var rec in r)
{
Console.WriteLine(rec.Dump());
}
}
22.38. How to read CSV to Collection / Array members?
Cinchoo handles loading of CSV into array/collection members of a specific type.
Just define the class structure with mapping CSV column to each members.
For a sample CSV file, where CreId_0, CreName_0, CreId_1, CreName_2 are values to
be loaded into collection objects

Hide Copy Code


Id, Name, CreId_0, CreName_0, CreId_1, CreName_1
1, Tom, CI0, CN0, CI1, CN1
2, Mark, CI20, CN20, CI21, CN21
You can define class structure as below matching the CSV file by field names as
below

Hide Copy Code


public class StudentInfo
{
public string Id { get; set; }
public string Name { get; set; }
[Range(1, 2)]
public Course[] Courses { get; set; }
[DisplayName("Grade")]
[Range(1, 3)]
public List<string> Grades { get; set; }

public StudentInfo()
{
Courses = new Course[2];
}
}
public class Course
{
[DisplayName("CreId")]
public string CourseId { get; set; }
[DisplayName("CreName")]
public string CourseName { get; set; }
}
In above, define Courses as array of Course object with RangeAttribute to specify
the number of occurances (in this case 2 items [0, 1]). Must intialize Courses with
array of objects.

Here is the sample code to load the CSV into StudentInfo objects
Hide Copy Code
string csv = @"Id, Name, CreId_0, CreName_0, CreId_1,
CreName_1,Grade_1,Grade_2,Grade_3
1, Tom, CI0, CN0, CI1, CN1,A,B,C
2, Mark, CI20, CN20, CI21, CN21,A,B,C
";
using (var r = ChoCSVReader<StudentInfo>.LoadText(csv)
.WithFirstLineHeader()
)
{
foreach (var rec in r)
{
Console.WriteLine(rec.Dump());
}
}
Default array separator is '_'. If your CSV file comes with different array
separator, you can specify them in Configuration.ArrayIndexSeparator.

Sample below shows how to approach using configuration


Hide Copy Code
string csv = @"Id, Name, CreId_0, CreName_0, CreId_1,
CreName_1,Grade_1,Grade_2,Grade_3
1, Tom, CI0, CN0, CI1, CN1,A,B,C
2, Mark, CI20, CN20, CI21, CN21,A,B,C
";

var config = new ChoCSVRecordConfiguration<StudentInfo>()


.Map(f => f.Id)
.Map(f => f.Grades, "Grade")
.IndexMap(f => f.Courses, 0, 1)
.IndexMap(f => f.Grades, 1, 3)
.MapForType<Course>(f => f.CourseId, "CreId")
.MapForType<Course>(f => f.CourseName, "CreName")
.WithFirstLineHeader();

using (var r = ChoCSVReader<StudentInfo1>.LoadText(csv, config))


{
foreach (var rec in r)
{
Console.WriteLine(rec.Dump());
}
}
In above, use the Map, IndexMap, MapForType helper methods to configure the CSV
layout with POCO object model to successfully load the CSV file in defined object
structure,
22.39. How to preserve leading / padded zeros correctly in Excel?
When you open CSV file in Excel, all of the leading zeros are cut off any numbers.
This is actually an Excel issue. To overcome this issue the CSV field values must
be saved from 08820 as ="08820". This will preserve the leading zeros when you open
them in excel.

If the CSV file is excel aware file, you can parse them using ChoCSVReader to
handle these values out of the box. There are couple of ways you can instruct the
parser.

To handle specific field by setting ExcelField property to true.


To handle globally, by specifying ImplicitExcelFieldValueHandling to true to handle
them.
Listing 22.39.1 Global excel field value handling approach

Hide Copy Code


static void ExcelFieldTest()
{
string csv = @"Id,Name,Salary
1,Carl,=""10000""
2,Mark,=""5000""
3,Tom,=""2000""";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.Configure(c => c.ImplicitExcelFieldValueHandling = true)
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}
}
Listing 22.39.2 Targetted CSV field excel value handling approach

Hide Copy Code


static void ExcelFieldTest()
{
string csv = @"Id,Name,Salary
1,Carl,=""10000""
2,Mark,=""5000""
3,Tom,=""2000""";

using (var parser = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithField("Id")
.WithField("Name")
.WithField("Salary", m => m.Configure(c => c.ExcelField = true))
)
{
foreach (var rec in parser)
{
Console.WriteLine(String.Format("Id: {0}", rec.Id));
Console.WriteLine(String.Format("Name: {0}", rec.Name));
Console.WriteLine(String.Format("Salary: {0}", rec.Salary));
}
}

}
23.40. Does ChoCSVReader supports multiline header?
In a rare case, some CSV file may come with multi-line header. ChoCSVReader handle
it seamlessly. Sample CSV below with multiline csv header at line 1-2

Listing 22.40.1 Multi-Line header CSV file

Hide Copy Code


CUSTOMER,CUSTOMER NAME,INVOICE ID,PURCHASE,PRODUCT ID,PURCHASED,PURCHASED
QTY,LOCATION
ID,DATE,AMOUNT,QUANTITY ID
22160,MANSFIELD BROTHERS HEATING & AIR,sss.001,02/08/2017,193792,69.374,2,30
27849,OWSLEY SUPPLY LLC - EQUIPMENT,sss.001,03/14/2017,123906,70.409,1,2
27849,OWSLEY SUPPLY LLC - EQUIPMENT,sss.001,03/14/2017,40961,10.000,1,2
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,116511,63.016,1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,116511,-63.016,-1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,122636,30.748,1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,137661,432.976,1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,137661,-432.976,-1,15
Sample code shows how to parse the above CSV file successfully

Listing 22.40.2 Parsing Multi-Line header CSV file

Hide Shrink Copy Code


static void MultiLineHeaderTest()
{
string csv = @"CUSTOMER,CUSTOMER NAME,INVOICE ID,PURCHASE,PRODUCT
ID,PURCHASED,PURCHASED QTY,LOCATION
ID,DATE,AMOUNT,QUANTITY ID
22160,MANSFIELD BROTHERS HEATING & AIR,sss.001,02/08/2017,193792,69.374,2,30
27849,OWSLEY SUPPLY LLC - EQUIPMENT,sss.001,03/14/2017,123906,70.409,1,2
27849,OWSLEY SUPPLY LLC - EQUIPMENT,sss.001,03/14/2017,40961,10.000,1,2
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,116511,63.016,1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,116511,-63.016,-1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,122636,30.748,1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,137661,432.976,1,15
16794,ALEXANDER GILMORE dba AL'S HEATING,sss.001,01/25/2017,137661,-432.976,-1,15";

foreach (var rec in ChoCSVReader.LoadText(csv)


.WithMaxScanRows(2)
.Setup(s =>
{
s.MultiLineHeader += (o, e) =>
{
if (e.LineNo <= 2)
e.IsHeader = true;
else
e.IsHeader = false;
};
})
.Configure(c => c.TurnOnMultiLineHeaderSupport = true)
.ThrowAndStopOnMissingField(false)
)
Console.WriteLine(rec.Dump());
}
23.41. How to load CSV to Dictionary Member of POCO type?
If your CSV file comes with columns, wanted to load them to dictionary member, you
can do so with ChoCSVReader. For sample CSV file below

Hide Copy Code


Id, Name, K1, K2
1, Tom, A, B
2, Mark, C, D
Where K1, K2 are coulmns you wanted to load into dictionary member.

First define a POCO class as below

Hide Copy Code


public class StudentInfo
{
public string Id { get; set; }
public string Name { get; set; }
[ChoDictionaryKey("K1, K2")]
public Dictionary<string, string> Grades { get; set; }
}
In above, Grades member decorated with ChoDictionaryKeyAttribute with possible CSV
columns to be loaded into.

Then you can consume the CSV file, ChoCSVReader will take care of plumbing work to
load the data into the object appropriately.

Hide Copy Code


public static void CSV2DictionaryMemberTest()
{
string csv = @"Id, Name, K1, K2
1, Tom, A, B
2, Mark, C, D
";

using (var r = ChoCSVReader<StudentInfo2>.LoadText(csv))


{
foreach (var rec in r)
Console.WriteLine(rec.Dump());
}
}
Next, I'll show you how to achieve this using Configuration approach

Hide Copy Code


public static void CSV2DictionaryMemberTest()
{
string csv = @"Id, Name, K1, K2
1, Tom, A, B
2, Mark, C, D
";

var config = new ChoCSVRecordConfiguration<StudentInfo2>()


.Map(f => f.Id)
.Map(f => f.Name)
.DictionaryMap(f => f.Grades, new string[] { "K1", "K2" });

using (var r = ChoCSVReader<StudentInfo2>.LoadText(csv, config))


{
foreach (var rec in r)
{
Console.WriteLine(rec.Dump());
}
}
}
23.42. Can duplicate names while reading dynamic records be handled?
If the CSV file comes with duplication column names, and wanted to handle it
automatically by auto incrementing number '_2', '_3' etc. in the parser. Yes,
ChoCSVReader does it automatically.

For sample CSV file below

Hide Copy Code


Id, Name, Name
1, Tom, Mark
2, Kevin, Fahey
Code below shows how to load them into dynamic records

Hide Copy Code


static void DuplicateNameInDynamicModeTest()
{
string csv = @"Id, Name, Name
1, Tom, Mark
2, Kevin, Fahey";

using (var r = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.AutoIncrementDuplicateColumnNames()
.ArrayIndexSeparator('_')
)
{
foreach (var rec in r)
Console.WriteLine(rec.Dump());
}
}
In here, the CSV file comes with 2 columns in same names. By instructing
ChoCSVReader with AutoIncrementDuplicateColumnNames you can load this file
successfully. Dynamic object will have Id, Name, Name_2 as their members with
values in it.

23.43. Can ChoCSVReader find delimiter in CSV file automatically?


Yes, it does. Sample below show how to

Hide Copy Code


static void AutoDetectDelimiter()
{
string csv = @"Id;Guid
10;cc6f0116-589a-4cf1-8605-a4eb6ab3bd34
20;cc6f0116-589a-4cf1-8605-a4eb6ab3bd34
";

using (var r = ChoCSVReader.LoadText(csv)


.WithFirstLineHeader()
.WithMaxScanRows(1)
.AutoDetectDelimiter()
)
{
foreach (var rec in r)
Console.WriteLine(rec.Dump());
}
}
By setting `AutoDetectDelimiter` along with `WithMaxScanRows`, you can instruct
ChoCSVReader to scan and detect the CSV delimiter automatically.

23. ChoTSVReader
ChoTSVReader is specialized version of CSV reader, used to parse TSV (Tab
Seperated) files.

Hide Copy Code


static void TSV2Xml()
{
string tsv = @"Time Object pmPdDrb pmPdcDlSrb
00:45 EUtranCellFDD=GNL02294_7A_1 2588007 1626
00:45 EUtranCellFDD=GNL02294_7B_1 18550 32
00:45 EUtranCellFDD=GNL02294_7C_1 26199 38
00:45 EUtranCellFDD=GNL02294_9A_1 3857243 751";

StringBuilder xml = new StringBuilder();


using (var r = ChoTSVReader.LoadText(tsv)
.WithFirstLineHeader()
)
{
using (var w = new ChoXmlWriter(xml)
.Configure(c => c.RootName = "xmlnodes")
.Configure(c => c.NodeName = "xmlnode")
)
w.Write(r);
}
Console.WriteLine(xml.ToString());
}
Sample above shows to parse TSV formatted file to Xml output.

25. References
Here are the list of other modules exposed by Cinchoo ETL for your reference to
take look at it
1. CSV Reader / Writer

CSV Reader
CSV Writer
2. Fixed Length Reader / Writer

Flat (Fixed Length) file Reader


Flat (Fixed Length) file Writer
3. Xml Reader / Writer

Xml Reader
Xml Writer
4. JSON Reader / Writer

JSON Reader
JSON Writer - Coming soon
5. Key-Value Reader / Writer

Key-Value Reader - Coming soon


Key-Value Writer - Coming soon
6. Manifold Reader / Writer

Manifold Reader - Coming soon


Manifold Writer - Coming soon

License
This article, along with any associated source code and files, is licensed under
The Code Project Open License (CPOL)

Share
About the Author

Cinchoo
United States United States
No Biography provided

Comments and Discussions

You must Sign In to use this message board.


Search Comments
Submit
Spacing
Relaxed
Layout
Normal
Per page
25

First PrevNext
General
My vote of 5 Pin Member manosabari 3-May-20 18:47
General
My vote of 3 Pin Member David A. Gray 9-Mar-20 20:35
General
Re: My vote of 3 Pin professional Chris Solutions 4-May-20 18:01
Question
Dynamic Column not matching correctly Pin Member Dannoman1234 30-Dec-19
19:18
Answer
Re: Dynamic Column not matching correctly Pin Member Cinchoo 26-Jan-20
13:16
Question
Column names... Pin Member Paul G. Scannell 11-Jul-19 8:05
Answer
Re: Column names... Pin Member Cinchoo 12-Jul-19 3:15
Question
Missing object?? Pin Member Paul G. Scannell 11-Jul-19 3:06
Answer
Re: Missing object?? Pin Member Paul G. Scannell 11-Jul-19 3:50
General
Re: Missing object?? Pin Member Cinchoo 12-Jul-19 2:56
General
Re: Missing object?? Pin Member Paul G. Scannell 13-Jul-19 2:00
General
Re: Missing object?? Pin Member Cinchoo 13-Jul-19 7:29
Question
CSV Reader and Writer downloads Pin Member Paul G. Scannell 10-Jul-19 6:24
Answer
Re: CSV Reader and Writer downloads Pin Member Paul G. Scannell 10-Jul-19
6:40
General
Re: CSV Reader and Writer downloads Pin Member Cinchoo 12-Jul-19 2:43
Suggestion
Pay attention to delimiter Pin Member Gluups 10-Jul-19 6:05
General
Re: Pay attention to delimiter Pin Member Cinchoo 12-Jul-19 2:39
General
Re: Pay attention to delimiter Pin Member Gluups 12-Jul-19 3:03
Question
Schema in mongoDB using C# Pin Member Member 14198848 14-May-19 2:24
Question
Hello, is there any way to use it with .NET 4.0? Pin Member Sergio Bombien
19-Feb-19 21:05
Question
Load CSV (headers, schema not know) to IEnumerable Or List<> Or JSON Pin Member
longnights 13-Dec-18 14:20
Answer
Re: Load CSV (headers, schema not know) to IEnumerable Or List<> Or JSON Pin Member
Cinchoo 8-Jul-19 11:11
General
Re: Load CSV (headers, schema not know) to IEnumerable Or List<> Or JSON Pin Member
Paul G. Scannell 11-Jul-19 2:29
General
Re: Load CSV (headers, schema not know) to IEnumerable Or List<> Or JSON Pin Member
Cinchoo 12-Jul-19 2:34
Answer
Csv file is locked When Using ChoCSVReader Pin Member vcj#B4kT 4-Mar-18
10:27
Last Visit: 31-Dec-99 18:00 Last Update: 12-Jun-20 4:19Refresh 123 Next �
General General News News Suggestion Suggestion Question Question Bug
Bug Answer Answer Joke Joke Praise Praise Rant Rant Admin Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads,


Ctrl+Shift+Left/Right to switch pages.

Article
View Stats
Revisions (195)
Comments (56)
Posted 24 Oct 2016
Tagged as
C#
.NET
Stats
163.6K views
194 bookmarked
Go to top
Permalink
Advertise
Privacy
Cookies
Terms of Use
Layout: fixed | fluid

Article Copyright 2016 by Cinchoo


Everything else Copyright © CodeProject, 1999-2020

Web04 2.8.200606.1

You might also like