Data Lake Analytics and Nullable DateTime fields

My first solo foray into Data Lake Analytics on Azure had me stumped for a while so I thought I’d share in case anyone came across the same problem.

I had created a CSV file from some live data extracted from SQL Server which contained several nullable DateTime fields. The upload to Azure Data Lake storage was fine, but when I ran my U-SQL job it failed.

It turns out, although I had declared the field as nullable in the U-SQL script:-

DECLARE @in string = @"data/MyFile.csv";
// Define schema of file
@absences =
	EXTRACT OrganisationId int,
			StartDate DateTime,
			EndDate DateTime?,
			CompletionDate DateTime?
	FROM @in
	USING Extractors.Csv();

It did not like the fact that the CSV file contained ‘NULL’ in the field:-

1,2017-07-11 00:00:00.000,NULL,NULL
1,2017-07-10 00:00:00.000,2017-07-11 00:00:00.000,NULL
1,2017-07-09 00:00:00.000,2017-07-11 00:00:00.000,2017-07-11 00:00:00.000

I had to replace the ‘NULL’ with an empty field:-

1,2017-07-11 00:00:00.000,,
1,2017-07-10 00:00:00.000,2017-07-11 00:00:00.000,
1,2017-07-09 00:00:00.000,2017-07-11 00:00:00.000,2017-07-11 00:00:00.000

It’s always the stupid little things that get you 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s