My first solo foray into Data Lake Analytics on Azure had me stumped for a while so I thought I’d share in case anyone came across the same problem.
I had created a CSV file from some live data extracted from SQL Server which contained several nullable DateTime fields. The upload to Azure Data Lake storage was fine, but when I ran my U-SQL job it failed.
It turns out, although I had declared the field as nullable in the U-SQL script:-
DECLARE @in string = @"data/MyFile.csv";
// Define schema of file
@absences =
EXTRACT OrganisationId int,
StartDate DateTime,
EndDate DateTime?,
CompletionDate DateTime?
FROM @in
USING Extractors.Csv();
It did not like the fact that the CSV file contained ‘NULL’ in the field:-
1,2017-07-11 00:00:00.000,NULL,NULL
1,2017-07-10 00:00:00.000,2017-07-11 00:00:00.000,NULL
1,2017-07-09 00:00:00.000,2017-07-11 00:00:00.000,2017-07-11 00:00:00.000
I had to replace the ‘NULL’ with an empty field:-
1,2017-07-11 00:00:00.000,,
1,2017-07-10 00:00:00.000,2017-07-11 00:00:00.000,
1,2017-07-09 00:00:00.000,2017-07-11 00:00:00.000,2017-07-11 00:00:00.000
It’s always the stupid little things that get you 🙂