How to handle NULL in Polybase – Quick Tip
How to prepare a file for ingestion into Azure SQL Data Warehouse using Polybase, when there are NULLs in the data?
There are only a few web pages that talk about how Polybase handles NULL values… And most of them are misleading. For example:
- Polybase to treat the NULL string as Null values and not text.
- and this – which didn’t clear things up for me.
So here is the answer:
I’ll talk in the context of delimited files. To indicate a NULL in a column you simply put nothing between the delimiters. i.e. a missing value.
E.g. A row with three values, where the second value is NULL would look like this: “A”,,”B”
Simple right? If only those other resources were a bit clearer!
There is one more thing that you need to do, when defining your External File Format you should set the Format Option USE_TYPE_DEFAULT = False. e.g.
CREATE EXTERNAL FILE FORMAT Fmt_psg_SV_Service_335f7_SV_Service
WITH (FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS(
FIELD_TERMINATOR = ‘,’,
STRING_DELIMITER = ‘”‘,
DATE_FORMAT = ”,
USE_TYPE_DEFAULT = False)
)
As per this CREATE EXTERNAL FILE FORMAT (Transact-SQL) article, which explains that, if USE_TYPE_DEFAULT = False, then missing values are treated as NULL. It’s a good explanation, you just need to know to look there !
I’ve tested this, and it works for int, varchar and DateTime type columns.
[box type=”download”] If you found this useful please like (via one of the share buttons below) or link to it, to help others find it![/box]