Skip to content

Commit 77d1017

Browse files
committed
Add DW PolyBase support for ADLS Gen2
1 parent a8e4bf8 commit 77d1017

File tree

1 file changed

+15
-12
lines changed

1 file changed

+15
-12
lines changed

articles/data-factory/connector-azure-sql-data-warehouse.md

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.workload: data-services
1212
ms.tgt_pltfrm: na
1313

1414
ms.topic: conceptual
15-
ms.date: 04/16/2019
15+
ms.date: 04/19/2019
1616
ms.author: jingwang
1717

1818
---
@@ -396,22 +396,29 @@ Learn more about how to use PolyBase to efficiently load SQL Data Warehouse in t
396396

397397
Using [PolyBase](https://docs.microsoft.com/sql/relational-databases/polybase/polybase-guide) is an efficient way to load a large amount of data into Azure SQL Data Warehouse with high throughput. You'll see a large gain in the throughput by using PolyBase instead of the default BULKINSERT mechanism. See [Performance reference](copy-activity-performance.md#performance-reference) for a detailed comparison. For a walkthrough with a use case, see [Load 1 TB into Azure SQL Data Warehouse](https://docs.microsoft.com/azure/data-factory/v1/data-factory-load-sql-data-warehouse).
398398
399-
* If your source data is in Azure Blob storage or Azure Data Lake Store, and the format is compatible with PolyBase, copy direct to Azure SQL Data Warehouse by using PolyBase. For details, see **[Direct copy by using PolyBase](#direct-copy-by-using-polybase)**.
399+
* If your source data is in **Azure Blob, Azure Data Lake Storage Gen1 or Azure Data Lake Storage Gen2**, and the **format is PolyBase compatible**, you can use copy activity to directly invoke PolyBase to let Azure SQL Data Warehouse pull the data from source. For details, see **[Direct copy by using PolyBase](#direct-copy-by-using-polybase)**.
400400
* If your source data store and format isn't originally supported by PolyBase, use the **[Staged copy by using PolyBase](#staged-copy-by-using-polybase)** feature instead. The staged copy feature also provides you better throughput. It automatically converts the data into PolyBase-compatible format. And it stores the data in Azure Blob storage. It then loads the data into SQL Data Warehouse.
401401

402402
### Direct copy by using PolyBase
403403

404-
SQL Data Warehouse PolyBase directly supports Azure Blob and Azure Data Lake Store. It uses service principal as a source and has specific file format requirements. If your source data meets the criteria described in this section, use PolyBase to copy direct from the source data store to Azure SQL Data Warehouse. Otherwise, use [Staged copy by using PolyBase](#staged-copy-by-using-polybase).
404+
SQL Data Warehouse PolyBase directly supports Azure Blob, Azure Data Lake Storage Gen1 and Azure Data Lake Storage Gen2. If your source data meets the criteria described in this section, use PolyBase to copy directly from the source data store to Azure SQL Data Warehouse. Otherwise, use [Staged copy by using PolyBase](#staged-copy-by-using-polybase).
405405

406406
> [!TIP]
407-
> To copy data efficiently from Data Lake Store to SQL Data Warehouse, learn more from [Azure Data Factory makes it even easier and convenient to uncover insights from data when using Data Lake Store with SQL Data Warehouse](https://blogs.msdn.microsoft.com/azuredatalake/2017/04/08/azure-data-factory-makes-it-even-easier-and-convenient-to-uncover-insights-from-data-when-using-data-lake-store-with-sql-data-warehouse/).
407+
> To copy data efficiently to SQL Data Warehouse, learn more from [Azure Data Factory makes it even easier and convenient to uncover insights from data when using Data Lake Store with SQL Data Warehouse](https://blogs.msdn.microsoft.com/azuredatalake/2017/04/08/azure-data-factory-makes-it-even-easier-and-convenient-to-uncover-insights-from-data-when-using-data-lake-store-with-sql-data-warehouse/).
408408

409409
If the requirements aren't met, Azure Data Factory checks the settings and automatically falls back to the BULKINSERT mechanism for the data movement.
410410
411-
1. The **Source linked service** type is Azure Blob storage (**AzureBLobStorage**/**AzureStorage**) with **account key authentication** or Azure Data Lake Storage Gen1 (**AzureDataLakeStore**) with **service principal authentication**.
412-
2. The **input dataset** type is **AzureBlob** or **AzureDataLakeStoreFile**. The format type under `type` properties is **OrcFormat**, **ParquetFormat**, or **TextFormat**, with the following configurations:
411+
1. The **source linked service** is with the following types and authentication methods:
413412
414-
1. `fileName` doesn't contain wildcard filter.
413+
| Supported source data store type | Supported source authentication type |
414+
|:--- |:--- |
415+
| [Azure Blob](connector-azure-blob-storage.md) | Account key authentication |
416+
| [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md) | Service principal authentication |
417+
| [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md) | Account key authentication |
418+
419+
2. The **source dataset format** is of **ParquetFormat**, **OrcFormat**, or **TextFormat**, with the following configurations:
420+
421+
1. `folderPath` and `fileName` don't contain wildcard filter.
415422
2. `rowDelimiter` must be **\n**.
416423
3. `nullValue` is either set to **empty string** ("") or left as default, and `treatEmptyAsNull` is left as default or set to true.
417424
4. `encodingName` is set to **utf-8**, which is the default value.
@@ -420,18 +427,14 @@ If the requirements aren't met, Azure Data Factory checks the settings and autom
420427
421428
```json
422429
"typeProperties": {
423-
"folderPath": "<blobpath>",
430+
"folderPath": "<path>",
424431
"format": {
425432
"type": "TextFormat",
426433
"columnDelimiter": "<any delimiter>",
427434
"rowDelimiter": "\n",
428435
"nullValue": "",
429436
"encodingName": "utf-8",
430437
"firstRowAsHeader": <any>
431-
},
432-
"compression": {
433-
"type": "GZip",
434-
"level": "Optimal"
435438
}
436439
},
437440
```

0 commit comments

Comments
 (0)