Transform data: minor url update; write data: tables cut

nibaccam · nibaccam · commit d0483105f008 · 2019-02-21T14:58:18.000-08:00
diff --git a/articles/machine-learning/service/how-to-transform-data.md b/articles/machine-learning/service/how-to-transform-data.md
@@ -285,7 +285,7 @@ dataflow.head(5)
 
 ### Filtering columns
 
-To filter columns, use `Dataflow.drop_columns()`. This method takes a list of columns to drop or a more complex argument called [`ColumnSelector`](https://docs.microsoft.com/en-us/python/api/azureml-dataprep/azureml.dataprep.columnselector?view=azure-dataprep-py).
+To filter columns, use `Dataflow.drop_columns()`. This method takes a list of columns to drop or a more complex argument called [`ColumnSelector`](https://docs.microsoft.com/python/api/azureml-dataprep/azureml.dataprep.columnselector?view=azure-dataprep-py).
 
 #### Filtering columns with list of strings
 
@@ -490,7 +490,7 @@ df.head(2)
 |0|ALABAMA|Jefferson County|Jefferson County, Alabama|1.019200e+10|1.0|
 |1|ALABAMA|Jefferson County|Jefferson County, Alabama|1.019200e+10|0.0|
 
-## Next Steps
+## Next steps
 
 * See the SDK [overview](https://aka.ms/data-prep-sdk) for design patterns and usage examples
 * See the Azure Machine Learning Data Prep SDK [tutorial](tutorial-data-prep.md) for an example of solving a specific scenario
diff --git a/articles/machine-learning/service/how-to-write-data.md b/articles/machine-learning/service/how-to-write-data.md
@@ -15,7 +15,7 @@ ms.custom: seodec18
 ---
 # Write data using the Azure Machine Learning Data Prep SDK
 
-In this article, you learn different methods to write data using the Azure Machine Learning Data Prep SDK. Output data can be written at any point in a dataflow, and writes are added as steps to the resulting data flow and are run every time the data flow is. Data is written to multiple partition files to allow parallel writes.
+In this article, you learn different methods to write data using the [Azure Machine Learning Data Prep SDK](https://aka.ms/data-prep-sdk). Output data can be written at any point in a dataflow, and writes are added as steps to the resulting data flow and are run every time the data flow is. Data is written to multiple partition files to allow parallel writes.
 
 Since there are no limitations to how many write steps there are in a pipeline, you can easily add additional write steps to get intermediate results for troubleshooting or for other pipelines.
 
@@ -27,7 +27,7 @@ The following file formats are supported
 -	Delimited files (CSV, TSV, etc.)
 -	Parquet files
 
-Using the [Azure Machine Learning Data Prep python SDK](https://aka.ms/data-prep-sdk), you can write data to:
+Using the Azure Machine Learning Data Prep python SDK, you can write data to:
 + a local file system
 + Azure Blob Storage
 + Azure Data Lake Storage
@@ -48,22 +48,17 @@ For this example, start by loading data into a data flow. You reuse this data wi
 import azureml.dataprep as dprep
 t = dprep.auto_read_file('./data/fixed_width_file.txt')
 t = t.to_number('Column3')
-t.head(10)
+t.head(5)
 ```
 
 Example output:
-|   |  Column1 |	Column2	| Column3 |	Column4	 |Column5	| Column6 |	Column7	| Column8 |	Column9 |
-| -------- |  -------- | -------- | -------- |  -------- |  -------- |  -------- |  -------- |  -------- |  -------- |
-| 0 |	10000.0	|	99999.0	|	None|		NO|		NO	|	ENRS	|NaN	|	NaN	|	NaN|	
-|	1|		10003.0	|	99999.0	|	None|		NO|		NO	|	ENSO|		NaN|		NaN	|NaN|	
-|	2|	10010.0|	99999.0|	None|	NO|	JN|	ENJA|	70933.0|	-8667.0	|90.0|
-|3|	10013.0|	99999.0|	None|	NO|	NO|	|	NaN|	NaN|	NaN|
-|4|	10014.0|	99999.0|	None|	NO|	NO|	ENSO|	59783.0|	5350.0|	500.0|
-|5|	10015.0|	99999.0|	None|	NO|	NO|	ENBL|	61383.0|	5867.0|	3270.0|
-|6|	10016.0	|99999.0|	None|	NO|	NO|		|64850.0|	11233.0|	140.0|
-|7|	10017.0|	99999.0|	None|	NO|	NO|	ENFR|	59933.0|	2417.0|	480.0|
-|8|	10020.0|	99999.0|	None|	NO|	SV|		|80050.0|	16250.0|	80.0|
-|9|	10030.0|	99999.0|	None|	NO|	SV|		|77000.0|	15500.0|	120.0|
+| | Column1 | Column2 | Column3 | Column4 | Column5	| Column6 |	Column7	| Column8 |	Column9 |
+| -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- |
+|0| 10000.0 | 99999.0 |	None | NO |	NO | ENRS | NaN	| NaN |	NaN |	
+|1| 10003.0 | 99999.0 |	None | NO |	NO | ENSO |	NaN | NaN | NaN |	
+|2| 10010.0 | 99999.0 |	None | NO |	JN | ENJA |	70933.0 | -8667.0 | 90.0 |
+|3| 10013.0 | 99999.0 |	None | NO |	NO |	  |	NaN | NaN |	NaN |
+|4| 10014.0 | 99999.0 |	None | NO |	NO | ENSO |	59783.0 | 5350.0 |	500.0|
 
 ### Delimited file example
 
@@ -77,22 +72,18 @@ write_t = t.write_to_csv(directory_path=dprep.LocalFileOutput('./test_out/'))
 write_t.run_local()
 
 written_files = dprep.read_csv('./test_out/part-*')
-written_files.head(10)
+written_files.head(5)
 ```
 
 Example output:
-|   |  Column1 |    Column2 | Column3 | Column4  |Column5   | Column6 | Column7 | Column8 | Column9 |
-| -------- |  -------- | -------- | -------- |  -------- |  -------- |  -------- |  -------- |  -------- |  -------- |
-| 0 |   10000.0 |   99999.0 |   ERROR |       NO|     NO  |   ENRS    |ERROR    |   ERROR |   ERROR|    
-|   1|      10003.0 |   99999.0 |   ERROR |       NO|     NO  |   ENSO|       ERROR|        ERROR |ERROR|   
-|   2|  10010.0|    99999.0|    ERROR |   NO| JN| ENJA|   70933.0|    -8667.0 |90.0|
-|3| 10013.0|    99999.0|    ERROR |   NO| NO| |   ERROR|    ERROR|    ERROR|
-|4| 10014.0|    99999.0|    ERROR |   NO| NO| ENSO|   59783.0|    5350.0| 500.0|
-|5| 10015.0|    99999.0|    ERROR |   NO| NO| ENBL|   61383.0|    5867.0| 3270.0|
-|6| 10016.0 |99999.0|   ERROR |   NO| NO|     |64850.0|   11233.0|    140.0|
-|7| 10017.0|    99999.0|    ERROR |   NO| NO| ENFR|   59933.0|    2417.0| 480.0|
-|8| 10020.0|    99999.0|    ERROR |   NO| SV|     |80050.0|   16250.0|    80.0|
-|9| 10030.0|    99999.0|    ERROR |   NO| SV|     |77000.0|   15500.0|    120.0|
+| | Column1 | Column2 | Column3 | Column4 | Column5	| Column6 |	Column7	| Column8 |	Column9 |
+| -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- |
+|0| 10000.0 | 99999.0 |	ERROR | NO | NO | ENRS | NaN	| NaN |	NaN |	
+|1| 10003.0 | 99999.0 |	ERROR | NO | NO | ENSO |	NaN | NaN | NaN |	
+|2| 10010.0 | 99999.0 |	ERROR | NO | JN | ENJA |	70933.0 | -8667.0 | 90.0 |
+|3| 10013.0 | 99999.0 |	ERROR | NO | NO |	  |	NaN | NaN |	NaN |
+|4| 10014.0 | 99999.0 |	ERROR | NO | NO | ENSO |	59783.0 | 5350.0 |	500.0|
+
 
 In the preceding output, several errors appear in the numeric columns because of numbers that were not parsed correctly. When written to CSV, null values are replaced with the string "ERROR" by default.
 
@@ -104,22 +95,17 @@ write_t = t.write_to_csv(directory_path=dprep.LocalFileOutput('./test_out/'),
                          na='NA')
 write_t.run_local()
 written_files = dprep.read_csv('./test_out/part-*')
-written_files.head(10)
+written_files.head(5)
 ```
 
 The preceding code produces this output:
-|   |  Column1 |    Column2 | Column3 | Column4  |Column5   | Column6 | Column7 | Column8 | Column9 |
-| -------- |  -------- | -------- | -------- |  -------- |  -------- |  -------- |  -------- |  -------- |  -------- |
-| 0 |   10000.0 |   99999.0 |   BadData |       NO|     NO  |   ENRS    |BadData    |   BadData |   BadData|    
-|   1|      10003.0 |   99999.0 |   BadData |       NO|     NO  |   ENSO|       BadData|        BadData |BadData|   
-|   2|  10010.0|    99999.0|    BadData |   NO| JN| ENJA|   70933.0|    -8667.0 |90.0|
-|3| 10013.0|    99999.0|    BadData |   NO| NO| |   BadData|    BadData|    BadData|
-|4| 10014.0|    99999.0|    BadData |   NO| NO| ENSO|   59783.0|    5350.0| 500.0|
-|5| 10015.0|    99999.0|    BadData |   NO| NO| ENBL|   61383.0|    5867.0| 3270.0|
-|6| 10016.0 |99999.0|   BadData |   NO| NO|     |64850.0|   11233.0|    140.0|
-|7| 10017.0|    99999.0|    BadData |   NO| NO| ENFR|   59933.0|    2417.0| 480.0|
-|8| 10020.0|    99999.0|    BadData |   NO| SV|     |80050.0|   16250.0|    80.0|
-|9| 10030.0|    99999.0|    BadData |   NO| SV|     |77000.0|   15500.0|    120.0|
+| | Column1 | Column2 | Column3 | Column4 | Column5	| Column6 |	Column7	| Column8 |	Column9 |
+| -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- |
+|0| 10000.0 | 99999.0 |	BadData | NO | NO | ENRS | NaN	| NaN |	NaN |	
+|1| 10003.0 | 99999.0 |	BadData | NO | NO | ENSO |	NaN | NaN | NaN |	
+|2| 10010.0 | 99999.0 |	BadData | NO | JN | ENJA |	70933.0 | -8667.0 | 90.0 |
+|3| 10013.0 | 99999.0 |	BadData | NO | NO |	  |	NaN | NaN |	NaN |
+|4| 10014.0 | 99999.0 |	BadData | NO | NO | ENSO |	59783.0 | 5350.0 |	500.0|
 
 ### Parquet file example