|
| 1 | +--- |
| 2 | +title: Data scientists |
| 3 | +description: Get started with Coder as a data scientist. |
| 4 | +--- |
| 5 | + |
| 6 | +This article will walk you through the process of getting started with a Coder |
| 7 | +workspace capable of supporting data science projects. You'll learn how to: |
| 8 | + |
| 9 | +- Connect Coder to your Git provider (this example assumes that you're using |
| 10 | + GitHub, but Coder supports GitLab and Bitbucket as well)); |
| 11 | +- Create a workspace with Jupyter Notebook and other data science packages |
| 12 | + present; |
| 13 | +- Add a sample project to your workspace, specifically |
| 14 | + [one as a Jupyter Notebook using IMDB movie data](https://github.com/khorne3/data-science-imdb-sample); |
| 15 | +- Create a dev URL and preview changes to your project. |
| 16 | + |
| 17 | +## Prerequisites |
| 18 | + |
| 19 | +This guide assumes that you have a Coder deployment available to you and that |
| 20 | +you have the credentials needed to access the deployment. |
| 21 | + |
| 22 | +## Step 1: Log in and connect Coder to your Git provider |
| 23 | + |
| 24 | +In this step, you'll log into Coder, then link your Coder account with your Git |
| 25 | +provider. This will allow you to do things like pull repositories and push |
| 26 | +changes. |
| 27 | + |
| 28 | +1. Navigate to the Coder deployment using the URL provided to you by your site |
| 29 | + manager, and log in. |
| 30 | + |
| 31 | +1. Click on your avatar in the top-right, and select **Account**. |
| 32 | + |
| 33 | +  |
| 34 | + |
| 35 | +1. Provide Coder with your SSH key to connect and authenticate to GitHub. |
| 36 | + |
| 37 | + If your site manager has configured OAuth, go to **Linked Accounts** and |
| 38 | + follow the on-screen instructions to link your GitHub account. |
| 39 | + |
| 40 | +  |
| 41 | + |
| 42 | + If your site manager has _not_ configured OAuth or you are using a Git |
| 43 | + provider that Coder does not support, go to **SSH keys**. Copy your public |
| 44 | + SSH key and |
| 45 | + [provide it to GitHub](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account). |
| 46 | + |
| 47 | +  |
| 48 | + |
| 49 | +## Step 2: Import an image |
| 50 | + |
| 51 | +At this point, you'll import your image, which you can think of as a template |
| 52 | +for your workspace. This template contains the language version, tooling, and |
| 53 | +dependencies you need to work on the project. In this case, the image also |
| 54 | +contains a `configure` script that will clone the data science project from |
| 55 | +GitHub to your workspace. |
| 56 | + |
| 57 | +To import an image: |
| 58 | + |
| 59 | +1. In the top navigation bar, click **Images**. Then, click on **Import Image**. |
| 60 | + |
| 61 | +1. Leave the default registry (which is **dockerhub**) selected. |
| 62 | + |
| 63 | +1. Under **repository**, provide **kmhcdr/python**. Provide **latest** as the |
| 64 | + **tag**. Optionally, you can provide a **description** of the image |
| 65 | + |
| 66 | +1. Specify the minimum amount of resources (cores, memory, and disk space) the |
| 67 | + workspace should have when using this image. For this project, we recommend 4 |
| 68 | + cores, 8 GB memory, and 10 GB disk space as a starting point. |
| 69 | + |
| 70 | +1. Click **Import Image**. |
| 71 | + |
| 72 | + |
| 73 | + |
| 74 | +## Step 3: Create your workspace |
| 75 | + |
| 76 | +You will now create the workspace where you'll work on your development project. |
| 77 | + |
| 78 | +1. Return to **Workspaces** using the top navigation bar. |
| 79 | + |
| 80 | +1. Click **New workspace** to launch the workspace-creation dialog. |
| 81 | + |
| 82 | +1. Provide a **Workspace Name**. |
| 83 | + |
| 84 | +1. In the **Image** section, select the **kmhcdr/python** image you just |
| 85 | + imported. |
| 86 | + |
| 87 | +1. Under **Workspace providers**, leave the default option (which is |
| 88 | + **built-in**) selected. |
| 89 | + |
| 90 | +1. Expand the **Advanced** section. If the **Run as a container-based virtual |
| 91 | + machine** option is selected, _unselect_ the box. Leave the **CPU**, |
| 92 | + **Memory**, **Disk**, and **GPU** allocations as-is. |
| 93 | + |
| 94 | +1. Scroll to the bottom, and click **Create workspace**. The dialog will close, |
| 95 | + allowing you to see the main workspace page. You can track the workspace |
| 96 | + build process using the **Build log** on the right-hand side. |
| 97 | + |
| 98 | + Due to the number of packages present in the image, this might take few |
| 99 | + minutes. |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | +Once your workspace is ready for use, you'll see a chip that says **Running** |
| 104 | +next to the name of your workspace. |
| 105 | + |
| 106 | +## Step 4: Open up the sample project |
| 107 | + |
| 108 | +At this point, you're ready to open up Jupyter to access your notebook. |
| 109 | + |
| 110 | +1. Under **Browser applications**, click **Jupyter** to open the IDE in a new |
| 111 | + browser tab. |
| 112 | + |
| 113 | +1. Under **Files**, click to open the **data-science-imdb-sample** project. |
| 114 | + |
| 115 | +1. Click **Data Science Workflow.ipynb** to launch the notebook. |
| 116 | + |
| 117 | + You're now ready to proceed with work on the project. |
| 118 | + |
| 119 | + |
0 commit comments