Skip to content

PySpark debugging can't set --jar (or other spark-submit) options #123

@DonJayamanne

Description

@DonJayamanne

From @KevinGrealish on May 30, 2017 7:48

Environment data

VS Code version: 1.12.1
Python Extension version: 0.6.3
Python Version: 3.5.2
OS and version: WIndows 10 Creators Update

Actual behavior

In PySpark debug configuration there is no way to add options to spark-submit, for example to use the --jar option to add a .jar file to the run. Using args in the debug config will just append the args to the end of the spark-submit command and hence passes them to visualstudio_py_launcher (and then to the python program being debugged) rather than to spark-submit. Adding the --jar option on the end of the "pythonPath" also does not work as then the entire value, now including the options, is quoted.

Expected behavior

Another option perhaps: pythonArgs, which is like args but puts the arguments specified after the pythonPath but before the program.

Steps to reproduce:

With Spark installed and SPARK_HOME set. Use the provided PySpark debugger config.

  • In launch,json in PySpark configuration add:
    "args": [
    "--jar",
    "C:\myjar.jar"
    ],
  • Launch the debugger and find the PySpark program does not have the .jar loaded. See that the --jar option was added as the very end of the spark-submit command, but it needs to be before the .py file.
    OR
  • In launch.json in PySpark configuration change the pythonPath to include the options and then launch the debugger. The launch fails due to quoting around the PythonPath including the options.

Settings

Your launch.json (if dealing with debugger issues):

{
    "version": "0.2.0",
    "configurations": [

        {
            "name": "Python",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",
            "program": "${file}",
            "cwd": "${workspaceRoot}",
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ]
        },
        {
            "name": "PySpark",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "osx": {
                "pythonPath": "${env:SPARK_HOME}/bin/spark-submit"
            },
            "windows": {
                "pythonPath": "${env:SPARK_HOME}/bin/spark-submit.cmd"
            },
            "linux": {
                "pythonPath": "${env:SPARK_HOME}/bin/spark-submit"
            },
            "program": "${file}",
            "cwd": "${workspaceRoot}",
            "args": [
                "--jar",
                "C:\\Jex\\out\\artifacts\\Jex_jar\\Jex.jar"
            ],

            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ],"console": "integratedTerminal"
        },
        {
            "name": "Python Module",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",
            "module": "module.name",
            "cwd": "${workspaceRoot}",
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ]
        },
        {
            "name": "Integrated Terminal/Console",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",
            "program": "${file}",
            "cwd": "",
            "console": "integratedTerminal",
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit"
            ]
        },
        {
            "name": "External Terminal/Console",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",
            "program": "${file}",
            "cwd": "",
            "console": "externalTerminal",
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit"
            ]
        },
        {
            "name": "Django",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",
            "program": "${workspaceRoot}/manage.py",
            "cwd": "${workspaceRoot}",
            "args": [
                "runserver",
                "--noreload"
            ],
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput",
                "DjangoDebugging"
            ]
        },
        {
            "name": "Flask",
            "type": "python",
            "request": "launch",
            "stopOnEntry": false,
            "pythonPath": "${config:python.pythonPath}",
            "program": "fully qualified path fo 'flask' executable. Generally located along with python interpreter",
            "cwd": "${workspaceRoot}",
            "env": {
                "FLASK_APP": "${workspaceRoot}/quickstart/app.py"
            },
            "args": [
                "run",
                "--no-debugger",
                "--no-reload"
            ],
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ]
        },
        {
            "name": "Flask (old)",
            "type": "python",
            "request": "launch",
            "stopOnEntry": false,
            "pythonPath": "${config:python.pythonPath}",
            "program": "${workspaceRoot}/run.py",
            "cwd": "${workspaceRoot}",
            "args": [],
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ]
        },
        {
            "name": "Pyramid",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",
            "cwd": "${workspaceRoot}",
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "args": [
                "${workspaceRoot}/development.ini"
            ],
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput",
                "Pyramid"
            ]
        },
        {
            "name": "Watson",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",
            "program": "${workspaceRoot}/console.py",
            "cwd": "${workspaceRoot}",
            "args": [
                "dev",
                "runserver",
                "--noreload=True"
            ],
            "env": {},
            "envFile": "${workspaceRoot}/.env",
            "debugOptions": [
                "WaitOnAbnormalExit",
                "WaitOnNormalExit",
                "RedirectOutput"
            ]
        },
        {
            "name": "Attach (Remote Debug)",
            "type": "python",
            "request": "attach",
            "localRoot": "${workspaceRoot}",
            "remoteRoot": "${workspaceRoot}",
            "port": 3000,
            "secret": "my_secret",
            "host": "localhost"
        }
    ]
}

Your settings.json:

Logs

Output from Python Debug Console output panel

Windows PowerShell
Copyright (C) 2016 Microsoft Corporation. All rights reserved.

PS C:\Jex> cd 'c:\Jex'; $env:PYTHONIOENCODING='UTF-8'; C:\spark-2.0.2-bin-hadoop2.7\/bin/spark-submit.cmd C:\Users\KevinGre\.vscode\extensions\donjayamanne.python-0.6.3\pythonFiles\PythonTools\visualstudio_py_launcher.
py c:\Jex 2283 34806ad9-833a-4524-8cd6-18ca4aa74f14 WaitOnAbnormalExit,WaitOnNormalExit c:\Jex\src\main.py --jar C:\Jex\out\artifacts\Jex_jar\Jex.jar

Copied from original issue: DonJayamanne/pythonVSCode#988

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature-requestRequest for new features or functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions