Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some issues with calling a langchain_core.tools.tool with **kwargs #25405

Closed
5 tasks done
pdhoolia opened this issue Aug 14, 2024 · 4 comments
Closed
5 tasks done

Some issues with calling a langchain_core.tools.tool with **kwargs #25405

pdhoolia opened this issue Aug 14, 2024 · 4 comments
Assignees
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: core Related to langchain-core investigate

Comments

@pdhoolia
Copy link

pdhoolia commented Aug 14, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangGraph/LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangGraph/LangChain rather than my code.
  • I am sure this is better as an issue rather than a GitHub discussion, since this is a LangGraph bug and not a design question.

Example Code

I have a simple query tool (over a json) as follows:

from langchain_core.tools import tool

data = [
    {
        "name": "Ann Lau",
        "jobTitle": "Director of Talent Management",
        "userId": "82092",
        "email": "Ann.Lau@bestrunsap.com",
        "departmentName": "Talent Management Corp",
        "departmentNumber": "50007750",
        "managerId": "71081",
    },
    {
        "name": "Ellen Reckert",
        "jobTitle": "Recruiting Manager",
        "userId": "100009",
        "email": "Ellen.Reckert@bestrunsap.com",
        "departmentName": "Human Resources US",
        "departmentNumber": "50150001",
        "managerId": "82092",
    },
    {
        "name": "Jack Kincaid",
        "jobTitle": "Development Manager",
        "userId": "100083",
        "email": "Jack.Kincaid@bestrunsap.com",
        "departmentName": "Employee Development",
        "departmentNumber": "50012007",
        "managerId": "82092",
    },
    {
        "name": "Joanne Pawlucky",
        "jobTitle": "Compensation Manager",
        "userId": "100152",
        "email": "Joanne.Pawlucky@bestrunsap.com",
        "departmentName": "Total Rewards",
        "departmentNumber": "50007730",
        "managerId": "82092",
    },
    {
        "name": "Parry Nolan",
        "jobTitle": "Recruiter",
        "userId": "108724",
        "email": "Parry.Nolan@bestrunsap.com",
        "departmentName": "Human Resources US",
        "departmentNumber": "50150001",
        "managerId": "100009",
    },
    {
        "name": "Melissa Collins",
        "jobTitle": "Recruiter",
        "userId": "108721",
        "email": "Melissa.Collins@bestrunsap.com",
        "departmentName": "Human Resources US",
        "departmentNumber": "50150001",
        "managerId": "100009",
    },
    {
        "name": "Jack Powers",
        "jobTitle": "Recruiter",
        "userId": "108723",
        "email": "Jack.Powers@bestrunsap.com",
        "departmentName": "Human Resources US",
        "departmentNumber": "50150001",
        "managerId": "100009",
    },
    {
        "name": "Brett Neil",
        "jobTitle": "Recruiter",
        "userId": "108722",
        "email": "Brett.Neil@bestrunsap.com",
        "departmentName": "Human Resources US",
        "departmentNumber": "50150001",
        "managerId": "100009",
    },
    {
        "name": "Amelia Ruiz",
        "jobTitle": "Sr Recruiter",
        "userId": "108713",
        "email": "Amelia.Ruiz@bestrunsap.com",
        "departmentName": "Human Resources US",
        "departmentNumber": "50150001",
        "managerId": "100009",
    },
    {
        "name": "Michael Roth",
        "jobTitle": "Development Analyst",
        "userId": "196802",
        "email": "Michael.Roth@bestrunsap.com",
        "departmentName": "Employee Development",
        "departmentNumber": "50012007",
        "managerId": "100083",
    },
    {
        "name": "Amanda Winters",
        "jobTitle": "Development Analyst Lead",
        "userId": "100052",
        "email": "Amanda.Winters@bestrunsap.com",
        "departmentName": "Employee Development",
        "departmentNumber": "50012007",
        "managerId": "100083",
    },
    {
        "name": "Rick Smolla",
        "jobTitle": "Development Analyst",
        "userId": "100093",
        "email": "Rick.Smolla@bestrunsap.com",
        "departmentName": "Employee Development",
        "departmentNumber": "50012007",
        "managerId": "100083",
    },
    {
        "name": "Kay Holliston",
        "jobTitle": "Program Manager",
        "userId": "100095",
        "email": "Kay.Holliston@bestrunsap.com",
        "departmentName": "Human Resources US",
        "departmentNumber": "50150001",
        "managerId": "100083",
    },
    {
        "name": "John Parker",
        "jobTitle": "Sr. Compensation Analyst",
        "userId": "100241",
        "email": "John.Parker@bestrunsap.com",
        "departmentName": "Total Rewards",
        "departmentNumber": "50007730",
        "managerId": "100152",
    },
]

@tool
def query_data(query_type, **kwargs) -> list[dict]:
    """Query employee data with various filters
Args:
query_type (str): Must be one of: ["get_by_name", "get_by_email", "get_by_department", "get_by_manager", "get_by_userId"]
    to retrieve entries based on: employee name, employee email, department name, manager's ID, manager's name, or user ID, respectively.

**kwargs: Additional arguments specific to the query type
    - name (str, optional): employee name (required for "get_by_name")
    - email (str, optional): employee email (required for "get_by_email")
    - departmentName (str, optional): department name (required for "get_by_department")
    - managerId (str, optional): manager ID (required for "get_by_manager")
    - managerName (str, optional): manager name (required for "get_by_managerName")
    - userId (str, optional): employee's user ID (required for "get_by_userId")

Returns:
list: list of dictionary entries from data that match query criteria
dict: On error, returns a dictionary with error message
    """

    print(f"Received query_type: {query_type}")
    print(f"Received kwargs: {kwargs}")

    if query_type == "get_by_name":
        return [entry for entry in data if entry["name"].lower() == kwargs.get("name", "").lower()]
    elif query_type == "get_by_email":
        return [entry for entry in data if entry["email"].lower() == kwargs.get("email", "").lower()]
    elif query_type == "get_by_department":
        return [entry for entry in data if entry["departmentName"].lower() == kwargs.get("departmentName", "").lower()]
    elif query_type == "get_by_manager":
        return [entry for entry in data if entry["managerId"] == kwargs.get("managerId")]
    elif query_type == "get_by_managerName":
        # First, find the manager's ID by their name
        manager_entries = [entry for entry in data if entry["name"].lower() == kwargs.get("managerName", "").lower()]
        if not manager_entries:
            return {"error": "Manager with the given name not found"}
        manager_ids = [entry["userId"] for entry in manager_entries]
        # Then, find all entries managed by this manager
        return [entry for entry in data if entry["managerId"] in manager_ids]
    elif query_type == "get_by_userId":
        return [entry for entry in data if entry["userId"] == kwargs.get("userId")]
    else:
        return {"error": "Invalid query type or parameters"}

tools = [query_data]

If i run this function in isolation using something like:

print(query_data(query_type="get_by_name", name="Jack Kincaid"))

It works as expected.

However, when the tool is being called by langgraph. It seems to return []. It seems there are no kwargs received.

In the studio I see the following

{
  "values": {
    "messages": [
      {
        "content": "Hi there! My name is Jack Kincaid. Who is my manager?",
        "additional_kwargs": {},
        "response_metadata": {},
        "type": "human",
        "name": null,
        "id": "4622ca43-4c73-4656-907a-6ff0016173b7",
        "example": false
      },
      {
        "content": "",
        "additional_kwargs": {
          "tool_calls": [
            {
              "index": 0,
              "id": "call_TtwL3GLZmHr2gHNYJPMCTAOF",
              "function": {
                "arguments": "{\"query_type\":\"get_by_name\",\"name\":\"Jack Kincaid\"}",
                "name": "query_data"
              },
              "type": "function"
            }
          ]
        },
        "response_metadata": {
          "finish_reason": "tool_calls",
          "model_name": "gpt-4o-2024-05-13",
          "system_fingerprint": "fp_c9aa9c0491"
        },
        "type": "ai",
        "name": null,
        "id": "run-619273c0-f976-4ceb-91a4-e2bf0045e7a4",
        "example": false,
        "tool_calls": [
          {
            "name": "query_data",
            "args": {
              "query_type": "get_by_name",
              "name": "Jack Kincaid"
            },
            "id": "call_TtwL3GLZmHr2gHNYJPMCTAOF",
            "type": "tool_call"
          }
        ],
        "invalid_tool_calls": [],
        "usage_metadata": null
      },
      {
        "content": "[]",
        "additional_kwargs": {},
        "response_metadata": {},
        "type": "tool",
        "name": "query_data",
        "id": "58bc763e-b2ca-4dd8-adbd-f4c54e072045",
        "tool_call_id": "call_TtwL3GLZmHr2gHNYJPMCTAOF",
        "artifact": null,
        "status": "success"
      }
    ]
  },
  "next": [
    "agent"
  ],
  "config": {
    "configurable": {
      "thread_id": "a72d302c-1f87-4e84-a39c-db2823da90f6",
      "checkpoint_ns": "",
      "checkpoint_id": "1ef5a640-0cc9-6e85-8002-2271b93dc76c"
    }
  },
  "metadata": {
    "step": 2,
    "run_id": "1ef5a63f-ea6f-63e8-9e2f-3c9aec75708e",
    "source": "loop",
    "writes": {
      "action": {
        "messages": [
          {
            "id": "58bc763e-b2ca-4dd8-adbd-f4c54e072045",
            "name": "query_data",
            "type": "tool",
            "status": "success",
            "content": "[]",
            "artifact": null,
            "tool_call_id": "call_TtwL3GLZmHr2gHNYJPMCTAOF",
            "additional_kwargs": {},
            "response_metadata": {}
          }
        ]
      }
    },
    "user_id": "",
    "graph_id": "agent",
    "thread_id": "a72d302c-1f87-4e84-a39c-db2823da90f6",
    "created_by": "system",
    "assistant_id": "fe096781-5601-53d2-b2f6-0d3403f7e9ca"
  },
  "created_at": "2024-08-14T17:38:22.841401+00:00",
  "parent_config": {
    "configurable": {
      "thread_id": "a72d302c-1f87-4e84-a39c-db2823da90f6",
      "checkpoint_ns": "",
      "checkpoint_id": "1ef5a640-0cb7-6a25-8001-8d5a0e3defd6"
    }
  }
}

The tool call message is correct

          {
            "name": "query_data",
            "args": {
              "query_type": "get_by_name",
              "name": "Jack Kincaid"
            },
            "id": "call_TtwL3GLZmHr2gHNYJPMCTAOF",
            "type": "tool_call"
          }

However the tool response in next message is empty.

Also my log for the query_data function is:

langgraph-api-1       | Received query_type: get_by_name
langgraph-api-1       | Received kwargs: {}

I think something related to kwargs here may be causing the problem.

Description

I am expecting the tool_call to return a result. But it seems kwargs part of the tool signature is somehow getting lost.

System Info

pip freeze | grep langchain

langchain==0.2.13
langchain-anthropic==0.1.23
langchain-community==0.2.12
langchain-core==0.2.30
langchain-openai==0.1.21
langchain-text-splitters==0.2.2

platform: mac
python version: i have tried with both 3.11 and 3.12. Same results.

@hinthornw
Copy link
Collaborator

hinthornw commented Aug 14, 2024

Great question!

High level, i don't think it's good practice to expose polymorphic functions directly to LLMs. It's more confusing to the LLM to use kwargs since you're giving even more ambiguity to the LLM that will almost definitely reduce reliability.

Mechanistically, tools are called using invoke() or ainvoke() (for async), not as functions. That means you'd call like

query_data.invoke({"query_type": "foo", "kwargs": {"ey", "yo"}})

and the you'd get a literal 'kwargs'-keyed keyword arg you would have to de-nest here. We could potentially look into adding more native support for kwargs, though that would be directed on the langchain-core repo.

But again, if I were trying to get good results from an LLM I would avoid overloading 6 tools in one here and make it explicit what it's supposed to provide.

@hwchase17
Copy link
Contributor

im going to move this to the langchain repo, but i do think it could be nice to think if we can improve support for this...

@hwchase17 hwchase17 transferred this issue from langchain-ai/langgraph Aug 14, 2024
@langcarl langcarl bot added the investigate label Aug 14, 2024
@dosubot dosubot bot added Ɑ: core Related to langchain-core 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Aug 14, 2024
@pdhoolia
Copy link
Author

pdhoolia commented Aug 15, 2024

High level, i don't think it's good practice to expose polymorphic functions directly to LLMs. It's more confusing to the LLM to use kwargs since you're giving even more ambiguity to the LLM that will almost definitely reduce reliability.

I agree, and for a quick proof-of-concept that was what I was going to do. Before doing that however, I asked the LLM what it would do (with a prompt like):

I want to write a generic python function that i can use with Open AI function_call support to perform all kinds of queries with this data.

Ironically, LLM led with that and settled on:

def query_data(query_type, **kwargs):
    """
    Query the employee data with various filters.

    Args:
        query_type (str): The type of query to perform. Must be one of the following:
            - "get_by_name": Retrieve entries based on the employee's name.
            - "get_by_email": Retrieve entries based on the employee's email.
            - "get_by_department": Retrieve entries based on the department name.
            - "get_by_manager": Retrieve entries based on the manager's ID.
            - "get_by_managerName": Retrieve entries based on the manager's name.
            - "get_by_userId": Retrieve entries based on the employee's user ID.
        
        **kwargs: Additional arguments specific to the query type.
            - name (str, optional): The name of the employee to search for (required for "get_by_name").
            - email (str, optional): The email of the employee to search for (required for "get_by_email").
            - departmentName (str, optional): The name of the department to search for (required for "get_by_department").
            - managerId (str, optional): The ID of the manager to search for (required for "get_by_manager" and "get_direct_reports").
            - managerName (str, optional): The name of the manager to search for (required for "get_by_managerName").
            - userId (str, optional): The user ID of the employee to search for (required for "get_by_userId").

    Returns:
        list: A list of dictionary entries from the data that match the query criteria.
        dict: If an error occurs, returns a dictionary with an error message.

    Examples:
        >>> query_data(query_type="get_by_name", name="Ann Lau")
        Returns all entries with the name "Ann Lau".

        >>> query_data(query_type="get_by_email", email="Ann.Lau@bestrunsap.com")
        Returns the entry with the email "Ann.Lau@bestrunsap.com".

        >>> query_data(query_type="get_by_department", departmentName="Human Resources US")
        Returns all entries within the "Human Resources US" department.

        >>> query_data(query_type="get_by_manager", managerId="82092")
        Returns all entries managed by the manager with ID "82092".

        >>> query_data(query_type="get_by_managerName", managerName="Ann Lau")
        Returns all entries managed by the manager named "Ann Lau".
    """
    ...

Also, the other thing is that generic lookup APIs like that are quite common, and while they could be function wrapped in a lot of de-polymorphing functions. That could be quite some work everytime we encounter such an enterprise API.

Copy link

dosubot bot commented Nov 29, 2024

Hi, @pdhoolia. I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

Issue Summary

  • You reported a bug with **kwargs in langchain_core.tools.tool.
  • A code snippet was provided to demonstrate the issue.
  • hinthornw suggested using invoke() or ainvoke() due to potential ambiguity with polymorphic functions.
  • hwchase17 mentioned moving the issue to the langchain repository and considering improvements for kwargs support.
  • You acknowledged the feedback but highlighted the commonality of generic lookup APIs.

Next Steps

  • Please confirm if this issue is still relevant to the latest version of the LangChain repository. If so, you can keep the discussion open by commenting here.
  • Otherwise, this issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 29, 2024
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2024
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: core Related to langchain-core investigate
Projects
None yet
Development

No branches or pull requests

4 participants