Skip to content

TextLoader Dynamic encoding detection similar to python #7933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

gangadharrr
Copy link

Implemented the python logic of file encoding detection using (JsChardet) to detect all the file encodings and tested out with utf-16le encoded legacy file for raw text. This PR fixes the issue of raw text format representation in the Text loder load() medthod which creates documents in raw format.
python helpers.py
https://github.com/langchain-ai/langchain/blob/b075eab3e0af9a578af80c6e38f869419e770b5c/libs/community/langchain_community/document_loaders/helpers.py#L19
python Textloader.py
https://github.com/langchain-ai/langchain/blob/b075eab3e0af9a578af80c6e38f869419e770b5c/libs/community/langchain_community/document_loaders/text.py#L13

Fixes # (issue)

Copy link

vercel bot commented Mar 30, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ❌ Failed (Inspect) 💬 Add feedback Mar 30, 2025 5:31am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Mar 30, 2025 5:31am

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 30, 2025
@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Mar 30, 2025
@gangadharrr
Copy link
Author

Hey team, tried to fix the docs deployment with the ways I know, still not sure why docs are failing, LMK if some changes need to done from my side. Happy Learning!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant