-
Notifications
You must be signed in to change notification settings - Fork 1.2k
hyperlinked text contents are ignored #406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I too stumbled into this problem. AFAIK there is some progress on this in the issue #85 and official patch will be released eventually, but if you need this working right now and content only getting text from hyperlinks (without address part), you can install package with this quick and dirty hack q210@336ed9f from my own fork (based on patch proposed by Brad-Python in #85 ) pip command is: |
Running against the same wall. Would appreciate an official patch for this. Tested git above and that solves my problem Thanks, /PA |
I have written a brute force PoC just using LXML, which seems to work. I hope it helps /PA |
I downloaded the below mentioned file and it is not working for me. pip install git+git://github.com/q210/pythondocx.git@336ed9f |
@kart8172 if you post some minimal example of your data here (perhaps .docx file and the script where you trying to use my fork) and describe what are your expectations in more details, I can try and debug the problem. |
Hi I also meet this problem, and I try to use xml to fix it.
|
|
I wrote a small script to fetch the contents of a table in a docx file. Some of the cells in the table contained hyperlinked texts.
while iterating over the table's row's cells i was able to retrieve the text contents by using
.text
attribute.But if the cell contains some texts with hyperlinks, they were simply ignored, while the other texts were successfully retrieved.
The text was updated successfully, but these errors were encountered: