Skip to content

Header and Footer #291

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

eupharis
Copy link
Contributor

No description provided.

@eupharis eupharis mentioned this pull request Apr 20, 2016

@text.setter
def text(self, text):
raise NotImplementedError('todo')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I just revised this commit and merged it in: https://github.com/python-openxml/python-docx/commits/feature/header

Since it's your first one of these I'm thinking it's faster that way, we can maintain momentum and I'll just note the whys and wherefores here so you can pick it up as you go :)

You'll probably want to rebase and force push after you've digested the comments so you're building on the main branch, but as long as your commits are individually mergeable without error there's some discretion on all that. I'm going to be cherry picking them and generally there are always revisions in the process. I'll leave the rebasing decision to you.

I do recommend you study the diff and understand why each change was made because they'll come up over and over. The general patterns are remarkably consistent on this project, probably because everything is just a fancy way of reading and writing to an XML file :)


  • All classes, functions, and methods get a full docstring at their first appearance. Follow the conventions of the rest of the library. Everyone's got a favorite; but consistency is the objective here.
  • Header should inherit from ElementProxy because it will wrap the <w:header> element. More on that later.
  • These two methods are unsupported by tests and not yet indicated by the acceptance test errors. So they come out for now. They'll get their turn in due course :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@eupharis eupharis closed this May 2, 2016
@eupharis eupharis force-pushed the feature/header2 branch from d11d160 to bbbc287 Compare May 2, 2016 17:59
@eupharis eupharis reopened this May 2, 2016
@eupharis eupharis force-pushed the feature/header2 branch from 7f02b5d to c68f667 Compare May 2, 2016 22:32
"""

__slots__ = ()
Copy link
Contributor Author

@eupharis eupharis May 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used BlockItemContainer in my previous implementation of header and it works great. And its docstring says it's supposed to be used for headers.

It gives you all the .paragraphs and .add_paragraphs and all that jazz pretty much out of the box.

Header is a bit weird in that it needs two separate elements in two separate files to function:

<w:headerReference> in document.xml
<w:header> in header1.xml (or whatever)

in my previous implementation i just made Header handle <w:header> in header1.xml and then did this for the headerReference element

        header_ref_elm_tag = 'w:headerReference'
        header_ref_elm = OxmlElement(header_ref_elm_tag, attrs=header_attrs)
        # add the relationship
        sectPr.insert(0, header_ref_elm)

it seemed to work pretty well.

@scanny
Copy link
Contributor

scanny commented May 4, 2016

Okay, I've added my comments on that commit. Want to have another go and I think we'll have this commit ready to merge :)

@eupharis eupharis force-pushed the feature/header2 branch 5 times, most recently from 6459a93 to 3598440 Compare May 5, 2016 16:47
from .shared import ElementProxy


class Header(ElementProxy):
"""
The default page header of a section.
Proxy for ``<w:hdr>`` element in this section, having primarily a
container role.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is merged now, these comments are just for reference for future commits :)
https://github.com/python-openxml/python-docx/commits/feature/header

I left this change out. The docstring is for the library end-user (developer using python-docx) because it appears in the API documentation. So the details of what's being proxied wouldn't generally be a concern for them. I know I've used the "container role" phrase elsewhere, but I'm thinking if we want to elaborate we'd want to do so with specifics that would be helpful to a developer learning how to use this object. I'm sure we'll find more tidbits to add here as we go.

@scanny
Copy link
Contributor

scanny commented May 10, 2016

As far as next steps are concerned, it's probably time to start a new hdr-header-props.feature file and start on the header properties.

In general we'll want to stay to the "getter" side of things until that's pretty well fleshed out; the acceptance tests for the "setter" side generally use the getter side methods to validate they worked.

I'm thinking the getter aspect of Header.is_linked_to_previous is a good place to start. So something like:

Scenario Outline: Get Header.is_linked_to_previous
  Given a header <having-or-no> definition 
   Then header.is_linked_to_previous is <value>

  Examples: ...
    | having-or-no | value |
    | having a     | False |
    | having no    | True  |

That will get us down to the next step in the process, which is accessing the right header definition elements under w:sectPr.

After that, I'm thinking the next step is Header.body, which is the actual block item container. Something like:

Scenario: Get Header.body
  Given a header having a definition 
   Then header.body is a BlockItemContainer object

This will need to get more sophisticated later, adding a case for a header that doesn't have a definition and may need to create one, but this will be a start.

Want to work in that direction and we'll see where we go?

@eupharis
Copy link
Contributor Author

@scanny ok! I just did that acpt test and pushed it. I tried to keep it to just the acceptance test and nothing else for that commit.

Tomorrow I'll dive in to the implementation and unittests.

@eupharis
Copy link
Contributor Author

eupharis commented Jun 13, 2016

@scanny Ok! I reviewed the top commit. Looks good.

I just pushed those analysis updates. I think maybe we should hide the header.body interface, like on document. So the api people actually use is just header.add_paragraph or header.paragraphs, which internally uses header._body. What do you think?

What are the next steps to code? related_hdrftr_body?

To add the header relationship before between hdr and headerReference I did this

        reltype = nsmap['r'] + '/header'
        self._parent.part.rels.add_relationship(reltype, header_part, rel_id)

So for this reading stage we'll have to interact with self._parent.part.rels and use that to get to the HeaderPart

Here's what I had for the HeaderPart before:

class HeaderPart(XmlPart):
    # MOSTLY COPYPASTA FROM DOCUMENT PART BELOW THIS POINT
    # TODO ABSTRACT?
    @property
    def next_id(self):
        """
        The next available positive integer id value in this document. Gaps
        in id sequence are filled. The id attribute value is unique in the
        document, without regard to the element type it appears on.
        """
        id_str_lst = self._element.xpath('//@id')
        used_ids = [int(id_str) for id_str in id_str_lst if id_str.isdigit()]
        for n in range(1, len(used_ids)+2):
            if n not in used_ids:
                return n
        ...

And then in the docx/__init__.py this guy was necessary:

PartFactory.part_type_for[CT.WML_HEADER] = HeaderPart

So in English what I understand this to do is something like:

"For all content types in the document.xml.rels of type WML_HEADER, interact with the actual XML file via the HeaderPart."

@eupharis
Copy link
Contributor Author

eupharis commented Jul 6, 2016

Whew busy few weeks! Hope yours has been good! @scanny where are we at on this? any next steps?

@scanny
Copy link
Contributor

scanny commented Jul 8, 2016

Hi Dan, apologies, this one slipped down my inbox a bit :)

It looks from the current behave error that the next step is to implement DocumentPart.related_hdrftr_body().

Hiding Header.body might possibly work. I assume the behavior you mean is that it would provide access to the paragraphs or whatever of the effective header, meaning this one if defined or the inherited one if not. So .body would never be None and it would add a new header on the first section if there were no other header to inherit from.

We can probably postpone a final decision until we get further along, since I'm thinking these would just amount to API shortcuts; the "long-form" calls would still need to be implemented.

On the rels stuff, most of those operations (I think all) have been factored out into docx.opc.Part. So for the situation you mention, all you need in Header should be:

from docx.opc.constants import RELATIONSHIP_TYPE as RT
rId = self.part.relate_to(header_part, RT.HEADER)

If you look in docx.parts.styles.StylesPart.default() you can see an example of how a new part is created from scratch. SettingsPart has one too, very similarly implemented.

Let me know if you need more to go on :)

I'll wait to merge the docs updates until the next commit so you can update the page if you need to.

@eupharis
Copy link
Contributor Author

Ok! Just pushed another commit.

For some reason I find all this Part stuff way more confusing than anything else in this project. I hope this commit is going in the right direction.

HeaderPart feels slightly different than NumberingPart, StylesPart, and SettingsPart in that we can have mulitple HeaderParts in the same Document and they can have arbitrary PackURIs.

With this commit, if CT_HeaderFooter were done the behave test might be close to passing? Would CT_HeaderFooter be the next step?

@scanny
Copy link
Contributor

scanny commented Jul 12, 2016

Yes, it took me a while to get the part bits worked out. Initially the part and the API object (like document) were the same object; it's only relatively recently that I made them distinct.

The thing about a part object is that it deals with packaging concerns (and interacting with other parts). This is distinct from the API object, which deals strictly with the XML inside the part. So that's where the boundary is.

Anytime you're dealing with an rId (relationship), it's going to involve the part, because only a part can interact with another part. Also, most of those interactions are factored out into the XmlPart or Part base classes.

Hopefully that helps clarify a little bit :)

HeaderPart is distinct (so far) in the way that you mention in python-docx. But it's not the first time we've encountered sets of parts. In python-pptx, slide parts (and others) are multiple. So we should have all the infrastructure in place for dealing with them built into the .opc sub-package.

I'll have a look at this commit and try to get it merged this week. I'm traveling, so it will be toward the end of the week.

just adding it because it seems much easier to mock than
self.rels.related_parts
"""
return self.rels.related_parts[rId]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this all in one piece. Here's an example of how I mocked related_parts in python-pptx. .related_parts is available directly on the part object, no need to go out to .rels to get it :)

    # in pptx.parts.presentation ------
    def related_slide(self, rId):
        """
        Return the |Slide| object for the related |SlidePart| corresponding
        to relationship key *rId*.
        """
        return self.related_parts[rId].slide

    # the rest of this is in tests.parts.test_presentation ------
    def it_provides_access_to_a_related_slide(self, slide_fixture):
        prs_part, rId, slide_ = slide_fixture
        slide = prs_part.related_slide(rId)
        prs_part.related_parts.__getitem__.assert_called_once_with(rId)
        assert slide is slide_

    @pytest.fixture
    def slide_fixture(self, slide_, related_parts_prop_):
        prs_part = PresentationPart(None, None, None, None)
        rId = 'rId42'
        related_parts_ = related_parts_prop_.return_value
        related_parts_.__getitem__.return_value.slide = slide_
        return prs_part, rId, slide_

    @pytest.fixture
    def related_parts_prop_(self, request):
        return property_mock(request, PresentationPart, 'related_parts')

@eupharis
Copy link
Contributor Author

Ok cool. Parts make a bit more sense.

The related_parts mock makes sense. I guess __getitem__ is all we need to do. That's handy. I'll do that from now on.

@Sum4196
Copy link

Sum4196 commented Jul 18, 2016

Hello @eupharis and @scanny, is there anything I can do to help with this?

@scanny
Copy link
Contributor

scanny commented Jul 18, 2016

Hi Michael, we're always happy to have help with things, the main question is whether you have the skills (or, like I was initially, the willingness to develop them quickly :). It's a pretty big library by many people's standards, and fairly complex, so it takes some time to get oriented. Also we maintain pretty high engineering standards in order to keep the library robust at this size with part-time attention.

If none of that scares you away I recommend reading through this full thread and its predecessor #264 and see what you can learn about what it takes. After that you would probably might want to pick a different feature because the coordination overhead would be fairly high and it would be easy to get in each other's way (it's a fairly linear process within a particular feature), but I'll leave that call to @eupharis.

@Sum4196
Copy link

Sum4196 commented Jul 18, 2016

Your response is much appreciated @scanny, I was reading up on this and it seemed to be more of a linear process as you say which I don't want to interfere with. I do know that I am not highly skilled at the time being and am learning as I go (I have learned A LOT reading up on this project by the way). I can possibly take a look at the full thread that you referenced, however, I'm unsure if I would be of much help. I just figured I would ask anyway :)

@dariojr
Copy link

dariojr commented Feb 12, 2017

Hello! The library is very useful, thanks for your work.
I would like to help in the development of this functionality (header and footer). Although I have experience in the work, but new in collaboration GitHub and high standards.
Do you recommend some documentation to start?
I already read branch 291 and download the last merge to start.
Thank you!

@scanny
Copy link
Contributor

scanny commented Feb 13, 2017

@dariojr There is a branch named feature/header which contains the committed work so far: https://github.com/python-openxml/python-docx/commits/feature/header

About the last 11 commits on that branch is the header/footer work. If you want to take a look at that and see where you think you want to go next.

Best is probably to get the tests working on your machine as it's all test-driven development.

The unit tests run with py.test and the acceptance tests are run with behave. There is also syntax checking with flake8. Once you get those three installed, the tests are run with:

make clean & flake 8 & py.test && behave --stop

from the project directory as the current working directory.

What operating system are you on?

@dariojr
Copy link

dariojr commented Feb 27, 2017

Hi, work in Ubuntu 12.04 tls, I am testing now.
Results:
root@dario:/usr/local/lib/python2.7/dist-packages/docx# make clean & flake8 & py.test && behave --stop
[1] 5703
[2] 5704
make: *** No rule to make target `clean'. Stop.
===================================================== test session starts =====================================================
platform linux2 -- Python 2.7.3, pytest-3.0.6, py-1.4.32, pluggy-0.4.0
rootdir: /usr/local/lib/python2.7/dist-packages/docx, inifile:
collected 0 items

================================================ no tests ran in 0.05 seconds =================================================
[1]- Exit 2 make clean
root@dario:/usr/local/lib/python2.7/dist-packages/docx# ./image/png.py:135:9: E731 do not assign a lambda expression, use a def
./image/png.py:146:9: E731 do not assign a lambda expression, use a def
./oxml/init.py:67:1: E402 module level import not at top of file
./oxml/init.py:70:1: E402 module level import not at top of file
./oxml/init.py:73:1: E402 module level import not at top of file
./oxml/init.py:77:1: E402 module level import not at top of file
./oxml/init.py:89:1: E402 module level import not at top of file
./oxml/init.py:98:1: E402 module level import not at top of file
./oxml/init.py:119:1: E402 module level import not at top of file
./oxml/init.py:133:1: E402 module level import not at top of file
./oxml/init.py:151:1: E402 module level import not at top of file
./oxml/init.py:184:1: E402 module level import not at top of file
./oxml/init.py:187:1: E402 module level import not at top of file
./oxml/init.py:198:1: E402 module level import not at top of file
./enum/text.py:84:1: E305 expected 2 blank lines after class or function definition, found 1
./enum/shape.py:21:1: E305 expected 2 blank lines after class or function definition, found 1
root@dario:/usr/local/lib/python2.7/dist-packages/docx#

As you will see I install it in dis-packages to test also with my codes. But I cant function Candidate Protocol publish in:
http://python-docx.readthedocs.io/en/latest/dev/analysis/features/header.html

        header = document.sections[0].header
        header.is_linked_to_previous
        header.text = 'foobar'
        header.is_linked_to_previous

Any suggestions for me to continue introducing myself to this project? Thank you!

@scanny scanny removed the active label Mar 12, 2017
@ascendedcrow
Copy link

Is there a timeframe for this to go in?

@tbell511
Copy link

Quick question... Why has this not been merged into another release?

@khanzf
Copy link

khanzf commented Sep 18, 2017

python-docx has 41 pull requests and has not updated since Jun 22, 2016.
It might be useful to fork and start a new active branch.

@scanny
Copy link
Contributor

scanny commented Sep 18, 2017

@formatically Work stalled on this contributor branch a while back. That often happens on contributed branches. If you want to pick it up by all means do. @khanzf This is a test-driven project, so no commits get merged without the tests. That discourages most casual additions it seems. In my experience only a relative few Python developers are test-driven development (TDD) guys. If you want to add some features by all means open the conversation here. Or if you want to maintain a non-TDD version, fork away :) You might soon be spending your weekends chasing bugs though. That's where indiscriminate committing of pull requests without tests leads in my experience. It might be more productive to become maintainer on this project.

@guerda
Copy link

guerda commented Feb 16, 2018

@scanny: I'm happy to help to push this feature to release.
What do you see to be done to be able to merge this feature branch? As far as I can see, Travis has been building and testing that branch successfully. Again, I'm happy to help

@wheeled
Copy link

wheeled commented Jul 24, 2018

Hi guys, is this PR working? I’d like to check it out.

@Spandan-Madan
Copy link

Can this be used to edit the header of a docx file?

If so, can someone please give a quick example for doing so?

Thanks a lot!

@Benjamin-T Benjamin-T mentioned this pull request Sep 2, 2018
@scanny
Copy link
Contributor

scanny commented Jan 6, 2019

A client sponsored the headers/footers feature and it is forthcoming in the next release v0.8.8 in the next few days. @ondrej-111 I credited you with several of the commits, although the implementation is somewhat different. You can see the implementation on the spike-header branch for comparison if you like. @eupharis I credited you with the analysis and user documentation commits. Thanks to both of you for your contributions to this feature and I'm glad we're finally getting it added to the build :)

@scanny scanny closed this Jan 6, 2019
@eupharis
Copy link
Contributor Author

eupharis commented Jan 7, 2019

@scanny Great news! Thanks for finishing this. Excited to try it out :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants