Headers and Footers API #236

eupharis · 2015-12-03T19:20:44Z

This PR is way not ready to actually be merged. Just want to get some feedback on the API before I go farther.

All existing tests pass
(Except for 3 hash comparison tests since I added a header and footer to tests/expanded_docx/. I skipped them for now.)
There are 5 new passing tests for header and 5 new passing tests for footer. There obviously need to be many, many more unittests before this is ready for primetime.
The code needs much cleanup.

Here's a sample:

header_style = document.styles.add_style('Bar', ...)
header_style.font = Pt(32)

document = Document()
header = document.add_header()
header_p = header.add_paragraph('foo', style='Bar')
header_p.add_run('alice')
header_p.add_run().add_picture(img_file, width, height)

Fuller API notes below in docs/header_footer.rst.

eupharis · 2015-12-08T01:23:25Z

Woops. Looks like I had some python 3 issues. Fixed now.

scanny · 2015-12-08T02:14:03Z

Hi @eupharis, looks like you've made some good progress. It probably makes sense for you to sketch out the API you're thinking of so we can consider it directly. You've correctly identified that as the key part of getting a commit. It's the one thing we can't change once it's been put in so needs to be gotten right the first time :)

There are analysis pages under the docs directory, here's one that's related:
http://python-docx.readthedocs.org/en/latest/dev/analysis/features/sections.html

There are a few others around this one that are fairly good examples:
http://python-docx.readthedocs.org/en/latest/dev/analysis/features/text/font-color.html

I use the word "protocol" in there to denote describing the API by showing a typical usage of it, rather than just saying what methods there are on what objects. I think that's a pretty good way to capture the API you propose. It would take too long and be too error-prone to pull that out of your draft code; better to spell it out so we can discuss it easily.

I'm thinking one of these pages is a good place to gather that sort of thing because it will then become part of the documentation. It will also probably be the first item committed.

There are other examples, some well developed, in the analysis documentation for python-pptx; here's a recent one:
http://python-pptx.readthedocs.org/en/latest/dev/analysis/features/shp-hyperlink.html

I recommend just starting out with the bits that seem helpful for communicating your thinking. I see these as documentation for my own use when I'm writing them. Things I need to get straight in my head or details I need to dig out of the schema or Microsoft API or whatever that I know I'll need when I get into the code.

Then we can focus the discussion around a fairly well circumscribed area.

eupharis · 2015-12-11T00:47:43Z

Ok cool! I just pushed a protocol here:

https://github.com/eupharis/python-docx/blob/master/docs/dev/analysis/features/header-footer.rst

It looks slightly better via local Sphinx but the github rendering is pretty good.

It includes a write-up of my notes of how the hell docx headers work. I will definitely reference it in the future.

I modeled the protocol format after this one:

http://python-docx.readthedocs.org/en/latest/dev/analysis/features/text/font-color.html

eupharis · 2016-01-07T19:01:17Z

Ping!

scanny · 2016-01-26T20:33:07Z

Hi @eupharis, apologies, I'm not actively working on this project at the moment (other projects call as you will understand I'm sure), so it's taken a while to get to this.

Thanks for writing up the analysis page. This helps a lot in getting details ironed out before you go in to do the actual implementation. I've been surprised a bit to discover that the implementation generally turns out to be the easy part; it's the discovery, analysis, and resolution up front that take the time and all the noodling :)

The approach you suggest in the analysis page appears to be inconsistent with the existing approach for that sort of thing and also how Word handles the broader bit about headers and footers.

I just realized I had started some analysis work on headers/footers a while back. I just posted it here, you might want to have a look:
e8fb68e

I think the first thing is that headers and footers have a natural affinity for sections, rather than the document as a whole. Most documents probably only have a single section, but of course the API needs to work for the general case. So the first step in the protocol would be something like this I'm pretty sure:

>>> document = Document('xyx.docx')
>>> section = document.sections[-1]  #last section, for example
>>> odd_page_header = section.primary_header
>>> odd_page_header
<docx.header.Header object at 0x1029a6820>
>>>

Another item that jumps out is that there can be as many as three headers and/or footers for each section, one for odd pages (the primary header/footer), one for even pages (if laying out verso/recto for binding), and a possibly different first-page header/footer. So the API will need to both provide access to those and provide properties like has_first_page_header or whatever to allow discovery of what is defined.

I think you'll want to take a look at the MS API for headers and footers to see what they do. We usually find it pretty useful to model after that API, although there are important departures in certain cases to make things more Pythonic or just more rational. In any case it's useful to understand the MS API protocol as a key input to the analysis. I know in this case they maintain a collection accessed by indices 1, 2, or 3. I'm thinking we just make three properties (six actually when you include footers, e.g. .even_page_header that either return the appropriate section or None if one is not defined.

You'll also want to pull together the key parts of the XML schema along with examples/snippets. You can find plenty of examples of those in the other analysis pages. You might be able to reuse some of the bits from sections here, since they're so closely related:
https://github.com/eupharis/python-docx/blob/master/docs/dev/analysis/features/sections.rst

Also have a look at this issue for some ideas and prior work:
#104

Let us know how you go :)

eupharis · 2016-01-26T21:17:48Z

Ok awesome! I totally understand about other projects hah.

This is all super helpful. I'm currently using this code on a project of mine, and I've run into the problem with this v1 version of headers not playing nicely with sections. The approach you outline above should work much better.

I've blocked out some time next week and will dig into this further!

…x into develop

eupharis · 2016-02-23T00:41:31Z

Next week turned out to be rather optimistic ;)

I merged in all your feature/header commits. It looks like that new settingsPart work will be invaluable for adding <w:evenAndOddHeaders/> to settings.xml right?

I spelled out the XML that should exist for all the different all/even/odd/first header scenarios:

https://github.com/eupharis/python-docx/blob/master/docs/dev/analysis/features/header-footer.rst#all-pages-even-pages-odd-pages-first-page

I also updated the analysis I did so that header, even_page_header, and friends are all on Section not Document:

https://github.com/eupharis/python-docx/blob/master/docs/dev/analysis/features/header-footer.rst#candidate-protocol

Your headerfooter.rst still has a lot of good tidbits still that aren't merged into header-footer.rst yet. I figure we'll tackle that when we get closer.

I think we are getting close with the API?!?

eupharis · 2016-02-23T01:23:02Z

I totally agree that defining all the API and understanding the XML structure and writing the analysis is the hard part. This is actually pretty simple to implement now. I could port all my document.add_header methods and tests pretty easily.

scanny · 2016-02-23T01:35:23Z

It's very counterintuitive that way, isn't it? It still surprises me when I do it :)

I'll take a look at your analysis page and provide some feedback.

scanny · 2016-02-23T02:08:58Z

Ok @eupharis, I've taken a look and have some remarks I want to add in as comments. What I'll need is to have this file in a commit of its own so I can make inline comments. And it needs to be the whole file, all at once. If there are changes spread across multiple commits it breaks up the text and it's impossible to trace through.

What I'm thinking probably makes sense is for you to create a new, separate pull request that we use for moving this to commit/merge. We can keep this one open as it might come in handy. The new one will be built up commit by commit, essentially getting each one ready to merge before bringing in the next one.

How's your Git? You'll be needing it here I expect. Let me know if you need help on some part as we go. Here's what we'll need.

The new branch will be called feature/header. It will be based (rooted) at python-openxml/master. All commits submitted for review and later merging will appear on that branch, which will be eupharis/feature/header once you push it to your repo. On your local it will likely be origin/feature/header, but that can vary depending on where you cloned from.

This first commit should be named: 'docs: document header/footer analysis'. It will remain a single commit until merging, at least on your GitHub repo. You can keep it any way you like locally as you're the only one who sees that and occasionally it makes sense to have multiple commits and then squash them or whatever before pushing.

In order to overwrite the pull request branch (eupharis/feature/header) with changes to an existing commit, you'll need to push with the --force option. After a forced push, the pull request will automatically show the updated version. The comments from me (often extensive) from the prior version of that commit will be gone I believe, so make sure you get what you need before force pushing :)

In general there will be no merges in the pull request branch; the commit history will remain entirely linear. Likewise, there will be no "fixed this, fixed that" commits. Changes will be rebased into place as necessary. If changes need to be made to a commit that's already been merged, we'll talk and work it out.

Take a browse through the commit history for the latent styles features added starting here: e7930e4

You'll see a pattern that happens over and over again, and should happen here too:

docs: document analysis (single commit)
acpt: add scenarios for #acceptance tests
style: add Object.property # contains one unit test and code to make that test pass
repeat for 2-6 atomic functionality bits, one commit each
after the last one, the acceptance test(s) pass, that's a wrap for that feature
rinse and repeat

You have to go back about that far to see it, because I did a bit of reorganization before the last release and that has a very different form (and it's not done very often :).

You'll notice a few things:

the commits are very atomic, there is only one thing happening in each one, but it's a whole thing.
there are no little back and forth fixing things up commits. If those happen, they're rebased out before the "merge".
There are no merges. The commit history is entirely linear, except at the tip where there may be various feature branches in progress at any given time.
The entire test suite passes on each and every commit. No exceptions. The currently active acceptance test is marked @wip until that feature is done so it is skipped by the main test script. Where it fails (when run in development) points to what commit needs to come next; the one that gets you past the acceptance test error and gives you a different error further along.

Let me know if you need more to go on. Otherwise I'll review and remark on your analysis page once it's up :)

eupharis · 2016-02-23T16:54:25Z

Ok new PR created!

#263

I'm curious why we are basing it off of python-openxml/master. Are we going to merge in python-openxml/feature-header later?

scanny · 2016-02-23T18:59:05Z

Basing it off of feature/header is even better, by all means, let's do that :)

eupharis · 2016-02-23T19:37:12Z

Perfect! Here we go:

#264

I spent a fair while this morning playing with git rebase and git rebase -i. I feel good about rebasing to keep up to speed if you push anything new to feature/header. And squashing to get nice commits!

scanny · 2016-05-26T04:36:54Z

Closing this one for now since we have a new, later one open for header/footer.

Steve Canny and others added 17 commits March 15, 2015 23:37

docs: document header/footer feature analysis

e8fb68e

reformat single-line xml, add header

c08ca7e

temporarily skip sha tests

4b37134

add v1 HeaderPart

9c0544a

v1 of remove headers

a4999fe

move header methods to _body

90e453c

wip

a460f3b

add header v1 works!

2a69e3f

remove null-op subclass of XmlPart

205819e

footer methods and tests, all except add_footer

34161ec

add add_footer method

07566b5

make header / footer rels and styles play nicely

5093714

put header and footer ref at proper locs

b601615

add v1 header / footer api docs

3cf0802

cleanup

67ae998

cleanup

178d8f7

fix tests on python3

bab2801

eupharis added 2 commits December 10, 2015 15:16

define header structure

3f3791f

v1 header-footer protocol

8f843f4

tweak

e4d92c3

fix: a couple small fixes

b1f33b6

eupharis added 3 commits February 22, 2016 13:58

Merge branch 'feature/header' of github.com:python-openxml/python-doc…

6a125cf

…x into develop

add failing test for section.add_header method

329e990

wip

188e943

eupharis added 2 commits February 22, 2016 16:08

finish fleshing out header / footer analysis

26899a7

cleanup

17d47bc

scanny closed this May 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Headers and Footers API #236

Headers and Footers API #236

Uh oh!

eupharis commented Dec 3, 2015

Uh oh!

eupharis commented Dec 8, 2015

Uh oh!

scanny commented Dec 8, 2015

Uh oh!

eupharis commented Dec 11, 2015

Uh oh!

eupharis commented Jan 7, 2016

Uh oh!

scanny commented Jan 26, 2016

Uh oh!

eupharis commented Jan 26, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

scanny commented Feb 23, 2016

Uh oh!

scanny commented Feb 23, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

scanny commented Feb 23, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

scanny commented May 26, 2016

Uh oh!

Uh oh!

Headers and Footers API #236

Headers and Footers API #236

Uh oh!

Conversation

eupharis commented Dec 3, 2015

Uh oh!

eupharis commented Dec 8, 2015

Uh oh!

scanny commented Dec 8, 2015

Uh oh!

eupharis commented Dec 11, 2015

Uh oh!

eupharis commented Jan 7, 2016

Uh oh!

scanny commented Jan 26, 2016

Uh oh!

eupharis commented Jan 26, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

scanny commented Feb 23, 2016

Uh oh!

scanny commented Feb 23, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

scanny commented Feb 23, 2016

Uh oh!

eupharis commented Feb 23, 2016

Uh oh!

scanny commented May 26, 2016

Uh oh!

Uh oh!